[WIP] adding chatbot-e2e #462

HamidShojanazeri · 2024-04-24T17:05:12Z

Main Goal

Building an e2e recipe for building chatbots where we need to fine-tune a model and wont be able to rely only on RAG.

This is just a work in progress and have many gaps to fill/ fix. Please dont review it as a ready work.

High level idea

We want to focus on following stages:

Data pipelines for creating datasets for chatbots
Data processing/ quality assurance practices/pipelines/tooling
Evaluation process
Fine-tuning a model/ Best practices for fine-tuning/ LORA/QLORA/ hyper params.

Use-case

We want to use this as an example to develop the e2e recipe and share it with broader community to customize for their own use-cases. As an example we use LLAMA (or could be PyTorch) FAQ model.

Llama FAQ model using OSS llama docs, github docs, papers, website, etc.
Proposed data pipeline, using Llama 70b as the teacher model to create Q&A pairs from Llama docs as mentioned above.[ Open to any other ideas here]
Data Quality/ Eval using same teacher model [ Open to any other ideas here]

facebook-github-bot added the cla signed label Apr 24, 2024

HamidShojanazeri changed the title ~~adding chatbot-e2e~~ [WIP] adding chatbot-e2e Apr 24, 2024

HamidShojanazeri and others added 4 commits May 7, 2024 13:26

adding chatbot-e2e

a036811

adding support for vllm local endpoint and llama3 model

230c557

working draft for vllm using llama3-70B

b07cbad

fix generate_qa function

6204d5a

wukaixingxp force-pushed the chatbot-e2e branch from 96f9f9a to 6204d5a Compare May 7, 2024 20:28

wukaixingxp added 6 commits May 7, 2024 15:55

changed requirement.txt and readme.md

d5767a1

adding self-curation using LLM

274ed14

restructured folders and added eval pipeline

9add30a

end-to-end testing on the pipeline

bb96a88

working draft of end-to-end pipelines

6a83585

fixed chatbot_dataset.py

40c03da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] adding chatbot-e2e #462

[WIP] adding chatbot-e2e #462

HamidShojanazeri commented Apr 24, 2024

[WIP] adding chatbot-e2e #462

Are you sure you want to change the base?

[WIP] adding chatbot-e2e #462

Conversation

HamidShojanazeri commented Apr 24, 2024

Main Goal

High level idea

Use-case