Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] adding chatbot-e2e #462

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

[WIP] adding chatbot-e2e #462

wants to merge 10 commits into from

Conversation

HamidShojanazeri
Copy link
Contributor

Main Goal

Building an e2e recipe for building chatbots where we need to fine-tune a model and wont be able to rely only on RAG.

This is just a work in progress and have many gaps to fill/ fix. Please dont review it as a ready work.

High level idea

We want to focus on following stages:

  • Data pipelines for creating datasets for chatbots
  • Data processing/ quality assurance practices/pipelines/tooling
  • Evaluation process
  • Fine-tuning a model/ Best practices for fine-tuning/ LORA/QLORA/ hyper params.

Use-case

We want to use this as an example to develop the e2e recipe and share it with broader community to customize for their own use-cases. As an example we use LLAMA (or could be PyTorch) FAQ model.

  • Llama FAQ model using OSS llama docs, github docs, papers, website, etc.
  • Proposed data pipeline, using Llama 70b as the teacher model to create Q&A pairs from Llama docs as mentioned above.[ Open to any other ideas here]
  • Data Quality/ Eval using same teacher model [ Open to any other ideas here]

@HamidShojanazeri HamidShojanazeri changed the title adding chatbot-e2e [WIP] adding chatbot-e2e Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants