Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the finetuned mistal model for inference with Medusa #75

Open
pradeepdev-1995 opened this issue Jan 24, 2024 · 7 comments
Open

Comments

@pradeepdev-1995
Copy link

How to use the finetuned mistal model for inference with Medusa

@ctlllll
Copy link
Contributor

ctlllll commented Jan 25, 2024

As an example, you can refer to the Zephyr model (python -m medusa.inference.cli --model FasterDecoding/medusa-1.0-zephyr-7b-beta) :)

@pradeepdev-1995
Copy link
Author

pradeepdev-1995 commented Jan 25, 2024

@ctlllll
It seems that , this command expectes a medusa model

python -m medusa.inference.cli --model [path of medusa model]

But in my case , i am using the mistral model-https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 (not based on medusa). so shall i use medusa library to improve my mistral models inference time?

@eldhosemjoy
Copy link

You will need to train the hugging face model on the medusa heads before you can use it for inference.

@pradeepdev-1995
Copy link
Author

@eldhosemjoy How to train the hugging face model on the Medusa heads?can you share the reference

@eldhosemjoy
Copy link

You can use this script - https://github.com/FasterDecoding/Medusa/blob/main/medusa/train/train_legacy.py
This is a Llama example I suppose. you can try with Mistral.

@MoOo2mini
Copy link

MoOo2mini commented Feb 5, 2024

Is there no way to inference without training? I didn't have the computing resources to train, so I wanted to infer without training.

@gangooteli
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@eldhosemjoy @gangooteli @ctlllll @pradeepdev-1995 @MoOo2mini and others