How to use the finetuned mistal model for inference with Medusa #75

pradeepdev-1995 · 2024-01-24T16:15:50Z

How to use the finetuned mistal model for inference with Medusa

ctlllll · 2024-01-25T02:20:03Z

As an example, you can refer to the Zephyr model (python -m medusa.inference.cli --model FasterDecoding/medusa-1.0-zephyr-7b-beta) :)

pradeepdev-1995 · 2024-01-25T02:51:07Z

@ctlllll
It seems that , this command expectes a medusa model

python -m medusa.inference.cli --model [path of medusa model]

But in my case , i am using the mistral model-https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 (not based on medusa). so shall i use medusa library to improve my mistral models inference time?

eldhosemjoy · 2024-02-02T15:34:39Z

You will need to train the hugging face model on the medusa heads before you can use it for inference.

pradeepdev-1995 · 2024-02-02T15:53:08Z

@eldhosemjoy How to train the hugging face model on the Medusa heads?can you share the reference

eldhosemjoy · 2024-02-02T15:55:47Z

You can use this script - https://github.com/FasterDecoding/Medusa/blob/main/medusa/train/train_legacy.py
This is a Llama example I suppose. you can try with Mistral.

MoOo2mini · 2024-02-05T09:48:52Z

Is there no way to inference without training? I didn't have the computing resources to train, so I wanted to infer without training.

gangooteli · 2024-05-11T17:37:55Z

Provide feedback