You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Qwen 7B/14B model looks strong, I understand we don't have access to their dataset, but still extremely useful to havea medusa finetuned with smaller Chinese/English dataset.
The text was updated successfully, but these errors were encountered:
I tried to implement this myself, yet not successful. Training can run but throws bunch of out of memory error and token mismatch ones. Help needed on this one.
Overall I think it's important to support Llama2, Mistral, and Qwen as these are the 3 most popular models in community now.
Qwen 7B/14B model looks strong, I understand we don't have access to their dataset, but still extremely useful to havea medusa finetuned with smaller Chinese/English dataset.
The text was updated successfully, but these errors were encountered: