[Feature Request] Qwen model support #52

JianbangZ · 2023-09-26T14:50:20Z

Qwen 7B/14B model looks strong, I understand we don't have access to their dataset, but still extremely useful to havea medusa finetuned with smaller Chinese/English dataset.

JianbangZ · 2023-10-11T14:53:40Z

I tried to implement this myself, yet not successful. Training can run but throws bunch of out of memory error and token mismatch ones. Help needed on this one.
Overall I think it's important to support Llama2, Mistral, and Qwen as these are the 3 most popular models in community now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Qwen model support #52

[Feature Request] Qwen model support #52

JianbangZ commented Sep 26, 2023

JianbangZ commented Oct 11, 2023 •

edited

[Feature Request] Qwen model support #52

[Feature Request] Qwen model support #52

Comments

JianbangZ commented Sep 26, 2023

JianbangZ commented Oct 11, 2023 • edited

JianbangZ commented Oct 11, 2023 •

edited