Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Quantized Model #20

Open
SeanHH86 opened this issue Mar 13, 2024 · 1 comment
Open

Support Quantized Model #20

SeanHH86 opened this issue Mar 13, 2024 · 1 comment
Assignees

Comments

@SeanHH86
Copy link
Collaborator

SeanHH86 commented Mar 13, 2024

Support Quantized Model.
For example:
https://huggingface.co/THUDM/chatglm2-6b-int4
https://huggingface.co/Qwen/Qwen1.5-72B-Chat-GPTQ-Int4

@SeanHH86 SeanHH86 assigned SeanHH86 and unassigned SeanHH86 Mar 13, 2024
@SeanHH86
Copy link
Collaborator Author

Inference's speed is slow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants