-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot run llama3 8b instruct: AssertionError: Fail to convert pytorch model
#1522
Comments
Thanks for reporting it, we will check the issue |
@N3RDIUM Hi, according to errors
It seems you did't download the model successfully. Please download the model from HF to the local disk and try again. Just setting the model_id to the local path.
Another issue is that the variable of model.device you didn't define |
I tried downloading the model again and using the local path as the model ID, but it gives me this error now:
|
Does this lib support *.pth models? I could go for the original/ dir: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/tree/main/original |
Hi,
The code you provided may be incompatible, whcih means ITREX or Neural Spedd verison is a little bit old. https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py I ran the code successfully last time I replied you~. Please try to reinstall the latest main bracnh ITREX and neural speed from the souce code~ |
Okay, will try. Thanks for the quick reply! |
Its running out of memory on |
Whoops! Closed it by mistake. Anyway, is there any way to reduce memory usage when loading the model from HF? I tried without itrex and it runs just fine :( |
Great, now I get |
Hi, @N3RDIUM
All people use the same function to load the model from the HF: The possible different is that the https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py#L1485 Please set the low_cpu_usage_mem=False before installation. According to my tests previously, it can reduce virtual memory sometimes.
No worries. Just setting the new conda env and reinstall the requirement.txt and ITREX+NS from the souce code. Theses issues will disappear I think. I have checked the installation pipeline again by using the latest ITREX and NS branch. It works. successful Installation screenshots(Check whether you install successfully) |
I have the same versions as you, yet it gives me the same error: |
Oops, did it again, extremely sorry |
I'm not using |
Here is the error now:
|
Which version of transformers and pytorch are you on? |
Facing the same issue for the given Dockerfile. |
Hey there! I'm trying to run llama3-8b-instruct with intel extension for transformers.
Here's my code:
Here's the error:
The text was updated successfully, but these errors were encountered: