Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading LLama2 and Other LLMs Running So Slow #437

Open
AfamO opened this issue Jan 9, 2024 · 0 comments
Open

Downloading LLama2 and Other LLMs Running So Slow #437

AfamO opened this issue Jan 9, 2024 · 0 comments

Comments

@AfamO
Copy link

AfamO commented Jan 9, 2024

Describe the bug
I noticed that downloading and running LLMS such as llama2 and others are very slow. It takes lots of time in my local system to download the model before generating completions. Typically it is faster on Collab with quantization technique.

To Reproduce
Steps to reproduce the behavior:

  1. Write any valid LLM-VM completions generation codes.
  2. Select 'llama2' as 'big_model ' parameter.
  3. Run or execute your code.
  4. See error

Expected behavior
The whole process of downloading LLM models, shards, checkpoints,.. and completions processes should be faster . Perhaps everything should be done in 3-6 minutes. It takes average of 3-4 minutes in Collab.

Screenshots
llama-running-slowly

Desktop (please complete the following information):

  • OS: Windows. Python 3.11.2
  • Version Windows 10 Pro
  • RAM Size: 16GB

Additional context
I am not running the code from Notebook, yet to try it on jupyter notebook. Rather I am running from command prompt that comes with PyCharm IDE.

@AfamO AfamO changed the title Downloading LLama2 and Other Models Running So Slow Downloading LLama2 and Other LLMs Running So Slow Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant