Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preload model for the ollama provider #1190

Open
1 of 2 tasks
sgwhat opened this issue Apr 26, 2024 · 2 comments
Open
1 of 2 tasks

Preload model for the ollama provider #1190

sgwhat opened this issue Apr 26, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@sgwhat
Copy link

sgwhat commented Apr 26, 2024

Validations

  • I believe this is a way to improve. I'll try to join the Continue Discord for questions
  • I'm not able to find an open issue that requests the same enhancement

Problem

I've noticed that when I use ollama to chat, the model loading always occurs during the first round of conversation, which makes the first round much slower than subsequent ones.

I'm exploring ways to help ollama preload the model. Even though I tried ollama run llama2:latest before conversation, the model still loads at the start of the first conversation.

Solution

No response

@sgwhat sgwhat added the enhancement New feature or request label Apr 26, 2024
@sestinj
Copy link
Contributor

sestinj commented Apr 26, 2024

@sgwhat
Copy link
Author

sgwhat commented Apr 26, 2024

@sgwhat https://github.com/ollama/ollama/blob/main/docs/api.md#load-a-model

This might be a good solution

I have tired, but still useless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants