Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: loadbalance multiple ollama servers in kubernetes #1745

Open
keisni opened this issue Apr 25, 2024 · 3 comments
Open

feat: loadbalance multiple ollama servers in kubernetes #1745

keisni opened this issue Apr 25, 2024 · 3 comments

Comments

@keisni
Copy link

keisni commented Apr 25, 2024

Maybe this should be regarded as a question.

I deployed open-webui & ollama with the helm chart in the source code, and I want to make sure If I run more than one replicas of the ollama server, they can be well loadbalanced and don`t cause any inconsistence.

Clearly, these ollama servers will be loadbalanced through cluster service. Does this mean that if I upload a model, it will be upload to any of them? And then if the chat request routed to a server without the loaded model, I will see a error reply? I known that there is a "Update all models" button, but I`m not sure this still works through the single clusterIP.

@keisni
Copy link
Author

keisni commented Apr 26, 2024

the helm chart in kubernetes directory create a Statefulset and CusterIP service for ollma. Through a single url like http://ollama-service.open-webui.svc.cluster.local:11434, open-webui lost tracking of which instance have which model. So I think this is a mistake. Use a headless service with statefulset is common pratice in k8s, and open-webui can fetch all ollama instance ips throuch dns resolve. Did I miss something?

@Expro
Copy link

Expro commented May 3, 2024

Works fine as-is. You can share volume across multiple instances of ollama, so that modle uploaded to one is visible to all of the instances.

@AdaptiveStep
Copy link

Anyone got any idea on how to limit this properly for multiple users?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants