Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Google Gemini OOTB #1405

Open
mhebrard-bigid opened this issue Apr 30, 2024 · 5 comments
Open

Support for Google Gemini OOTB #1405

mhebrard-bigid opened this issue Apr 30, 2024 · 5 comments

Comments

@mhebrard-bigid
Copy link

LiteLLM already support Gemini so it's probably already doable. Would be nice to support it OOTB as Gemini has a large context window

@zarlor
Copy link

zarlor commented May 19, 2024

It is doable now, but I find I have to add an extra step to it, after you set up the custom LLM for Gemini, and as long as the docker containers are running, you then have to run the following two commands (assuming you're using docker for this):

docker exec -it danswer-stack-api_server-1 pip install -q google-generativeai
docker exec -it danswer-stack-background-1 pip install -q google-generativeai

You'll have to run them everytime you restart Danswer. If you're running from source then you can just run the "pip install googe-generativeai" command locally and you should be good to go. But agreed, it would be nice to not have to do that and just have Gemini "work" right out of box. I've been using Gemini 1.5 pro mostly, lately, and it's does seem to do a pretty decent job (and it doesn't hurt that it's free for the moment! :D)

@gmoulard
Copy link

I can run this 2 command. I don't have error message. But I don't know how to setup and use Gemini.
I don't view any changes on admin interface.

docker exec -it danswer-stack-api_server-1 pip install -q google-generativeai
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

docker exec -it danswer-stack-backgdocker exec -it danswer-stack-background-1 pip install -q google-generativeai
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

@zarlor
Copy link

zarlor commented May 28, 2024

I can run this 2 command. I don't have error message. But I don't know how to setup and use Gemini. I don't view any changes on admin interface.

docker exec -it danswer-stack-api_server-1 pip install -q google-generativeai
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

docker exec -it danswer-stack-backgdocker exec -it danswer-stack-background-1 pip install -q google-generativeai
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

So that warning isn't a problem, it's just indicating there is a newer version of pip available within those containers. You didn't get any real errors so you are all set for Gemini to work. However, did you configure a Gemini LLM in Danswer? You still need to go to the Admin Panel and under Model Configs select LLM. From there you need to hit the button at the bottom for "Add Custom LLM Provider" because Gemini is not configured by default as an available LLM, that's why you wouldn't see any changes. Fill in Display name of gemini, Provider is also gemini, put in your API key in the field for that. Then all the way down under Model Names use whichever (or all) of the Model Names listed near the bottom of https://docs.litellm.ai/docs/providers/gemini. Also use one of those names in the Default Model Name field. You don't really need anything more than that so just hit "Test" to verify it's all good so you can save it.

@mhebrard-bigid
Copy link
Author

Works nicely with @zarlor instructions.
I'm not sure danswer is optimized for leveraging the 1M token window. The best I could configure was to increase the number of chunks to 20 instead of 10 when configuring a new assistant. So at most we use 8K tokens for context injection.

@zarlor
Copy link

zarlor commented May 29, 2024

I did the same, but haven't seen any issues on the sending front in terms of context. I have seen what seem to be limitations on the receiving side, though. I was doing a small project here trying to see what Danswer might be able to do for creating code for a Discord connector for Danswer and when responding with larger responses it would get cut off. There's probably someplace to change the incoming and outgoing context windows but I'm not sure where or what that would be, personally.

It also doesn't (the last time I checked) accept uploads of things like images within Danswer where it was just saying this isn't whatever it's called for being an image accepting/processing model, even though it is (I guess it just assumes any custom LLM can't handle it). So there are definitely some extra limitations for now but I don't find them too horrible. Even with the code thing I was able to tell Gemini to do things like no include any extra explanatory text, just the code, to get it all, or something like "I received up to the last two lines that say 'blah-blah'. Was that the end of your response and, if not, would you please send the rest of the response starting with those two lines" as well as things like copy/paste sending the entire code snippets back for verification (which, come to think of it, was a decently large context and Danswer didn't complain about sending it and it seemed like Gemini got the whole thing, so...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants