Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare model for deployment to Private Vertex AI endpoint #55

Open
BriianPowell opened this issue Apr 4, 2024 · 4 comments
Open

Prepare model for deployment to Private Vertex AI endpoint #55

BriianPowell opened this issue Apr 4, 2024 · 4 comments
Labels
type:support Support issues

Comments

@BriianPowell
Copy link

Hello, I have a use-case where I'd like to deploy this model to a private Vertex AI endpoint, is there any documentation/literature around how to do that?

@pkgoogle
Copy link

Hi @BriianPowell, welcome, does this answer your question? https://cloud.google.com/vertex-ai/generative-ai/docs/open-models/use-gemma Let us know if it does not.

@BriianPowell
Copy link
Author

Hey there, @pkgoogle thanks for getting back to me. Currently there is a no way to deploy the Gemma version from the Model Garden to a private Vertex AI endpoint. I have some constraints on my project where the Vertex AI Endpoint needs to be attached to a VPC.

I am thinking about following this guide from one of the links located in that article.

Would this work in conjunction with the image that's being created in this repo or are they totally different things?

@pkgoogle
Copy link

Hi @BriianPowell, I believe it will work -- can you give it a try and see if you run into any issues? Thanks.

@BriianPowell
Copy link
Author

@pkgoogle Just thinking here, but my understanding is that the current state of this project doesn't allow hosting the image as a api_server? I think I may have to go the vllm or hex-llm route

@tilakrayal tilakrayal added the type:support Support issues label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:support Support issues
Projects
None yet
Development

No branches or pull requests

3 participants