Skip to content

OSS-Pole-Emploi/happy_vllm

Repository files navigation

happy_vllm logo

pypi badge Generic badge License: AGPL v3

Build & Tests Wheel setup docs

📚 Documentation : https://oss-pole-emploi.github.io/happy_vllm/


happy_vLLM is a REST API for vLLM which was developed with production in mind. It adds some functionalities to vLLM.

Installation

You can install happy_vLLM using pip:

pip install happy_vllm

Or build it from source:

git clone https://github.com/OSS-Pole-Emploi/happy_vllm.git
cd happy_vllm
pip install -e .

Quickstart

Just use the entrypoint happy-vllm (see arguments for a list of all possible arguments)

happy_vllm --model path_to_model --host 127.0.0.1 --port 5000 --model-name my_model

It will launch the API and you can directly query it for example with

curl 127.0.0.1:5000/v1/info

To get various information on the application or

curl 127.0.0.1:5000/v1/completions -d '{"prompt": "Hey,", "model": "my_model"}'

if you want to generate your first LLM response using happy_vLLM. See endpoints for more details on all the endpoints provided by happy_vLLM.

Deploy with Docker image

A docker image is available from the Github Container Registry :

docker pull ghcr.io/oss-pole-emploi/happy_vllm:latest

See deploying_with_docker for more details on how to serve happy_vLLM with docker.

Swagger

You can reach the swagger UI at the /docs endpoint (so for example by default at 127.0.0.1:5000/docs). You will be displayed all the endpoints and examples on how to use them.