#

vllm

Here are 49 public repositories matching this topic...

wangcx18 / llm-vscode-inference-server

An endpoint server for efficiently serving quantized open-source LLMs for code.

vscode-extension llm vllm llm-inference

Updated Oct 15, 2023
Python

lklivingstone / sih_2023

A Large Language Model based tool for generating human like responses to natural language inputs for network not connected over internet.

linux django reactjs scylladb llm vllm

Updated Oct 29, 2023
Python

ivangabriele / docker-llm

Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.

api docker server docker-image openai vast orca vicuna openai-api llm runpod llms vllm lmsys openorca llong

Updated Oct 31, 2023
Dockerfile

gameofdimension / vllm-cn

演示 vllm 对中文大语言模型的神奇效果

Updated Nov 4, 2023
Jupyter Notebook

aflip / mood-muse

Embedding based semantic search app for poetry [App and EDA notebooks]

approximate-nearest-neighbor-search semantic-search sentence-embeddings data-enrichment vector-search vllm

Updated Nov 7, 2023
Jupyter Notebook

esmailza / Llama2-vLLM-LangChain-knowledge-graph

Preserving entities through the integration of knowledge graphs, Llama 2, vLLM, and LangChain.

python distributed information-extraction knowledge-graph named-entity-recognition summarization langchain vllm llama2

Updated Nov 22, 2023
Python

Trainy-ai / llm-atc

Fine-tuning and serving LLMs on any cloud

finetuning llms vllm llama2

Updated Dec 2, 2023
Python

mddunlap924 / LLM-Inference-Serving

This repository demonstrates LLM execution on CPUs using packages like llamafile, emphasizing low-latency, high-throughput, and cost-effective benefits for inference and serving.

deepspeed large-language-models llms llm-serving llamacpp vllm llm-inference llamafile

Updated Dec 4, 2023
Jupyter Notebook

gusanmaz / echosight

EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.

raspberry-pi replicate visual-audio visual-audio-navigation llm coqui-tts llms vllm replicate-api seamlessm4t cogvl

Updated Jan 3, 2024
Python

iNeil77 / vllm-code-harness

Run code inference-only benchmarks quickly using vLLM

transformers code-generation nlp-machine-learning vllm

Updated Feb 17, 2024
Python

phospho-app / fastassert

Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API provider.

docker outlines llm vllm llm-inference

Updated Feb 17, 2024
Jupyter Notebook

InquestGeronimo / superlaser

MLOps library for LLM deployment w/ the vLLM engine on RunPod's infra.

docker cicd mlops inference-api runpod vllm llm-inference runpod-serverless

Updated Mar 2, 2024
Python

joydeb28 / llm-lab

LLM, Fine Tuning, Llama 2, Gemma, Mixtral, vLLM, LangChain, RAG, ChromaDB, FAISS

nlp gemma faiss rag llm langchain vllm chromadb genai llama2 finetune-llm openllm mixtral

Updated Mar 5, 2024
Jupyter Notebook

yueying-teng / generate-language-image-instruction-following-data

mistral multimodal-learning llm langchain llava vllm llama-cpp-python instruction-following-data

Updated Mar 10, 2024
Python

varunshenoy / super-json-mode

Low latency JSON generation using LLMs ⚡️

openai huggingface-transformers llm vllm

Updated Mar 10, 2024
Jupyter Notebook

LLM-inference-router / vllm-router

vLLM Router

kubernetes huggingface llm vllm llm-inference llama2

Updated Mar 11, 2024
Python

kyegomez / SimpleUnet

An simple implementation of Unet because all the implementations i've seen are wayy tooo complicated.

image computer-vision artificial-intelligence image-classification image-segmentation unet biomedical biomedical-image-processing gpt4 vllm texttovide

Updated Apr 14, 2024
Python

aisu-programming / LLM-Coder-with-Discord

A discord bot which can call LLMs using either Hugging Face or vLLM on Windows platform. Combined with function calling.

docker discord discord-bot discord-py huggingface llm langchain vllm llm-agent

Updated Mar 20, 2024
Python

idncsk / canvas

Context layer on top of your unstructured universe

electron nodejs javascript ai notes roaring-bitmaps lmdb svelte tabs-management llms vllm

Updated Mar 25, 2024
JavaScript

TimeSurgeLabs / promptproxy

Call many AIs from a single API.

docker ai openai llama huggingface openai-api llm vllm openai-api-proxy llama2

Updated Mar 28, 2024
Go

Improve this page

Add a description, image, and links to the vllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vllm topic, visit your repo's landing page and select "manage topics."