A Python script designed to streamline the process of quantizing models to exllamav2 format
-
Updated
May 17, 2024 - Python
A Python script designed to streamline the process of quantizing models to exllamav2 format
A constrained generation filter for local LLMs that makes them quote properly from a source document
JavaScript WebSocket API for ExLlamav2
A lightweight, fast, parallel inference server for Llama
A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system
Run gguf LLM models in Latest Version TextGen-webui
LLM telegram bot
Add a description, image, and links to the exllama topic page so that developers can more easily learn about it.
To associate your repository with the exllama topic, visit your repo's landing page and select "manage topics."