#

llama

Here are 913 public repositories matching this topic...

llama3-Chinese-chat

CrazyBoyM / llama3-Chinese-chat

Llama3 中文仓库（聚合资料，各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档）

llama llama2 llama3 llama3-chinese llama3-finetune

Updated May 17, 2024
Python

LLaMA-Factory

hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Updated May 17, 2024
Python

unslothai / unsloth

Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

ai llama lora gemma mistral fine-tuning finetuning llms qlora llama2

Updated May 17, 2024
Python

AITreasureBox

superiorlu / AITreasureBox

🤖 Collect practical AI repos, tools, websites, papers and tutorials on AI. 实用的AI百宝箱 💎

machine-learning ai deep-learning agi openai awesome-list llama gpt alpaca gpt-4 aigc stable-diffusion llms chatgpt llama2

Updated May 17, 2024
Ruby

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

llama cuda-kernels deepspeed llm fastertransformer llm-inference turbomind internlm llama2 codellama llama3

Updated May 17, 2024
Python

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated May 17, 2024
Python

fishaudio / fish-speech

Brand new TTS solution

tts transformer llama valle vqvae vits vqgan

Updated May 17, 2024
Python

tenstorrent / tt-metal

🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.

metal accelerator ml falcon resnet llama low-level-programming mistral llm stable-diffusion mixtral tenstorrent

Updated May 17, 2024
C++

HqWu-HITCS / Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

nlp chinese llama awesome-lists llm chatglm

Updated May 17, 2024

llama.cpp

ggerganov / llama.cpp

LLM inference in C/C++

Updated May 17, 2024
C++

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated May 17, 2024
Python

modelscope / swift

ms-swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs

Updated May 17, 2024
Python

vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.

performance gpu model production cuda efficiency inference transformer llama speculative serving llm llm-inference llama3

Updated May 17, 2024
C++

PaddleNLP

PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

nlp search-engine compression sentiment-analysis transformers information-extraction question-answering llama pretrained-models embedding bert semantic-analysis distributed-training ernie neural-search uie document-intelligence paddlenlp llm

Updated May 17, 2024
Python

janelu9 / EasyLLM

Running Large Language Model easily.

pipeline llama zero deepspeed llm

Updated May 17, 2024
Python

fatwang2 / search2ai

Help your LLMs online

search online gemini openai llama mistral groq openai-api functioncalling toolcall

Updated May 17, 2024
JavaScript

google / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"

inference pytorch batching attention llama gemma model-serving tpu llm llm-inference llama2

Updated May 17, 2024
Python

google / JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

gpu inference pytorch transformer llama gpt gemma model-serving tpu jax mlops large-language-models llm llmops llm-inference llama2

Updated May 17, 2024
Python

fingerthief / minimal-chat

MinimalChat is a lightweight, open-source chat application that allows you to interact with various large language models.

javascript chat vuejs vue chatbot selfhosted self-hosted artificial-intelligence llama gpt chat-application webapplication vue3 llm chatgpt chatgpt-api gpt-vision claude-3 meta-llama

Updated May 17, 2024
Vue

reorproject / reor

Private & local AI personal knowledge management app.

markdown ai llama note-taking pkm rag local-first vector-database second-brain llamacpp lancedb ollama

Updated May 17, 2024
TypeScript

Improve this page

Add a description, image, and links to the llama topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama topic, visit your repo's landing page and select "manage topics."