Llama3 中文仓库(聚合资料,各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
-
Updated
May 17, 2024 - Python
Llama3 中文仓库(聚合资料,各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
Unify Efficient Fine-Tuning of 100+ LLMs
Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory
🤖 Collect practical AI repos, tools, websites, papers and tutorials on AI. 实用的AI百宝箱 💎
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
A high-throughput and memory-efficient inference and serving engine for LLMs
🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
A high-performance inference system for large language models, designed for production environments.
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Help your LLMs online
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
MinimalChat is a lightweight, open-source chat application that allows you to interact with various large language models.
Private & local AI personal knowledge management app.
Add a description, image, and links to the llama topic page so that developers can more easily learn about it.
To associate your repository with the llama topic, visit your repo's landing page and select "manage topics."