speculative-decoding

Star

Here are 12 public repositories matching this topic...

mscheong01 / speculative_decoding.c

Star

minimal C implementation of speculative decoding based on llama2.c

c artificial-intelligence llm llama2 speculative-decoding

Updated Apr 22, 2024
C

pinqian77 / Dynasurge

Star

Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding

large-language-models speculative-decoding

Updated Apr 29, 2024
Python

romsto / Speculative-Decoding

Star

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

fast-inference llm llm-inference speculative-decoding llm-optimization

Updated May 30, 2024
Python

hemingkx / SpecDec

Star

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

non-autoregressive speculative-decoding

Updated Dec 9, 2023
Python

AutonomicPerfectionist / PipeInfer

Sponsor

Star

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

inference llm llamacpp speculative-decoding

Updated Apr 15, 2024
C++

PopoDev / CSE481N_Project

Star

Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder

fast-inference llm speculative-decoding

Updated May 30, 2024
Python

u-hyszk / japanese-speculative-decoding

Star

Verification of the effect of speculative decoding in Japanese.

nlp japanese fast-inference speculative-decoding

Updated Mar 4, 2024
Python

kssteven418 / BigLittleDecoder

Star

[NeurIPS'23] Speculative Decoding with Big Little Decoder

decoding efficient-inference speculative-execution fast-inference llm speculative-decoding

Updated Feb 6, 2024
Python

Infini-AI-Lab / TriForce

Star

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

acceleration efficiency inference llm long-context llm-inference speculative-decoding

Updated Apr 20, 2024
Python

Infini-AI-Lab / Sequoia

Star

scalable and robust tree-based speculative decoding algorithm

efficiency inference llm speculative-decoding

Updated May 22, 2024
Python

SafeAILab / EAGLE

Star

[ICML'24] EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

large-language-models llm-inference speculative-decoding

Updated May 26, 2024
Python

intel / intel-extension-for-transformers

Star

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

retrieval chatbot rag habana large-language-model chatpdf llm-inference 4-bits speculative-decoding llm-cpu streamingllm intel-optimized-llamacpp neural-chat neural-chat-7b autoround gaudi3

Updated May 31, 2024
Python

Improve this page

Add a description, image, and links to the speculative-decoding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speculative-decoding topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speculative-decoding

Here are 12 public repositories matching this topic...

mscheong01 / speculative_decoding.c

pinqian77 / Dynasurge

romsto / Speculative-Decoding

hemingkx / SpecDec

AutonomicPerfectionist / PipeInfer

PopoDev / CSE481N_Project

u-hyszk / japanese-speculative-decoding

kssteven418 / BigLittleDecoder

Infini-AI-Lab / TriForce

Infini-AI-Lab / Sequoia

SafeAILab / EAGLE

intel / intel-extension-for-transformers

Improve this page

Add this topic to your repo