ggerganov / llama.cpp Public

Notifications
Fork 8.1k
Star 57.4k

Code
Issues 353
Pull requests 221
Discussions
Actions
Projects 4
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggerganov/llama.cpp

Labels 46 Milestones 0

New pull request New

221 Open 2,870 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add an option to build without CUDA VMM

#7067 opened May 4, 2024 by WilliamTambellini

Loading…

Add Note to Readme that LLaMA 3 is Not Supported for convert.py

#7065 opened May 3, 2024 by lyledean1

Loading…

Fix Linux /sys cpu path to guess number of cores

#7064 opened May 3, 2024 by viric

Loading…

Add BPE pre-tokenization for Command-R/R+.

#7063 opened May 3, 2024 by dranger003

Loading…

CUDA: generalize FP16 fattn vec kernel

#7061 opened May 3, 2024 by JohannesGaessler

Loading…

add_special option for server tokenize endpoint

#7059 opened May 3, 2024 by JohanAR

Loading…

Script to convert Grok-1 weights from raw JAX pickle files.

#7058 opened May 3, 2024 by heiner

Loading…

BPE pretokenizer - add support for command-r-plus and command-r models

#7041 opened May 2, 2024 by sealad886

Loading…

Decide pre tokenizer based on preprocessing of entry and not on tokens encoded

#7039 opened May 2, 2024 by JoanFM • Draft

Bug fix for server crash if first token is the stop word and asking for logprobs

#7038 opened May 2, 2024 by maor-ps

Loading…

tests : add test-tokenizer-0.sh

high priority

Very important issue

#7036 opened May 2, 2024 by ggerganov

Loading…

Disable benchmark on forked repo

#7034 opened May 2, 2024 by CISC

Loading…

convert-hf : reduce repeated boilerplate from write_tensors

need feedback

Testing and feedback with results are needed

refactoring

Refactoring

#7031 opened May 1, 2024 by compilade

Loading…

3 of 18 tasks

Add token healing example

#7028 opened May 1, 2024 by mare5x • Draft

convert.py: When --vocab-only is passed, generate false but valid params

#7027 opened May 1, 2024 by 20kdc

Loading…

docs: Fix typo and update description for --embeddings flag

#7026 opened May 1, 2024 by louixs

Loading…

Added support for the ArcticForCausalLM.

#7020 opened May 1, 2024 by fairydreaming

Loading…

chore: Add hashsum for stablelm models

#7018 opened May 1, 2024 by teleprint-me

Loading…

Tidy Android Instructions README.md

#7016 opened Apr 30, 2024 by Jeximo

Loading…

Update Server's README with undocumented options for RoPE, YaRN, and KV cache quantization

#7013 opened Apr 30, 2024 by K-Mistele

Loading…

Fix flash attention for ROCm

#7011 opened Apr 30, 2024 by jdecourval • Draft

add chatglm3-6b model support [help wanted]

#6999 opened Apr 30, 2024 by mnlife • Draft

new tokenizer-verifier tool to check gguf tokenizer parameters

#6988 opened Apr 29, 2024 by anisse

Loading…

Attempt at OpenElm

#6986 opened Apr 29, 2024 by joshcarp • Draft

llama3 custom regex split

#6965 opened Apr 28, 2024 by jaime-m-p

Loading…

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly