Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add an option to build without CUDA VMM
#7067 opened May 4, 2024 by WilliamTambellini Loading…
Fix Linux /sys cpu path to guess number of cores
#7064 opened May 3, 2024 by viric Loading…
CUDA: generalize FP16 fattn vec kernel
#7061 opened May 3, 2024 by JohannesGaessler Loading…
add_special option for server tokenize endpoint
#7059 opened May 3, 2024 by JohanAR Loading…
tests : add test-tokenizer-0.sh high priority Very important issue
#7036 opened May 2, 2024 by ggerganov Loading…
Disable benchmark on forked repo
#7034 opened May 2, 2024 by CISC Loading…
convert-hf : reduce repeated boilerplate from write_tensors need feedback Testing and feedback with results are needed refactoring Refactoring
#7031 opened May 1, 2024 by compilade Loading…
3 of 18 tasks
Add token healing example
#7028 opened May 1, 2024 by mare5x Draft
chore: Add hashsum for stablelm models
#7018 opened May 1, 2024 by teleprint-me Loading…
Tidy Android Instructions README.md
#7016 opened Apr 30, 2024 by Jeximo Loading…
Fix flash attention for ROCm
#7011 opened Apr 30, 2024 by jdecourval Draft
Attempt at OpenElm
#6986 opened Apr 29, 2024 by joshcarp Draft
llama3 custom regex split
#6965 opened Apr 28, 2024 by jaime-m-p Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.