You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
Hi! I am experimenting with using llama.cpp as a general-purpose code completion backend, similar to TabNine.
I am encountering a small problem: if the completion prompt ends mid-word, the results are not very accurate. For example, for a prompt such as Five, Four, Thre [sic], the model will often ignore the typo and suggest , Two (forming Thre, Two).
I think, as an option to the /completion server API, the following optional behavior would be useful:
Tokenize the text
Chop off the last token
Run the prediction with the remaining tokens, but only consider those tokens whose bytes start with the bytes of the last token.
Thanks!
The text was updated successfully, but these errors were encountered:
Ok. This can be demonstrated in one of the examples. One way would be to add it to main or simple + extend llama_sampling_sample with the necessary functionality
Hi @ilyannn, do you still want to work on this? I've created a draft PR (#7028) that demonstrates token healing, but I still haven't added it to main or server. We can collaborate on that, if you'd like.
@mare5x Sorry, I have not actually started so please don't wait for me. I'll try to take a look at your PR this week though and will be happy to help in any way I can.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Hi! I am experimenting with using llama.cpp as a general-purpose code completion backend, similar to TabNine.
I am encountering a small problem: if the completion prompt ends mid-word, the results are not very accurate. For example, for a prompt such as
Five, Four, Thre
[sic], the model will often ignore the typo and suggest, Two
(formingThre, Two
).I think, as an option to the
/completion
server API, the following optional behavior would be useful:Thanks!
The text was updated successfully, but these errors were encountered: