Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: add override-kv, or at least a way to specify the pretokenizer #819

Open
schmorp opened this issue May 2, 2024 · 5 comments

Comments

@schmorp
Copy link

schmorp commented May 2, 2024

llama.cpp has an override-kv option that can be used to override, well, model kv values. This can be useful with the myriad of existing ggufs that don't have a pretokenizer specified. It would be nice if koboldcpp had such an option (either generic override-kv option, or a way to specify/override the pretokenizer string). Or both. override-kv can be useful in a variety of other ways, but is of course more effort than just being able to specify the pretokenizer type.

Having a llama.cpp-compatible syntax for override-kv would also be a plus for users who could use instructions written for llama.cpp.

@LostRuins
Copy link
Owner

The CLI args for that flag are horrendous. I'm sure there must be a better way to do it.

@schmorp
Copy link
Author

schmorp commented May 3, 2024

Well, the override-kv is more of a low-level tool, but it can be very useful. For the concrete problem of setting the pre-tokenizer type, a simple "--pretokenizer llama3" or so would suffice. less generic, much less horrendous, I would assume.

The advantage of override-kv, other than being generic, would be compatibility with llama.cpp, as I see this a lot.

But having any way to override the pretokenizer with koboldcpp would greatly help. I don't see hundreds or even thousands of models to be requantized anytime soon, and this affects all kinds of models, not just llama3 models. (deepseek, command-r, practically anything that is not llama 2).

It would be pretty much in the same vein as --contextsize or --ropeconfig, which also override model-provided kv values.

@Rotatingxenomorph
Copy link

Rotatingxenomorph commented May 6, 2024

I can't use koboldcpp anymore for command-r plus because nobody wants to requantize it instead of using -override kv in llamacpp.

@LostRuins
Copy link
Owner

There's a fix for command-r plus coming out in the next version. In the meantime, you can try using an older version first.

@Rotatingxenomorph
Copy link

There's a fix for command-r plus coming out in the next version. In the meantime, you can try using an older version first.

Cool, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants