Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EOS is not read from gguf format #446

Open
Alisa-lisa opened this issue Dec 19, 2023 · 1 comment
Open

EOS is not read from gguf format #446

Alisa-lisa opened this issue Dec 19, 2023 · 1 comment
Assignees

Comments

@Alisa-lisa
Copy link

I have discovered that running the same model with the same parameters from llm (gguf branch) and llama.cpp results in a different behavior. llm seems to have not been reading EOS token and thus the model creates output until max tokens is reached.
Here is llama.cpp:
llamares
And the same model from llm:
llm

According to discord "discussion" it might be indeed a bug.

@philpax philpax self-assigned this Dec 19, 2023
@philpax
Copy link
Collaborator

philpax commented Dec 19, 2023

Thanks for reporting this! For my own reference, the issue is that this doesn't get the EOT from the tokenizer - instead, it assumes that it's the hardcoded token </s>. This made sense in the early days of LLaMA, but is no longer true:

self.tokenizer().id("</s>".as_bytes()).unwrap_or(2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants