Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new tokenizer-verifier tool to check gguf tokenizer parameters #6988

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Apr 30, 2024

  1. examples : new program to verify gguf tokenizer parameters

    This program verifies that a given gguf model file can tokenize all
    potential valid characters. Since llama.cpp currently raises an
    exception when tokenization is not possible[1], this tool helps
    verifying that valid ascii and utf-8 will always be properly tokenized.
    
    [1] ggerganov#2580
    anisse committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    a808370 View commit details
    Browse the repository at this point in the history