how to tokennizer vicuna #20

beyondli · 2023-11-20T01:47:28Z

Hi ，
I want to test the token speed of minigpt4, but tokenizer failed
AutoTokenizer.from_pretrained('maknee/ggml-vicuna-v0-quantized/13B') or

huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'maknee/ggml-vicuna-v0-quantized/13B'. Use repo_type argument if needed.

AutoTokenizer.from_pretrained('maknee/ggml-vicuna-v0-quantized') both failed.

Repository Not Found for url: https://huggingface.co/maknee/ggml-vicuna-v0-quantized/resolve/main/tokenizer_config.json.
Please make sure you specified the correct repo_id and repo_type.

what is correct command for tokennizer? thanks

The text was updated successfully, but these errors were encountered:

Maknee · 2023-11-23T04:52:42Z

The tokenizer used is the llama.cpp tokenizer. Call add_strings in c++ and set a timer to eval the speed. Unfortunately, you can't use AutoTokenizer from hugging face to directly load the tokenizer :(

The tokenizer internally calls llama_tokenize_internal in llama.cpp

beyondli · 2023-11-27T05:59:50Z

Hi Maknee，
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to tokennizer vicuna #20

how to tokennizer vicuna #20

beyondli commented Nov 20, 2023

Maknee commented Nov 23, 2023

beyondli commented Nov 27, 2023

how to tokennizer vicuna #20

how to tokennizer vicuna #20

Comments

beyondli commented Nov 20, 2023

Maknee commented Nov 23, 2023

beyondli commented Nov 27, 2023