Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

converting llama3 models with added tokens #3519

Open
l3utterfly opened this issue May 6, 2024 · 3 comments
Open

converting llama3 models with added tokens #3519

l3utterfly opened this issue May 6, 2024 · 3 comments
Assignees
Labels
enhancement Not as big of a feature, but technically not a bug. Should be easy to fix

Comments

@l3utterfly
Copy link

Following up from this: #3303

Converting finetuned llama3 models with the same special tokens works.

How can we convert llama3 finetunes which has added tokens? For example, this model: https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B is tuned on the ChatML format.

Converting using the same script gets this error:

RuntimeError: Error(s) in loading state_dict for Transformer:
        size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
        size mismatch for output.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).

This is due to the finetuned model having different number of tokens I believe.

@iseeyuan iseeyuan added the enhancement Not as big of a feature, but technically not a bug. Should be easy to fix label May 6, 2024
@larryliu0820
Copy link
Contributor

@l3utterfly Can you share the full error message? I thought it happens at load_state_dict but it seems strict=False so it shouldn't error out. https://github.com/pytorch/executorch/blob/main/examples/models/llama2/model.py#L197

@l3utterfly
Copy link
Author

Full error here:

python -m examples.models.llama2.export_llama --checkpoint /home/layla/src/text-generation-webui/models/Einstein-v6.1-Llama3-8B/checkpoint.pth -p /home/layla/src/text-generation-webui/models/Meta-Llama-3-8B-Instruct/original/params.json -d=fp32 -X -qmode 8da4w -kv --use_sdpa_with_kv_cache --output_name="Einstein-v6.1-Llama3-8B_kv_sdpa_xnn_qe_4_32_ctx2048.pte" --group_size 256 --metadata '{"get_bos_id":128000, "get_eos_id":128001}' --embedding-quantize 4,32 --max_seq_len 2048
[INFO 2024-05-07 11:19:11,348 builder.py:84] Loading model with checkpoint=/home/layla/src/text-generation-webui/models/Einstein-v6.1-Llama3-8B/checkpoint.pth, params=/home/layla/src/text-generation-webui/models/Meta-Llama-3-8B-Instruct/original/params.json, use_kv_cache=True, weight_type=WeightType.LLAMA
Traceback (most recent call last):
  File "/home/layla/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/layla/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/layla/src/executorch/examples/models/llama2/export_llama.py", line 30, in <module>
    main()  # pragma: no cover
  File "/home/layla/src/executorch/examples/models/llama2/export_llama.py", line 26, in main
    export_llama(modelname, args)
  File "/home/layla/src/executorch/examples/models/llama2/export_llama_lib.py", line 302, in export_llama
    return _export_llama(modelname, args)
  File "/home/layla/src/executorch/examples/models/llama2/export_llama_lib.py", line 380, in _export_llama
    builder_exported_to_edge = _prepare_for_llama_export(
  File "/home/layla/src/executorch/examples/models/llama2/export_llama_lib.py", line 352, in _prepare_for_llama_export
    load_llama_model(
  File "/home/layla/src/executorch/examples/models/llama2/builder.py", line 87, in load_llama_model
    model, example_inputs, _ = EagerModelFactory.create_model(
  File "/home/layla/src/executorch/examples/models/model_factory.py", line 44, in create_model
    model = model_class(**kwargs)
  File "/home/layla/src/executorch/examples/models/llama2/model.py", line 195, in __init__
    self.model_.load_state_dict(
  File "/home/layla/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2191, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Transformer:
        size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
        size mismatch for output.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).

@larryliu0820
Copy link
Contributor

@l3utterfly vocab_size is something configurable. Can you change the value in /home/layla/src/text-generation-webui/models/Meta-Llama-3-8B-Instruct/original/params.json to the new one and retry?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Not as big of a feature, but technically not a bug. Should be easy to fix
Projects
None yet
Development

No branches or pull requests

3 participants