converting llama3 models with added tokens #3519

l3utterfly · 2024-05-06T15:03:40Z

Following up from this: #3303

Converting finetuned llama3 models with the same special tokens works.

How can we convert llama3 finetunes which has added tokens? For example, this model: https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B is tuned on the ChatML format.

Converting using the same script gets this error:

RuntimeError: Error(s) in loading state_dict for Transformer:
        size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
        size mismatch for output.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).

This is due to the finetuned model having different number of tokens I believe.

The text was updated successfully, but these errors were encountered:

larryliu0820 · 2024-05-06T22:29:51Z

@l3utterfly Can you share the full error message? I thought it happens at load_state_dict but it seems strict=False so it shouldn't error out. https://github.com/pytorch/executorch/blob/main/examples/models/llama2/model.py#L197

l3utterfly · 2024-05-07T11:19:31Z

Full error here:

python -m examples.models.llama2.export_llama --checkpoint /home/layla/src/text-generation-webui/models/Einstein-v6.1-Llama3-8B/checkpoint.pth -p /home/layla/src/text-generation-webui/models/Meta-Llama-3-8B-Instruct/original/params.json -d=fp32 -X -qmode 8da4w -kv --use_sdpa_with_kv_cache --output_name="Einstein-v6.1-Llama3-8B_kv_sdpa_xnn_qe_4_32_ctx2048.pte" --group_size 256 --metadata '{"get_bos_id":128000, "get_eos_id":128001}' --embedding-quantize 4,32 --max_seq_len 2048
[INFO 2024-05-07 11:19:11,348 builder.py:84] Loading model with checkpoint=/home/layla/src/text-generation-webui/models/Einstein-v6.1-Llama3-8B/checkpoint.pth, params=/home/layla/src/text-generation-webui/models/Meta-Llama-3-8B-Instruct/original/params.json, use_kv_cache=True, weight_type=WeightType.LLAMA
Traceback (most recent call last):
  File "/home/layla/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/layla/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/layla/src/executorch/examples/models/llama2/export_llama.py", line 30, in <module>
    main()  # pragma: no cover
  File "/home/layla/src/executorch/examples/models/llama2/export_llama.py", line 26, in main
    export_llama(modelname, args)
  File "/home/layla/src/executorch/examples/models/llama2/export_llama_lib.py", line 302, in export_llama
    return _export_llama(modelname, args)
  File "/home/layla/src/executorch/examples/models/llama2/export_llama_lib.py", line 380, in _export_llama
    builder_exported_to_edge = _prepare_for_llama_export(
  File "/home/layla/src/executorch/examples/models/llama2/export_llama_lib.py", line 352, in _prepare_for_llama_export
    load_llama_model(
  File "/home/layla/src/executorch/examples/models/llama2/builder.py", line 87, in load_llama_model
    model, example_inputs, _ = EagerModelFactory.create_model(
  File "/home/layla/src/executorch/examples/models/model_factory.py", line 44, in create_model
    model = model_class(**kwargs)
  File "/home/layla/src/executorch/examples/models/llama2/model.py", line 195, in __init__
    self.model_.load_state_dict(
  File "/home/layla/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2191, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Transformer:
        size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
        size mismatch for output.weight: copying a param with shape torch.Size([128260, 4096]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).

larryliu0820 · 2024-05-09T18:52:59Z

@l3utterfly vocab_size is something configurable. Can you change the value in /home/layla/src/text-generation-webui/models/Meta-Llama-3-8B-Instruct/original/params.json to the new one and retry?

iseeyuan added the enhancement Not as big of a feature, but technically not a bug. Should be easy to fix label May 6, 2024

iseeyuan assigned larryliu0820 May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

converting llama3 models with added tokens #3519

converting llama3 models with added tokens #3519

l3utterfly commented May 6, 2024

larryliu0820 commented May 6, 2024

l3utterfly commented May 7, 2024

larryliu0820 commented May 9, 2024

converting llama3 models with added tokens #3519

converting llama3 models with added tokens #3519

Comments

l3utterfly commented May 6, 2024

larryliu0820 commented May 6, 2024

l3utterfly commented May 7, 2024

larryliu0820 commented May 9, 2024