Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge #172

Open
Spongeorge opened this issue Jan 23, 2024 · 0 comments

Comments

@Spongeorge
Copy link

Trying to run the following:

python /projects/geba2844/LongLora/LongLoRA/merge_lora_weights_and_save_hf_model.py \
        --base_model /projects/geba2844/LongLora/Llama-2-7b-hf \
        --peft_model /projects/geba2844/LongLora/Llama-2-7b-longlora-8k \
        --context_size 8192 \
        --save_path /projects/geba2844/LongLora/Llama-2-7b-longlora-8k-hf

Returns the following error:

bash-4.4$ bash merge.sh
base model /projects/geba2844/LongLora/Llama-2-7b-hf
peft model /projects/geba2844/LongLora/Llama-2-7b-longlora-8k
Loading checkpoint shards:   0%|                                                                                                                                                                                       | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/projects/geba2844/LongLora/LongLoRA/merge_lora_weights_and_save_hf_model.py", line 113, in <module>
    main(args)
  File "/projects/geba2844/LongLora/LongLoRA/merge_lora_weights_and_save_hf_model.py", line 68, in main
    model = transformers.AutoModelForCausalLM.from_pretrained(
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 565, in from_pretrained
    return model_class.from_pretrained(
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3307, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3681, in _load_pretrained_model
    state_dict = load_state_dict(shard_file)
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/modeling_utils.py", line 463, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant