Remove .attention from skipped tensors to match more accurately #7051

bartowski1182 · 2024-05-02T20:36:57Z

This change fixes #7046

https://huggingface.co/nvidia/ChatQA-1.5-8B has tensors called model.layers.x.self_attn.rotary_emb.inv_freq instead of model.layers.x.self_attn.attention.rotary_emb.inv_freq, this change will capture both and properly skip them.

compilade

I'd like to note that this is also done in #7031, but I'm fine with this being fixed separately here.

bartowski1182 · 2024-05-02T23:05:10Z

Ah good catch :) I'll let you know if this gets merged so you can avoid conflict

…ganov#7051)

Remove .attention from skipped tensors to match more accurately

f700301

compilade approved these changes May 2, 2024

View reviewed changes

slaren merged commit 60325fa into ggerganov:master May 2, 2024
22 checks passed

nopperl pushed a commit to nopperl/llama.cpp that referenced this pull request May 5, 2024

Remove .attention from skipped tensors to match more accurately (gger…

ee8f5ac

…ganov#7051)

teleprint-me pushed a commit to teleprint-me/llama.cpp that referenced this pull request May 7, 2024

Remove .attention from skipped tensors to match more accurately (gger…

23a57e3

…ganov#7051)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove .attention from skipped tensors to match more accurately #7051

Remove .attention from skipped tensors to match more accurately #7051

bartowski1182 commented May 2, 2024

compilade left a comment

bartowski1182 commented May 2, 2024

Remove .attention from skipped tensors to match more accurately #7051

Remove .attention from skipped tensors to match more accurately #7051

Conversation

bartowski1182 commented May 2, 2024

compilade left a comment

Choose a reason for hiding this comment

bartowski1182 commented May 2, 2024