We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
本代码有关注意力权重的转换代码如下:
看其他的一些权重转换代码,针对于注意力权重有进行视图维度转换的操作,如下所示:
二者都会在后续再进行chunk来进行tensor切分操作,但两者的操作结果不一样吧?
请问本代码为什么没有考虑视图维度转换呢?
The text was updated successfully, but these errors were encountered:
@li-yi-dong Can you help me answer the above question, or is there a bug in the code?
Sorry, something went wrong.
这个仓库里面Attention 使用QKV 的地方做了相应的修改 https://github.com/alibaba/Megatron-LLaMA/blob/main/megatron/model/transformer.py#L553
No branches or pull requests
本代码有关注意力权重的转换代码如下:
看其他的一些权重转换代码,针对于注意力权重有进行视图维度转换的操作,如下所示:
二者都会在后续再进行chunk来进行tensor切分操作,但两者的操作结果不一样吧?
请问本代码为什么没有考虑视图维度转换呢?
The text was updated successfully, but these errors were encountered: