-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SME][TOPI] Add conv2d NHWC SME fp32 schedule #17003
Conversation
@tvm-bot rerun |
Failed to re-run CI in https://github.com/apache/tvm/actions/runs/9147913719
with response
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Anndrey24 great work - it's awesome to see this coming together! I left a few comments, largely nitpicks and a couple of questions
This commit adds a scalable `arm_cpu` conv2d NHWC schedule for fp32 which generates SME instructions by using the tensor intrinsics introduced in apache#16921. Alongside the SME schedule, the logic of the TE schedule `schedule_conv2d_gemm_native()` for both non-scalable and scalable vector implementations has also been translated into the new TIR schedule. This means that the TE compute definition `compute_conv2d_NHWC_hybrid()` is now compatible with both the original TE schedules (e.g. `schedule_conv2d_NHWC_hybrid()`) and the newly introduced TIR schedule `schedule_conv2d_NHWC_hybrid_TIR()`. The corresponding TOPI test has been extended to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! - It looks like there's a merge conflict that needs to be fixed :/
de63e5e
to
8e93462
Compare
Resolved the conflict! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very cool!
Thanks @Anndrey24 and @lhutton1, this is now merged! |
Thanks @Anndrey24 @lhutton1 @ekalda . Seems we have a breakage/flaky likely related to this pr https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-arm/detail/main/1980/pipeline (in lint,arm, and cpu jobs). |
I created a temp revert, #17038 to unblock the ci, if there is an alternative fix that would also be good, eitherway we followup with a redo quickly. |
Fixes a merge conflict between apache#16981 and apache#17003. Change-Id: Ifcc983ef0b8c00250568a048fd682933adfdcde4
This commit adds a scalable
arm_cpu
conv2d NHWC schedule for fp32 which generates SME instructions by using the tensor intrinsics introduced in #16921.Alongside the SME schedule, the logic of the TE schedule
schedule_conv2d_gemm_native()
for both non-scalable and scalable vector implementations has also been translated into the new TIR schedule. This means that the TE compute definitioncompute_conv2d_NHWC_hybrid()
is now compatible with both the original TE schedules (e.g.schedule_conv2d_NHWC_hybrid()
) and the newly introduced TIR scheduleschedule_conv2d_NHWC_hybrid_TIR()
. The corresponding TOPI test has been extended to reflect that.cc @ekalda @lhutton1