Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mlas int4 int8 with avx2/512 #20687

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Mlas int4 int8 with avx2/512 #20687

wants to merge 10 commits into from

Conversation

liqunfu
Copy link
Contributor

@liqunfu liqunfu commented May 15, 2024

Description

Motivation and Context

…en32, symmetric1 hasBias0 Int8

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Signed-off-by: Liqun Fu <liqfu@microsoft.com>
…tric:1/ComputeType:4/real_time_mean 1542487160 ns 1539062500 ns

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
…048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 1434872720 ns

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
…NBITGEMM<4>/BlkLen:32/M:2048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 1265060620 ns 1265625000 ns

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
…TGEMM<4>/BlkLen:32/M:2048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 1214042220 ns

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
…6/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 784668090 ns; SQNBITGEMM<4>/BlkLen:64/M:2048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 754939430 ns

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Signed-off-by: Liqun Fu <liqfu@microsoft.com>
@liqunfu liqunfu requested a review from a team as a code owner May 15, 2024 17:01
@liqunfu liqunfu marked this pull request as draft May 15, 2024 17:03
@@ -38,6 +38,8 @@ onnxruntime_add_static_library(onnxruntime_mlas
${MLAS_SRC_DIR}/qdwconv_kernelsize.cpp
${MLAS_SRC_DIR}/sqnbitgemm.h
${MLAS_SRC_DIR}/sqnbitgemm.cpp
${MLAS_SRC_DIR}/llama.cpp.sgemm.h
${MLAS_SRC_DIR}/llama.cpp.sgemm.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are these used?

…ymmetric:1/ComputeType:4/real_time_mean 664029830 ns

Signed-off-by: liqunfu <liqun.fu@microsoft.com>
@liqunfu liqunfu changed the title Mlas int4 int8 with avx2 Mlas int4 int8 with avx2/512 May 26, 2024
Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants