New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

CUDA: generalize FP16 fattn vec kernel #7061

Merged

JohannesGaessler merged 7 commits into ggerganov:master from JohannesGaessler:cuda-fa-no-tc-5

May 9, 2024

Commits on May 9, 2024

CUDA: generalize FP16 fattn vec kernel

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for 48463c0

Browse repository at this point
Copy the full SHA

48463c0 View commit details

Browse the repository at this point in the history
disable unsupported head sizes for AMD in test

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for 86636bd

Browse repository at this point
Copy the full SHA

86636bd View commit details

Browse the repository at this point in the history
try AMD fix

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for 617f129

Browse repository at this point
Copy the full SHA

617f129 View commit details

Browse the repository at this point in the history
fix batch size 2-8

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for d9bcb92

Browse repository at this point
Copy the full SHA

d9bcb92 View commit details

Browse the repository at this point in the history
partially revert changes

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for fa81c3a

Browse repository at this point
Copy the full SHA

fa81c3a View commit details

Browse the repository at this point in the history
fix performance regression

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for 2272765

Browse repository at this point
Copy the full SHA

2272765 View commit details

Browse the repository at this point in the history
fix compiler warning

JohannesGaessler committed May 9, 2024
Configuration menu
View commit details

Copy full SHA for fece1fe

Browse repository at this point
Copy the full SHA

fece1fe View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: generalize FP16 fattn vec kernel #7061

CUDA: generalize FP16 fattn vec kernel #7061

Commits on May 9, 2024