Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-to-end FP8 training #45

Open
1 of 8 tasks
xrsrke opened this issue Nov 26, 2023 · 1 comment
Open
1 of 8 tasks

End-to-end FP8 training #45

xrsrke opened this issue Nov 26, 2023 · 1 comment
Assignees
Labels
help wanted Extra attention is needed High Priority

Comments

@xrsrke
Copy link
Owner

xrsrke commented Nov 26, 2023

Notes

  • Write an FP8Tensor that inherits from torch.Tensor (just support type hints).
  • Write an FP8Linear that binds to TransformerEngine's FP8 kernel in the forward pass

TODO

  • nn.Linear but in FP8
  • Recursively convert all nn.Linear to FP8 Linear
  • nn.Embedding in FP8
  • DP in FP8
  • TP in FP8
  • MoE in FP8
  • ZeRO-1
  • PP in FP8 (you get it for free)
@xrsrke xrsrke assigned xrsrke and unassigned xrsrke Nov 27, 2023
@3outeille
Copy link
Collaborator

@xrsrke On it

@xrsrke xrsrke added help wanted Extra attention is needed High Priority labels Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed High Priority
Projects
Status: In Progress
Development

No branches or pull requests

2 participants