-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pinned build mode for GPU, with prebuilt Transformer Engine bdist #547
Conversation
150b679
to
c50ddfd
Compare
c6c6e60
to
32a4d4e
Compare
32a4d4e
to
7387dae
Compare
68e0566
to
a77969e
Compare
Also undo some unnecessary change
a77969e
to
b938fa8
Compare
TE version has been updated in the anticipation of #555 landing soon. Once it lands, I will rebase and make sure to do some sanity check on all builds. |
Also add flash attention smoketest
Also update constraints
setup.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I know what's "--extra-index-url https://us-python.pkg.dev/gce-ai-infra/maxtext-build-support-packages/simple/" used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that pip will know where to find prebuilt wheel, which is currently stored in the referenced artifact registry path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Is it accessible publicly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - any developer will be able to download wheel. Although uploading to the registry will require credential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks In-Ho! This is really helpful!
Also add constraints to GPU stable build which are cherry-picked from #522 and #525
To update the wheel, take a look at maxtext_transformerengine_builder.Docker. Once the image is built, you can extract the wheel file under
/root/TransformerEngine/dist
of the image. Unfortunately, there's no one-liner to copy a file from offline docker image, but there's no shortage of workarounds: https://stackoverflow.com/q/25292198To upload the wheel, builder who is authorized to upload to the artifact registry can do:
Where
$GITHASH
is the commit hash version of the Transformer EngineFor nightly builds, we don't apply constraints so the image can be built with the latest and greatest packages.
We also copy setup.sh, requirements.txt and constraints.txt first before executing setup.sh, to avoid running redundant setup script when there is no dependency change. We also mount pip cache from host to speed up the build process.