Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for T5 #560

Open
kishorenc opened this issue Mar 27, 2024 · 4 comments
Open

Support for T5 #560

kishorenc opened this issue Mar 27, 2024 · 4 comments

Comments

@kishorenc
Copy link

Do you have plans to support encoder-decoder models like T5? It will be great to have T5 with flash attention 馃槂

@rwitten
Copy link
Collaborator

rwitten commented Mar 27, 2024

What specific model would you like supported? We would only take this on if we saw sufficient interest (but in practice we see heavy movement towards decoder-only models).

@kishorenc
Copy link
Author

Decoder only models are great for generative use cases but T5 family is the work horse for many discriminative tasks. For example, the flan-t5-base model has 2M downloads on Huggingface in the last month. Support for flan-t5 will add a huge value to the community.

@versae
Copy link

versae commented Apr 8, 2024

It'd be great to have T5 models here as well.

@emergenz
Copy link

I'm going to try to turn MaxText into encoder-decoder anyway, so native support is of course also appreciated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants