New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] (Q)DoRA #893
Comments
Thanks for filing this issue! The torchtune team has definitely been following some of the progress around (Q)DoRA and are super interested in adding it. Stay tuned for more thoughts and discussion soon! |
@DreamGenX If this is something you're interested in contributing to the library, we'd be happy to work with you on it :) |
@RdoubleA I added a Dora-based update to my fork. If you approve, I can submit a pull request. |
Hi @Prakyathkantharaju, thanks for sharing the fork. If I understand correctly, you are returning |
Hello @ebsmothers , Yes, that is correct. The method presented in the paper and that I wrote is slightly different, I based my code on this repo: https://github.com/rasbt/dora-from-scratch/blob/main/Using-LinearDoRA.ipynb I updated the code based on the author's recommendation here, which apparently make this inference faster on GPU. Here is the reference for the author's comment: huggingface/peft#1474 (comment) I have added references in the code as well. I also added a pull request #936 So that I can track any changes much clearer. |
Thanks for the clarification @Prakyathkantharaju, and thanks for opening the PR. I agree, let's consolidate the discussion on the PR. I left a few initial comments, let's continue the conversation there. |
(Q)DoRA, an alternative to (Q)LoRA is quickly proving to be a superior technique in terms of closing the gap between FFT and PEFT.
Known existing implementations:
Field reports / comparisons to LoRA:
The text was updated successfully, but these errors were encountered: