Skip to content

Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression

Notifications You must be signed in to change notification settings

gustavecortal/gpt-j-fine-tuning-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Fine-tuning 6-Billion GPT-J (& other models) in colab with LoRA and 8-bit compression Open in Colab

This notebook is a simple example for fine-tuning GPT-J-6B with limited memory. A detailed explanation of how it works can be found in this model card. It is heavily based on this Colab. Huge thanks to Hivemind!

You can also finetune GPT-Neo-2.7B, French GPT-J (Cedille's Boris) and T0-3B with limited memory.

Models trained with this method:

Sauge Divine: @saugedivine. Trained on philosophical, trippy and mystical content.
La Voix du Bot: @lavoixdubot. Trained on French news.

LoRA: https://arxiv.org/abs/2106.09685
8-bit Optimizers: https://arxiv.org/abs/2110.02861

Twitter: @gustavecortal