Skip to content

Continued Pretraining

Latest
Compare
Choose a tag to compare
@danielhanchen danielhanchen released this 04 Jun 20:11
· 8 commits to main since this release
27fa021

Continued pretraining

You can now do continued pretraining with Unsloth. See https://unsloth.ai/blog/contpretraining for more details!

Continued pretraining is 2x faster and uses 50% less VRAM than HF + FA2 QLoRA. We offload embed_tokens and lm_head to disk to save VRAM!

You can now simply use both in the target modules like below:

model = FastLanguageModel.get_peft_model(
    model,
    r = 128, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",
                      "embed_tokens", "lm_head",], # Add for continual pretraining
    lora_alpha = 32,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = True,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

We also allow 2 learning rates - one for the embedding matrices and another for the LoRA adapters:

from unsloth import is_bfloat16_supported
from unsloth import UnslothTrainer, UnslothTrainingArguments

trainer = UnslothTrainer(
    args = UnslothTrainingArguments(
        ....
        learning_rate = 5e-5,
        embedding_learning_rate = 5e-6,
    ),
)

We also share a free Colab to finetune Mistral v3 to learn Korean (you can select any language you like) using Wikipedia and the Aya Dataset: https://colab.research.google.com/drive/1tEd1FrOXWMnCU9UIvdYhs61tkxdMuKZu?usp=sharing

And we're sharing our free Colab notebook for continued pretraining for text completion: https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing

What's Changed

New Contributors

Full Changelog: May-2024...June-2024