Skip to content

Releases: hiyouga/LLaMA-Factory

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

16 Dec 13:48
Compare
Choose a tag to compare

🚨🚨 Core refactor

  • Deprecate checkpoint_dir and use adapter_name_or_path instead
  • Replace resume_lora_training with create_new_adapter
  • Move the patches in model loading to llmtuner.model.patcher
  • Bump to Transformers 4.36.1 to adapt to the Mixtral models
  • Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
  • Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

  • Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in #1347 (comment)
  • Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
  • Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
  • Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
  • Support system column in both alpaca and sharegpt dataset formats

New models

  • Base models
    • Mixtral-8x7B-v0.1
  • Instruct/Chat models
    • Mixtral-8x7B-v0.1-instruct
    • Mistral-7B-Instruct-v0.2
    • XVERSE-65B-Chat
    • Yi-6B-Chat

Bug fix

v0.3.3: ModelScope Integration, Reward Server

03 Dec 14:17
Compare
Choose a tag to compare

New features

  • Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
  • Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
  • Support using a reward model server in PPO training via specifying --reward_model_type api
  • Support adjusting the shard size of exported models via the export_size argument

New models

  • Base models
    • DeepseekLLM-Base (7B/67B)
    • Qwen (1.8B/72B)
  • Instruct/Chat models
    • DeepseekLLM-Chat (7B/67B)
    • Qwen-Chat (1.8B/72B)
    • Yi-34B-Chat

New datasets

  • Supervised fine-tuning datasets
  • Preference datasets

Bug fix

v0.3.2: Patch release

21 Nov 05:41
Compare
Choose a tag to compare

New features

  • Support training GPTQ quantized model #729 #1481 #1545
  • Support resuming reward model training #1567

Bug fix

v0.3.0: Full-Parameter RLHF

16 Nov 08:24
Compare
Choose a tag to compare

New features

  • Support full-parameter RLHF training (RM & PPO)
  • Refactor llmtuner core in #1525 by @hiyouga
  • Better LLaMA Board: full-parameter RLHF and demo mode

New models

  • Base models
    • ChineseLLaMA-1.3B
    • LingoWhale-8B
  • Instruct/Chat models
    • ChineseAlpaca-1.3B
    • Zephyr-7B-Alpha/Beta

Bug fix

v0.2.2: Patch release

13 Nov 15:16
Compare
Choose a tag to compare

Bug fix

v0.2.1: Variant Models, NEFTune Trick

09 Nov 08:30
Compare
Choose a tag to compare

New features

  • Support NEFTune trick for supervised fine-tuning by @anvie in #1252
  • Support loading dataset in the sharegpt format - read data/readme for details
  • Support generating multiple responses in demo API via the n parameter
  • Support caching the pre-processed dataset files via the cache_path argument
  • Better LLaMA Board (pagination, controls, etc.)
  • Support push_to_hub argument #1088

New models

  • Base models
    • ChatGLM3-6B-Base
    • Yi (6B/34B)
    • Mistral-7B
    • BlueLM-7B-Base
    • Skywork-13B-Base
    • XVERSE-65B
    • Falcon-180B
    • Deepseek-Coder-Base (1.3B/6.7B/33B)
  • Instruct/Chat models
    • ChatGLM3-6B
    • Mistral-7B-Instruct
    • BlueLM-7B-Chat
    • Zephyr-7B
    • OpenChat-3.5
    • Yayi (7B/13B)
    • Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

  • Pre-training datasets
    • RedPajama V2
    • Pile
  • Supervised fine-tuning datasets
    • OpenPlatypus
    • ShareGPT Hyperfiltered
    • ShareGPT4
    • UltraChat 200k
    • AgentInstruct
    • LMSYS Chat 1M
    • Evol Instruct V2

Bug fix

v0.2.0: Web UI Refactor, LongLoRA

15 Oct 13:06
Compare
Choose a tag to compare

New features

  • Support LongLoRA for the LLaMA models
  • Support training the Qwen-14B and InternLM-20B models
  • Support training state recovery for the all-in-one Web UI
  • Support Ascend NPU by @statelesshz in #975
  • Integrate MMLU, C-Eval and CMMLU benchmarks

Modifications

  • Rename repository to LLaMA Factory (former LLaMA Efficient Tuning)
  • Use the cutoff_len argument instead of max_source_length and max_target_length #944
  • Add a train_on_prompt option #1184

Bug fix

[1] huggingface/transformers#25598 (comment)

v0.1.8: FlashAttention-2 and Baichuan2

11 Sep 09:55
Compare
Choose a tag to compare

New features

  • Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
  • Support training the Baichuan2 models
  • Use right-padding to avoid overflow in fp16 training (also mentioned here)
  • Align the computation method of the reward score with DeepSpeed-Chat (better generation)
  • Support --lora_target all argument which automatically finds the applicable modules for LoRA training

Bug fix

v0.1.7: Script Preview and RoPE Scaling

18 Aug 09:39
Compare
Choose a tag to compare

New features

  • Preview training script in Web UI by @codemayq in #479 #511
  • Support resuming from checkpoints by @niuba in #434 (transformers>=4.31.0 required)
  • Two RoPE scaling methods: linear and NTK-aware scaling for LLaMA models (transformers>=4.31.0 required)
  • Support training the ChatGLM2-6B model
  • Support PPO training in bfloat16 data type #551

Bug fix

v0.1.6: DPO Training and Qwen-7B

11 Aug 15:43
Compare
Choose a tag to compare