Releases: hiyouga/LLaMA-Factory
Releases Β· hiyouga/LLaMA-Factory
v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration
π¨π¨ Core refactor
- Deprecate
checkpoint_dir
and useadapter_name_or_path
instead - Replace
resume_lora_training
withcreate_new_adapter
- Move the patches in model loading to
llmtuner.model.patcher
- Bump to Transformers 4.36.1 to adapt to the Mixtral models
- Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
- Temporarily disable LongLoRA due to breaking changes, which will be supported later
The above changes were made by @hiyouga in #1864
New features
- Add DPO-ftx: mixing fine-tuning gradients to DPO via the
dpo_ftx
argument, suggested by @lylcst in #1347 (comment) - Integrate AutoGPTQ into the model export via the
export_quantization_bit
andexport_quantization_dataset
arguments - Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
- Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
- Support system column in both alpaca and sharegpt dataset formats
New models
- Base models
- Mixtral-8x7B-v0.1
- Instruct/Chat models
- Mixtral-8x7B-v0.1-instruct
- Mistral-7B-Instruct-v0.2
- XVERSE-65B-Chat
- Yi-6B-Chat
Bug fix
v0.3.3: ModelScope Integration, Reward Server
New features
- Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
- Support launching a reward model server in demo API via specifying
--stage=rm
inapi_demo.py
- Support using a reward model server in PPO training via specifying
--reward_model_type api
- Support adjusting the shard size of exported models via the
export_size
argument
New models
- Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
- Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat
New datasets
- Supervised fine-tuning datasets
- Preference datasets
Bug fix
v0.3.2: Patch release
v0.3.0: Full-Parameter RLHF
v0.2.2: Patch release
v0.2.1: Variant Models, NEFTune Trick
New features
- Support NEFTune trick for supervised fine-tuning by @anvie in #1252
- Support loading dataset in the sharegpt format - read data/readme for details
- Support generating multiple responses in demo API via the
n
parameter - Support caching the pre-processed dataset files via the
cache_path
argument - Better LLaMA Board (pagination, controls, etc.)
- Support
push_to_hub
argument #1088
New models
- Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
- Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)
New datasets
- Pre-training datasets
- RedPajama V2
- Pile
- Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2
Bug fix
v0.2.0: Web UI Refactor, LongLoRA
New features
- Support LongLoRA for the LLaMA models
- Support training the Qwen-14B and InternLM-20B models
- Support training state recovery for the all-in-one Web UI
- Support Ascend NPU by @statelesshz in #975
- Integrate MMLU, C-Eval and CMMLU benchmarks
Modifications
- Rename repository to LLaMA Factory (former LLaMA Efficient Tuning)
- Use the
cutoff_len
argument instead ofmax_source_length
andmax_target_length
#944 - Add a
train_on_prompt
option #1184
Bug fix
v0.1.8: FlashAttention-2 and Baichuan2
New features
- Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
- Support training the Baichuan2 models
- Use right-padding to avoid overflow in fp16 training (also mentioned here)
- Align the computation method of the reward score with DeepSpeed-Chat (better generation)
- Support
--lora_target all
argument which automatically finds the applicable modules for LoRA training
Bug fix
- Use efficient EOS tokens to align with the Baichuan training ( baichuan-inc/Baichuan2#23 )
- Remove PeftTrainer to save model checkpoints in DeepSpeed training
- Fix bugs in web UI by @beat4ocean in #596 by @codemayq in #644 #651 #678 #741 by @kinghuin in #786
- Add dataset explanation by @panpan0000 in #629
- Fix a bug in the DPO data collator
- Fix a bug of the ChatGLM2 tokenizer in right-padding
- #608 #617 #649 #757 #761 #763 #809 #818
v0.1.7: Script Preview and RoPE Scaling
New features
- Preview training script in Web UI by @codemayq in #479 #511
- Support resuming from checkpoints by @niuba in #434 (
transformers>=4.31.0
required) - Two RoPE scaling methods: linear and NTK-aware scaling for LLaMA models (
transformers>=4.31.0
required) - Support training the ChatGLM2-6B model
- Support PPO training in bfloat16 data type #551
Bug fix
v0.1.6: DPO Training and Qwen-7B
- Adapt DPO training from the TRL library
- Support fine-tuning the Qwen-7B, Qwen-7B-Chat, XVERSE-13B, and ChatGLM2-6B models
- Implement the "safe" ChatML template for Qwen-7B-Chat
- Better Web UI
- Pretty readme by @codemayq #382
- New features: #395 #451
- Fix InternLM-7B inference #312
- Fix bugs: #351 #354 #361 #376 #408 #417 #420 #423 #426