Releases · hiyouga/LLaMA-Factory

16 Dec 13:48

hiyouga

v0.4.0

870426f

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

🚨🚨 Core refactor

Deprecate checkpoint_dir and use adapter_name_or_path instead
Replace resume_lora_training with create_new_adapter
Move the patches in model loading to llmtuner.model.patcher
Bump to Transformers 4.36.1 to adapt to the Mixtral models
Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in #1347 (comment)
Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
Support system column in both alpaca and sharegpt dataset formats

New models

Base models
- Mixtral-8x7B-v0.1
Instruct/Chat models
- Mixtral-8x7B-v0.1-instruct
- Mistral-7B-Instruct-v0.2
- XVERSE-65B-Chat
- Yi-6B-Chat

Bug fix

Improve logging for unknown arguments by @yhyu13 in #1868
Fix an overflow issue in LLaMA2 PPO training #1742
Fix #246 #1561 #1715 #1764 #1765 #1770 #1771 #1784 #1786 #1795 #1815 #1819 #1831

Contributors

wangxingjun778, hiyouga, and 3 other contributors

Assets 2

03 Dec 14:17

hiyouga

v0.3.3

438dea6

v0.3.3: ModelScope Integration, Reward Server

New features

Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
Support using a reward model server in PPO training via specifying --reward_model_type api
Support adjusting the shard size of exported models via the export_size argument

New models

Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat

New datasets

Supervised fine-tuning datasets
- Nectar dataset by @mlinmg in #1689
Preference datasets
- Nectar dataset by @mlinmg in #1689

Bug fix

Improve get_current_device by @billvsme in #1690
Improve web UI preview by @Samge0 in #1695
Fix #1543 #1597 #1657 #1658 #1659 #1668 #1682 #1696 #1699 #1703 #1707 #1710

Contributors

billvsme, Samge0, and 2 other contributors

Assets 2

21 Nov 05:41

hiyouga

v0.3.2

5085b00

v0.3.2: Patch release

New features

Support training GPTQ quantized model #729 #1481 #1545
Support resuming reward model training #1567

Bug fix

Change default PPO parameters by @hannlp in #1553
Fix ChatGLM2&3 templates #1453 #1480
Fix #1548 by @Outsider565 in #1544
Fix #1263 #1550 #1558

Contributors

hannlp and Outsider565

Assets 2

16 Nov 08:24

hiyouga

v0.3.0

c4facc0

v0.3.0: Full-Parameter RLHF

New features

Support full-parameter RLHF training (RM & PPO)
Refactor llmtuner core in #1525 by @hiyouga
Better LLaMA Board: full-parameter RLHF and demo mode

New models

Base models
- ChineseLLaMA-1.3B
- LingoWhale-8B
Instruct/Chat models
- ChineseAlpaca-1.3B
- Zephyr-7B-Alpha/Beta

Bug fix

Fix bugs in partial-parameter (freeze) tuning
Fix #224 #336 #931 #936 #1011 #1489 #1494 #1507 #1514

Contributors

hiyouga

Assets 2

13 Nov 15:16

hiyouga

v0.2.2

35cc1e2

v0.2.2: Patch release

Bug fix

Fix the OOM issue in PPO training by @mmbwf in #424
Fix fine-tuning arguments by @yyq in #1454
Refactor constants and evaluation by @hiyouga
Fix #1452 #1466 #1478

Contributors

yyq, hiyouga, and mmbwf

Assets 2

09 Nov 08:30

hiyouga

v0.2.1

b357265

v0.2.1: Variant Models, NEFTune Trick

New features

Support NEFTune trick for supervised fine-tuning by @anvie in #1252
Support loading dataset in the sharegpt format - read data/readme for details
Support generating multiple responses in demo API via the n parameter
Support caching the pre-processed dataset files via the cache_path argument
Better LLaMA Board (pagination, controls, etc.)
Support push_to_hub argument #1088

New models

Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

Pre-training datasets
- RedPajama V2
- Pile
Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2

Bug fix

Fix full-parameter DPO training #1383 #1422 (inspired by @mengban )
Fix tokenizer config by @lvzii in #1436
Fix #1197 #1215 #1217 #1218 #1228 #1232 #1285 #1287 #1290 #1316 #1325 #1349 #1356 #1365 #1411 #1418 #1438 #1439 #1446

Contributors

anvie, mengban, and lvzii

Assets 2

15 Oct 13:06

hiyouga

v0.2.0

7a53188

v0.2.0: Web UI Refactor, LongLoRA

New features

Support LongLoRA for the LLaMA models
Support training the Qwen-14B and InternLM-20B models
Support training state recovery for the all-in-one Web UI
Support Ascend NPU by @statelesshz in #975
Integrate MMLU, C-Eval and CMMLU benchmarks

Modifications

Rename repository to LLaMA Factory (former LLaMA Efficient Tuning)
Use the cutoff_len argument instead of max_source_length and max_target_length #944
Add a train_on_prompt option #1184

Bug fix

Fix numeric error caused by the layer norm dtype in 84b7486 [1]
Fix bugs in PPO Trainer by @mmbwf in #900
Fix #424 #762 #814 #887 #913 #1000 #1026 #1032 #1064 #1068 #1074 #1086 #1097 #1176 #1177 #1190 #1191

[1] huggingface/transformers#25598 (comment)

Contributors

statelesshz and mmbwf

Assets 2

11 Sep 09:55

hiyouga

v0.1.8

ccb3553

v0.1.8: FlashAttention-2 and Baichuan2

New features

Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
Support training the Baichuan2 models
Use right-padding to avoid overflow in fp16 training (also mentioned here)
Align the computation method of the reward score with DeepSpeed-Chat (better generation)
Support --lora_target all argument which automatically finds the applicable modules for LoRA training

Bug fix

Use efficient EOS tokens to align with the Baichuan training ( baichuan-inc/Baichuan2#23 )
Remove PeftTrainer to save model checkpoints in DeepSpeed training
Fix bugs in web UI by @beat4ocean in #596 by @codemayq in #644 #651 #678 #741 by @kinghuin in #786
Add dataset explanation by @panpan0000 in #629
Fix a bug in the DPO data collator
Fix a bug of the ChatGLM2 tokenizer in right-padding
#608 #617 #649 #757 #761 #763 #809 #818

Contributors

codemayq, kinghuin, and 2 other contributors

Assets 2

18 Aug 09:39

hiyouga

v0.1.7

9c9009f

v0.1.7: Script Preview and RoPE Scaling

New features

Preview training script in Web UI by @codemayq in #479 #511
Support resuming from checkpoints by @niuba in #434 (transformers>=4.31.0 required)
Two RoPE scaling methods: linear and NTK-aware scaling for LLaMA models (transformers>=4.31.0 required)
Support training the ChatGLM2-6B model
Support PPO training in bfloat16 data type #551

Bug fix

Unusual output of quantized models #278 #391
Runtime error in distributed DPO training #480
Unexpected truncation in generation #532
Dataset streaming error in pre-training #548 #549
Tensor shape mismatch in PPO training using ChatGLM2 #527 #528
#475 #476 #478 #481 #494 #551

Contributors

niuba and codemayq

Assets 2

11 Aug 15:43

hiyouga

v0.1.6

a48cb0d

v0.1.6: DPO Training and Qwen-7B

Adapt DPO training from the TRL library
Support fine-tuning the Qwen-7B, Qwen-7B-Chat, XVERSE-13B, and ChatGLM2-6B models
Implement the "safe" ChatML template for Qwen-7B-Chat
Better Web UI
Pretty readme by @codemayq #382
New features: #395 #451
Fix InternLM-7B inference #312
Fix bugs: #351 #354 #361 #376 #408 #417 #420 #423 #426

Contributors

codemayq

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨🚨 Core refactor

New features

New models

Bug fix

Contributors

New features

New models

New datasets

Bug fix

Contributors

New features

Bug fix

Contributors

New features

New models

Bug fix

Contributors

Bug fix

Contributors

New features

New models

New datasets

Bug fix

Contributors

New features

Modifications

Bug fix

Contributors

New features

Bug fix

Contributors

New features

Bug fix

Contributors

Contributors

Releases: hiyouga/LLaMA-Factory

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

🚨🚨 Core refactor

New features

New models

Bug fix

Contributors

v0.3.3: ModelScope Integration, Reward Server

New features

New models

New datasets

Bug fix

Contributors

v0.3.2: Patch release

New features

Bug fix

Contributors

v0.3.0: Full-Parameter RLHF

New features

New models

Bug fix

Contributors

v0.2.2: Patch release

Bug fix

Contributors

v0.2.1: Variant Models, NEFTune Trick

New features

New models

New datasets

Bug fix

Contributors

v0.2.0: Web UI Refactor, LongLoRA

New features

Modifications

Bug fix

Contributors

v0.1.8: FlashAttention-2 and Baichuan2

New features

Bug fix

Contributors

v0.1.7: Script Preview and RoPE Scaling

New features

Bug fix

Contributors

v0.1.6: DPO Training and Qwen-7B

Contributors