Skip to content

v0.8.0: GLM-4, Qwen2, PaliGemma, KTO, SimPO

Latest
Compare
Choose a tag to compare
@hiyouga hiyouga released this 07 Jun 22:26
· 5 commits to main since this release

Stronger LlamaBoard πŸ’ͺπŸ˜€

  • Support single-node distributed training in Web UI
  • Add dropdown menu for easily resuming from checkpoints and picking saved configurations by @hiyouga and @hzhaoy in #4053
  • Support selecting checkpoints of full/freeze tuning
  • Add throughput metrics to LlamaBoard by @injet-zhou in #4066
  • Faster UI loading

New features

  • Add KTO algorithm by @enji-zhou in #3785
  • Add SimPO algorithm by @hiyouga
  • Support passing max_lora_rank to the vLLM backend by @jue-jue-zi in #3794
  • Support preference datasets in sharegpt format and remove big files from git repo by @hiyouga in #3799
  • Support setting system messages in CLI inference by @ycjcl868 in #3812
  • Add num_samples option in dataset_info.json by @seanzhang-zhichen in #3829
  • Add NPU docker image by @dongdongqiang2018 in #3876
  • Improve NPU document by @MengqingCao in #3930
  • Support SFT packing with greedy knapsack algorithm by @AlongWY in #4009
  • Add llamafactory-cli env for bug report
  • Support image input in the API mode
  • Support random initialization via the train_from_scratch argument
  • Initialize CI

New models

  • Base models
    • Qwen2 (0.5B/1.5B/7B/72B/MoE) πŸ“„
    • PaliGemma-3B (pt/mix) πŸ“„πŸ–ΌοΈ
    • GLM-4-9B πŸ“„
    • Falcon-11B πŸ“„
    • DeepSeek-V2-Lite (16B) πŸ“„
  • Instruct/Chat models
    • Qwen2-Instruct (0.5B/1.5B/7B/72B/MoE) πŸ“„πŸ€–
    • Mistral-7B-Instruct-v0.3 πŸ“„πŸ€–
    • Phi-3-small-8k-instruct (7B) πŸ“„πŸ€–
    • Aya-23 (8B/35B) πŸ“„πŸ€–
    • OpenChat-3.6-8B πŸ“„πŸ€–
    • GLM-4-9B-Chat πŸ“„πŸ€–
    • TeleChat-12B-Chat by @hzhaoy in #3958 πŸ“„πŸ€–
    • Phi-3-medium-8k-instruct (14B) πŸ“„πŸ€–
    • DeepSeek-V2-Lite-Chat (16B) πŸ“„πŸ€–
    • Codestral-22B-v0.1 πŸ“„πŸ€–

New datasets

  • Pre-training datasets
    • FineWeb (en)
    • FineWeb-Edu (en)
  • Supervised fine-tuning datasets
    • Ruozhiba-GPT4 (zh)
    • STEM-Instruction (zh)
  • Preference datasets
    • Argilla-KTO-mix-15K (en)
    • UltraFeedback (en)

Bug fix