Releases: InternLM/xtuner
Releases · InternLM/xtuner
XTuner Release V0.1.19
What's Changed
- [Fix] LLaVA-v1.5 official settings by @LZHgrla in #594
- [Feature] Release LLaVA-Llama-3-8B by @LZHgrla in #595
- [Improve] Add single-gpu configs for LLaVA-Llama-3-8B by @LZHgrla in #596
- [Docs] Add wisemodel badge by @LZHgrla in #597
- [Feature] Support load_json_file with json.load by @HIT-cwh in #610
- [Feature]Support Mircosoft Phi3 4K&128K Instruct Models by @pppppM in #603
- [Fix] set
dataloader_num_workers=4
for llava training by @LZHgrla in #611 - [Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. by @HIT-cwh in #609
- [Release] LLaVA-Phi-3-mini by @LZHgrla in #615
- Update README.md by @eltociear in #608
- [Feature] Refine sp api by @HIT-cwh in #619
- [Feature] Add conversion scripts for LLaVA-Llama-3-8B by @LZHgrla in #618
- [Fix] Convert nan to 0 just for logging by @HIT-cwh in #625
- [Docs] Delete colab and add speed benchmark by @HIT-cwh in #617
- [Feature] Support dsz3+qlora by @HIT-cwh in #600
- [Feature] Add qwen1.5 110b cfgs by @HIT-cwh in #632
- check transformers version before dispatch by @HIT-cwh in #672
- [Fix]
convert_xtuner_weights_to_hf
with frozen ViT by @LZHgrla in #661 - [Fix] Fix batch-size setting of single-card LLaVA-Llama-3-8B configs by @LZHgrla in #598
- [Feature] add HFCheckpointHook to auto save hf model after the whole training phase by @HIT-cwh in #621
- Remove test info in DatasetInfoHook by @hhaAndroid in #622
- [Improve] Support
safe_serialization
saving by @LZHgrla in #648 - bump version to 0.1.19 by @HIT-cwh in #675
New Contributors
- @eltociear made their first contribution in #608
Full Changelog: v0.1.18...v0.1.19
XTuner Release V0.1.18
What's Changed
- set dev version by @LZHgrla in #537
- [Fix] Fix typo by @KooSung in #547
- [Feature] support mixtral varlen attn by @HIT-cwh in #564
- [Feature] Support qwen sp and varlen attn by @HIT-cwh in #565
- [Fix]Fix attention mask in
default_collate_fn
by @pppppM in #567 - Accept pytorch==2.2 as the bugs in triton 2.2 are fixed by @HIT-cwh in #548
- [Feature] Refine Sequence Parallel API by @HIT-cwh in #555
- [Fix] Enhance
split_list
to supportvalue
at the beginning by @LZHgrla in #568 - [Feature] Support cohere by @HIT-cwh in #569
- [Fix] Fix rotary_seq_len in varlen attn in qwen by @HIT-cwh in #574
- [Docs] Add sequence parallel related to readme by @HIT-cwh in #578
- [Bug] SUPPORT_FLASH1 = digit_version(torch.version) >= digit_version('2… by @HIT-cwh in #587
- [Feature] Support Llama 3 by @LZHgrla in #585
- [Docs] Add llama3 8B readme by @HIT-cwh in #588
- [Bugs] Check whether cuda is available when choose torch_dtype in sft.py by @HIT-cwh in #577
- [Bugs] fix bugs in tokenize_ftdp_datasets by @HIT-cwh in #581
- [Feature] Support qwen moe by @HIT-cwh in #579
- [Docs] Add tokenizer to sft in Case 2 by @HIT-cwh in #583
- bump version to 0.1.18 by @HIT-cwh in #590
Full Changelog: v0.1.17...v0.1.18
XTuner Release V0.1.17
What's Changed
- [Fix] Fix PyPI package by @LZHgrla in #540
- [Improve] Add LoRA fine-tuning configs for LLaVA-v1.5 by @LZHgrla in #536
- [Configs] Add sequence_parallel_size and SequenceParallelSampler to configs by @HIT-cwh in #538
- Check shape of attn_mask during attn forward by @HIT-cwh in #543
- bump version to v0.1.17 by @LZHgrla in #542
Full Changelog: v0.1.16...v0.1.17
XTuner Release V0.1.16
What's Changed
- set dev version by @LZHgrla in #487
- Fix type error when the visual encoder is not CLIP by @hhaAndroid in #496
- [Feature] Support Sequence parallel by @HIT-cwh in #456
- [Bug] Fix bugs in flash_attn1_pytorch by @HIT-cwh in #513
- [Fix] delete cat in varlen attn by @HIT-cwh in #508
- bump version to 0.1.16 by @HIT-cwh in #520
- [Improve] Add
generation_kwargs
forEvaluateChatHook
by @LZHgrla in #501 - [Bugs] Fix bugs when training in non-distributed env by @HIT-cwh in #522
- [Fix] Support transformers>=4.38 and require transformers>=4.36.0 by @HIT-cwh in #494
- [Fix] Fix throughput hook by @HIT-cwh in #527
- Update README.md by @JianxinDong in #528
- [Fix] dispatch internlm rote by @HIT-cwh in #530
- Limit transformers != 4.38 by @HIT-cwh in #531
New Contributors
- @hhaAndroid made their first contribution in #496
- @JianxinDong made their first contribution in #528
Full Changelog: v0.1.15...v0.1.16
XTuner Release V0.1.15
What's Changed
- set dev version by @LZHgrla in #437
- [Bugs] Fix bugs when using EpochBasedRunner by @HIT-cwh in #439
- [Feature] Support processing ftdp dataset and custom dataset offline by @HIT-cwh in #410
- Update prompt_template.md by @aJupyter in #441
- [Doc] Split finetune_custom_dataset.md to 6 parts by @HIT-cwh in #445
- [Improve] Add notes for demo_data examples by @LZHgrla in #458
- [Fix] Gemma prompt_template by @LZHgrla in #454
- [Feature] Add LLaVA-InternLM2-1.8B by @LZHgrla in #449
- show more info about datasets by @amulil in #464
- [Fix] write text with
encoding='utf-8'
by @LZHgrla in #477 - support offline process llava data by @HIT-cwh in #448
- [Fix]
msagent_react_map_fn
error by @LZHgrla in #470 - [Improve] Reorg
xtuner/configs/llava/
configs by @LZHgrla in #483 - limit pytorch version <= 2.1.2 as there may be some bugs in triton2… by @HIT-cwh in #452
- [Fix] fix batch sampler bs by @HIT-cwh in #468
- bump version to v0.1.15 by @LZHgrla in #486
New Contributors
Full Changelog: v0.1.14...v0.1.15
XTuner Release V0.1.14
What's Changed
- set dev version by @LZHgrla in #341
- [Feature] More flexible
TrainLoop
by @LZHgrla in #348 - [Feature]Support CEPH by @pppppM in #266
- [Improve] Add
--repetition-penalty
forxtuner chat
by @LZHgrla in #351 - [Feature] Support MMBench DDP Evaluate by @pppppM in #300
- [Fix]
KeyError
ofencode_fn
by @LZHgrla in #361 - [Fix] Fix
batch_size
of full fine-tuing LLaVA-InternLM2 by @LZHgrla in #360 - [Fix] Remove
system
foralpaca_map_fn
by @LZHgrla in #363 - [Fix] Use
DEFAULT_IMAGE_TOKEN
instead of'<image>'
by @LZHgrla in #353 - [Feature] Support internlm sft by @HIT-cwh in #302
- [Fix] Add
attention_mask
fordefault_collate_fn
by @LZHgrla in #371 - [Fix] Update requirements by @LZHgrla in #369
- [Fix] Fix rotary_base, add
colors_map_fn
toDATASET_FORMAT_MAPPING
and rename 'internlm_repo' to 'intern_repo' by @HIT-cwh in #372 - update by @HIT-cwh in #377
- Delete useless codes and refactor process_untokenized_datasets by @HIT-cwh in #379
- [Feature] support flash attn 2 in internlm1, internlm2 and llama by @HIT-cwh in #381
- [Fix] Fix installation docs of mmengine in
intern_repo_dataset.md
by @LZHgrla in #384 - [Fix] Update InternLM2
apply_rotary_pos_emb
by @LZHgrla in #383 - [Feature] support saving eval output before save checkpoint by @HIT-cwh in #385
- fix lr scheduler setting by @gzlong96 in #394
- [Fix] Remove pre-defined
system
ofalpaca_zh_map_fn
by @LZHgrla in #395 - [Feature] Support
Qwen1.5
by @LZHgrla in #407 - [Fix] Fix no space in chat output using InternLM2. (#357) by @KooSung in #404
- [Fix] typo:
--system-prompt
to--system-template
by @LZHgrla in #406 - [Improve] Add
output_with_loss
for dataset process by @LZHgrla in #408 - [Fix] Fix dispatch to support transformers>=4.36 & Add USE_TRITON_KERNEL environment variable by @HIT-cwh in #411
- [Feature]Add InternLM2-Chat-1_8b full config by @KMnO4-zx in #396
- [Fix] Fix extract_json_objects by @fanqiNO1 in #419
- [Fix] Fix pth_to_hf error by @LZHgrla in #426
- [Feature] Support
Gemma
by @PommesPeter in #429 - add refcoco to llava by @LKJacky in #425
- [Fix] Inconsistent BatchSize of
LengthGroupedSampler
by @LZHgrla in #436 - bump version to v0.1.14 by @LZHgrla in #431
New Contributors
- @gzlong96 made their first contribution in #394
- @KooSung made their first contribution in #404
- @KMnO4-zx made their first contribution in #396
- @fanqiNO1 made their first contribution in #419
- @PommesPeter made their first contribution in #429
- @LKJacky made their first contribution in #425
Full Changelog: v0.1.13...v0.1.14
XTuner Release V0.1.13
What's Changed
- set dev version by @LZHgrla in #329
- [Docs] Add LLaVA-InternLM2 results by @LZHgrla in #332
- Update internlm2_chat template by @RangiLyu in #339
- [Fix] Fix examples demo_data configs by @LZHgrla in #334
- bump version to v0.1.13 by @LZHgrla in #340
New Contributors
Full Changelog: v0.1.12...v0.1.13
XTuner Release V0.1.12
What's Changed
- set dev version by @LZHgrla in #281
- [Fix] Update LLaVA results by @LZHgrla in #283
- [Fix] Update LLaVA results (based on VLMEvalKit) by @LZHgrla in #285
- [Fix] Fix filter bug for test data by @LZHgrla in #293
- [Fix] Fix
ConcatDataset
by @LZHgrla in #298 - [Improve] Redesign the
prompt_template
by @LZHgrla in #294 - [Fix] Fix errors about
stop_words
by @LZHgrla in #313 - [Fix] Fix Mixtral LoRA setting by @LZHgrla in #312
- [Feature] Support DeepSeek-MoE by @LZHgrla in #311
- [Fix] Set
torch.optim.AdamW
as the default optimizer by @LZHgrla in #318 - [FIx] Fix
pth_to_hf
for LLaVA model by @LZHgrla in #316 - [Improve] Add
demo_data
examples by @LZHgrla in #278 - [Feature] Support InternLM2 by @LZHgrla in #321
- [Fix] Fix the resume of seed by @LZHgrla in #309
- [Feature] Accelerate
xtuner xxx
by @pppppM in #307 - [Fix] Fix InternLM2 url by @LZHgrla in #325
- [Fix] Limit the version of python,
>=3.8, <3.11
by @LZHgrla in #327 - [Fix] Add
trust_remote_code=True
for AutoModel by @LZHgrla in #328 - [Docs] Improve README by @LZHgrla in #326
- bump verion to v0.1.12 by @pppppM in #323
Full Changelog: v0.1.11...v0.1.12
XTuner Release V0.1.11
What's Changed
- [Docs] Update Mixtral 8x7b docs by @LZHgrla in #265
- [Bug] Fix bugs when chat with --lagent by @ooooo-create in #269
- [Feature] Support setting the random seed for
xtuner train
by @LZHgrla in #272 - [Fix] Update Mixtral-8x7b repo_id; Add mixtral template by @LZHgrla in #275
- [Feature] Add Qwen 72b config by @xiaohangguo in #254
- [Improve] Add notes for requirements; Improve badges by @LZHgrla in #277
- [Feature] Support LLaVA by @LZHgrla in #196
- [Feature] Add
warmup
for all configs by @LZHgrla in #274 - bump version to v0.1.11 by @LZHgrla in #280
New Contributors
- @ooooo-create made their first contribution in #269
Full Changelog: v0.1.10...v0.1.11
XTuner Release V0.1.10
What's Changed
- [Feature] Support for full-scale fine-tuning of large language models such as Llama2 70B. by @HIT-cwh in #231
- [Feature] Support to process internlm-style datasets by @HIT-cwh in #232
- [Fix] Fix bugs of llama dispatch by @LZHgrla in #229
- [Bug] Resolve the bug introduced by higher versions of DeepSpeed. by @HIT-cwh in #240
- [Doc] Add internlm dataset doc by @HIT-cwh in #242
- add
wizardcoder
template by @xiaohangguo in #243 - [Feature] Filter negative labels by @xiaohangguo in #244
- [Bug] Support auto detect torch_dtype in chat.py by @HIT-cwh in #250
- [Feature] Add Qwen 1.8b config by @xiaohangguo in #252
- [Feature]Add Deepseekcoder config by @xiaohangguo in #253
- [Bug] Fix bugs when grad clip == 0 by @HIT-cwh in #262
- [Feature] Support Mixtral 8x7b by @pppppM in #263
- bump version to v0.1.10 by @pppppM in #264
New Contributors
- @xiaohangguo made their first contribution in #243
- @pppppM made their first contribution in #263
Full Changelog: v0.1.9...v0.1.10