请问为什么在训练llama的脚本中，预训练和微调所使用的conv不一样 #89

shidingz · 2024-04-24T07:33:11Z

https://github.com/dvlab-research/MGM/blob/main/scripts/llama/train/stage_1_2_full_v7b_336_hr_768.sh
在这个脚本中pretrain用的--version plain
而finetune用的是--version v1
前后不一致模型不会混乱吗

yanwei-li · 2024-05-03T04:44:32Z

Hi, for the LLaMA 7B and 13B, we follow the instruction format in LLaVA. In the pretraining stage, the main focus is image caption. So, it works well with plain style.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

请问为什么在训练llama的脚本中，预训练和微调所使用的conv不一样 #89

请问为什么在训练llama的脚本中，预训练和微调所使用的conv不一样 #89

shidingz commented Apr 24, 2024

yanwei-li commented May 3, 2024

请问为什么在训练llama的脚本中，预训练和微调所使用的conv不一样 #89

请问为什么在训练llama的脚本中，预训练和微调所使用的conv不一样 #89

Comments

shidingz commented Apr 24, 2024

yanwei-li commented May 3, 2024