Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatglm3,lora微调报错 #116

Open
Jsonzhang20 opened this issue May 15, 2024 · 1 comment
Open

chatglm3,lora微调报错 #116

Jsonzhang20 opened this issue May 15, 2024 · 1 comment

Comments

@Jsonzhang20
Copy link

image
数据集是系统提供的huanhuan.json

相关参数设置如下:
data_collator = DataCollatorForSeq2Seq(
tokenizer,
model=model,
label_pad_token_id=-100,
pad_to_multiple_of=None,
padding=False
)
# 自定义 TrainingArguments 参数
args = TrainingArguments(
output_dir="output/ChatGLM", # 模型输出路径
num_train_epochs=1, # epoch
per_device_train_batch_size=1, # batch_size
gradient_accumulation_steps=8, # 梯度累加,如果你的显存比较小,那可以把 batch_size 设置小一点,梯度累加增大一些
logging_steps=5, # 多少步,输出一次log
save_steps = 100, # 多少步保存一次
save_strategy= 'steps',
# max_steps = 5, # 总共训练多少步,官方推荐52000
learning_rate= 1e-4,
# gradient_checkpointing = True # 梯度检查,这个一旦开启,模型就必须执行model.enable_input_require_grads()
)

@KMnO4-zx
Copy link
Contributor

windows环境太复杂了,总会出现奇奇怪怪的bug,建议在linux环境下学习本教程,或使用与本教程一样的autodl环境

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants