Internlm2-chat-7b在进行单轮对话微调后，灾难性遗忘，求帮助 #516

Egber1t · 2024-02-01T14:23:03Z

max_length = 2048
pack_to_max_length = True

Scheduler & Optimizer

batch_size = 1 # per_device
accumulative_counts = 16
dataloader_num_workers = 0
max_epochs = 3
optim_type = AdamW
lr = 2e-4
betas = (0.9, 0.999)
weight_decay = 0
max_norm = 1 # grad clip
warmup_ratio = 0.03

Save

save_steps = 500
save_total_limit = 2 # Maximum checkpoints to keep (-1 means unlimited)

Evaluate the generation performance during the training

evaluation_freq = 500
SYSTEM = ''
evaluation_inputs = [
'请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai'
]
tokenizer = dict(
type=AutoTokenizer.from_pretrained,
pretrained_model_name_or_path=pretrained_model_name_or_path,
trust_remote_code=True,
padding_side='right')

model = dict(
type=SupervisedFinetune,
llm=dict(
type=AutoModelForCausalLM.from_pretrained,
pretrained_model_name_or_path=pretrained_model_name_or_path,
trust_remote_code=True,
torch_dtype=torch.float16,
quantization_config=dict(
type=BitsAndBytesConfig,
load_in_4bit=True,
load_in_8bit=False,
llm_int8_threshold=6.0,
llm_int8_has_fp16_weight=False,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type='nf4')),
lora=dict(
type=LoraConfig,
r=64,
lora_alpha=16,
lora_dropout=0.1,
bias='none',
task_type='CAUSAL_LM'))

MING-ZCH · 2024-03-17T05:28:37Z

数据量是多少？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internlm2-chat-7b在进行单轮对话微调后，灾难性遗忘，求帮助 #516

Internlm2-chat-7b在进行单轮对话微调后，灾难性遗忘，求帮助 #516

Egber1t commented Feb 1, 2024

MING-ZCH commented Mar 17, 2024

Internlm2-chat-7b在进行单轮对话微调后，灾难性遗忘，求帮助 #516

Internlm2-chat-7b在进行单轮对话微调后，灾难性遗忘，求帮助 #516

Comments

Egber1t commented Feb 1, 2024

Scheduler & Optimizer

Save

Evaluate the generation performance during the training

MING-ZCH commented Mar 17, 2024