You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train phi3 mini model with longer context length 8192 than its default length of 4096.
I understand that reope scaling is not supported for models with sliding window. How can I proceed
from this to train a phi3 model with longer context? should i finetune the base model to extend its
context length? which methods can i use? Is there a plan to support in future?
Hi,
I am trying to train phi3 mini model with longer context length 8192 than its default length of 4096.
I understand that reope scaling is not supported for models with sliding window. How can I proceed
from this to train a phi3 model with longer context? should i finetune the base model to extend its
context length? which methods can i use? Is there a plan to support in future?
AlgorithmError: ExecuteUserScriptError: ExitCode 1 ErrorMessage "raise RuntimeError( RuntimeError: Unsloth: Unfortunately Mistral type models do not support RoPE scaling! The maximum sequence length supported is 4096." Command "/opt/conda/bin/python3.10 run_unsloth.py --bf16 True --dataset_path /opt/ml/input/data/training --eval_steps 1000 --evaluation_strategy steps --fp16 False --gradient_accumulation_steps 2 --gradient_checkpointing True --learning_rate 0.0002 --load_in_4bit True --logging_dir /opt/ml/output/tensorboard --logging_steps 10 --lr_scheduler_type linear --max_seq_length 8192 --model_name unsloth/Phi-3-mini-4k-instruct-bnb-4bit --neftune_noise_alpha 5 --num_train_epochs 2 --optim adamw_8bit --output_dir /opt/ml/checkpoints --per_device_eval_batch_size 6 --per_device_train_batch_size 6 --report_to tensorboard --save_strategy epoch --seed 3407 --train_filename train.parquet --validation_filename val.parquet --warmup_steps 5 --weight_decay 0.01", exit code: 1
The text was updated successfully, but these errors were encountered: