Skip to content

A script for training the ConvNextV2 on CIFAR10 dataset using the FSDP technique for a distributed training scheme.

Notifications You must be signed in to change notification settings

arawxx/FSDP-Distributed-Training-of-ConvNextV2-on-CIFAR10

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FSDP Distributed Training of ConvNextV2 on CIFAR10

A script for training the ConvNextV2 on CIFAR10 dataset using the FSDP technique for a distributed training scheme. You can run the script using the torchrun with the run.py file, i.e.: torchrun --nnodes 1 --nproc_per_node 2 run.py

run.py script arguments include:

--batch-size
--epochs
--lr
--gamma
--no-cuda
--seed
--run_validation
--save-model

Additional info for the arguments can be seen using the --help argument.

Releases

No releases published

Packages

No packages published

Languages