Skip to content

Instantly share code, notes, and snippets.

@Zasder3
Last active March 4, 2022 10:26
Show Gist options
  • Save Zasder3/4a5cde2854788b2f99aa131f472863b0 to your computer and use it in GitHub Desktop.
Save Zasder3/4a5cde2854788b2f99aa131f472863b0 to your computer and use it in GitHub Desktop.
export WORLD_SIZE=96
export MASTER_ADDR=172.31.207.212
export MASTER_PORT=13820
cd open_clip
export PYTHONPATH="$PYTHONPATH:$PWD/src"
torchrun --nnodes=12 --nproc_per_node=8 --rdzv_id=42 --rdzv_backend=c10d --rdzv_endpoint=$MASTER_ADDR \
src/training/main.py --save-frequency 1 --report-to wandb --train-data="pipe:aws s3 cp --quiet s3://laion-us-east-1/laion-data/laion2B-data/{000000..041455}.tar -" --dataset-type="webdataset" --model=ViT-B/32 --batch-size=256 --warmup=2000 --workers=8 --local-loss --gather-with-grad --dist-url="env://"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment