Skip to content

Instantly share code, notes, and snippets.

@mkocabas
Created April 9, 2018 09:41
Show Gist options
  • Save mkocabas/a6177fc00315403d31572e17700d7fd9 to your computer and use it in GitHub Desktop.
Save mkocabas/a6177fc00315403d31572e17700d7fd9 to your computer and use it in GitHub Desktop.
Download COCO dataset. Run under 'datasets' directory.
mkdir coco
cd coco
mkdir images
cd images
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/zips/test2017.zip
wget http://images.cocodataset.org/zips/unlabeled2017.zip
unzip train2017.zip
unzip val2017.zip
unzip test2017.zip
unzip unlabeled2017.zip
rm train2017.zip
rm val2017.zip
rm test2017.zip
rm unlabeled2017.zip
cd ../
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip
wget http://images.cocodataset.org/annotations/image_info_test2017.zip
wget http://images.cocodataset.org/annotations/image_info_unlabeled2017.zip
unzip annotations_trainval2017.zip
unzip stuff_annotations_trainval2017.zip
unzip image_info_test2017.zip
unzip image_info_unlabeled2017.zip
rm annotations_trainval2017.zip
rm stuff_annotations_trainval2017.zip
rm image_info_test2017.zip
rm image_info_unlabeled2017.zip
@bit-scientist
Copy link

The exact same script but with the modification proposed by @buttercutter, @ben-xD and @AlessandroMondin (memory constraint)

First open three separate shells and lay out them for convenient use:

On the first one do the following: (you may need to use sudo if permission denied error appears )

mkdir coco
cd coco
mkdir images
cd images
wget -c http://images.cocodataset.org/zips/train2017.zip

On the second:

cd coco/images/
wget -c http://images.cocodataset.org/zips/val2017.zip
wget -c http://images.cocodataset.org/zips/test2017.zip

Note that you will need to press Enter on shell 2 to download test2017.zip after it finishes val2017.zip

On the third:

cd coco/images/
wget -c http://images.cocodataset.org/zips/unlabeled2017.zip

Wait a little while (or do some five-minute stretching 😄 ) until processes on shells one and two finish.

Back on the first shell, issue the following:

unzip train2017.zip
unzip val2017.zip
unzip test2017.zip
unzip unlabeled2017.zip

rm train2017.zip
rm val2017.zip
rm test2017.zip
rm unlabeled2017.zip 

On the second shell:

cd ../
wget -c http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget -c http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip
wget -c http://images.cocodataset.org/annotations/image_info_test2017.zip
wget -c http://images.cocodataset.org/annotations/image_info_unlabeled2017.zip

unzip annotations_trainval2017.zip
rm annotations_trainval2017.zip

unzip stuff_annotations_trainval2017.zip
rm stuff_annotations_trainval2017.zip

unzip image_info_test2017.zip
rm image_info_test2017.zip

unzip image_info_unlabeled2017.zip
rm image_info_unlabeled2017.zip

By the time shells one and two finish, the shell three will have finished its job. Hope it helps save some time. 🤝

@M0E313
Copy link

M0E313 commented Mar 18, 2024

same script but instead of 5 min stretch, try doing squats.

@shwu-nyunai
Copy link

Everything but Claude Haiku-fied with <script> + separate and run them in parallel (maybe background processes)

# Create the coco directory and cd into it
mkdir coco
cd coco

# Create the images directory and cd into it
mkdir images
cd images

# Download the dataset zip files in parallel
wget http://images.cocodataset.org/zips/train2017.zip &
wget http://images.cocodataset.org/zips/val2017.zip &
wget http://images.cocodataset.org/zips/test2017.zip &
wget http://images.cocodataset.org/zips/unlabeled2017.zip &
wait

# Unzip the dataset zip files in parallel
unzip train2017.zip &
unzip val2017.zip &
unzip test2017.zip &
unzip unlabeled2017.zip &
wait

# Remove the zip files
rm train2017.zip
rm val2017.zip
rm test2017.zip
rm unlabeled2017.zip

# Go back to the coco directory
cd ../

# Download the annotation zip files in parallel
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip &
wget http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip &
wget http://images.cocodataset.org/annotations/image_info_test2017.zip &
wget http://images.cocodataset.org/annotations/image_info_unlabeled2017.zip &
wait

# Unzip the annotation zip files in parallel
unzip annotations_trainval2017.zip &
unzip stuff_annotations_trainval2017.zip &
unzip image_info_test2017.zip &
unzip image_info_unlabeled2017.zip &
wait

# Remove the zip files
rm annotations_trainval2017.zip
rm stuff_annotations_trainval2017.zip
rm image_info_test2017.zip
rm image_info_unlabeled2017.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment