Skip to content

Instantly share code, notes, and snippets.

@nkashy1 nkashy1/stanford-dogs.bash

Last active Sep 23, 2019
Embed
What would you like to do?
Upload Stanford Dogs dataset to S3 and register against Simiotics Data Registry
#!/usr/bin/env bash
STANFORD_DOG_IMAGES_DIR=${STANFORD_DOG_IMAGES_DIR:-~/data/stanford-dogs/Images}
DOG_DIRS=$(ls -1 $STANFORD_DOG_IMAGES_DIR)
BATCH_SIZE=${BATCH_SIZE:-100}
PARALLELISM=${PARALLELISM:-0}
SIMIOTICS_SOURCE=${SIMIOTICS_SOURCE}
if [ -z "$SIMIOTICS_SOURCE" ]; then
echo "ERROR: SIMIOTICS_SOURCE environment variable must be defined"
exit 1
fi
for dog_dir in ${DOG_DIRS[@]}; do
breed_tag=$(echo $dog_dir | awk -F- '{OFS="-"; $1=""; print $0}' | sed 's/^-//')
ls -1 $STANFORD_DOG_IMAGES_DIR/$dog_dir/* | xargs -n${BATCH_SIZE} -P${PARALLELISM} simiotics_s3 data register -s $SIMIOTICS_SOURCE -t breed=$breed_tag
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.