Skip to content

Instantly share code, notes, and snippets.

@jelc53
Last active June 2, 2023 18:06
Show Gist options
  • Save jelc53/7107f0da424eeca8a7b4bf3b4b0dc747 to your computer and use it in GitHub Desktop.
Save jelc53/7107f0da424eeca8a7b4bf3b4b0dc747 to your computer and use it in GitHub Desktop.
pre-processing image data for deepsolar
# download french dataset and unzip in data directory
# sub dir structure: data/bdappv-france/bdappv/[ign,google]/[img,mask]
mkdir bdappv-france
unzip bdappv.zip
# checkout retrain_pytorch github branch
git checkout retrain_pytorch
git pull
# create "dummy" data sub-directory for testing
# sub dir structure: data/dummy/val/[0,1]
mkdir dummy && cd dummy
mkdir val && cd val
mkdir 0 && mkdir 1
# drop files_to_copy.csv into data directory
scp -i ~/.ssh/cs224n.pem files_to_copy.csv ubuntu@ec2-XX-XX-XXX-XXX.us-west-2.compute.amazonaws.com:files_to_copy.csv
mv files_to_copy.csv deepsolar/data/files_to_copy.csv
# convert file list csv to txt file
cat files_to_copy.csv | tr ',' '\n' > files_to_copy.txt
# copy out image files specified by file list
# run the following bash script from the data directory
mkdir bar && bash copy_out.sh < files_to_copy.txt # dumps images into tmp bar/ folder
# move images to target directory we will use for testing
mv bar dummy/val/1
# (optional) repeat for mask files if not already included
# note, to re-create list with updated path can use string_surgery.py
# helper script which outputs files_to_copy_updated.txt
python string_surgery.py files_to_copy.txt --change_str 'mask' --change_idx 8
bash copy_out.sh files_to_copy_updated.txt
bash rename_files.sh
cp -r bar/. dummy/val/1
rm -rf bar
# (optional) copy across N files from dir a to b
cp $(ls | head -n N) path/to/b
# (optional) diff between files prepend path
grep -Fxvf file1 file2
awk '{print "prefix" $0}' file
# run testing deepsolar script
# note, first change directory config in .py file
# to match new target directory @ data/dummy/val
cd ~/deepsolar/src/
python train_segmentation_pytorch.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment