aman-tiwari/WORKSHOP.md

## WORKSHOP.md

      
    Raw
  

              WORKSHOP.md
            
          
    Messing with Maps and ML quickstart

This document: https://goo.gl/AqGoE8
Installation instructions

By far the most annoying part of getting started with messing with ML is installing researcher-made code and turning it into something fun to play with.
Before doing any of these, please install Miniconda. If you don't have it installed already, here's how:
For OSX, this is:
curl "https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh" -o "Miniconda.sh"
bash Miniconda.sh
For Linux this is:
wget "https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh" -O "Miniconda.sh"
bash Miniconda.sh
For Windows, go to https://conda.io/miniconda.html and run the installer.
After running those commands, you'll have to go through the installation process, which involves:

holding down return to scroll past the licence and typing in yes to accept it.
Leave the installation path to the default
Answer yes when it asks Do you wish the installer to prepend the Miniconda3 install location to PATH...

Then, run source ~/.bashrc
Now, set up two environments:
conda create -n py3k anaconda numpy scipy scikit-image rasterio python=3.6
conda create -n analogies anaconda numpy scipy scikit-image python=3.6

Data

For all of the models, we need two aligned datasets of images. This basically means that we need two folders, called datasetA/train and datasetB/train such that datasetA/train/abcd.png corresponds in some way to datasetB/train/abcd.png. For instance, one could be the outline of a handbag and the other the picture of the handbag itself. See pix2pix for some examples.
Luckily, we'll be working with slippy maps, for which it is easy to create two aligned datasets.
I've created a script that lets you scrape any slippy map once you get a url in the form http://api.a_website.com/.../{z}/{x}/{y}....
To install the scraper,
git clone https://github.com/stamen/the-ultimate-tile-stitcher
cd the-ultimate-tile-stitcher
source activate py3k
pip install -r requirements.txt
If you get an error regarding libgeos, try brew install geos for OSX or install geos through your package manager on Linux.
For the data, you could create two maps in mapbox studio and get the the api url from the Styles -> { Select a style } -> Share, develop & use -> User style in GIS apps -> CartoDB. Or, you can use any other slippy map service that gives you an api of this form (e.g, maps.stamen.com)
Here's one I made earlier:
(Use these URLS in the scraper):
https://api.mapbox.com/styles/v1/aman-tiwari/cj5ms4up63pre2slf4b1v3auu/tiles/256/{z}/{x}/{y}@2x?access_token=pk.eyJ1IjoiYW1hbi10aXdhcmkiLCJhIjoiY2ozajdzOXM4MDBqYjJ3cXNnbHg3YjF3dyJ9.DjsmHW5ahovyG4sYPGQ-Zw
https://api.mapbox.com/styles/v1/mapbox/satellite-streets-v10/tiles/256/{z}/{x}/{y}@2x?access_token=pk.eyJ1IjoiYW1hbi10aXdhcmkiLCJhIjoiY2ozajdzOXM4MDBqYjJ3cXNnbHg3YjF3dyJ9.DjsmHW5ahovyG4sYPGQ-Zw
Then, go to to geojson.io and create a GeoJSON containing the area you want to sample the dataset of tiles from.
Then, run (still in the scraper directory), where {slippy map url 1} and {slippy map url 2} are the two styles you want to scrape:
mkdir tiles_1
mkdir tiles_1/train
python scraper.py --poly {your geojson} --zoom {the zoom level you want} --url {slippy map url 1} --out-dir tiles_1/train
And also
mkdir tiles_2
mkdir tiles_2/train
python scraper.py --poly {your geojson} --zoom {the zoom level you want} --url {slippy map url 2} --out-dir tiles_2/train
Image Analogies

Image analogies is a generative technique based on matching patches between two sets of images: https://github.com/awentzonline/image-analogies
To install it,
source activate analogies
pip install neural-image-analogies
Then, ff your laptop doesn't have a gpu run pip install tensorflow-cpu otherwise run pip install Theano=0.8.1. Theano is annoying to install and use, so I'd suggest using the first one
Then, download the vgg16_weights.h5 weights from: https://drive.google.com/file/d/0Bz7KyqmuGsilT0J5dmRCM0ROVHc/view
Then, follow the instructions at https://github.com/awentzonline/image-analogies to make your own image analogies between the tiles you scraped.
Pix2pix

Many of you may be familiar with stuff like:

edges -> cat
edges -> horrifying face
zebra <-> horse

All of these are instances of a class of generative models called pix2pix, with the zebra <-> horse being a CycleGan, in the class of Image-to-Image translation networks.
Today we'll look at Pix2pix, because it's easy to explain and powerful.
Intallation, data and training

We'll use the pytorch implementation located at: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
To install, (after doing the above conda installation instructions):
Activate the conda environment. This means anything you install will be kept neatly inside this environment without trampling over everything else
source activate py3k
conda install -c menpo opencv
pip install visdom dominate
conda update
then, follow the instructions on http://pytorch.org/ to install PyTorch.
then
git clone https://github.com/aman-tiwari/pytorch-CycleGAN-and-pix2pix
cd pytorch-CycleGAN-and-pix2pix
mkdir datasets/tiles_dataset
python datasets/combine_A_and_B.py --fold_A {path to tiles_1 that you scraped} --fold_B {path to tiles_2} --fold_AB datasets/tiles_dataset
This will create concatenate the images you scraped to be side by side, preparing them to train the model.
Then, in another terminal, rin:
source activate py3k
python -m visdom.server

and then, to begin training, run:
python train.py --dataroot datasets/tiles_dataset --name tiles --model pix2pix  --which_model_netG unet_256 --which_direction AtoB --lambda_A 100 --dataset_mode aligned --no_lsgan --norm batch --gpu_ids=-1 

This will take a long time, and especially if you don't have a gpu. Here's one I prepared earlier:
https://drive.google.com/drive/folders/0B3B6i70h60E2Z2NvcVdMY1dPX00?usp=sharing
Download latest_net_D.pth and latest_net_G.pth and put them in checkpoints/tiles_pretrained/
Then, draw some images using the palette of https://goo.gl/nn8oAC , and put them in drawn_imgs
Then, run
python test.py --dataroot drawn_imgs --name tiles_pretrained --model test --which_model_netG unet_256 --which_direction BtoA --dataset_mode single --gpu_ids=-1 --norm batch (remove the gpu_ids argument to use the gpu).
to generate some fake images! These will be placed in ./checkpoints/tiles_pretrained/web/images
(The pretrained one above was trained on the styles https://api.mapbox.com/styles/v1/aman-tiwari/cj5ms4up63pre2slf4b1v3auu.html?title=true&access_token=pk.eyJ1IjoiYW1hbi10aXdhcmkiLCJhIjoiY2ozajdzOXM4MDBqYjJ3cXNnbHg3YjF3dyJ9.DjsmHW5ahovyG4sYPGQ-Zw#12.9/52.427631/4.927835/0 and https://api.mapbox.com/styles/v1/mapbox/satellite-streets-v10.html?fresh=true&title=true&access_token=pk.eyJ1IjoiYW1hbi10aXdhcmkiLCJhIjoiY2ozajdzOXM4MDBqYjJ3cXNnbHg3YjF3dyJ9.DjsmHW5ahovyG4sYPGQ-Zw#1.82/0/0)
Useful, interesting links

Generative


CycleGAN and Pix2Pix: https://junyanz.github.io/CycleGAN/
a* Pix2pix online demo: https://affinelayer.com/pixsrv/
Pix2Pix explainerf: https://ml4a.github.io/guides/Pix2Pix/
Deepdream, Face generation things: http://mtyka.github.io/
Terrapattern: http://www.terrapattern.com/
PENNY: http://penny.digitalglobe.com/
Invisible Cities: https://opendot.github.io/ml4a-invisible-cities/
Street-view, GTA5 generation: https://www.youtube.com/watch?v=0fhUJT21-bs
Neural Image Analogies: https://github.com/awentzonline/image-analogies

Analytic


CIA Lab doing work on ML and sat. imagery: https://medium.com/the-downlinq
Psychogeographically segmenting neighbourhoods: https://medium.com/topos-ai/rethinking-distance-in-new-york-city-d17212d24919
FB's extremely granular population density maps: https://code.facebook.com/posts/1676452492623525/connecting-the-world-with-better-maps/
Spike-triggered visulation of ML networks: https://github.com/timsainb/Tensorflow-MultiGPU-VAE-GAN

Implementations


Pix2pix & CycleGAN

PyTorch: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
Tensorflow (inc. docker containers): https://github.com/affinelayer/pix2pix-tensorflow
Torch: https://github.com/phillipi/pix2pix


Image Analogies:

Theano: https://github.com/awentzonline/image-analogies


ResNets

PyTorch: https://github.com/pytorch/vision
Lots of other ones, every deep learning library has some


Fast Style Transfer

Torch: https://github.com/jcjohnson/fast-neural-style
Tensorflow: https://github.com/lengstrom/fast-style-transfer
PyTorch: https://github.com/abhiskk/fast-neural-style


Slow (but more high quality) style transfer:

Caffe + C++, by Microsoft. Easiy state of the art, : https://github.com/msracver/Deep-Image-Analogy
OG style transfer: https://github.com/jcjohnson/neural-style


Misc:

Fast exact nearest-neighbour search used for terrapattern: https://github.com/aman-tiwari/ofxCoverTree, https://github.com/manzilzaheer/CoverTree
Faster approximate nearest-neighbour search by FB: https://github.com/facebookresearch/faiss
Exploring and sampling generative networks: https://github.com/dribnet/plat