mtreml/DNNs4MapSwipe.md

## DNNs4MapSwipe.md

      
    Raw
  

              DNNs4MapSwipe.md
            
          
    Building footprint detection in satellite images for MapSwipe

Table of contents

Summary and scope

Introduction - why and how does it pay off?

Improvements on the current MapSwipe workflow

How to achieve these improvements: artificial intelligence

Training of a DNN on detecting building footprints in satellite images

DNN architectures for semantic segmentation

Training data

Deep learning infrastructure

Project plan

Timeline / Steps

Software architecture overview - relation to the MapSwipe / MissingMaps project

Resources


Summary and scope

(Microsoft)
This project wants to improve and automatize the process of detecting objects like roads, buildings or land cover on satellite images. Currently many humanitarian organizations depend on the availability of up-to-date and accurate geographic data to plan their activities. In remote areas such information is often incomplete, inaccurate or not available at all. A community of volunteer mappers help to create this important data by using MapSwipe. This approach has proven to be very useful in many humanitarian interventions in the past. However, the data produced by MapSwipe projects faces certain challenges at the moment: it is a very time consuming process and it lacks high resolution information. We are adressing these shortcomings by leveraging vast amounts of openly available training data for deep learning. In the future this will allow MapSwipe to produce more accurate geographic information in much less time.
Objectives in a nutshell


Providing high resolution geographic data: Semantic segmentation enables pixel-wise classification of satellite images. That means the location of buildings, roads etc. can be determined much more accurately.


Action -> supervision: The new MapSwipe 2.0 workflow provides a new role for the user. Instead of labelling data per hand (acting), in the new workflow the user validates data that has been previously labelled by a DNN (supervision).


Introduction - why and how does it pay off?

Overview, background, context, ...
Improvements on the current MapSwipe workflow

MapSwipe is a very successful way of crowdsourcing and parallelizing the task of mapping an area of interest by a community of volunteers. However, the current workflow of detecting objects in satellite images has two disadvantages:
1) Exact location of objects remains unknown -> detect building footprints
Satellite images are only classified whether they contain an object or not - no information is given where this building is located.
2) Labelling is very time consuming -> use AI to automatize this workflow
Since each satellite image has to be presented to the user and her feedback is recorded, it can take considerable amounts of time to map an area of interest.
How to achieve these improvements: deep neural networks (DNNs)

Different tasks in computer vision


Semantic segmentation allows pixelwise building footprint detection in satellite images


Training of a DNN on detecting building footprints in satellite images

DNN architectures for semantic segmentation

There exists a whole zoo of deep neural network architectures for semantic segmentation. In the following we compare their performance on several standard benchmark datasets, their computational complexity (~ training time, memory requirements and inference time) and their availability as open source code.
Computational costs (Paszke et al.)


Performance on standard benchmarks


Architecture
mAP(VOC12)
mAP(VOC12 with COCO)
mAP(Pascal Context)
IoU(Cityscapes)
Paper
Code


FCN-8s
62.2

37.8
65.3
http://arxiv.org/abs/1411.4038
https://github.com/aurora95/Keras-FCN


DeepLab
71.6


http://arxiv.org/abs/1412.7062


CRF-RNN
72
74.7
39.3

http://arxiv.org/abs/1502.03240
https://github.com/sadeepj/crfasrnn_keras


DeconvNet
72.5


http://arxiv.org/abs/1505.04366
https://github.com/fabianbormann/Tensorflow-DeconvNet-Segmentation


DPN
74.1
77.5


https://arxiv.org/abs/1707.01629
https://github.com/cypw/DPNs


SegNet


http://arxiv.org/abs/1511.00561
https://github.com/preddy5/segnet


Dilation8

75.3


https://arxiv.org/abs/1511.07122
https://github.com/DavideA/dilation-keras


Deeplab v2

79.7
45.7
70.4
https://arxiv.org/abs/1606.00915


FRRN B


71.8
https://arxiv.org/abs/1611.08323


G-FRNet
79.3


https://arxiv.org/abs/1806.11266
https://github.com/mrochan/gfrnet


GCN

82.2

76.9
https://arxiv.org/abs/1703.02719
https://github.com/ZijunDeng/pytorch-semantic-segmentation


RefineNet

83.4
47.3
73.6
https://arxiv.org/abs/1611.06612
https://github.com/guosheng/refinenet


PSPNet
82.6
85.4

80.2
https://arxiv.org/abs/1612.01105
https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow


DeepLabv3

85.7

81.3
https://arxiv.org/abs/1706.05587


EncNet
82.9
85.9
51.7

https://arxiv.org/abs/1803.08904


DFN
82.7
86.2

80.3
https://arxiv.org/abs/1804.09337
https://github.com/ycszen/TorchSeg


DeepLabv3+

87.8

82.1
https://arxiv.org/abs/1802.02611
https://github.com/bonlime/keras-deeplab-v3-plus


OCNet


81.2(81.7)
https://arxiv.org/abs/1809.00916
https://github.com/PkuRainBow/OCNet.pytorch


DUpsampling
85.3
88.1
52.5

https://arxiv.org/abs/1903.02120


FastFCN


53.1

https://arxiv.org/abs/1903.11816
https://github.com/wuhuikai/FastFCN


U-Net


https://arxiv.org/abs/1505.04597
https://github.com/zhixuhao/unet


Extensive list of code for semantic segmentation: https://github.com/mrgloom/awesome-semantic-segmentation
Training data

Satellite image sources

Overview: https://wiki.openstreetmap.org/wiki/Aerial_imagery

Bing Maps: https://www.bingmapsportal.com/
Mapbox: https://docs.mapbox.com/help/
Sentinel: https://www.sentinel-hub.com/
Digital Globe: https://discover.digitalglobe.com/

Labelled / annotated data sources (ML-datasets)


OSM data
Microsoft Building Footprint data
https://github.com/Microsoft/USBuildingFootprints
CrowdAI Mapping Challenge
https://github.com/Microsoft/USBuildingFootprints
Kaggle challenge
https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data
SpaceNet challenge
https://spacenetchallenge.github.io/AOI_Lists/AOI_HomePage.html
Ciesin Columbia (Population density)
https://ciesin.columbia.edu/data/hrsl/
Open Data DC
http://opendata.dc.gov/datasets/building-footprints/data

Deep learning infrastructure

In order to train a DNN on training data from model regions we need access to GPU clusters in the cloud. After copying the network architecture definition and the dataset to a GPU server cluster, we can start the training.

Google Cloud (Tensorflow / Keras)

https://cloud.google.com/ml-engine
https://cloud.google.com/ml-engine/docs/tensorflow/using-gpus


Project plan

Timeline / Steps


Create a first prototype model

Choose a DNN architecture
Choose geographic model regions
Train the DNN on data from model regions on a Google cloud GPU cluster


Integrate prototype model into the MapSwipe workflow

Define a mapping task in one of the model regions
Use the prototype model to predict building footprints
Visualize & analyze predictions to gain insights
Ask MapSwipe users to validate these predictions in the app


Start debugging iterations

Analyse mistakes
Clean training data
Improve DNN-training procedure
Adapt architecture
Develop best-practice mapping task definitions


Software architecture overview - relation to the MapSwipe / MissingMaps project


Resources

Overview papers

A Survey of Semantic Segmentation: https://arxiv.org/abs/1602.06541
Performance & memory: https://arxiv.org/abs/1605.07678
A Review on Deep Learning Techniques Applied to Semantic Segmentation: https://arxiv.org/abs/1704.06857
Recent progress in semantic image segmentation: https://arxiv.org/abs/1809.10198

Overview blog posts

https://www.jeremyjordan.me/semantic-segmentation/
http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review
https://zhangbin0917.github.io/2018/09/18/Semantic-Segmentation/
https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47
https://towardsdatascience.com/semantic-segmentation-with-deep-learning-a-guide-and-code-e52fc8958823
https://towardsdatascience.com/semantic-segmentation-popular-architectures-dff0a75f39d0
https://towardsdatascience.com/review-segnet-semantic-segmentation-e66f2e30fb96
https://medium.com/@arthur_ouaknine/review-of-deep-learning-algorithms-for-image-semantic-segmentation-509a600f7b57
https://medium.com/nanonets/how-to-do-image-segmentation-using-deep-learning-c673cc5862ef
https://medium.com/beyondminds/a-simple-guide-to-semantic-segmentation-effcf83e7e54
Architecture	mAP(VOC12)	mAP(VOC12 with COCO)	mAP(Pascal Context)	IoU(Cityscapes)	Paper	Code
FCN-8s	62.2		37.8	65.3	http://arxiv.org/abs/1411.4038	https://github.com/aurora95/Keras-FCN
DeepLab	71.6				http://arxiv.org/abs/1412.7062
CRF-RNN	72	74.7	39.3		http://arxiv.org/abs/1502.03240	https://github.com/sadeepj/crfasrnn_keras
DeconvNet	72.5				http://arxiv.org/abs/1505.04366	https://github.com/fabianbormann/Tensorflow-DeconvNet-Segmentation
DPN	74.1	77.5			https://arxiv.org/abs/1707.01629	https://github.com/cypw/DPNs
SegNet					http://arxiv.org/abs/1511.00561	https://github.com/preddy5/segnet
Dilation8		75.3			https://arxiv.org/abs/1511.07122	https://github.com/DavideA/dilation-keras
Deeplab v2		79.7	45.7	70.4	https://arxiv.org/abs/1606.00915
FRRN B				71.8	https://arxiv.org/abs/1611.08323
G-FRNet	79.3				https://arxiv.org/abs/1806.11266	https://github.com/mrochan/gfrnet
GCN		82.2		76.9	https://arxiv.org/abs/1703.02719	https://github.com/ZijunDeng/pytorch-semantic-segmentation
RefineNet		83.4	47.3	73.6	https://arxiv.org/abs/1611.06612	https://github.com/guosheng/refinenet
PSPNet	82.6	85.4		80.2	https://arxiv.org/abs/1612.01105	https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow
DeepLabv3		85.7		81.3	https://arxiv.org/abs/1706.05587
EncNet	82.9	85.9	51.7		https://arxiv.org/abs/1803.08904
DFN	82.7	86.2		80.3	https://arxiv.org/abs/1804.09337	https://github.com/ycszen/TorchSeg
DeepLabv3+		87.8		82.1	https://arxiv.org/abs/1802.02611	https://github.com/bonlime/keras-deeplab-v3-plus
OCNet				81.2(81.7)	https://arxiv.org/abs/1809.00916	https://github.com/PkuRainBow/OCNet.pytorch
DUpsampling	85.3	88.1	52.5		https://arxiv.org/abs/1903.02120
FastFCN			53.1		https://arxiv.org/abs/1903.11816	https://github.com/wuhuikai/FastFCN
U-Net					https://arxiv.org/abs/1505.04597	https://github.com/zhixuhao/unet