Introduction - why and how does it pay off?
Improvements on the current MapSwipe workflow
How to achieve these improvements: artificial intelligence
Training of a DNN on detecting building footprints in satellite images
DNN architectures for semantic segmentation
Training data
Deep learning infrastructure
Project plan
Timeline / Steps
Software architecture overview - relation to the MapSwipe / MissingMaps project
This project wants to improve and automatize the process of detecting objects like roads, buildings or land cover on satellite images. Currently many humanitarian organizations depend on the availability of up-to-date and accurate geographic data to plan their activities. In remote areas such information is often incomplete, inaccurate or not available at all. A community of volunteer mappers help to create this important data by using MapSwipe. This approach has proven to be very useful in many humanitarian interventions in the past. However, the data produced by MapSwipe projects faces certain challenges at the moment: it is a very time consuming process and it lacks high resolution information. We are adressing these shortcomings by leveraging vast amounts of openly available training data for deep learning. In the future this will allow MapSwipe to produce more accurate geographic information in much less time.
Objectives in a nutshell
-
Providing high resolution geographic data: Semantic segmentation enables pixel-wise classification of satellite images. That means the location of buildings, roads etc. can be determined much more accurately.
-
Action -> supervision: The new MapSwipe 2.0 workflow provides a new role for the user. Instead of labelling data per hand (acting), in the new workflow the user validates data that has been previously labelled by a DNN (supervision).
Overview, background, context, ...
MapSwipe is a very successful way of crowdsourcing and parallelizing the task of mapping an area of interest by a community of volunteers. However, the current workflow of detecting objects in satellite images has two disadvantages:
1) Exact location of objects remains unknown -> detect building footprints
Satellite images are only classified whether they contain an object or not - no information is given where this building is located.
2) Labelling is very time consuming -> use AI to automatize this workflow
Since each satellite image has to be presented to the user and her feedback is recorded, it can take considerable amounts of time to map an area of interest.
There exists a whole zoo of deep neural network architectures for semantic segmentation. In the following we compare their performance on several standard benchmark datasets, their computational complexity (~ training time, memory requirements and inference time) and their availability as open source code.
Extensive list of code for semantic segmentation: https://github.com/mrgloom/awesome-semantic-segmentation
Overview: https://wiki.openstreetmap.org/wiki/Aerial_imagery
- Bing Maps: https://www.bingmapsportal.com/
- Mapbox: https://docs.mapbox.com/help/
- Sentinel: https://www.sentinel-hub.com/
- Digital Globe: https://discover.digitalglobe.com/
- OSM data
- Microsoft Building Footprint data https://github.com/Microsoft/USBuildingFootprints
- CrowdAI Mapping Challenge https://github.com/Microsoft/USBuildingFootprints
- Kaggle challenge https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data
- SpaceNet challenge https://spacenetchallenge.github.io/AOI_Lists/AOI_HomePage.html
- Ciesin Columbia (Population density) https://ciesin.columbia.edu/data/hrsl/
- Open Data DC http://opendata.dc.gov/datasets/building-footprints/data
In order to train a DNN on training data from model regions we need access to GPU clusters in the cloud. After copying the network architecture definition and the dataset to a GPU server cluster, we can start the training.
- Google Cloud (Tensorflow / Keras)
-
Create a first prototype model
- Choose a DNN architecture
- Choose geographic model regions
- Train the DNN on data from model regions on a Google cloud GPU cluster
-
Integrate prototype model into the MapSwipe workflow
- Define a mapping task in one of the model regions
- Use the prototype model to predict building footprints
- Visualize & analyze predictions to gain insights
- Ask MapSwipe users to validate these predictions in the app
-
Start debugging iterations
- Analyse mistakes
- Clean training data
- Improve DNN-training procedure
- Adapt architecture
- Develop best-practice mapping task definitions
Overview papers
- A Survey of Semantic Segmentation: https://arxiv.org/abs/1602.06541
- Performance & memory: https://arxiv.org/abs/1605.07678
- A Review on Deep Learning Techniques Applied to Semantic Segmentation: https://arxiv.org/abs/1704.06857
- Recent progress in semantic image segmentation: https://arxiv.org/abs/1809.10198
Overview blog posts
- https://www.jeremyjordan.me/semantic-segmentation/
- http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review
- https://zhangbin0917.github.io/2018/09/18/Semantic-Segmentation/
- https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47
- https://towardsdatascience.com/semantic-segmentation-with-deep-learning-a-guide-and-code-e52fc8958823
- https://towardsdatascience.com/semantic-segmentation-popular-architectures-dff0a75f39d0
- https://towardsdatascience.com/review-segnet-semantic-segmentation-e66f2e30fb96
- https://medium.com/@arthur_ouaknine/review-of-deep-learning-algorithms-for-image-semantic-segmentation-509a600f7b57
- https://medium.com/nanonets/how-to-do-image-segmentation-using-deep-learning-c673cc5862ef
- https://medium.com/beyondminds/a-simple-guide-to-semantic-segmentation-effcf83e7e54