Saafke/gsoc19_dnn_superres.md

## gsoc19_dnn_superres.md

      
    Raw
  

              gsoc19_dnn_superres.md
            
          
    Google Summer of Code 2019 with OpenCV

Learning-based Super Resolution

Student: Xavier Weber

Mentors: Vladimir Tyan & Yida Wang

Student on the same project: Fanny Monori
Link to accomplished work:

PR in the opencv_contrib repository: opencv_contrib/pull/2231
EDSR implementation in TensorFlow: EDSR repo
FSRCNN implementation in TensorFlow: FSRCNN repo

Intro

Hello, I am Xavier! I was a student developer for GSoC 2019 with OpenCV (Link to Project).
The main goal of this project was to add a new module to OpenCV: dnn_superres. This module allows for upscaling images via Convolutional Neural Networks. Easy access to popular Super Resolution data was also included.
The module delivers a simple-to-use interface that effectively uses the state-of-the-art super resolution techniques. This enables developers (that have little or no knowledge about deep learning or super resolution) to easily use this tool in his/her project. A developer can input an image or even real-time video, select their desired method and up-scaling factor and get as output their imagery with up-scaled resolution.
During this project I worked closely with Fanny. She worked on the same module but implemented different models: ESPCN [2] and LapSRN [4]. I implemented EDSR [1] and FSRCNN [3]. These are all supported in this module.
My Journey

First Period

In the first month of GSoC I implemented the following:


Loading code for three popular SR datasets: Div2k, BSDS and General100.

This allows for easy access to high-quality images inside of the OpenCV environment.


Creating the 'dnn_superres' module which is the interface that connects users to all the trained models.

This simplifies the use of these trained models by handling all the pre- and post-processing.


FSRCNN [3]

This is a pretty light-weight network that can upscale images fast and has decent performance.


Second Period

In the second month of GSoC I implemented the following:


EDSR [1]

This is a state-of-the-art high performance neural network, albeit quite slow.


Updated the 'dnn_superres' module with pre- and post-processing to support this new network.


Third Period

In the last month of GSoC I implemented the following:

Finished the models and trained them for scales x2, x3 and x4
Coordinating with Fanny to complete the module
Documentation
Test
Tutorial

Demonstration

Here I will demonstrate how you can use this module and reveal its POWER!
1. Build

Make sure you build OpenCV with the contrib-modules, including "dnn_superres".
2. Upscale

Now we can upscale any images using state-of-the-art techniques, by only using OpenCV!!
#include <opencv2/dnn_superres.hpp>
using namespace std;
using namespace cv;
using namespace dnn;
using namespace dnn_superres;

//Create the object
DnnSuperResImpl sr;

//Read the desired model - download links below
path = "models/EDSR_x4.pb"
sr.readModel(path);

//Set the desired model and scale to get correct pre- and post-processing
sr.setModel("edsr", 4);

//Upscale
Mat img = cv::imread(img_path);
Mat img_new;
sr.upsample(img, img_new);
You can download my models here: EDSR and FSRCNN, and Fanny's models here: ESPCN and LapSRN.
3. Results

Input:


Bicubic upscale:


FSRCNN upscale:


EDSR upscale:


Original:


Benchmarks on the General100 dataset

All computations were done on an i7-9700K. Metrics used are PSNR and SSIM.

2x scaling factor


Avg inference time in sec (CPU)
Avg PSNR
Avg SSIM


ESPCN
0.008795
32.7059
0.9276


EDSR
5.923450
34.1300
0.9447


FSRCNN
0.021741
32.8886
0.9301


LapSRN
0.114812
32.2681
0.9248


Bicubic
0.000208
32.1638
0.9305


Nearest neighbor
0.000114
29.1665
0.9049


Lanczos
0.001094
32.4687
0.9327


4x scaling factor


Avg inference time in sec (CPU)
Avg PSNR
Avg SSIM


ESPCN
0.004311
26.6870
0.7891


EDSR
1.607570
28.1552
0.8317


FSRCNN
0.005302
26.6088
0.7863


LapSRN
0.121229
26.7383
0.7896


Bicubic
0.000311
26.0635
0.8754


Nearest neighbor
0.000148
23.5628
0.8174


Lanczos
0.001012
25.9115
0.8706


References

[1] Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee, "Enhanced Deep Residual Networks for Single Image Super-Resolution",  2nd NTIRE: New Trends in Image Restoration and Enhancement workshop and challenge on image super-resolution in conjunction with CVPR 2017.  [PDF] [arXiv] [Slide]
[2] Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A., Bishop, R., Rueckert, D. and Wang, Z., "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network", Proceedings of the IEEE conference on computer vision and pattern recognition CVPR 2016. [PDF] [arXiv]
[3] Chao Dong, Chen Change Loy, Xiaoou Tang. "Accelerating the Super-Resolution Convolutional Neural Network",  in Proceedings of European Conference on Computer Vision ECCV 2016. [PDF]
[arXiv] [Project Page]
[4] Lai, W. S., Huang, J. B., Ahuja, N., and Yang, M. H., "Deep laplacian pyramid networks for fast and accurate super-resolution",  In Proceedings of the IEEE conference on computer vision and pattern recognition CVPR 2017. [PDF] [arXiv] [Project Page]
	Avg inference time in sec (CPU)	Avg PSNR	Avg SSIM
ESPCN	0.008795	32.7059	0.9276
EDSR	5.923450	34.1300	0.9447
FSRCNN	0.021741	32.8886	0.9301
LapSRN	0.114812	32.2681	0.9248
Bicubic	0.000208	32.1638	0.9305
Nearest neighbor	0.000114	29.1665	0.9049
Lanczos	0.001094	32.4687	0.9327
	Avg inference time in sec (CPU)	Avg PSNR	Avg SSIM
ESPCN	0.004311	26.6870	0.7891
EDSR	1.607570	28.1552	0.8317
FSRCNN	0.005302	26.6088	0.7863
LapSRN	0.121229	26.7383	0.7896
Bicubic	0.000311	26.0635	0.8754
Nearest neighbor	0.000148	23.5628	0.8174
Lanczos	0.001012	25.9115	0.8706