AjitPant/gsoc2020_detection_summary.md

## gsoc2020_detection_summary.md

      
    Raw
  

              gsoc2020_detection_summary.md
            
          
    Student: Ajit Pant

Mentors: Gary Bradski, Gholamreza Amayeh
Links to accomplished work:


Pull requests for ColorChecker in opencv_contrib:opencv/opencv_contrib#2532, opencv/opencv_contrib#2644
Repository for AprilTags models and training: https://github.com/AjitPant/AprilTag_Detection
A short presentation with embedded videos showing the output: Link


Objectives:

The first objective was to build a detector for various types of ColorCheckers, which can support three standard ColorCheckers and can also be generalized easily to support new types of checkers in future. For this a module in opencv_contrib was to be created which will contain all these methods. It should provide the coordinates of the colorcheckers, and also the colors of the patches. It must also be robust to some level of occlusion and lighting changes.
The second objective was to develop a deep learning based detector for AprilTags, the plan was to use a unet with 6 channels in output, one for segmentation of tags, 4 for the four corners (all four corners of a tag are considered different), and one for non corner pixels. Since getting real training data for AprilTags is hard, so it was to be trained on  synthetic data. A fake dataset generator was to be written for this purpose.

Objectives accomplished:


The detector for the three types of ColorCheckers is completed, it can be run in two different modes, the first is using purely classical methods, whereas in the other one a deepnet is used for localisation of the charts, then the classical method is run on the cropped regions of the image. Its not perfect, but if you tune the parameters of the detector it can detect tags quite easily in a variety of scenes.
The detector return a lot of useful data regarding the charts, like their position, bounding boxes of each patch, summary statistics of colors of each patch, like mean color, standard deviation, occluded or not. The occlusion part is not perfect as it if the edges of a patch are not sharp enough, it is not detected. Also a loss value per patch is returned.
A drawing utility for detected ColorChecker Charts. It can draw the bounding box per patch, scaled to a given size.
A debugging utility which shows the output of all the stages of the detector, it can be used to tune the parameters as per the need.
A synthetic dataset generator for AprilTags. It can adjust rotation, size, lightning, blurs and some other stuff for the tags. The images are not perfectly realistic though.
A Unet which was trained on this synthetic dataset. It works almost perfectly on the synthetic data, and around 40-50% of the easy detections in real dataset.


What couldn't be completed:


The main portion was in the AprilTags, even after training it quite a lot with a lot of augmentations, it didn't generalized properly to the real dataset. Out of the 6 channels previously mention, the segmentation works properly, but some of the corners get mixed up.


It would have been great if there were some color calibration algorithms, which would utilize the detections provided to calibrate the images. But because the AprilTags didn't worked properly, I couldn't start the work on it. There is a small sample showing how to calibrate the color, but its quite simple using only a simple linear 3x3 color matrix to calibrate.


Details of Algorithms Implemented:


Most of the code for ColorChecker Detection is based on this repository https://github.com/pedrodiamel/colorchecker-detection. It already contained a working code for the standard macbeth chart, so most of the work was in porting it as a module for opencv and adding detector for two new charts. Some changes have been made in the detector, which allows the tuning of parameters much more easily. Also the localization neural network, used a Nvidia caffe model, which doesnt work with OpenCV DNN, so I rewrote it first in PyTorch using MaskRCNN, which also didnt work with OpenCV, so in the end I wrote it in Tensorflow. The current detector is quite robust to occlusion as it usually needs only 20-30% of the patches to detect the entire chart.


The synthetic data generator, takes images of tags from a mosiac, then roated it around the x, y and z axis, after which it is placed randomly in the background image, light is then adjusted for each tag, smoothing the border to make it more realistic. After all the tags are added, motion blurs and global lighting adjustments are done. To improve the augmentations even further while loading the data during rating, colors are changed and random affine transforms are added.


In the AprilTag detector a Unet was used. I tried it in three different varients, the first one was a normal trained from scratch model, second had resnet-18 encoder, and the third one with resnet-50 encoder. The scratch model was bad as expected, whereas the resnet-50 was nearly perfect in detecting AprilTags in synthetic test data, but failed badly in the real data. The resnet-18 had similar peformance in both on the real and the synthetic data, so I believe that adding more augmentation in the training data, and training it for long enough, might make it work for the real
data.


What I will try to do after GSoC:


My first priority is to get the AprilTags detector working as soon as possible, already a lot of work has been done on this, so I believe if everything goes fine I will be able to complete it soon.


I will also try to keep updating the mcc module as per the needs of the users and the feedback that I get.


Experience

I had a really great experience working with OpenCV. Got to learn PyTorchLightning and tricks for converging neural networks much more effectively, and many classical detection methods. I had great and productive summers, thanks to my mentors Gary, Reza and the amazing OpenCV Community.
References:


Pedro D. Marrero Fernández and Fidel A. Guerrero Peña and Tsang Ing Ren and Jorge J.G. Leandro

Fast and Robust Multiple ColorChecker Detection using Deep Convolutional Neural Networks

[PDF]
[arxiv]