My Google Summer of Code 2018 Work
Synopsis from the Task Proposal
When OpenCine was conceptualized there were several free and open-source raw image processing tools. But none of them were able to process moving images e.g. movies. To answer this problem, OpenCine was concepted as a powerful raw image processing tool, designed to handle moving images, with a lot of planned features. My goal is to implement different debayering algorithms and to parallelise implementation with OpenCL or other frameworks, as described on T722 in the Task List.*
For this task, I worked for 3 debayering algorithms: Bilinear Interpolation, Green Edge Directed Interpolation and SHOODAK2. A comparison of the final work can be seen here: Original Image here.
Some of the major differences are noticeable on the tip of the glass and on the bird's beak.
Community Bonding Period (April 23, 2018 - May 14, 2018)
- My first task in GSoC was to optimize the Image Pre-Processor, particularly by simplifying the existing Nested For Loops into Single For Loops.
- Commit Links: #777ee30
- I started working on a Bilinear Debayer Class, which would debayer the images. The initial version only worked for a specific Bayer pattern. I also messed up with the class name and had to fix it.
- Then I started developing a version that would be able to debayer with every possible pattern, which made me remove the existing processing code. The initial version only had the support for recognizing the pattern.
- Commit Links: #0be1350
- I postponed further development on the Bilinear Debayer Class as my mentor suggested that they would help a lot. I started coding Unit Tests for the Downscaler, along with adding pattern support for it.
First Coding Period (May 14, 2018 - June 15, 2018)
- I came back to working with the Bilinear Debayer, making it support every pattern. I also created unit tests to check if it worked accordingly.
- I found out that my optimized version of the Image Pre-Processor was responsible for a lot of the bugs I was thinking that were related to the Bilinear Debayer. I had to revert it to the older version.
- Commit Links: #cff1753
- After fixing the Pre-Processor, I added Nearest Neighbor Interpolation to the Bilinear Debayer Class, along with some name fixes. This marked the definitive and working pattern offsets for the Bilinear Debayer Class, as I was struggling trying to find out if the patterns were incorrect (It turned out that they weren't, due to the bugs in Pre-Processor). Oh - And I added Unit Tests for the Nearest Neighbor too!
- One thing that was not finished by this time was the demosaicing of the image borders, in the Bilinear Debayer Class. I added it along with Unit Tests for every possible pattern.
- Then I added Green Edge Directed Interpolation, it was very quick to make, as it shares most of the code with Bilinear Interpolation.
- Commit Links: #2901b01
- Started working with SHOODAK algorithm. As we will see further into this gist, this required a lot of time from my part. For example, I thought that I had finished it for now, and only very later I found out that it had small unnoticeable bugs.
- Now it was time to implement an image export feature into OpenCine. I used the lodepng library.
- After a lot of talks with the creator of SHOODAK, I added an edge sensing algorithm. And this is when I detected a bug that went unnoticed. This marks the end of SHOODAK development, as it was essentially finished. However, I decided to give a revamp to it some day, as it should be faster than GEDI.
- The end of the Researching and Implementation Phase was nearing and it was time to finish the unit tests for every algorithm and pattern. And in order to prepare to the next phase, Acceleration Phase, I created benchmark unit tests.
Second Coding Period (June 15, 2018 - July 13, 2018)
- Before diving into OpenMP, I optimized the algorithms, namely GEDI, as it had a mathematical redudancy in one of its operations. After that, I started researching about OpenMP and implemented a working OpenMP version of GEDI, with its respective benchmark unit test. This first approach did a significant improvement.
- With my mentor's suggestions, I changed my approach with OpenMP, which made a big improvement. On benchmarks, it took 9ms to debayer with GEDI in a sample image. I ported this knowledge to the SHOODAK and to the Bilinear Algorithms, by creating the OpenMP versions of those classes, along with its respective unit tests for benchmarking and for testing.
- It was time to add the OpenMP versions to ProcessingTest, so that we could display the results and compare the algorithms (It is also a great way to check if there are any glitches).
- Commit Links: #2f49a54
- I changed the single for loops in the OpenMP versions of GEDI and Bilinear to nested for loops, as it would be better optimized with OpenMP. It significantly improved the performance. GEDI was 5ms on my benchmarks with the sample image. This marked the end of my work with OpenMP for GSoC.
- Now it was time to work with OpenCL, I spent several days researching and learning about it. And then I developed a basic first version, which didn't go into final version. As usual, a unit test was also created.
Third Coding Period (July 13, 2018 - August 6, 2018)
- I started working on a Kernel Loader. In order to test it, I created a Kernel file with a Image Filler function, that fills the entire image with a specified value.
- And then I started working on a Nearest Neighbor OpenCL version, which was later scrapped and refactorized when developing the Bilinear Interpolation in OpenCL.
- After converting the Base OpenCL to a class, with the help of my mentor and by using an OpenCL C++ Wrapper by Khronos, I started developing the kernel for Green Edge Directed Interpolation.
- Now, it was time to develop a Debayer Processor! I started with Bilinear Interpolation. The Processor was continuously improved, in order to optimize speeds, although it is still slow, despite my best effort to speed it up. Currently it takes 41ms to do what OpenMP version takes 4ms...
What is left to do?
SHOODAK would benefit from a revamp, as it is slow and in theory should be faster than Green Edge Directed Interpolation.
The OpenCL implementations also need some sort of improvements, as they don't have satisfactory speed, error handling and code structure.
- I documented the debayering algorithms in the apertusº Wiki, along with a page about Coding Guidelines followed in OpenCine. Link: https://wiki.apertus.org/index.php/OpenCine.Code_Documentation