This report summarises the project that I have worked with BeagleBoard.org to accelerate as many layers types as possible using OpenGLES and Darknet as Deep Learning framemork.
The main goal of the project is to accelerate as many types of layers as possible using OpenGLES and Darknet as deep learning frameworks. Accelerating the performance of deep learning models is crucial for real-time applications. GPUs are widely used to accelerate the computations in deep learning models, as they can perform many operations in parallel.OpenGLES is a widely used graphics API that provides a framework for performing computations on GPUs. By using OpenGLES to perform computations, we can leverage the parallel processing power of the GPU to accelerate the performance of deep learning models.
- Adding as many layers as possible using Opengles to the darknet framework
- Shreyas Atre
- Deepak Khatri
- GSoC Proposal: Opengles Acceleration using Deep learning
- GSoC Blog: Porting darknet to BB AI-64, Matrix Multiplication, Convolutional Layer Forward Propagation, Implementing Region Layer in OpenGLES, Understanding MaxPooling layer
- Weekly Report: Project Weekly Report
- Project Milestones: Project Milestones
- Github Repo: Darknet
- Introductory Video: Introductory Video
- Bechmarked darknet on Host and Beagleboard AI-64
- Added code for matrix multiplication using shaders.
- Added compute shader code for im2col, fill_gpu, add_bias and activation function used in forward propogation in convolution nueral network.
- Added convolution layer and generated the output
- Added Maxpool and Region layer.
- Understanding of darknet framework.
- Understanding Opengles.
- Understanding the need of parallel computing capabilities, improving performance and making it suitable for large-scale neural network training and inference.
- Although the maxpool layer previously verifies the boundary conditions, the output of the layer surpassed the boundary check conditions during printing. Therefore, in order to guarantee correct output values, this function needs to be constructed more effectively.
- There is some work to be done on the region layer logic. Even though I've included the logic in the shader, the
entry_index
function code still has to be added. Boundary checks must also be performed for this layer. - Benchmarking needs to be done for this layers in CPU as well as in GPU.
- It is necessary to add more neural network layers with Opengles.
I am deeply grateful to the BeagleBoard Community for giving me the incredible opportunity to participate in the Google Summer of Code (GSoC) program. My mentors, Shreyas Atre, Deepak Khatri, and Kumar Abhishek, have been instrumental in guiding and supporting me throughout this project. Their expertise and dedication have been invaluable.
I would also like to extend my appreciation to the entire BeagleBoard Community for their open-source contributions and collaborative spirit. This experience has been a significant stepping stone in my journey as a software developer, and I look forward to continued contributions to this vibrant community.
- Adding region layer and maxpool layer(some part of it is completed till now).
- Adding other layers in the neural networks framework.