Kanav-7/GSoC_Report.md

## GSoC_Report.md

      
    Raw
  

              GSoC_Report.md
            
          
    Google Summer of Code Report

Hand Gesture Recognition Component

One of the capabilities of the robot is to efficiently communicate with humans, either through voice or gestures. With a lot of work being done in the field of Computer Vision, Hand gestures have become one of the most efficient ways of human-robot interactions. The main aim of this project was to integrate ‘Hand Gesture Recognition Component’ in Robocomp's Robolab. The component added enables robots to understand American Sign Language (ASL) Alphabets in real-time, which is an efficient way of communication.
Project Details

As a part of this project, I implemented three different Robocomp components each of which can be used standalone for other use cases, and when combined perform hand gesture recognition.


Client Component: I developed this component in the first phase of the project. This component acts as a client for the entire hand gesture recognition pipeline and bundle together all the components. Along with that, this component also finds out the bounding box of the hand from the image. For hand bounding box detection, I implemented two methods (Mediapipe and SSD+MobileNet) and any one of these can be used by the user by setting appropriate parameters.


Hand Keypoint Component: This component was developed in the second phase and finds out hand keypoints using the hand bounding box. For implementing this, I used Openpose which is a Real-time multi-person keypoint detection library. But as response time was low on systems with low specifications, I also implemented keypoint detection using mediapipe (which was done in client component itself).


Hand Gesture Component: This component was developed in the final phase of GSoC. This component use hand keypoints to predict the Hand Gesture. The scope was limited to American Sign Language (ASL) alphabets. As no dataset is available which maps keypoints to gestures, I created one using publically available ASL image to gesture dataset. After this, I used one-vs-one Support Vector Machines (SVM) model to predict gestures. Also, I added a feature where the user can input the exact classes they want predictions from.


Using all these components I was able to recognize hand gestures in real-time, that too, on systems with basic specifications.
Blog Posts

Throughout my Google Summer of Code project period, I am writing blogs which contain detailed information about my work.
Links to my Blog Posts:

First post
Component Structure of Project
Hand Detection in image feed
Hand Keypoint Detection
Dataset Creation and other updates
Gesture Recognition and Future Work

Project Demo

I have made a video demo explaining the steps to run the components.
Youtube Link: https://youtu.be/1JFvr_lMYTo
For experiencing this yourself, visit handGestureClient Component and follow the given instructions.
Pull Requests

I have made my commits in some pull requests. Links for same are mentioned below

Hand Bounding Box detection component
IDSL definitions for Hand Gesture recognition 
Updated interface definitions for Hand Gesture Recognition Project
Hand Keypoint Detection Component
Added handGesture Component
Updated HandGesture interface definition

Components Repositories

I have developed three components as the part of GSoC project. Github Links to all of them are mentioned below

Client Component
Hand Keypoint Component
Hand Gesture Component

Future Work

In this project, I have created a basic structure and properly functioning components for gesture recognition in real-time. But some improvements can be made, which are mentioned below:

Accuracy of gesture recognition can be improved by using a larger dataset of keypoints
In this project scope of gestures was limited to ASL alphabets, in the future, we can improve this component to work for a wider range of gestures like ASL words. Also, since ASL alphabets are single-handed, we can work on two-handed sign languages like British Sign language.

About Me

My name is Kanav Gupta. My interests lie in Computer Vision, Machine Learning, and Software Development. I also love to explore and learn about research articles that help solve numerous use cases in the field of robotics and vision.
For any queries related to this project, feel free to connect with me on LinkedIn or mail me at kanavgupta0711@gmail.com
Thanking Note

The whole journey of Google Summer of Code has been really exciting. It was a brilliant learning experience as I was working on the problem of this kind for the first time. There were some challenges I faced and it was fun solving them.
I would like to thank Aditya Aggarwal, Francisco Andrés, Esteban Martinena Guerrero, and Pilar Bachiller for helping me overcome challenges and giving constructive suggestions throughout the project.