Skip to content

Instantly share code, notes, and snippets.

@Kanav-7
Last active August 25, 2020 20:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Kanav-7/5cbe51c3575fa44debb5705fa0597900 to your computer and use it in GitHub Desktop.
Save Kanav-7/5cbe51c3575fa44debb5705fa0597900 to your computer and use it in GitHub Desktop.
Report on work done for Robocomp as a part of Google Summer of Code 2020

Google Summer of Code Report

Hand Gesture Recognition Component

One of the capabilities of the robot is to efficiently communicate with humans, either through voice or gestures. With a lot of work being done in the field of Computer Vision, Hand gestures have become one of the most efficient ways of human-robot interactions. The main aim of this project was to integrate ‘Hand Gesture Recognition Component’ in Robocomp's Robolab. The component added enables robots to understand American Sign Language (ASL) Alphabets in real-time, which is an efficient way of communication.

Project Details

As a part of this project, I implemented three different Robocomp components each of which can be used standalone for other use cases, and when combined perform hand gesture recognition.

  • Client Component: I developed this component in the first phase of the project. This component acts as a client for the entire hand gesture recognition pipeline and bundle together all the components. Along with that, this component also finds out the bounding box of the hand from the image. For hand bounding box detection, I implemented two methods (Mediapipe and SSD+MobileNet) and any one of these can be used by the user by setting appropriate parameters.

  • Hand Keypoint Component: This component was developed in the second phase and finds out hand keypoints using the hand bounding box. For implementing this, I used Openpose which is a Real-time multi-person keypoint detection library. But as response time was low on systems with low specifications, I also implemented keypoint detection using mediapipe (which was done in client component itself).

  • Hand Gesture Component: This component was developed in the final phase of GSoC. This component use hand keypoints to predict the Hand Gesture. The scope was limited to American Sign Language (ASL) alphabets. As no dataset is available which maps keypoints to gestures, I created one using publically available ASL image to gesture dataset. After this, I used one-vs-one Support Vector Machines (SVM) model to predict gestures. Also, I added a feature where the user can input the exact classes they want predictions from.

Using all these components I was able to recognize hand gestures in real-time, that too, on systems with basic specifications.

Blog Posts

Throughout my Google Summer of Code project period, I am writing blogs which contain detailed information about my work.

Links to my Blog Posts:

Project Demo

I have made a video demo explaining the steps to run the components.

Youtube Link: https://youtu.be/1JFvr_lMYTo

For experiencing this yourself, visit handGestureClient Component and follow the given instructions.

Pull Requests

I have made my commits in some pull requests. Links for same are mentioned below

Components Repositories

I have developed three components as the part of GSoC project. Github Links to all of them are mentioned below

  1. Client Component
  2. Hand Keypoint Component
  3. Hand Gesture Component

Future Work

In this project, I have created a basic structure and properly functioning components for gesture recognition in real-time. But some improvements can be made, which are mentioned below:

  • Accuracy of gesture recognition can be improved by using a larger dataset of keypoints
  • In this project scope of gestures was limited to ASL alphabets, in the future, we can improve this component to work for a wider range of gestures like ASL words. Also, since ASL alphabets are single-handed, we can work on two-handed sign languages like British Sign language.

About Me

My name is Kanav Gupta. My interests lie in Computer Vision, Machine Learning, and Software Development. I also love to explore and learn about research articles that help solve numerous use cases in the field of robotics and vision.

For any queries related to this project, feel free to connect with me on LinkedIn or mail me at kanavgupta0711@gmail.com

Thanking Note

The whole journey of Google Summer of Code has been really exciting. It was a brilliant learning experience as I was working on the problem of this kind for the first time. There were some challenges I faced and it was fun solving them.

I would like to thank Aditya Aggarwal, Francisco Andrés, Esteban Martinena Guerrero, and Pilar Bachiller for helping me overcome challenges and giving constructive suggestions throughout the project.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment