- Year: 2023
- Organisation: TensorFlow
- Project Title: Interactive Web Demos using the MediaPipe Machine Learning Library
- Mentor: Jen Person (@jenperson)
An interactive web app which enables users to perform contactless interactions with the interface using simple human gestures. β¨
Background: The COVID-19 pandemic
has increased awareness of hygiene risks associated with touchscreens, with reports indicating that 80% of people find them unhygienic. Touchless gesture-based intuitive systems can reduce transmission in public settings and workplaces, and offer a seamless and convenient experience. Touchless technology is expected to remain popular in various industries, such as retail, healthcare, and hospitality.
The web app highlights a special ATM which showcases an augmented transaction panel, enabling users to interact accurately through intuitive gestures detected from an input video feed. Users can perform essential operations directly through the interactive floating panel (on screen) via custom simple-to-use gestures, allowing them to experience the checkout process without the need for physical touch.
Shortened Version ![](https://user-images.githubusercontent.com/48355572/234978665-08b7d16e-dace-479a-a061-478972c43f6b.gif)
In a rapidly evolving technological landscape, the aftermath of the COVID-19 pandemic has amplified concerns regarding hygiene and touch-based interactions. With **80%** of individuals deeming public touchscreens *unhygienic*, there is a compelling need for innovative solutions. Enter touchless gesture-based systems, poised to reshape industries and public spaces. Seamlessly aligning with the post-pandemic era, this technology offers intuitive and convenient interactions. From **ATMs** and airports to healthcare and retail, touchless interactions are on the brink of becoming ubiquitous. This project directly addresses these changing expectations by harnessing the power of the MediaPipe [Hand Landmarker](https://developers.google.com/mediapipe/api/solutions/js/tasks-vision.handlandmarker) task from MediaPipe Solutions. By precisely detecting [21](https://developers.google.com/mediapipe/solutions/vision/hand_landmarker#models) key hand landmarks, this technology powers an interactive web application enabling users to effortlessly engage with interfaces through contactless gestures. Designed for optimal performance in well-lit environments and on larger screens, this project embodies the future of safer, more advanced interactions.
Googleβs MediaPipe Solutions helps developers add machine learning to their end-user devices, including mobile, web, and IoT. It provides a framework that lets you configure prebuilt processing pipelines that deliver immediate, engaging, and useful output to users.
The demo showcases the capabilities of the MediaPipe Hand Landmarker task, which accurately detects and tracks 21 hand landmarks. These landmarks are utilized in the web app to enable users to perform contactless interactions with the interface using simple gestures.
β Best experienced in well-lit environments. Ideal on larger screens. All data taken via input video feed is deleted after returning inference and is computed directly on the client side, making it GDPR compliant.
πΈ Play with the Live Demo β Here β¨
πΈ Alternate CodeSandbox
Template β Here β¨
πΈ View the Installation Notes β Here β¨
πΈ Explore the Official Repository β Here β¨
π‘ The source code for the demo includes detailed comments that explain the implementation and rationale behind the design decisions.
Throughout the summer, I have made multiple contributions to MediaPipe. It's to be noted that chunks of git commits have been rebased
into each of them, including:
βοΈ [app
]: MediaPipe Interactive Web Demo β Contactless ATM Playground, (#209)
βοΈ [feat
]: Adding Offline Support for Interactive Web Demo, (#215)
βοΈ [style
]: Formatting & Asset Optimization, (WIP)
Over the course of my GSoC journey, I've penned down blogs to share my insights, and here they are, presented in reverse chronological order:
No. | Blog Title | Description | Link |
---|---|---|---|
1 | Interactive Web Demo ![]() |
Step-by-step guide to a touchless interactive web demo. | Link |
2 | Predicting Custom Gestures for Interactive Web Demo | Exploring how to predict custom gestures for interactive demos. | Link |
3 | A Holistic Preview of MediaPipe | A comprehensive look into MediaPipe's capabilities and potential. | Link |
4 | GSoC'23 Community Bonding Period | Insights into community bonding during GSoC 2023 preparations. | Link |
β This documentation is intended to assist other developers in utilizing the MediaPipe library and implementing similar touchless interaction features in their projects.
β If you want to delve deep into the specs of the model, feel free to explore the official docs, which can be found
here
. You can access the officialmodel card
for MediaPipe Hands (Lite/Full) here. It provides detailed information about the model.
DEMOX.mp4
β οΈ Webcam is essential & required for hand detection and gesture recognition. Please ensure your device has a functioning webcam.
[1] MediaPipe Hands Official Paper: Β (LINKπ)
[2] Applying Hand Gesture Recognition for User Guide Application Using MediaPipe (Paper): Β (LINKπ)
[3] MediaPipe Solutions API Docs: Β (LINKπ)
Copyright 2023 The MediaPipe Authors. Distributed under the Apache License 2.0. See LICENSE
for more information.
Participating in Google Summer of Code (GSoC) for the first time was a fantastic experience. I'm deeply grateful to my mentor, Jen Person π©, for this opportunity. Her invaluable feedback propelled the project.
Special Thanks to Paul Ruiz (@PaulTR) for providing immense support and guidance throughout the program, & Jason Mayes (@jasonmayes) for his valuable feedback on the proposal.
Beyond GSoC, I'm committed to ongoing contributions. Numerous exciting features remain to be explored. Count on me for consistent patches and updates to keep the project current. Feel free to connect on Twitter or LinkedIn for suggestions and feedback! π