This project developed a gesture recognition application using machine learning algorithms. The application recognizes gestures without color markers by extracting features from images using Hu moments and training a Hidden Markov Model. Common gestures like "ok" and "peace" were mapped to tasks like switching slides. The system was tested and achieved 60% accuracy. Future work could involve adding more gestures and connecting it to other devices.
Final Year Project
Title:Gesture Recognition Applications
Under guidance: Presented By:
Shruthi H R G. Sai Samhitha
Asst. Professor, Dept. of CSE Imon Barua
Rishav Kumar
Sana Aram
2.
Problem Statement
▪ Todevelop a user-friendly gesture recognition application
by eliminating dependency on colour markers.
▪ The objective of this project is to:
recognize the gestures successfully
perform certain application tasks using hand gestures.
3.
Introduction
▪ The applicationwill allow people to interact with the
machine without physical contact using gesture
recognition
▪ It can be run on computers that have a camera to detect
gestures from the physical world to interact with the digital
world.
▪ Based on gesture, predefined actions are performed.
4.
Machine Learning
Machine learningis the study of algorithms and statistical models
that computers use to perform a given task without any explicit
instructions but rely on patterns and inference.
The aim of machine learning is to find patterns in the data set and
convert it into a model that can simply recognize and be utilized by
the people.
We leave a lot of that to the machine to learn from the data.
Computer Vision
▪ Itdeals with how computers can be made to gain a high-level
understanding from images and videos. It aims to automate tasks
that the human visual system performs.
▪ The tasks include methods for
– Acquiring
– Processing
– Analyzing
– Understanding digital images
– Extracting high-dimensional data to produce numerical information
7.
Gesture Recognition
Gesturerecognition is a domain of computer vision which is the
study of recognizing and interpreting human gestures by using
mathematical algorithms.
These gestures can be identified by different body motion that can
come from face or hand.
Types of Gestures –
▪ Online gestures:These are the direct manipulation gestures, used to
scale or rotate an object.
▪ Offline gestures:These gestures act as an input that is later used to
process after user interaction with the object.
▪
Implementation
▪ LOADINGTHE DATASET
Raw Data – Images.These images are not directly fed to the training model.
We make use of something called Hu Invariant Moments for feature extraction.
Before calculating Hu Moments, make sure that the images are thresholded and skin
segmented.
-Thresholding: converts the original image into a binary image with black and white pixels.
- Skin Segmentation : Done to recognize the various skin tones.
10.
Training The Model
▪Training is a process in Machine Learning where a model is fed
(fitted) with data to learn from it.
▪ The learning algorithm tries to find a pattern in the dataset and maps
the provided input to the target value and then creates a model of
machine learning with these patterns learnt.
▪ The algorithms used to train the model –
- K means
- HMM
- BaumWelch
11.
K-Means
• K-means isa type of unsupervised learning
• It is used to do the classification of objects into different groups
• Each data point is assigned to its nearest centroid
• Updation of centroid is done by finding out the mean value of all the
data points assigned to that centroid’s cluster
• Optimum number of clusters k is found by the elbow point
• 30 clusters have been used to preprocess the data in this project
12.
HMM
▪ HMM isa type of statistical Markov model
▪ The system being modeled is assumed to be a Markov process with
unobservable states which are hidden.
Baum Welch
▪ CentralIssues with Hidden Markov Model:
1. Evaluation Problem:
2. Learning Problem:
estimate theTransition (aij) & Emission (bjk)
Probabilities using the training sequences.
3. Decoding Problem
15.
Cont.
▪ Forward Algorithm:
wewill use the computed probability on current time step to
derive the probability of the next time step. Hence the it is
computationally more efficient O(N2.T).
Backward Algorithm:
time-reversed version
probability that the machine will be in hidden state si at time
step t and will generate the remaining part of the sequence of the
visible symbolVT.
16.
Cont.
▪ BaumWelch algorithmuses a special case of the Expectation
Maximization(EM) Algorithm.
▪ Steps:
1. Start with initial probability estimates [A,B]. Initially set equal
probabilities or define them randomly.
2. Compute expectation of how often each transition/emission has been
used.We will estimate latent variables [ ξ,γ ] (This is common approach
for EM Algorithm)
3. Re-estimate the probabilities [A,B] based on those estimates (latent
variable).
4. Repeat until convergence
17.
Further Steps
▪ Gettingthe live images from the camera.
▪ Thresholding and skin segmentation.
▪ Feature extraction using the same process of calculating
Hu Moments.
▪ Prediticting the gesture.
▪ Performing the task using the gesture.
18.
Implementation
▪ Thresholded resultof hand gesture
▪ After performing Feature Extraction using Hu invariant moments, we
are storing the 7 characteristics of each image in text files
19.
Cont.
▪ Using thedataset, we are classifying the gesture values using
k-means classifier. All the similar Hu invariant moments are
clustered together.
▪ Now using HMM algorithm we predict a sequence of unknown
variables from a set of observed variables
▪ Once the model is trained we give live input (gestures) from the
webcam
▪ Each gesture is mapped to one or more applications. When the
gesture is recognized, the task is performed.
Results
This gesture called“ok” is used to
visit the previous slide in
Microsoft PowerPointTool.
▪ “Peace” gesture is used various
application like Camera,
Moving to Next slide in
PowerPoint.
Future Enhancement
Includingmore number of gestures.
Different loops can be defined within
an application to perform different
tasks within that application.
Gestures to speech or text system to
help physically challenged people.The
sign language can be directly
converted to text.
Can be connected to a mobile device
that can perform actions to make it
more usable.
Improve the system, so that it can be
used with any kind of background.
Problems Faced
Non-availability of Hu
Invariant Dataset.
Replication of values due
to execution updation
latency.
Cache of the previous
data was stored and was
not able to clear
instantly due to the
limited speed of the
system.
25.
Conclusion
▪ Eliminate thedependency on colour markers.
▪ Able to recognize hand gestures successfully by using
the mentioned algorithm (HMM) with accuracy of about
60%.
▪ Hand gestures can now be easily used to perform
various tasks without the use of any coloured markers.
▪ Becomes easy to integrate the physical world with the
digital world.