SlideShare a Scribd company logo
1 of 5
Abstract
This paper proposes new features that can be extracted
from the Leap Motion and Microsoft Kinect to be used
for recognizing hand gestures. The Leap Motion
(https://www.leapmotion.com) provides 3D hand
information and the Microsoft Kinect
(https://developer.microsoft.com/en-
us/windows/kinect) providesdepth information,and the
combination of both allows us to extrapolate a large
amount of diverse information about the hand’s shape
enabling more accurate gesture recognition than is
possible with either camera individually. Using a
database of 10 distinct American Sign language
gestures, provided by Marin et al. [1,2], our new
features allow us to achieve a high level of accurate
recognitions.
1. Introduction
Interest in hand gesture recognition has continued to
grow in recent years as the demand for its utility
increases in areas such as video game development and
sign language translation for human computer
interaction (HCI). Advancements in 3D imaging and
Time-of-Flight cameras as well as the availability of
new technology such as the Leap Motion and the
Microsoft Kinect have made major advances to making
hand gesture recognition possible for a consumer
market. The Microsoft Kinect provides both an RGB
and depth image from which hand information can be
extracted, while the Leap Motion, created specifically
for hand gesture recognition, returns coordinate
information about the hand. Many different approaches
have been taken to distinguish gestures such as
topological features of holes in the image to distinguish
gestures [3] and rapid recognition to recognize gestures
before the gesture is completed [4]. Superpixel Earth
Mover’s Distance, which is widely used in image
retrieval and pattern recognition, is another approach
that has been applied to gesture recognition [5,6]. In this
paper, we utilize data from both a Microsoft Kinect and
a Leap Motion to obtain a more complete image of the
hand, making it possible to correctly identify the hand
gesture as shown in Figure 1.
The Leap Motion has improved significantly overthe
last few years.The Leap now provides more information
than previously available. In our approach,we made use
of these features to improve the accuracy of our
recognition. One of the features that is now available to
the Leap is extended fingers. The Leap is able to detect
which fingers are extended with very high accuracy.
This allows us to distinguish between gestures such as
the ones seen in Figure 1. The database we have selected
to use was created using the older version of the Leap
Motion. This means that they did not have the extended
finger feature available. To see how this feature
improves the accuracy of recognition, we manually
entered the extended fingers into the data. Section 2.1
will discuss our other new features for the Leap,
maximum X and Y value, average area, and X-Y ratio.
The Microsoft Kinect uses a Time-of-Flight camera
to create a grayscale depth image which makes the hand
easy to distinguish from the background. To find the
hand in the image, we assume the hand is always the
closest object to the Kinect and remove anything behind
the closest object. Other approaches use blob detection
to locate the hand with good results [7]. After finding
the hand,we extract features from the image in order to
gather data about the hand’s shape. The features we
extract are Silhouette, Convex Hull, Cell Occupancy
Average Depth, Cell Occupancy Non-Zero, Distance
Contour, Fingertip Distance and Fingertip Angles. We
then run the data through the Random Forest classifier,
which interprets the data and attempts to correctly
identify the gesture.There has been much work already
done using the Microsoft Kinect for hand gesture
recognition, such as researchers at the universities of
Padova and Moscow[1,8]. We have used features
inspired by these as well as innovative new ones to
maximize our recognition rate. The dataset of gestures
that we have used include 10 distinct hand gestures that
can be seen in Figure 2.
Fig. 1 Kinect and Leap setup
Leap Motion and Microsoft Kinect Hand Gesture Recognition
Josiah Bailey, Julie Kunnumpurath, Ryan Malon, Joe Nicosia, James Zwilling
State University of New York at Binghamton
2. Hand Gesture Features
2.1 Leap Features
The Leap Motion provides us information about the
hand that can be accessed.Using the Leap SDK, we are
able to extract information about the fingertip positions,
palm center, extended fingers and various other points
of interest on the hand. Using this information, we
calculate our features.
2.1.1 Scale Factor and Number of Fingers
In order to account for different sized hands, we use a
scale factor. We average the x, y and z values for all
extended fingertips to create a new point we find the
distance between that averaged point and the center of
the palm. This distance is the scale factor. The number
of fingers is provided by the Leap and is the total
number of extended fingers detected by the Leap.
2.1.2 Extended Finger Logic Vector
We create a 5-bit vector where the most significant bit
(MSB) represents the thumb and the least significant bit
(LSB) represents the pinky all other fingers follow this
pattern. The Leap tells us what fingers are extended and
those corresponding bits are set to 1. This feature
provides a simple method to differentiate between two
gestures that both have the same number of fingers
extended.
2.1.3 Extended Finger Area
We calculate the area of the triangle using the points at
the two fingertips furthest apart and the center of the
palm as shown in Figure 3. This area is then divided the
number of extended fingers. This division helps for
differentiating between gestures 7and 8 as well as other
similar cases where the area would be the same but they
are two distinct gestures.The area is then divided by our
scale factor.
2.1.4 Max X and Y Value
We use the maximum x and y values with respect to the
palm center. These values are divided by the scale
factor.
2.1.5 Length Width Ratio
Based off the work of Ding et al.[9], we calculate the
width using the distance between the two furthest apart
fingers. The length is the greatest distance between a
fingertip and the center of the palm as seen in Figure 3.
The length is then divided by the width getting the ratio,
which
is then divided by our scale factor.
2.1.6 Fingertip Distances
Using the fingertip distances the Leap provides we
divide each by our scale factor to create distances
normalized based on the size of the hand.
2.1.7 Fingertip Directions
This is another feature that the Leap Motion calculates
directly. It is a vector of floats for each fingertip which
denotes the direction it is pointing. This is a feature of
the new Leap SDK, so we only use this feature for the
dataset that we created.
2.2 Kinect Features
Starting with the depth image returned by the Microsoft
Kinect, we then use the assumption that the hand will be
the closest object to the camera. By using a depth
threshold, we are able to cut out most of the image’s
background. Then the wrist is found by determining the
place where the width of the hand image decreases at the
highest rate.After everything below the wrist, the image
is then scaled, so the hand image is a uniformsize. Once
an image of just the hand is obtained, we are then able
to calculate some information that will be used in
several of our features.Among those are the contours of
the hand image, and the palm center, which we found by
performing a Gaussian Blur on the image, and then
using the brightest point on the image [10].
Fig. 2 Dataset Gestures
a) b)
Fig. 3 a) Extended Finger Area and
b) Length Width Ratio
2.2.1 Silhouette
Based off of the work of Kurakin et al.[8], this feature
involves dividing the image into 32 equal radial sections
as seen in Figure 4. We calculate the distance from the
center of the image to the contour of the hand in each
radial section. The total distance of each section is then
averaged based off of how many points on the contour
are in the section. This helps to distinguish between
fingers that are together and fingers that are separated.
2.2.2 Convex Hull
Calculates the area of the black space around the hand.
By finding the space in between the fingers as shown in
Figure 4, this feature can showhow many fingers are up
and distinguish between fingers being togetheror apart.
2.2.3 Cell Occupancy Average Depth
Based off of the work of Kurakin et al.[8], this feature
involves splitting the image into a predetermined
number of squares of equal size. We used 64 squares in
ourresearch. We then find the average depth in each cell
using the grayscale value of each pixel in the hand.
2.2.4 Cell Occupancy Non-Zero
Based off of the work of Kurakin et al.[8], using the
same grid work as the above feature, the number of
occupied pixels in each cell is also saved as a unique
feature.
2.2.5 Distance Contour
The distance from the palm center to each point on the
contour of the hand can help to find local maxima and
minima, and distinguish between gestures with different
numbers of raised fingers.
2.2.6 Fingertip Distance
After using changes in the difference in the distance to
the palm centerfrom points along the contourto find the
fingertips, the distance from the palm center to each
fingertip is obtained. This feature can show some
differences between gestures such as 6 and 10, where
the difference is the finger being raised.
2.2.7 Fingertip Angles
Calculates the angles between the palm center and the
fingertips found in the Fingertip Distance feature.
2.3 Feature Vector
The feature extraction provides us with eight feature
vectors, one from the Leap and seven from the Kinect.
Only fingertip distances forthe Leap needed to be stored
in a vector while the other features were single values.
Each of the Kinect features were stored in separate
feature vectors as shown in Figure 5. All the Leap
features and Kinect features are then grouped into two
separate sets. These feature vectors can be tested alone
or combined with different features when using the
Random Forest test.
Fig. 5 Feature Vectors
3. Results
Testing of our features was done on the [1,2], which
contains ten gestures from fourteen individuals with
data samples for each gesture from each person. We
used the Random Forest classifier to measure the
accuracy of our gesture recognition. Table 1 shows our
results compared to the state of the art and Table 2
shows the results of the Random Forest test.
SVM
Leap Kinect Combined
Marin et
al.
81.5% 96.35% 96.5%
Our
Results
68.00% 95.36% 95.50%
Table 1. SVM results
The highest accuracy we got with the SVM test with
just the Leap features was 68%. Number of fingers, max
X, and area yielded this result. While running these tests
we were unable to use the extended fingers feature as
a) b)
Fig. 4 a) Silhouette and b) Convex
Hull
well as the fingertip directions, because the database
was made using an older version of the Leap Motion
SDK. The newer Leap Motion SDK provides that data,
and when included, this data is very useful for gesture
recognition. For the same reason, fingertip distances
were not included in this calculation either. When
manually entering the extended fingers, the accuracy of
fingertip distances is 90.36% by itself and extended
fingers is 100% by itself. The highest accuracy we got
with the Kinect was 95.36% on the SVM test. The
features that yielded this were convex hull, cell
occupancy average depth, and fingertip angles.
The highest recognition rate using a combination of
both Kinect and Leap features was 95.50% on the SVM.
The combination of number of fingers, ratio, max X and
Y, convex hull, cell occupancy average depth, and
fingertip angles produced that result.This set once again
excludes the Leap fingertip distance, extended fingers,
and fingertip directions.
Random Forest
Leap Kinect Combined
Our
Results
81.71% 95.21% 96.07%
Table 2. Random Forest results
Using the Random Forest test, we got significantly
improved results. Similar to the SVM tests, the same
Leap features as previously mentioned were excluded
from the calculations. The highest results using only
Leap or Kinect features gave us 81.71% and 95.21%,
respectively. Combining ratio, max X, max Y, and area
created the best combination of Leap features.
Combining convex hull, cell occupancy average depth,
and fingertip distances created the best performing
Kinect feature set. The highest combined accuracy of
96.07% was obtained with multiple feature sets. One
combination included ratio, max X and Y, area, convex
hull, cell occupancy average depth,and fingertip angles.
The other combination was similar but contained
number of fingers and fingertip distance instead of Max
Y and fingertip angles.
In addition to getting a higher recognition rate with
the Random Forest test, this test has a lower
computation time. When running all the Leap and
Kinect features, the SVM took 36 seconds to run. The
Random Forest test only took 8 seconds.The significant
decrease in computation time is important for real-time
gesture recognition.
Figure 6. ASL Dataset
We created our own dataset as well to test the
recognition rate ofour features. This dataset contains the
entire American Sign Language alphabet as shown in
Figure 6, excluding the letters J and Z. The dataset has
seven individuals performing each gesture ten times.
This dataset was made using the Leap Motion Desktop
V2, which means it provides more information
including extended fingers and fingertip directions.This
means that the Leap extended fingers and fingertip
distances can be used in our calculates for accuracy.
Utilizing all three of these,in addition to the rest of our
features, increased our recognition rate as well, as seen
in Table 3.
ASL Dataset Results
Leap Kinect Combined
Random
Forest
98.60% 87.74% 98.75%
SVM 80.83% 84.23% 91.73%
Table 3. Recognition rate with ASL dataset
Using the SVM, the combined results had a
recognition rate of 91.73% with a feature set consisting
of extended fingers, ratio, max X and Y, area, Leap
fingertip distances, and cell occupancy average depth.
For the Leap by itself, the recognition was 80.83% with
max X and Y along with fingertip direction. The Kinect
was 84.23% with just cell occupancy average depth.The
Random Forest results were even better. The combined
rate was 98.75% with Leap fingertip distances,fingertip
direction, convex hull, fingertip angles, and Kinect
fingertip distances. The Leap alone got a recognition
rate of 98.6% with all the Leap features, excluding max
Y. Kinect had a recognition rate of 87.74% with convex
hull, cell occupancy average depth,fingertip angles,and
fingertip distances.
4. Discussions
Popularity of virtual reality continues to grow and the
addition of hand gesture recognition would only further
this already burgeoning field. Advances in technology
have allowed for highly accurate collections of data
from everyday human interactions. Computers can now
process motion as input, instead of static symbols.
Implementing hand gesture recognition can help
broaden fields such as virtual reality, interactive
simulations, as well as help the hearing impaired.
The results we calculated with the new dataset
demonstrates how improvement in the Leap Motion
technology available for gesture recognition improves
the accuracy of gesture recognition as it now can
produce more precise information. Data such as
extended fingers and finger direction helped distinguish
between similar gestures that previously would have
been undifferentiated using older technology. As
interest in this field continues to increase, the
capabilities of technologies such as the Leap will
increase.
One reason that the Kinect features did not perform
as well on the ASL dataset is because of the similar
shapes ofmany of the gestures.As seen in Figure 6, it is
clear to see that many of the gestures are very similar.
With such similar gestures,it becomes more difficult to
distinguish between gestures like M and N, as seen in
Figure 7.
Figure 7. ASL Random Forest confusion matrix
By using Random Forest instead of an SVM, we were
able to get a higher recognition rate as well. Using the
Random Forest test works much better with a larger set
of features than an SVM. Because the training model
deals with well over 300 features considering some of
our features are vectors with multiple values, the SVM
requires much more time to train the data. In addition to
the higher accuracy provided by Random Forest, it was
also generally much more time efficient than the SVM.
5. Conclusions
In this paper, we propose new features one can extract
from the Leap Motion and Microsoft Kinect for use in
gesture recognition. Using the [1,2] dataset, we
produced promising results in gesture recognition,
especially from the Kinect. When creating a more
complex dataset consisting ofthe ASL alphabet,we had
access to the additional data the Leap Motion provides,
and observed an increase in accuracy by a significant
amount. The combination of our Kinect and Leap
features, while not perfect, work to correctly recognize
gestures with high accuracy.
Gestures like J and Z were not possible to test with
static images because in ASL as these gestures require
motion. Future work needs to be done with real time
hand tracking to correctly identify gestures in motion.
References
[1] Giulio Marin, Fabio Dominio, and Pietro Zanuttigh,
“Hand GestureRecognition With Jointly Calibrated Leap
Motion and Depth Sensor,” in Multimedia Tools and
Applications, 2014.
[2] G. Marin et al. “Hand Gesture Recognition with Jointly
Calibrated Leap Motion and Depth Sensor,” Multimedia
Tools and Applications, pp. 1-25, 2014.
[3] Kaoning Hu and Lijun Yin, “Multi-Scale Topological
Features for Hand PostureRepresentation and Analysis,”
in International Conference on Computer Vision, 2013.
[4] Yanmei Chen, Zeyu Ding, Yen-Lun Chen, and Xinyu
Wu, “Rapid Recognition of Dynamic Hand Gestures
using Leap Motion,” in IEEE International Conference
on Information and Automation, 2015.
[5] Chong Wang, Zhong Liu, and Shing-Chow Chan,
“Superpixel-Based Hand Gesture Recognition With
Kinect Depth Camera,” in IEEE Transactions on
Multimedia, Vol. 17, No. 1, 2015.
[6] Z. Ren, J. Yuan, J. Meng, and Z. Zhang, “Robust part-
based hand gesture recognition using Kinect sensor,”
IEEE Trans. Multimedia, vol.15, no. 5, 2013.
[7] Xia Liu and Kikuo Fujimura, “Hand GestureRecognition
using Depth Data,”in 6th IEEE International Conference
on Automatic Face and Gesture Recognition, 2004.
[8] Alexey Kurakin, Zhengyou Zhang, and Zicheng Liu, “A
Real-Time System for Dynamic Hand Gesture
Recognition with a Depth Sensor,”in Proc. of EUSIPCO,
2012.
[9] Zeyu Ding, Zexiong Zhang, Yanmei Chen, Yen-Lun
Chen, and Xinyu Wu, “A Real-time Dynamic Gesture
Recognition Based on 3D Trajectories in Distinguishing
Similar Gestures,” in IEEE International Conference on
Information and Automation, 2015.
[10] Fabio Dominio, Mauro Donadeo, and Pietro Zanuttigh,
“Combining MultipleDepth-based Descriptors for Hand
Gesture Recognition” Pattern Recognition Letters, 2013.

More Related Content

What's hot

Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8Matteo Valoriani
 
Hand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape ParametersHand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape ParametersNithinkumar P
 
Vision based human computer interface using colour detection
Vision based human computer interface using colour detectionVision based human computer interface using colour detection
Vision based human computer interface using colour detectioneSAT Journals
 
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...TELKOMNIKA JOURNAL
 
Architecture for Locative Augmented Reality
Architecture for Locative Augmented RealityArchitecture for Locative Augmented Reality
Architecture for Locative Augmented RealityChinar Patil
 
Final Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingFinal Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingSabnam Pandey, MBA
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Jeff Sipko
 
Sensors On 3d Digitization
Sensors On 3d DigitizationSensors On 3d Digitization
Sensors On 3d DigitizationRajan Kumar
 
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...sugiuralab
 
11.biometric data security using recursive visual cryptography
11.biometric data security using recursive visual cryptography11.biometric data security using recursive visual cryptography
11.biometric data security using recursive visual cryptographyAlexander Decker
 
Estimation of body mass distribution using projective image
Estimation of body mass distribution using projective imageEstimation of body mass distribution using projective image
Estimation of body mass distribution using projective imageSuhas Deshpande
 
Sensors on 3 d digitization seminar report
Sensors on 3 d digitization seminar reportSensors on 3 d digitization seminar report
Sensors on 3 d digitization seminar reportVishnu Prasad
 
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...Rong-Hao Liang
 
Virtual Reality Training for Upper Limb Prosthesis Patients
Virtual Reality Training for Upper Limb Prosthesis PatientsVirtual Reality Training for Upper Limb Prosthesis Patients
Virtual Reality Training for Upper Limb Prosthesis PatientsAnnette Mossel
 

What's hot (20)

Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8Introduction to Kinect - Update v 1.8
Introduction to Kinect - Update v 1.8
 
Hand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape ParametersHand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape Parameters
 
Vision based human computer interface using colour detection
Vision based human computer interface using colour detectionVision based human computer interface using colour detection
Vision based human computer interface using colour detection
 
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...
 
georgefox_template1
georgefox_template1georgefox_template1
georgefox_template1
 
Architecture for Locative Augmented Reality
Architecture for Locative Augmented RealityArchitecture for Locative Augmented Reality
Architecture for Locative Augmented Reality
 
Kinect
KinectKinect
Kinect
 
Final Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingFinal Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image Processing
 
Kinect sensor
Kinect sensorKinect sensor
Kinect sensor
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
 
winter project
winter projectwinter project
winter project
 
Sensors On 3d Digitization
Sensors On 3d DigitizationSensors On 3d Digitization
Sensors On 3d Digitization
 
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...
 
11.biometric data security using recursive visual cryptography
11.biometric data security using recursive visual cryptography11.biometric data security using recursive visual cryptography
11.biometric data security using recursive visual cryptography
 
Report
ReportReport
Report
 
40120130406016
4012013040601640120130406016
40120130406016
 
Estimation of body mass distribution using projective image
Estimation of body mass distribution using projective imageEstimation of body mass distribution using projective image
Estimation of body mass distribution using projective image
 
Sensors on 3 d digitization seminar report
Sensors on 3 d digitization seminar reportSensors on 3 d digitization seminar report
Sensors on 3 d digitization seminar report
 
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...
 
Virtual Reality Training for Upper Limb Prosthesis Patients
Virtual Reality Training for Upper Limb Prosthesis PatientsVirtual Reality Training for Upper Limb Prosthesis Patients
Virtual Reality Training for Upper Limb Prosthesis Patients
 

Viewers also liked

Senior project research paper
Senior project research paperSenior project research paper
Senior project research paperKeithDWJ
 
Daniel Wray Senior Project Research Paper
Daniel Wray Senior Project Research PaperDaniel Wray Senior Project Research Paper
Daniel Wray Senior Project Research PaperD_wray
 
Final Paper - LaTeX love
Final Paper - LaTeX loveFinal Paper - LaTeX love
Final Paper - LaTeX loveYahya Al-Mashni
 
Leap Motion Basic
Leap Motion BasicLeap Motion Basic
Leap Motion Basicvelakaturi
 
Project Seminar on Leapmotion Technology
Project Seminar on Leapmotion TechnologyProject Seminar on Leapmotion Technology
Project Seminar on Leapmotion TechnologyAbhijit Dey
 

Viewers also liked (6)

Senior project research paper
Senior project research paperSenior project research paper
Senior project research paper
 
Daniel Wray Senior Project Research Paper
Daniel Wray Senior Project Research PaperDaniel Wray Senior Project Research Paper
Daniel Wray Senior Project Research Paper
 
Final Paper - LaTeX love
Final Paper - LaTeX loveFinal Paper - LaTeX love
Final Paper - LaTeX love
 
Leap Motion Basic
Leap Motion BasicLeap Motion Basic
Leap Motion Basic
 
Project Seminar on Leapmotion Technology
Project Seminar on Leapmotion TechnologyProject Seminar on Leapmotion Technology
Project Seminar on Leapmotion Technology
 
Leap motion
Leap motionLeap motion
Leap motion
 

Similar to researchPaper

Mouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured TapesMouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured Tapesijistjournal
 
Mouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured Tapes Mouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured Tapes ijistjournal
 
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODELTURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODELcscpconf
 
Turkish Sign Language Recognition Using Hidden Markov Model
Turkish Sign Language Recognition Using Hidden Markov Model Turkish Sign Language Recognition Using Hidden Markov Model
Turkish Sign Language Recognition Using Hidden Markov Model csandit
 
IRJET-Real Time Hand Gesture Recognition using Finger Tips
IRJET-Real Time Hand Gesture Recognition using Finger TipsIRJET-Real Time Hand Gesture Recognition using Finger Tips
IRJET-Real Time Hand Gesture Recognition using Finger TipsIRJET Journal
 
Hand Gesture Controls for Digital TV using Mobile ARM Platform
Hand Gesture Controls for Digital TV using Mobile ARM PlatformHand Gesture Controls for Digital TV using Mobile ARM Platform
Hand Gesture Controls for Digital TV using Mobile ARM Platformijsrd.com
 
Controlling Mouse Movements Using hand Gesture And X box 360
Controlling Mouse Movements Using hand Gesture And X box 360Controlling Mouse Movements Using hand Gesture And X box 360
Controlling Mouse Movements Using hand Gesture And X box 360IRJET Journal
 
Gesture detection by virtual surface
Gesture detection by virtual surfaceGesture detection by virtual surface
Gesture detection by virtual surfaceAshish Garg
 
Kinect Arabic Interfaced Drawing Application
Kinect Arabic Interfaced Drawing ApplicationKinect Arabic Interfaced Drawing Application
Kinect Arabic Interfaced Drawing ApplicationYasser Hisham
 
Sign Language Identification based on Hand Gestures
Sign Language Identification based on Hand GesturesSign Language Identification based on Hand Gestures
Sign Language Identification based on Hand GesturesIRJET Journal
 
Accessing Operating System using Finger Gesture
Accessing Operating System using Finger GestureAccessing Operating System using Finger Gesture
Accessing Operating System using Finger GestureIRJET Journal
 
EXPLORATORY PROJECT
EXPLORATORY PROJECTEXPLORATORY PROJECT
EXPLORATORY PROJECTAman Soni
 
Virtual Yoga System Using Kinect Sensor
Virtual Yoga System Using Kinect SensorVirtual Yoga System Using Kinect Sensor
Virtual Yoga System Using Kinect SensorIRJET Journal
 
Gesture Recognition Technology
Gesture Recognition TechnologyGesture Recognition Technology
Gesture Recognition TechnologyNikith Kumar Reddy
 
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...caijjournal
 
IRJET- Survey on Sign Language and Gesture Recognition System
IRJET- Survey on Sign Language and Gesture Recognition SystemIRJET- Survey on Sign Language and Gesture Recognition System
IRJET- Survey on Sign Language and Gesture Recognition SystemIRJET Journal
 
IRJET- Recognition of Theft by Gestures using Kinect Sensor in Machine Le...
IRJET-  	  Recognition of Theft by Gestures using Kinect Sensor in Machine Le...IRJET-  	  Recognition of Theft by Gestures using Kinect Sensor in Machine Le...
IRJET- Recognition of Theft by Gestures using Kinect Sensor in Machine Le...IRJET Journal
 

Similar to researchPaper (20)

Mouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured TapesMouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured Tapes
 
Mouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured Tapes Mouse Simulation Using Two Coloured Tapes
Mouse Simulation Using Two Coloured Tapes
 
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODELTURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODEL
 
Turkish Sign Language Recognition Using Hidden Markov Model
Turkish Sign Language Recognition Using Hidden Markov Model Turkish Sign Language Recognition Using Hidden Markov Model
Turkish Sign Language Recognition Using Hidden Markov Model
 
IRJET-Real Time Hand Gesture Recognition using Finger Tips
IRJET-Real Time Hand Gesture Recognition using Finger TipsIRJET-Real Time Hand Gesture Recognition using Finger Tips
IRJET-Real Time Hand Gesture Recognition using Finger Tips
 
Hand Gesture Controls for Digital TV using Mobile ARM Platform
Hand Gesture Controls for Digital TV using Mobile ARM PlatformHand Gesture Controls for Digital TV using Mobile ARM Platform
Hand Gesture Controls for Digital TV using Mobile ARM Platform
 
Controlling Mouse Movements Using hand Gesture And X box 360
Controlling Mouse Movements Using hand Gesture And X box 360Controlling Mouse Movements Using hand Gesture And X box 360
Controlling Mouse Movements Using hand Gesture And X box 360
 
Gesture detection by virtual surface
Gesture detection by virtual surfaceGesture detection by virtual surface
Gesture detection by virtual surface
 
Kinect Arabic Interfaced Drawing Application
Kinect Arabic Interfaced Drawing ApplicationKinect Arabic Interfaced Drawing Application
Kinect Arabic Interfaced Drawing Application
 
Sign Language Identification based on Hand Gestures
Sign Language Identification based on Hand GesturesSign Language Identification based on Hand Gestures
Sign Language Identification based on Hand Gestures
 
Accessing Operating System using Finger Gesture
Accessing Operating System using Finger GestureAccessing Operating System using Finger Gesture
Accessing Operating System using Finger Gesture
 
EXPLORATORY PROJECT
EXPLORATORY PROJECTEXPLORATORY PROJECT
EXPLORATORY PROJECT
 
Gesture phones final
Gesture phones  finalGesture phones  final
Gesture phones final
 
Virtual Yoga System Using Kinect Sensor
Virtual Yoga System Using Kinect SensorVirtual Yoga System Using Kinect Sensor
Virtual Yoga System Using Kinect Sensor
 
Nikppt
NikpptNikppt
Nikppt
 
Gesture Recognition Technology
Gesture Recognition TechnologyGesture Recognition Technology
Gesture Recognition Technology
 
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...
 
IRJET- Survey on Sign Language and Gesture Recognition System
IRJET- Survey on Sign Language and Gesture Recognition SystemIRJET- Survey on Sign Language and Gesture Recognition System
IRJET- Survey on Sign Language and Gesture Recognition System
 
micwic2013_poster
micwic2013_postermicwic2013_poster
micwic2013_poster
 
IRJET- Recognition of Theft by Gestures using Kinect Sensor in Machine Le...
IRJET-  	  Recognition of Theft by Gestures using Kinect Sensor in Machine Le...IRJET-  	  Recognition of Theft by Gestures using Kinect Sensor in Machine Le...
IRJET- Recognition of Theft by Gestures using Kinect Sensor in Machine Le...
 

researchPaper

  • 1. Abstract This paper proposes new features that can be extracted from the Leap Motion and Microsoft Kinect to be used for recognizing hand gestures. The Leap Motion (https://www.leapmotion.com) provides 3D hand information and the Microsoft Kinect (https://developer.microsoft.com/en- us/windows/kinect) providesdepth information,and the combination of both allows us to extrapolate a large amount of diverse information about the hand’s shape enabling more accurate gesture recognition than is possible with either camera individually. Using a database of 10 distinct American Sign language gestures, provided by Marin et al. [1,2], our new features allow us to achieve a high level of accurate recognitions. 1. Introduction Interest in hand gesture recognition has continued to grow in recent years as the demand for its utility increases in areas such as video game development and sign language translation for human computer interaction (HCI). Advancements in 3D imaging and Time-of-Flight cameras as well as the availability of new technology such as the Leap Motion and the Microsoft Kinect have made major advances to making hand gesture recognition possible for a consumer market. The Microsoft Kinect provides both an RGB and depth image from which hand information can be extracted, while the Leap Motion, created specifically for hand gesture recognition, returns coordinate information about the hand. Many different approaches have been taken to distinguish gestures such as topological features of holes in the image to distinguish gestures [3] and rapid recognition to recognize gestures before the gesture is completed [4]. Superpixel Earth Mover’s Distance, which is widely used in image retrieval and pattern recognition, is another approach that has been applied to gesture recognition [5,6]. In this paper, we utilize data from both a Microsoft Kinect and a Leap Motion to obtain a more complete image of the hand, making it possible to correctly identify the hand gesture as shown in Figure 1. The Leap Motion has improved significantly overthe last few years.The Leap now provides more information than previously available. In our approach,we made use of these features to improve the accuracy of our recognition. One of the features that is now available to the Leap is extended fingers. The Leap is able to detect which fingers are extended with very high accuracy. This allows us to distinguish between gestures such as the ones seen in Figure 1. The database we have selected to use was created using the older version of the Leap Motion. This means that they did not have the extended finger feature available. To see how this feature improves the accuracy of recognition, we manually entered the extended fingers into the data. Section 2.1 will discuss our other new features for the Leap, maximum X and Y value, average area, and X-Y ratio. The Microsoft Kinect uses a Time-of-Flight camera to create a grayscale depth image which makes the hand easy to distinguish from the background. To find the hand in the image, we assume the hand is always the closest object to the Kinect and remove anything behind the closest object. Other approaches use blob detection to locate the hand with good results [7]. After finding the hand,we extract features from the image in order to gather data about the hand’s shape. The features we extract are Silhouette, Convex Hull, Cell Occupancy Average Depth, Cell Occupancy Non-Zero, Distance Contour, Fingertip Distance and Fingertip Angles. We then run the data through the Random Forest classifier, which interprets the data and attempts to correctly identify the gesture.There has been much work already done using the Microsoft Kinect for hand gesture recognition, such as researchers at the universities of Padova and Moscow[1,8]. We have used features inspired by these as well as innovative new ones to maximize our recognition rate. The dataset of gestures that we have used include 10 distinct hand gestures that can be seen in Figure 2. Fig. 1 Kinect and Leap setup Leap Motion and Microsoft Kinect Hand Gesture Recognition Josiah Bailey, Julie Kunnumpurath, Ryan Malon, Joe Nicosia, James Zwilling State University of New York at Binghamton
  • 2. 2. Hand Gesture Features 2.1 Leap Features The Leap Motion provides us information about the hand that can be accessed.Using the Leap SDK, we are able to extract information about the fingertip positions, palm center, extended fingers and various other points of interest on the hand. Using this information, we calculate our features. 2.1.1 Scale Factor and Number of Fingers In order to account for different sized hands, we use a scale factor. We average the x, y and z values for all extended fingertips to create a new point we find the distance between that averaged point and the center of the palm. This distance is the scale factor. The number of fingers is provided by the Leap and is the total number of extended fingers detected by the Leap. 2.1.2 Extended Finger Logic Vector We create a 5-bit vector where the most significant bit (MSB) represents the thumb and the least significant bit (LSB) represents the pinky all other fingers follow this pattern. The Leap tells us what fingers are extended and those corresponding bits are set to 1. This feature provides a simple method to differentiate between two gestures that both have the same number of fingers extended. 2.1.3 Extended Finger Area We calculate the area of the triangle using the points at the two fingertips furthest apart and the center of the palm as shown in Figure 3. This area is then divided the number of extended fingers. This division helps for differentiating between gestures 7and 8 as well as other similar cases where the area would be the same but they are two distinct gestures.The area is then divided by our scale factor. 2.1.4 Max X and Y Value We use the maximum x and y values with respect to the palm center. These values are divided by the scale factor. 2.1.5 Length Width Ratio Based off the work of Ding et al.[9], we calculate the width using the distance between the two furthest apart fingers. The length is the greatest distance between a fingertip and the center of the palm as seen in Figure 3. The length is then divided by the width getting the ratio, which is then divided by our scale factor. 2.1.6 Fingertip Distances Using the fingertip distances the Leap provides we divide each by our scale factor to create distances normalized based on the size of the hand. 2.1.7 Fingertip Directions This is another feature that the Leap Motion calculates directly. It is a vector of floats for each fingertip which denotes the direction it is pointing. This is a feature of the new Leap SDK, so we only use this feature for the dataset that we created. 2.2 Kinect Features Starting with the depth image returned by the Microsoft Kinect, we then use the assumption that the hand will be the closest object to the camera. By using a depth threshold, we are able to cut out most of the image’s background. Then the wrist is found by determining the place where the width of the hand image decreases at the highest rate.After everything below the wrist, the image is then scaled, so the hand image is a uniformsize. Once an image of just the hand is obtained, we are then able to calculate some information that will be used in several of our features.Among those are the contours of the hand image, and the palm center, which we found by performing a Gaussian Blur on the image, and then using the brightest point on the image [10]. Fig. 2 Dataset Gestures a) b) Fig. 3 a) Extended Finger Area and b) Length Width Ratio
  • 3. 2.2.1 Silhouette Based off of the work of Kurakin et al.[8], this feature involves dividing the image into 32 equal radial sections as seen in Figure 4. We calculate the distance from the center of the image to the contour of the hand in each radial section. The total distance of each section is then averaged based off of how many points on the contour are in the section. This helps to distinguish between fingers that are together and fingers that are separated. 2.2.2 Convex Hull Calculates the area of the black space around the hand. By finding the space in between the fingers as shown in Figure 4, this feature can showhow many fingers are up and distinguish between fingers being togetheror apart. 2.2.3 Cell Occupancy Average Depth Based off of the work of Kurakin et al.[8], this feature involves splitting the image into a predetermined number of squares of equal size. We used 64 squares in ourresearch. We then find the average depth in each cell using the grayscale value of each pixel in the hand. 2.2.4 Cell Occupancy Non-Zero Based off of the work of Kurakin et al.[8], using the same grid work as the above feature, the number of occupied pixels in each cell is also saved as a unique feature. 2.2.5 Distance Contour The distance from the palm center to each point on the contour of the hand can help to find local maxima and minima, and distinguish between gestures with different numbers of raised fingers. 2.2.6 Fingertip Distance After using changes in the difference in the distance to the palm centerfrom points along the contourto find the fingertips, the distance from the palm center to each fingertip is obtained. This feature can show some differences between gestures such as 6 and 10, where the difference is the finger being raised. 2.2.7 Fingertip Angles Calculates the angles between the palm center and the fingertips found in the Fingertip Distance feature. 2.3 Feature Vector The feature extraction provides us with eight feature vectors, one from the Leap and seven from the Kinect. Only fingertip distances forthe Leap needed to be stored in a vector while the other features were single values. Each of the Kinect features were stored in separate feature vectors as shown in Figure 5. All the Leap features and Kinect features are then grouped into two separate sets. These feature vectors can be tested alone or combined with different features when using the Random Forest test. Fig. 5 Feature Vectors 3. Results Testing of our features was done on the [1,2], which contains ten gestures from fourteen individuals with data samples for each gesture from each person. We used the Random Forest classifier to measure the accuracy of our gesture recognition. Table 1 shows our results compared to the state of the art and Table 2 shows the results of the Random Forest test. SVM Leap Kinect Combined Marin et al. 81.5% 96.35% 96.5% Our Results 68.00% 95.36% 95.50% Table 1. SVM results The highest accuracy we got with the SVM test with just the Leap features was 68%. Number of fingers, max X, and area yielded this result. While running these tests we were unable to use the extended fingers feature as a) b) Fig. 4 a) Silhouette and b) Convex Hull
  • 4. well as the fingertip directions, because the database was made using an older version of the Leap Motion SDK. The newer Leap Motion SDK provides that data, and when included, this data is very useful for gesture recognition. For the same reason, fingertip distances were not included in this calculation either. When manually entering the extended fingers, the accuracy of fingertip distances is 90.36% by itself and extended fingers is 100% by itself. The highest accuracy we got with the Kinect was 95.36% on the SVM test. The features that yielded this were convex hull, cell occupancy average depth, and fingertip angles. The highest recognition rate using a combination of both Kinect and Leap features was 95.50% on the SVM. The combination of number of fingers, ratio, max X and Y, convex hull, cell occupancy average depth, and fingertip angles produced that result.This set once again excludes the Leap fingertip distance, extended fingers, and fingertip directions. Random Forest Leap Kinect Combined Our Results 81.71% 95.21% 96.07% Table 2. Random Forest results Using the Random Forest test, we got significantly improved results. Similar to the SVM tests, the same Leap features as previously mentioned were excluded from the calculations. The highest results using only Leap or Kinect features gave us 81.71% and 95.21%, respectively. Combining ratio, max X, max Y, and area created the best combination of Leap features. Combining convex hull, cell occupancy average depth, and fingertip distances created the best performing Kinect feature set. The highest combined accuracy of 96.07% was obtained with multiple feature sets. One combination included ratio, max X and Y, area, convex hull, cell occupancy average depth,and fingertip angles. The other combination was similar but contained number of fingers and fingertip distance instead of Max Y and fingertip angles. In addition to getting a higher recognition rate with the Random Forest test, this test has a lower computation time. When running all the Leap and Kinect features, the SVM took 36 seconds to run. The Random Forest test only took 8 seconds.The significant decrease in computation time is important for real-time gesture recognition. Figure 6. ASL Dataset We created our own dataset as well to test the recognition rate ofour features. This dataset contains the entire American Sign Language alphabet as shown in Figure 6, excluding the letters J and Z. The dataset has seven individuals performing each gesture ten times. This dataset was made using the Leap Motion Desktop V2, which means it provides more information including extended fingers and fingertip directions.This means that the Leap extended fingers and fingertip distances can be used in our calculates for accuracy. Utilizing all three of these,in addition to the rest of our features, increased our recognition rate as well, as seen in Table 3. ASL Dataset Results Leap Kinect Combined Random Forest 98.60% 87.74% 98.75% SVM 80.83% 84.23% 91.73% Table 3. Recognition rate with ASL dataset Using the SVM, the combined results had a recognition rate of 91.73% with a feature set consisting of extended fingers, ratio, max X and Y, area, Leap fingertip distances, and cell occupancy average depth. For the Leap by itself, the recognition was 80.83% with max X and Y along with fingertip direction. The Kinect was 84.23% with just cell occupancy average depth.The Random Forest results were even better. The combined rate was 98.75% with Leap fingertip distances,fingertip direction, convex hull, fingertip angles, and Kinect fingertip distances. The Leap alone got a recognition rate of 98.6% with all the Leap features, excluding max Y. Kinect had a recognition rate of 87.74% with convex hull, cell occupancy average depth,fingertip angles,and fingertip distances. 4. Discussions Popularity of virtual reality continues to grow and the addition of hand gesture recognition would only further this already burgeoning field. Advances in technology have allowed for highly accurate collections of data from everyday human interactions. Computers can now process motion as input, instead of static symbols. Implementing hand gesture recognition can help
  • 5. broaden fields such as virtual reality, interactive simulations, as well as help the hearing impaired. The results we calculated with the new dataset demonstrates how improvement in the Leap Motion technology available for gesture recognition improves the accuracy of gesture recognition as it now can produce more precise information. Data such as extended fingers and finger direction helped distinguish between similar gestures that previously would have been undifferentiated using older technology. As interest in this field continues to increase, the capabilities of technologies such as the Leap will increase. One reason that the Kinect features did not perform as well on the ASL dataset is because of the similar shapes ofmany of the gestures.As seen in Figure 6, it is clear to see that many of the gestures are very similar. With such similar gestures,it becomes more difficult to distinguish between gestures like M and N, as seen in Figure 7. Figure 7. ASL Random Forest confusion matrix By using Random Forest instead of an SVM, we were able to get a higher recognition rate as well. Using the Random Forest test works much better with a larger set of features than an SVM. Because the training model deals with well over 300 features considering some of our features are vectors with multiple values, the SVM requires much more time to train the data. In addition to the higher accuracy provided by Random Forest, it was also generally much more time efficient than the SVM. 5. Conclusions In this paper, we propose new features one can extract from the Leap Motion and Microsoft Kinect for use in gesture recognition. Using the [1,2] dataset, we produced promising results in gesture recognition, especially from the Kinect. When creating a more complex dataset consisting ofthe ASL alphabet,we had access to the additional data the Leap Motion provides, and observed an increase in accuracy by a significant amount. The combination of our Kinect and Leap features, while not perfect, work to correctly recognize gestures with high accuracy. Gestures like J and Z were not possible to test with static images because in ASL as these gestures require motion. Future work needs to be done with real time hand tracking to correctly identify gestures in motion. References [1] Giulio Marin, Fabio Dominio, and Pietro Zanuttigh, “Hand GestureRecognition With Jointly Calibrated Leap Motion and Depth Sensor,” in Multimedia Tools and Applications, 2014. [2] G. Marin et al. “Hand Gesture Recognition with Jointly Calibrated Leap Motion and Depth Sensor,” Multimedia Tools and Applications, pp. 1-25, 2014. [3] Kaoning Hu and Lijun Yin, “Multi-Scale Topological Features for Hand PostureRepresentation and Analysis,” in International Conference on Computer Vision, 2013. [4] Yanmei Chen, Zeyu Ding, Yen-Lun Chen, and Xinyu Wu, “Rapid Recognition of Dynamic Hand Gestures using Leap Motion,” in IEEE International Conference on Information and Automation, 2015. [5] Chong Wang, Zhong Liu, and Shing-Chow Chan, “Superpixel-Based Hand Gesture Recognition With Kinect Depth Camera,” in IEEE Transactions on Multimedia, Vol. 17, No. 1, 2015. [6] Z. Ren, J. Yuan, J. Meng, and Z. Zhang, “Robust part- based hand gesture recognition using Kinect sensor,” IEEE Trans. Multimedia, vol.15, no. 5, 2013. [7] Xia Liu and Kikuo Fujimura, “Hand GestureRecognition using Depth Data,”in 6th IEEE International Conference on Automatic Face and Gesture Recognition, 2004. [8] Alexey Kurakin, Zhengyou Zhang, and Zicheng Liu, “A Real-Time System for Dynamic Hand Gesture Recognition with a Depth Sensor,”in Proc. of EUSIPCO, 2012. [9] Zeyu Ding, Zexiong Zhang, Yanmei Chen, Yen-Lun Chen, and Xinyu Wu, “A Real-time Dynamic Gesture Recognition Based on 3D Trajectories in Distinguishing Similar Gestures,” in IEEE International Conference on Information and Automation, 2015. [10] Fabio Dominio, Mauro Donadeo, and Pietro Zanuttigh, “Combining MultipleDepth-based Descriptors for Hand Gesture Recognition” Pattern Recognition Letters, 2013.