This paper proposes combining data from the Leap Motion and Microsoft Kinect to more accurately recognize hand gestures. New features are extracted from each device and combined into a feature vector, including extended finger detection, fingertip positions and angles, and measurements of the hand shape. These features are used to train a random forest classifier on a dataset of 10 American Sign Language gestures. The results show improved recognition accuracy over using either device alone.
The document describes a project to develop a tabletop touchscreen interface using a Kinect depth sensor. The researchers designed and built a table setup with a projected screen and mounted Kinect. They tested the Kinect's ability to track finger touches and gestures through programs that allowed for coloring, puzzles, zooming and image swiping. The Kinect was able to accurately detect and follow finger motion. This demonstrated the viability of using a depth sensor for a multi-user touch interface and suggested advantages over other touchscreen technologies.
This project develops a natural user interface for interacting with 3D environments using the Microsoft Kinect. Two Kinect devices are placed in a virtual reality space to track a user's full body movements and gestures. The Kinect data is used to create a digital avatar that represents the user's position and allows directly interacting with virtual objects by reaching out. Gesture recognition is also implemented to provide additional controls for navigation and selection. The goal is to make interacting with complex 3D data more intuitive by mirroring natural physical interactions.
This document provides information about 3D television and film production, including how to capture and display 3D images and some common issues that can arise. It discusses several methods for displaying 3D including anaglyph (using colored filters), shutter glasses, and circular polarization. It also covers basic 3D concepts like parallax and techniques for controlling depth cues and camera setup to optimize the 3D effect.
This is a basic introduction for kinect v1 and processing in 2014. However, some practice codes not included in this slide. It's only the concept help you understand some information about how using processing play with kinect.
Complex Weld Seam Detection Using Computer Vision Linked Inglenn_silvers
This document discusses a project to use computer vision and a Microsoft Kinect sensor to enable real-time gesture control of a welding robot. The project aims to detect and track a user's hand gestures to control robot movement, and to define the weld seam region of interest to allow for seam detection. The plan involves accessing Kinect data, detecting and tracking the hand in 3D space, recognizing gestures for robot movement commands, extracting color values from the hand for skin detection, and using the hand position to define the seam region of interest. The work so far has successfully defined the hand and fingers, tracked hand motion, and extracted the seam region. Further work is needed to finalize the gesture commands and integrate control of the robot.
Goal location prediction based on deep learning using RGB-D camerajournalBEEI
In the navigation system, the desired destination position plays an essential role since the path planning algorithms takes a current location and goal location as inputs as well as the map of the surrounding environment. The generated path from path planning algorithm is used to guide a user to his final destination. This paper presents a proposed algorithm based on RGB-D camera to predict the goal coordinates in 2D occupancy grid map for visually impaired people navigation system. In recent years, deep learning methods have been used in many object detection tasks. So, the object detection method based on convolution neural network method is adopted in the proposed algorithm. The measuring distance between the current position of a sensor and the detected object depends on the depth data that is acquired from RGB-D camera. Both of the object detected coordinates and depth data has been integrated to get an accurate goal location in a 2D map. This proposed algorithm has been tested on various real-time scenarios. The experiments results indicate to the effectiveness of the proposed algorithm.
Digital image forgery can be categorized into three main types: image retouching, image splicing, and copy-move attack. Image retouching makes minor enhancements without significantly altering the image. Image splicing combines two or more images to create a composite fake image. Copy-move attack modifies an image by copying and moving a region within the same original image, such as duplicating smoke to conceal details or tamper with the image. Effective and low-cost ways to help secure images and prevent misuse include adding copyright text to images, optimizing image size and compression, slicing images, using mouseover image swaps, and setting images as table backgrounds with transparent GIFs.
Forgery in digital images can be done by manipulating the digital image to conceal some meaningful or useful information of the image. It can be much difficult to identify the edited region from the original image in various cases. In order to maintain the integrity and authenticity of the image, the detection of forgery in the image is necessary. Adaption of modern lifestyle and advanced photography equipment has made tempering of digital image easy with the help of image editing soft wares. It is thus important to detect such image tempering operations. Different methods exist in literature that divide the suspicious image into overlapped blocks and extract some features from the images to detect the type of forgery that exist in the image. The image forgery detection can be done based on object removal, object addition, unusual color modifications in the image. Many existing techniques are available to overcome this problem but most of these techniques have many limitations. Images are one of the powerful media for communication. In this paper a survey of different types of forgery and digital image forgery detection has been focused.
The document describes a project to develop a tabletop touchscreen interface using a Kinect depth sensor. The researchers designed and built a table setup with a projected screen and mounted Kinect. They tested the Kinect's ability to track finger touches and gestures through programs that allowed for coloring, puzzles, zooming and image swiping. The Kinect was able to accurately detect and follow finger motion. This demonstrated the viability of using a depth sensor for a multi-user touch interface and suggested advantages over other touchscreen technologies.
This project develops a natural user interface for interacting with 3D environments using the Microsoft Kinect. Two Kinect devices are placed in a virtual reality space to track a user's full body movements and gestures. The Kinect data is used to create a digital avatar that represents the user's position and allows directly interacting with virtual objects by reaching out. Gesture recognition is also implemented to provide additional controls for navigation and selection. The goal is to make interacting with complex 3D data more intuitive by mirroring natural physical interactions.
This document provides information about 3D television and film production, including how to capture and display 3D images and some common issues that can arise. It discusses several methods for displaying 3D including anaglyph (using colored filters), shutter glasses, and circular polarization. It also covers basic 3D concepts like parallax and techniques for controlling depth cues and camera setup to optimize the 3D effect.
This is a basic introduction for kinect v1 and processing in 2014. However, some practice codes not included in this slide. It's only the concept help you understand some information about how using processing play with kinect.
Complex Weld Seam Detection Using Computer Vision Linked Inglenn_silvers
This document discusses a project to use computer vision and a Microsoft Kinect sensor to enable real-time gesture control of a welding robot. The project aims to detect and track a user's hand gestures to control robot movement, and to define the weld seam region of interest to allow for seam detection. The plan involves accessing Kinect data, detecting and tracking the hand in 3D space, recognizing gestures for robot movement commands, extracting color values from the hand for skin detection, and using the hand position to define the seam region of interest. The work so far has successfully defined the hand and fingers, tracked hand motion, and extracted the seam region. Further work is needed to finalize the gesture commands and integrate control of the robot.
Goal location prediction based on deep learning using RGB-D camerajournalBEEI
In the navigation system, the desired destination position plays an essential role since the path planning algorithms takes a current location and goal location as inputs as well as the map of the surrounding environment. The generated path from path planning algorithm is used to guide a user to his final destination. This paper presents a proposed algorithm based on RGB-D camera to predict the goal coordinates in 2D occupancy grid map for visually impaired people navigation system. In recent years, deep learning methods have been used in many object detection tasks. So, the object detection method based on convolution neural network method is adopted in the proposed algorithm. The measuring distance between the current position of a sensor and the detected object depends on the depth data that is acquired from RGB-D camera. Both of the object detected coordinates and depth data has been integrated to get an accurate goal location in a 2D map. This proposed algorithm has been tested on various real-time scenarios. The experiments results indicate to the effectiveness of the proposed algorithm.
Digital image forgery can be categorized into three main types: image retouching, image splicing, and copy-move attack. Image retouching makes minor enhancements without significantly altering the image. Image splicing combines two or more images to create a composite fake image. Copy-move attack modifies an image by copying and moving a region within the same original image, such as duplicating smoke to conceal details or tamper with the image. Effective and low-cost ways to help secure images and prevent misuse include adding copyright text to images, optimizing image size and compression, slicing images, using mouseover image swaps, and setting images as table backgrounds with transparent GIFs.
Forgery in digital images can be done by manipulating the digital image to conceal some meaningful or useful information of the image. It can be much difficult to identify the edited region from the original image in various cases. In order to maintain the integrity and authenticity of the image, the detection of forgery in the image is necessary. Adaption of modern lifestyle and advanced photography equipment has made tempering of digital image easy with the help of image editing soft wares. It is thus important to detect such image tempering operations. Different methods exist in literature that divide the suspicious image into overlapped blocks and extract some features from the images to detect the type of forgery that exist in the image. The image forgery detection can be done based on object removal, object addition, unusual color modifications in the image. Many existing techniques are available to overcome this problem but most of these techniques have many limitations. Images are one of the powerful media for communication. In this paper a survey of different types of forgery and digital image forgery detection has been focused.
Hand Gesture Recognition Based on Shape ParametersNithinkumar P
Hi guys,
I am sharing a new link for code & project report. Hope it help you in your academics. Contact me if you need any help.
https://drive.google.com/drive/folders/1H0p852jfoyQuFig_IoMyVVK-U5o18Mxh?usp=sharing
A real time system for hand gesture recognition on the basis of detection of some meaningful shape based features like orientation, centre of mass (centroid), status of fingers and thumb in terms of raised or folded and their respective location in image.
Algorithm is implemented in Matlab v7.10
We use this hand gestures for
1. Sign Language Recognition
2. Human Machine Interaction.
Vision based human computer interface using colour detectioneSAT Journals
Abstract In this paper we have tried to present an approach to Human Computer Interaction (HCI). Here we have tried to control actions
associated with mouse. Each mouse actions are associated to a colour pointer. These colour pointer are acquired as input using
web camera. The acquired colour pointer are processed using colour detection technique.
Keywords: Human Computer Interaction, Web Camera, Colour Detection, Colour Subtraction.
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...TELKOMNIKA JOURNAL
Montessori method is a learning method using props. One of the developments props is to use the game as a medium of learning. The examples Game media as learning is the use of Virtual Reality or VR Technology. By using the VR, players will be brought into the virtual world as if the player is in the real world. The weakness of the VR game is the limited interaction with the outside world. Interaction uses only buttons and joysticks. In this paper we use Flex sensor and accelerometer sensor to detect hand movements for VR mathematic game. The result is VR games are more interactive and interesting with hand motion.
The document describes a method for 3D gesture recognition using a Leap Motion controller. Data from over 100 users performing 12 predefined gestures was collected, totaling 1.5GB and 9,600 gesture instances. The gestures were represented as "motion images" by mapping 3D locations to pixels and projecting onto planes to create fixed-size representations. Deep belief nets and convolutional neural networks were used to extract features and classify the images. Future work includes incorporating hidden Markov models to segment continuous gestures and exploring recurrent neural networks.
Architecture for Locative Augmented RealityChinar Patil
This document summarizes a system for indoor augmented reality using locative technologies. The system uses GPS to locate rooms in a building and then displays augmented objects based on the device's orientation and markers in each room. Two methods are proposed - one that uses stored location data from GPS to identify rooms, and another that uses computer vision techniques to generate an indoor map and determine locations. The system is intended to allow anyone to augment rooms in a building by storing object and location data.
This document provides an overview of Kinect motion technology. It describes how Kinect uses an infrared sensor and camera to track a user's full-body motion and interpret gestures and voice commands to control applications without any additional input devices. Applications discussed include gaming, healthcare, virtual pianos, and using Kinect to control robots and provide gesture-based interactions in augmented reality. Advantages are noted as not requiring additional input devices and allowing for voice and facial recognition, while disadvantages include sensitivity to infrared light sources and not detecting certain materials well.
Final Year Project-Gesture Based Interaction and Image ProcessingSabnam Pandey, MBA
This document summarizes a student's final year project report on developing a gesture recognition system for browsing pictures. The student aims to implement algorithms for skin and contour detection of a user's hand in real-time images from a webcam. The report includes chapters on literature review of gesture recognition and image processing techniques, methodology using the waterfall model, requirements analysis and design diagrams, implementation details using OpenCV, and testing and evaluation of the project objectives and aims.
The Kinect sensor is an input device by Microsoft that uses cameras and microphones to track body movements and recognize gestures and voices. It consists of an RGB camera, depth sensor using infrared light, and 4-microphone array. The depth sensor uses structured light to measure distances by projecting a pattern and analyzing its distortion. Kinect can track up to 20 joints of the human body in real-time using skeletal tracking. It has applications in 3D scanning, sign language translation, augmented reality, robot control, and virtual fitting rooms due to its low-cost depth sensing capabilities.
This document provides an overview of the Kinect sensor and Kinect for Windows SDK. It describes the Kinect sensor's capabilities including depth sensing, skeletal tracking, and speech recognition. It explains how the Kinect SDK allows accessing the sensor's data streams and provides APIs for tasks like skeletal tracking and speech recognition. The document also outlines the tools included in the SDK and provides code examples for initializing the sensor, accessing sensor data, and using speech recognition features.
This document is a winter project report submitted by Shantanu Bharadwaj on fingerprint recognition. The objective of the project was to verify identities by comparing white and black points extracted from input fingerprints to a database of fingerprints using edge detection and image processing in MATLAB. The report describes converting images to grayscale, introducing MATLAB, presenting code to extract points and compare fingerprints, and simulated results correctly matching and not matching fingerprints. It concludes the method effectively recognizes fingerprints and future work could improve speed, storage, and usability.
The document discusses different types of sensors used for 3D digitization, including passive and active vision techniques. It describes synchronization circuit-based dual photocells that improve measurement stability and repeatability. Position sensitive detectors are discussed that can measure the position of a light spot in one or two dimensions on a sensor surface to acquire high-resolution 3D images. A proposed sensor architecture combines color and range sensing for applications like hand-held 3D cameras.
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...sugiuralab
This document describes software developed to optimize the placement of real-world sensors for machine learning applications. The software allows virtually placing different numbers of sensors and calculating identification rates to determine the optimal sensor configuration. It was tested on a facial expression identification task using distance sensors on eyeglasses. The optimal 9-sensor placement identified in software achieved an 85% identification rate when tested with real-world time-of-flight sensors, demonstrating its ability to support sensor layout optimization for machine learning systems.
11.biometric data security using recursive visual cryptographyAlexander Decker
This document summarizes a research paper on using recursive visual cryptography and biometric authentication to securely store biometric data. The paper proposes a scheme where secrets can be recursively embedded within image shares created by visual cryptography. Additionally, biometric authentication is used to securely access the shares. The scheme involves creating shares of secrets, embedding those shares as additional secrets within other shares, and authenticating users through iris recognition before revealing embedded secrets. This allows for multiple secrets to be hidden and revealed securely through the visual cryptography and biometric authentication methods combined.
This document is a seminar report on 3D televisions that was submitted for a bachelor's degree. It contains an introduction to 3D TVs and their expected role as the next revolution in television history. It then covers various topics related to 3D TV technologies in six chapters, including the basics of depth perception, stereoscopic imaging, holographic displays, system architectures, acquisition methods, and 3D display technologies. Figures and references are also included at the end.
The document summarizes a voice-controlled robot called Home Butler that is designed to assist handicapped individuals. The robot takes voice commands, locates requested objects using image processing and a database, navigates to the object using SLAM and LIDAR sensors, identifies the object with camera vision and image matching, grabs the object using sensors and motors, and returns it to the user by retracing its path. The robot integrates LabVIEW for voice decoding, MATLAB for image processing, and a Raspberry Pi operating system to run the integrated software and databases.
Estimation of body mass distribution using projective imageSuhas Deshpande
This document discusses using image processing techniques to estimate body mass distribution and assist in obesity diagnosis. It presents an overview of objectives, definitions of obesity, current diagnosis methods, and how image processing and MATLAB code can be used to extract measurements from images to calculate diagnostic parameters like body volume, density, and fat percentage. Initial analysis is conducted to verify the image processing methodology. The process of acquiring, calibrating, processing images and extracting measurements is explained. Results show body measurements can be estimated from images to help with obesity diagnosis.
Sensors on 3 d digitization seminar reportVishnu Prasad
The document discusses sensors for 3D digitization. It describes two main strategies for 3D vision - passive vision which analyzes ambient light, and active vision which structures light using techniques like laser range cameras. It then discusses an auto-synchronized scanner that can provide registered 3D surface maps and color data by scanning a laser spot across a scene and detecting the reflected light with a linear sensor, producing registered images with spatial and color information.
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...Rong-Hao Liang
This work presents WonderLens, a system of optical lenses and mirrors for enabling tangible interactions on printed paper. When users perform spatial operations on the optical components, they deform the visual content that is printed on paper, and thereby provide dynamic visual feedback on user interactions without any display devices. The magnetic unit that is embedded in each lens and mirror allows the unit to be identified and tracked using an analog Hall-sensor grid that is placed behind the paper, so the system provides additional auditory and visual feedback through different levels of embodiment, further enhancing the interactivity with the printed content on the physical paper.
Virtual Reality Training for Upper Limb Prosthesis PatientsAnnette Mossel
Virtual reality training is proposed to help patients learn to use upper limb prosthetics. A system would allow training at home to improve control skills without risks. It aims to provide feedback during manufacturing to optimize fit. The system uses optical tracking of a head mounted display and arm target to control a virtual prosthetic hand in Unity. It demonstrates grasping objects. Future work includes testing with patients and developing games to enhance motivation.
Basketball was invented in the 1890s by James Naismith at a YMCA in Massachusetts as a winter indoor activity. Since then, basketball has evolved significantly from its origins and spread worldwide. Key developments included the addition of dribbling and dunking, the growth of college and professional leagues, and an increased focus on analytics and business aspects as the sport became more popular and lucrative globally. While some elements have remained the same, basketball today would be barely recognizable compared to its original form due to extensive changes over time that demonstrate how sports evolve.
The document discusses the history and process of special effects in filmmaking. It begins with a brief overview of how special effects have been used as far back as the 1700s by magicians and progressed to techniques like matte paintings and rear projection screens in early films. The document then focuses on modern special effects, highlighting CGI techniques used in films like Jurassic Park, Avatar, and Harry Potter to bring imaginary worlds and creatures to life. It also describes the multi-step post-production process that visual effects artists use to add effects like explosions and integrate computer graphics into live-action footage.
Hand Gesture Recognition Based on Shape ParametersNithinkumar P
Hi guys,
I am sharing a new link for code & project report. Hope it help you in your academics. Contact me if you need any help.
https://drive.google.com/drive/folders/1H0p852jfoyQuFig_IoMyVVK-U5o18Mxh?usp=sharing
A real time system for hand gesture recognition on the basis of detection of some meaningful shape based features like orientation, centre of mass (centroid), status of fingers and thumb in terms of raised or folded and their respective location in image.
Algorithm is implemented in Matlab v7.10
We use this hand gestures for
1. Sign Language Recognition
2. Human Machine Interaction.
Vision based human computer interface using colour detectioneSAT Journals
Abstract In this paper we have tried to present an approach to Human Computer Interaction (HCI). Here we have tried to control actions
associated with mouse. Each mouse actions are associated to a colour pointer. These colour pointer are acquired as input using
web camera. The acquired colour pointer are processed using colour detection technique.
Keywords: Human Computer Interaction, Web Camera, Colour Detection, Colour Subtraction.
Detection Hand Motion on Virtual Reality Mathematics Game with Accelerometer ...TELKOMNIKA JOURNAL
Montessori method is a learning method using props. One of the developments props is to use the game as a medium of learning. The examples Game media as learning is the use of Virtual Reality or VR Technology. By using the VR, players will be brought into the virtual world as if the player is in the real world. The weakness of the VR game is the limited interaction with the outside world. Interaction uses only buttons and joysticks. In this paper we use Flex sensor and accelerometer sensor to detect hand movements for VR mathematic game. The result is VR games are more interactive and interesting with hand motion.
The document describes a method for 3D gesture recognition using a Leap Motion controller. Data from over 100 users performing 12 predefined gestures was collected, totaling 1.5GB and 9,600 gesture instances. The gestures were represented as "motion images" by mapping 3D locations to pixels and projecting onto planes to create fixed-size representations. Deep belief nets and convolutional neural networks were used to extract features and classify the images. Future work includes incorporating hidden Markov models to segment continuous gestures and exploring recurrent neural networks.
Architecture for Locative Augmented RealityChinar Patil
This document summarizes a system for indoor augmented reality using locative technologies. The system uses GPS to locate rooms in a building and then displays augmented objects based on the device's orientation and markers in each room. Two methods are proposed - one that uses stored location data from GPS to identify rooms, and another that uses computer vision techniques to generate an indoor map and determine locations. The system is intended to allow anyone to augment rooms in a building by storing object and location data.
This document provides an overview of Kinect motion technology. It describes how Kinect uses an infrared sensor and camera to track a user's full-body motion and interpret gestures and voice commands to control applications without any additional input devices. Applications discussed include gaming, healthcare, virtual pianos, and using Kinect to control robots and provide gesture-based interactions in augmented reality. Advantages are noted as not requiring additional input devices and allowing for voice and facial recognition, while disadvantages include sensitivity to infrared light sources and not detecting certain materials well.
Final Year Project-Gesture Based Interaction and Image ProcessingSabnam Pandey, MBA
This document summarizes a student's final year project report on developing a gesture recognition system for browsing pictures. The student aims to implement algorithms for skin and contour detection of a user's hand in real-time images from a webcam. The report includes chapters on literature review of gesture recognition and image processing techniques, methodology using the waterfall model, requirements analysis and design diagrams, implementation details using OpenCV, and testing and evaluation of the project objectives and aims.
The Kinect sensor is an input device by Microsoft that uses cameras and microphones to track body movements and recognize gestures and voices. It consists of an RGB camera, depth sensor using infrared light, and 4-microphone array. The depth sensor uses structured light to measure distances by projecting a pattern and analyzing its distortion. Kinect can track up to 20 joints of the human body in real-time using skeletal tracking. It has applications in 3D scanning, sign language translation, augmented reality, robot control, and virtual fitting rooms due to its low-cost depth sensing capabilities.
This document provides an overview of the Kinect sensor and Kinect for Windows SDK. It describes the Kinect sensor's capabilities including depth sensing, skeletal tracking, and speech recognition. It explains how the Kinect SDK allows accessing the sensor's data streams and provides APIs for tasks like skeletal tracking and speech recognition. The document also outlines the tools included in the SDK and provides code examples for initializing the sensor, accessing sensor data, and using speech recognition features.
This document is a winter project report submitted by Shantanu Bharadwaj on fingerprint recognition. The objective of the project was to verify identities by comparing white and black points extracted from input fingerprints to a database of fingerprints using edge detection and image processing in MATLAB. The report describes converting images to grayscale, introducing MATLAB, presenting code to extract points and compare fingerprints, and simulated results correctly matching and not matching fingerprints. It concludes the method effectively recognizes fingerprints and future work could improve speed, storage, and usability.
The document discusses different types of sensors used for 3D digitization, including passive and active vision techniques. It describes synchronization circuit-based dual photocells that improve measurement stability and repeatability. Position sensitive detectors are discussed that can measure the position of a light spot in one or two dimensions on a sensor surface to acquire high-resolution 3D images. A proposed sensor architecture combines color and range sensing for applications like hand-held 3D cameras.
Development of Real-World Sensor Optimal Placement Support Software(AsianCHI2...sugiuralab
This document describes software developed to optimize the placement of real-world sensors for machine learning applications. The software allows virtually placing different numbers of sensors and calculating identification rates to determine the optimal sensor configuration. It was tested on a facial expression identification task using distance sensors on eyeglasses. The optimal 9-sensor placement identified in software achieved an 85% identification rate when tested with real-world time-of-flight sensors, demonstrating its ability to support sensor layout optimization for machine learning systems.
11.biometric data security using recursive visual cryptographyAlexander Decker
This document summarizes a research paper on using recursive visual cryptography and biometric authentication to securely store biometric data. The paper proposes a scheme where secrets can be recursively embedded within image shares created by visual cryptography. Additionally, biometric authentication is used to securely access the shares. The scheme involves creating shares of secrets, embedding those shares as additional secrets within other shares, and authenticating users through iris recognition before revealing embedded secrets. This allows for multiple secrets to be hidden and revealed securely through the visual cryptography and biometric authentication methods combined.
This document is a seminar report on 3D televisions that was submitted for a bachelor's degree. It contains an introduction to 3D TVs and their expected role as the next revolution in television history. It then covers various topics related to 3D TV technologies in six chapters, including the basics of depth perception, stereoscopic imaging, holographic displays, system architectures, acquisition methods, and 3D display technologies. Figures and references are also included at the end.
The document summarizes a voice-controlled robot called Home Butler that is designed to assist handicapped individuals. The robot takes voice commands, locates requested objects using image processing and a database, navigates to the object using SLAM and LIDAR sensors, identifies the object with camera vision and image matching, grabs the object using sensors and motors, and returns it to the user by retracing its path. The robot integrates LabVIEW for voice decoding, MATLAB for image processing, and a Raspberry Pi operating system to run the integrated software and databases.
Estimation of body mass distribution using projective imageSuhas Deshpande
This document discusses using image processing techniques to estimate body mass distribution and assist in obesity diagnosis. It presents an overview of objectives, definitions of obesity, current diagnosis methods, and how image processing and MATLAB code can be used to extract measurements from images to calculate diagnostic parameters like body volume, density, and fat percentage. Initial analysis is conducted to verify the image processing methodology. The process of acquiring, calibrating, processing images and extracting measurements is explained. Results show body measurements can be estimated from images to help with obesity diagnosis.
Sensors on 3 d digitization seminar reportVishnu Prasad
The document discusses sensors for 3D digitization. It describes two main strategies for 3D vision - passive vision which analyzes ambient light, and active vision which structures light using techniques like laser range cameras. It then discusses an auto-synchronized scanner that can provide registered 3D surface maps and color data by scanning a laser spot across a scene and detecting the reflected light with a linear sensor, producing registered images with spatial and color information.
CHI'15 - WonderLens: Optical Lenses and Mirrors for Tangible Interactions on ...Rong-Hao Liang
This work presents WonderLens, a system of optical lenses and mirrors for enabling tangible interactions on printed paper. When users perform spatial operations on the optical components, they deform the visual content that is printed on paper, and thereby provide dynamic visual feedback on user interactions without any display devices. The magnetic unit that is embedded in each lens and mirror allows the unit to be identified and tracked using an analog Hall-sensor grid that is placed behind the paper, so the system provides additional auditory and visual feedback through different levels of embodiment, further enhancing the interactivity with the printed content on the physical paper.
Virtual Reality Training for Upper Limb Prosthesis PatientsAnnette Mossel
Virtual reality training is proposed to help patients learn to use upper limb prosthetics. A system would allow training at home to improve control skills without risks. It aims to provide feedback during manufacturing to optimize fit. The system uses optical tracking of a head mounted display and arm target to control a virtual prosthetic hand in Unity. It demonstrates grasping objects. Future work includes testing with patients and developing games to enhance motivation.
Basketball was invented in the 1890s by James Naismith at a YMCA in Massachusetts as a winter indoor activity. Since then, basketball has evolved significantly from its origins and spread worldwide. Key developments included the addition of dribbling and dunking, the growth of college and professional leagues, and an increased focus on analytics and business aspects as the sport became more popular and lucrative globally. While some elements have remained the same, basketball today would be barely recognizable compared to its original form due to extensive changes over time that demonstrate how sports evolve.
The document discusses the history and process of special effects in filmmaking. It begins with a brief overview of how special effects have been used as far back as the 1700s by magicians and progressed to techniques like matte paintings and rear projection screens in early films. The document then focuses on modern special effects, highlighting CGI techniques used in films like Jurassic Park, Avatar, and Harry Potter to bring imaginary worlds and creatures to life. It also describes the multi-step post-production process that visual effects artists use to add effects like explosions and integrate computer graphics into live-action footage.
The study evaluated the speed and accuracy of three gesture input devices (multi-touch display, trackpad, and Leap Motion Controller) for maze navigation tasks. 18 participants completed 6 mazes with each device. The multi-touch display had the lowest error rate and fastest completion time, performing best. The trackpad followed closely in performance. The Leap Motion Controller performed significantly worse in both accuracy and speed, likely due to its novelty compared to the other devices. Factors like experience, maze design, and measurement methods were controlled for in the study design.
Project Seminar on Leapmotion TechnologyAbhijit Dey
This slideshow contains details about the technology packed in the Leapmotion Controller, a gesture tracking device, which can detect your hand gestures and finger movements to navigate and use different desktop or laptop apps on Windows and Mac.
The document introduces Leap Motion, a new technology that allows users to control computers and applications with hand gestures and motions. Leap Motion uses infrared sensors and cameras to track finger positions with high precision, enabling users to navigate interfaces without touching a mouse or keyboard. It has applications in gaming, robotics, music, healthcare, art and design by providing a more intuitive and natural interface compared to other input devices. The Leap Motion device is small, affordable, and portable, and has the potential to change how people interact with computers.
Mouse Simulation Using Two Coloured Tapesijistjournal
In this paper, we present a novel approach for Human Computer Interaction (HCI) where, we control cursor movement using a real-time camera. Current methods involve changing mouse parts such as adding more buttons or changing the position of the tracking ball. Instead, our method is to use a camera and computer vision technology, such as image segmentation and gesture recognition, to control mouse tasks (left and right clicking, double-clicking, and scrolling) and we show how it can perform everything as current mouse devices can.
The software will be developed in JAVA language. Recognition and pose estimation in this system are user independent and robust as we will be using colour tapes on our finger to perform actions. The software can be used as an intuitive input interface to applications that require multi-dimensional control e.g. computer games etc.
Mouse Simulation Using Two Coloured Tapes ijistjournal
In this paper, we present a novel approach for Human Computer Interaction (HCI) where, we control cursor movement using a real-time camera. Current methods involve changing mouse parts such as adding more buttons or changing the position of the tracking ball. Instead, our method is to use a camera and computer vision technology, such as image segmentation and gesture recognition, to control mouse tasks (left and right clicking, double-clicking, and scrolling) and we show how it can perform everything as current mouse devices can.
The software will be developed in JAVA language. Recognition and pose estimation in this system are user independent and robust as we will be using colour tapes on our finger to perform actions. The software can be used as an intuitive input interface to applications that require multi-dimensional control e.g. computer games etc.
TURKISH SIGN LANGUAGE RECOGNITION USING HIDDEN MARKOV MODELcscpconf
In past years, there were a lot of researches made in order to provide more accurate and
comfortable interaction between human and machine. Developing a system which recognizes
human gestures, is an important study to improve interaction between human and machine.
Sign language is a way of communication for hearing-impaired people which enables them to
communicate among themselves and with other people around them. Sign language consists of
hand gestures and facial expressions. During the past 20 years, researches were made to
facilitate communication of hearing-impaired people with others.
Sign language recognition systems are designed in various countries. This paper presents a sign
language recognition system, which uses Kinect camera to obtain skeletal model. Our aim was
to recognize expressions, which are used widely in Turkish Sign Language (TSL). For that
purpose we have selected 15 words/expressions randomly (repeated 4 times each by 3 different
signers) which belong to Turkish Sign Language. We have used 180 records in total. Videos are
recorded using Microsoft Kinect Camera and Nui Capture. Joint angles and joint positions have
been used as features of gesture and achieved close to 100% recognition rates.
Turkish Sign Language Recognition Using Hidden Markov Model csandit
In past years, there were a lot of researches made in order to provide more accurate and comfortable interaction between human and machine. Developing a system which recognizes human gestures, is an important study to improve interaction between human and machine.
Sign language is a way of communication for hearing-impaired people which enables them to communicate among themselves and with other people around them. Sign language consists of hand gestures and facial expressions. During the past 20 years, researches were made to facilitate communication of hearing-impaired people with others.
Sign language recognition systems are designed in various countries. This paper presents a sign language recognition system, which uses Kinect camera to obtain skeletal model. Our aim was to recognize expressions, which are used widely in Turkish Sign Language (TSL). For that purpose we have selected 15 words/expressions randomly (repeated 4 times each by 3 different signers) which belong to Turkish Sign Language. We have used 180 records in total. Videos are recorded using Microsoft Kinect Camera and Nui Capture. Joint angles and joint positions have been used as features of gesture and achieved close to 100% recognition rates.
IRJET-Real Time Hand Gesture Recognition using Finger TipsIRJET Journal
This document presents a real-time method for hand gesture recognition using finger tips. The method first extracts the hand region from images using skin color thresholding. It then segments the finger-palm region and locates the palm center and finger tips. The number of detected finger tips and angles between finger tips and palm center are used by a rule-based classifier to predict the hand gesture label in real-time. The method was tested on five static gestures and achieved over 85% accuracy on average. Future work could involve using machine learning to improve hand detection performance in complex backgrounds.
Hand Gesture Controls for Digital TV using Mobile ARM Platformijsrd.com
This paper presents a new approach for controlling digital television using a real-time camera. Proposed method uses a camera, a mobile ARM platform and computer vision technology, such as image segmentation and gesture recognition, to control TV operations such as changing channels, increasing or decreasing volume etc. For this we have used an ARM based mobile platform with OMAP processor. For processing the images we implemented the code using OpenCV library. Hand detection is one of the important stages for applications such as gesture recognition and hand tracking. In this paper, it proposes a new method to extract hand region and consequently the fingertips from color images.
Controlling Mouse Movements Using hand Gesture And X box 360IRJET Journal
This document describes a project to control computer mouse movements using hand gestures detected by an Xbox 360 camera and computer vision techniques. The system uses the Xbox 360 camera and OpenCV library to capture images of the user's hand, segment the hand from the background, identify finger positions including the index finger to act as the mouse cursor, and map hand and finger movements to mouse movements and clicks on the computer screen. The system was developed as an alternative to using a physical mouse to make computer interaction easier and more accessible. It achieves average recognition rates but has limitations in detecting gestures from long distances. Future work could improve hand recognition accuracy and range.
1) The document proposes a method for gesture detection using a virtual surface detected by a webcam or front-facing camera on a laptop or smartphone.
2) By tracking the number and position of pixels representing an object's shape at different distances from the camera, the computer can detect movements of the object maintaining a constant distance as gestures on a virtual surface or plane.
3) This technique aims to enable gesture control of computer functions like mouse movement or app launching without requiring specialized cameras, as a cheaper and more portable alternative to physical input devices.
The document describes a Kinect-based drawing application that allows users to draw on a canvas using gestures captured by the Kinect sensor. It discusses implementing cursor control and brush tools using Kinect skeleton tracking and joint position data. Future work ideas include expanding the drawing features and developing a keyboard simulator and sign language translator using Kinect gesture recognition.
Sign Language Identification based on Hand GesturesIRJET Journal
This document presents a study on sign language identification based on hand gestures. The researchers aim to develop a system that can recognize American Sign Language gestures from video sequences. They use two different models - a Convolutional Neural Network (CNN) to analyze the spatial features of video frames, and a Recurrent Neural Network (RNN) to analyze the temporal features across frames. The document discusses the methodology used, including data collection from videos, pre-processing of frames, feature extraction using CNN models, and gesture classification. It also provides a literature review on previous studies related to sign language recognition and communication systems for deaf people.
Accessing Operating System using Finger GestureIRJET Journal
This document describes a system for accessing an operating system using finger gestures captured by a webcam. The system aims to reduce costs compared to existing gesture recognition systems that use expensive sensors like Kinect. It uses image processing algorithms to detect hand gestures from webcam input, recognize gestures like number of fingers, and execute corresponding operating system commands. The system architecture first segments hand regions from background, then classifies skin pixels and detects colored tapes on fingers to identify gestures. It can open programs and navigate computer contents contactlessly using natural hand movements. The proposed system aims to provide an affordable alternative for human-computer interaction without external input devices like mice or keyboards.
The document discusses hand gesture recognition including hand-forearm segmentation, palm-finger segmentation, and gesture recognition. It describes algorithms for identifying the hand region, separating the palm and fingers, and recognizing gestures based on features like finger positions. Logistic regression is used to train classifiers to identify gestures belonging to different classes and subclasses based on the number and orientation of fingers. Applications mentioned include sign language, robot control, gaming, and controlling smart TVs.
This document discusses gesture phones and gesture recognition technology. It begins by explaining how gestures are recognized through optical tracking, inertial tracking, and calibration. Examples are given of how gestures could control a smartphone, such as answering calls or controlling media playback. Challenges of gesture recognition are also mentioned. The document then discusses applications of gesture technology on Android and Windows phones through various apps that enable gesture control. Benefits of gesture technology include more intuitive interaction and control when touch is not possible.
Virtual Yoga System Using Kinect SensorIRJET Journal
The document describes a virtual yoga system using the Microsoft Kinect sensor. The system aims to make yoga exercises more engaging and motivating for patients by tracking their poses in real-time and providing feedback. It recognizes skeleton joints and yoga postures using the Kinect's depth sensing capabilities. Voice instructions guide users through different poses. The system is intended to address issues with traditional physiotherapy being tedious and repetitive. It allows customizing exercises to individual needs and challenges. Recognizing poses accurately in real-time could help patients perform exercises correctly and consistently at home without direct supervision.
Gestures are an important form of non-verbal communication that involve visible bodily motions. The document discusses the history and development of gesture recognition technologies, describing early data gloves and videoplace systems as well as current technologies like Cepal and ADITI that help people with disabilities control devices with gestures. It also outlines the key components of a gesture recognition system including modeling, analysis, and recognition of gestures and discusses classification methods like HMMs and MLPs. Applications discussed include virtual keyboards, navigaze, and Sixth Sense technology.
Gestures are an important form of non-verbal communication that involve visible bodily motions. They can be used to control devices through gesture recognition systems. Such systems work by modeling, analyzing, and recognizing gestures based on features extracted from images of body movements. Various technologies have been developed to recognize both static and dynamic gestures through methods like contour analysis, Hidden Markov models, and others. Gesture recognition has applications in areas like human-computer interaction, rehabilitation robotics, and interactive gaming.
Hand Gesture Recognition Using Statistical and Artificial Geometric Methods :...caijjournal
Gesture recognition represents the silent language that can be done with robots as well as they done to us,
this overseas language ensures that everyone can understand the meaning of the gesturing as well as can
reply and interact with. Because of that this silent language has chosen for deaf people in which can make
their communication easier between each of them as well as with other people.
In this paper we have brought to the table two different outstanding gesture recognition systems, those two
techniques achieved high ratio of recognition percentage as well as that are invariant-free techniques,
especially rotation perturbation that hinders the achievement of high level recognition percentage, the first
method is the recognition of hand gesture with the help of dynamic circle template and second one using
variable length chromosome generic algorithm, these two methods has been applied to different people and
the main objective was to reduce the database size used for training.
IRJET- Survey on Sign Language and Gesture Recognition SystemIRJET Journal
This document summarizes several research papers on sign language and gesture recognition systems. It discusses various techniques that have been used to convert sign language and gestures into understandable formats for hearing people. Vision-based and sensor-based approaches are described. Specific papers summarized include those using 7Hu moments and KNN classification achieving 82% accuracy, a system using gloves with flex and inertial sensors recognizing Taiwanese sign language with 94% accuracy, and a vision-based system using convex hull and defects to control computer functions. The document concludes by describing a system using a sensor glove to detect gestures from British and Indian sign languages and output text and audio.
This project develops a natural user interface for interacting with 3D environments using the Microsoft Kinect. Two Kinect devices are placed in a virtual reality space to track a user's full body movements and gestures. The Kinect data is used to create a digital avatar that represents the user's position and allows directly interacting with virtual objects by reaching out. Gesture recognition is also implemented to provide additional controls for navigation and selection. The goal is to make interacting with complex 3D data more intuitive by mirroring natural physical interactions.
IRJET- Recognition of Theft by Gestures using Kinect Sensor in Machine Le...IRJET Journal
This document discusses a system that uses a Kinect sensor to recognize theft gestures using machine learning. The system tracks a person's skeleton and compares their gestures to a dictionary of known theft and normal gestures. If a match for a theft gesture is found, an alarm and SMS notification are generated. The system was implemented using Processing and a logistic regression machine learning algorithm to classify poses as abnormal or normal based on joint angle features extracted from Kinect skeleton data. The system aims to automatically detect theft in environments like banks and stores to improve security.
IRJET- Recognition of Theft by Gestures using Kinect Sensor in Machine Le...
researchPaper
1. Abstract
This paper proposes new features that can be extracted
from the Leap Motion and Microsoft Kinect to be used
for recognizing hand gestures. The Leap Motion
(https://www.leapmotion.com) provides 3D hand
information and the Microsoft Kinect
(https://developer.microsoft.com/en-
us/windows/kinect) providesdepth information,and the
combination of both allows us to extrapolate a large
amount of diverse information about the hand’s shape
enabling more accurate gesture recognition than is
possible with either camera individually. Using a
database of 10 distinct American Sign language
gestures, provided by Marin et al. [1,2], our new
features allow us to achieve a high level of accurate
recognitions.
1. Introduction
Interest in hand gesture recognition has continued to
grow in recent years as the demand for its utility
increases in areas such as video game development and
sign language translation for human computer
interaction (HCI). Advancements in 3D imaging and
Time-of-Flight cameras as well as the availability of
new technology such as the Leap Motion and the
Microsoft Kinect have made major advances to making
hand gesture recognition possible for a consumer
market. The Microsoft Kinect provides both an RGB
and depth image from which hand information can be
extracted, while the Leap Motion, created specifically
for hand gesture recognition, returns coordinate
information about the hand. Many different approaches
have been taken to distinguish gestures such as
topological features of holes in the image to distinguish
gestures [3] and rapid recognition to recognize gestures
before the gesture is completed [4]. Superpixel Earth
Mover’s Distance, which is widely used in image
retrieval and pattern recognition, is another approach
that has been applied to gesture recognition [5,6]. In this
paper, we utilize data from both a Microsoft Kinect and
a Leap Motion to obtain a more complete image of the
hand, making it possible to correctly identify the hand
gesture as shown in Figure 1.
The Leap Motion has improved significantly overthe
last few years.The Leap now provides more information
than previously available. In our approach,we made use
of these features to improve the accuracy of our
recognition. One of the features that is now available to
the Leap is extended fingers. The Leap is able to detect
which fingers are extended with very high accuracy.
This allows us to distinguish between gestures such as
the ones seen in Figure 1. The database we have selected
to use was created using the older version of the Leap
Motion. This means that they did not have the extended
finger feature available. To see how this feature
improves the accuracy of recognition, we manually
entered the extended fingers into the data. Section 2.1
will discuss our other new features for the Leap,
maximum X and Y value, average area, and X-Y ratio.
The Microsoft Kinect uses a Time-of-Flight camera
to create a grayscale depth image which makes the hand
easy to distinguish from the background. To find the
hand in the image, we assume the hand is always the
closest object to the Kinect and remove anything behind
the closest object. Other approaches use blob detection
to locate the hand with good results [7]. After finding
the hand,we extract features from the image in order to
gather data about the hand’s shape. The features we
extract are Silhouette, Convex Hull, Cell Occupancy
Average Depth, Cell Occupancy Non-Zero, Distance
Contour, Fingertip Distance and Fingertip Angles. We
then run the data through the Random Forest classifier,
which interprets the data and attempts to correctly
identify the gesture.There has been much work already
done using the Microsoft Kinect for hand gesture
recognition, such as researchers at the universities of
Padova and Moscow[1,8]. We have used features
inspired by these as well as innovative new ones to
maximize our recognition rate. The dataset of gestures
that we have used include 10 distinct hand gestures that
can be seen in Figure 2.
Fig. 1 Kinect and Leap setup
Leap Motion and Microsoft Kinect Hand Gesture Recognition
Josiah Bailey, Julie Kunnumpurath, Ryan Malon, Joe Nicosia, James Zwilling
State University of New York at Binghamton
2. 2. Hand Gesture Features
2.1 Leap Features
The Leap Motion provides us information about the
hand that can be accessed.Using the Leap SDK, we are
able to extract information about the fingertip positions,
palm center, extended fingers and various other points
of interest on the hand. Using this information, we
calculate our features.
2.1.1 Scale Factor and Number of Fingers
In order to account for different sized hands, we use a
scale factor. We average the x, y and z values for all
extended fingertips to create a new point we find the
distance between that averaged point and the center of
the palm. This distance is the scale factor. The number
of fingers is provided by the Leap and is the total
number of extended fingers detected by the Leap.
2.1.2 Extended Finger Logic Vector
We create a 5-bit vector where the most significant bit
(MSB) represents the thumb and the least significant bit
(LSB) represents the pinky all other fingers follow this
pattern. The Leap tells us what fingers are extended and
those corresponding bits are set to 1. This feature
provides a simple method to differentiate between two
gestures that both have the same number of fingers
extended.
2.1.3 Extended Finger Area
We calculate the area of the triangle using the points at
the two fingertips furthest apart and the center of the
palm as shown in Figure 3. This area is then divided the
number of extended fingers. This division helps for
differentiating between gestures 7and 8 as well as other
similar cases where the area would be the same but they
are two distinct gestures.The area is then divided by our
scale factor.
2.1.4 Max X and Y Value
We use the maximum x and y values with respect to the
palm center. These values are divided by the scale
factor.
2.1.5 Length Width Ratio
Based off the work of Ding et al.[9], we calculate the
width using the distance between the two furthest apart
fingers. The length is the greatest distance between a
fingertip and the center of the palm as seen in Figure 3.
The length is then divided by the width getting the ratio,
which
is then divided by our scale factor.
2.1.6 Fingertip Distances
Using the fingertip distances the Leap provides we
divide each by our scale factor to create distances
normalized based on the size of the hand.
2.1.7 Fingertip Directions
This is another feature that the Leap Motion calculates
directly. It is a vector of floats for each fingertip which
denotes the direction it is pointing. This is a feature of
the new Leap SDK, so we only use this feature for the
dataset that we created.
2.2 Kinect Features
Starting with the depth image returned by the Microsoft
Kinect, we then use the assumption that the hand will be
the closest object to the camera. By using a depth
threshold, we are able to cut out most of the image’s
background. Then the wrist is found by determining the
place where the width of the hand image decreases at the
highest rate.After everything below the wrist, the image
is then scaled, so the hand image is a uniformsize. Once
an image of just the hand is obtained, we are then able
to calculate some information that will be used in
several of our features.Among those are the contours of
the hand image, and the palm center, which we found by
performing a Gaussian Blur on the image, and then
using the brightest point on the image [10].
Fig. 2 Dataset Gestures
a) b)
Fig. 3 a) Extended Finger Area and
b) Length Width Ratio
3. 2.2.1 Silhouette
Based off of the work of Kurakin et al.[8], this feature
involves dividing the image into 32 equal radial sections
as seen in Figure 4. We calculate the distance from the
center of the image to the contour of the hand in each
radial section. The total distance of each section is then
averaged based off of how many points on the contour
are in the section. This helps to distinguish between
fingers that are together and fingers that are separated.
2.2.2 Convex Hull
Calculates the area of the black space around the hand.
By finding the space in between the fingers as shown in
Figure 4, this feature can showhow many fingers are up
and distinguish between fingers being togetheror apart.
2.2.3 Cell Occupancy Average Depth
Based off of the work of Kurakin et al.[8], this feature
involves splitting the image into a predetermined
number of squares of equal size. We used 64 squares in
ourresearch. We then find the average depth in each cell
using the grayscale value of each pixel in the hand.
2.2.4 Cell Occupancy Non-Zero
Based off of the work of Kurakin et al.[8], using the
same grid work as the above feature, the number of
occupied pixels in each cell is also saved as a unique
feature.
2.2.5 Distance Contour
The distance from the palm center to each point on the
contour of the hand can help to find local maxima and
minima, and distinguish between gestures with different
numbers of raised fingers.
2.2.6 Fingertip Distance
After using changes in the difference in the distance to
the palm centerfrom points along the contourto find the
fingertips, the distance from the palm center to each
fingertip is obtained. This feature can show some
differences between gestures such as 6 and 10, where
the difference is the finger being raised.
2.2.7 Fingertip Angles
Calculates the angles between the palm center and the
fingertips found in the Fingertip Distance feature.
2.3 Feature Vector
The feature extraction provides us with eight feature
vectors, one from the Leap and seven from the Kinect.
Only fingertip distances forthe Leap needed to be stored
in a vector while the other features were single values.
Each of the Kinect features were stored in separate
feature vectors as shown in Figure 5. All the Leap
features and Kinect features are then grouped into two
separate sets. These feature vectors can be tested alone
or combined with different features when using the
Random Forest test.
Fig. 5 Feature Vectors
3. Results
Testing of our features was done on the [1,2], which
contains ten gestures from fourteen individuals with
data samples for each gesture from each person. We
used the Random Forest classifier to measure the
accuracy of our gesture recognition. Table 1 shows our
results compared to the state of the art and Table 2
shows the results of the Random Forest test.
SVM
Leap Kinect Combined
Marin et
al.
81.5% 96.35% 96.5%
Our
Results
68.00% 95.36% 95.50%
Table 1. SVM results
The highest accuracy we got with the SVM test with
just the Leap features was 68%. Number of fingers, max
X, and area yielded this result. While running these tests
we were unable to use the extended fingers feature as
a) b)
Fig. 4 a) Silhouette and b) Convex
Hull
4. well as the fingertip directions, because the database
was made using an older version of the Leap Motion
SDK. The newer Leap Motion SDK provides that data,
and when included, this data is very useful for gesture
recognition. For the same reason, fingertip distances
were not included in this calculation either. When
manually entering the extended fingers, the accuracy of
fingertip distances is 90.36% by itself and extended
fingers is 100% by itself. The highest accuracy we got
with the Kinect was 95.36% on the SVM test. The
features that yielded this were convex hull, cell
occupancy average depth, and fingertip angles.
The highest recognition rate using a combination of
both Kinect and Leap features was 95.50% on the SVM.
The combination of number of fingers, ratio, max X and
Y, convex hull, cell occupancy average depth, and
fingertip angles produced that result.This set once again
excludes the Leap fingertip distance, extended fingers,
and fingertip directions.
Random Forest
Leap Kinect Combined
Our
Results
81.71% 95.21% 96.07%
Table 2. Random Forest results
Using the Random Forest test, we got significantly
improved results. Similar to the SVM tests, the same
Leap features as previously mentioned were excluded
from the calculations. The highest results using only
Leap or Kinect features gave us 81.71% and 95.21%,
respectively. Combining ratio, max X, max Y, and area
created the best combination of Leap features.
Combining convex hull, cell occupancy average depth,
and fingertip distances created the best performing
Kinect feature set. The highest combined accuracy of
96.07% was obtained with multiple feature sets. One
combination included ratio, max X and Y, area, convex
hull, cell occupancy average depth,and fingertip angles.
The other combination was similar but contained
number of fingers and fingertip distance instead of Max
Y and fingertip angles.
In addition to getting a higher recognition rate with
the Random Forest test, this test has a lower
computation time. When running all the Leap and
Kinect features, the SVM took 36 seconds to run. The
Random Forest test only took 8 seconds.The significant
decrease in computation time is important for real-time
gesture recognition.
Figure 6. ASL Dataset
We created our own dataset as well to test the
recognition rate ofour features. This dataset contains the
entire American Sign Language alphabet as shown in
Figure 6, excluding the letters J and Z. The dataset has
seven individuals performing each gesture ten times.
This dataset was made using the Leap Motion Desktop
V2, which means it provides more information
including extended fingers and fingertip directions.This
means that the Leap extended fingers and fingertip
distances can be used in our calculates for accuracy.
Utilizing all three of these,in addition to the rest of our
features, increased our recognition rate as well, as seen
in Table 3.
ASL Dataset Results
Leap Kinect Combined
Random
Forest
98.60% 87.74% 98.75%
SVM 80.83% 84.23% 91.73%
Table 3. Recognition rate with ASL dataset
Using the SVM, the combined results had a
recognition rate of 91.73% with a feature set consisting
of extended fingers, ratio, max X and Y, area, Leap
fingertip distances, and cell occupancy average depth.
For the Leap by itself, the recognition was 80.83% with
max X and Y along with fingertip direction. The Kinect
was 84.23% with just cell occupancy average depth.The
Random Forest results were even better. The combined
rate was 98.75% with Leap fingertip distances,fingertip
direction, convex hull, fingertip angles, and Kinect
fingertip distances. The Leap alone got a recognition
rate of 98.6% with all the Leap features, excluding max
Y. Kinect had a recognition rate of 87.74% with convex
hull, cell occupancy average depth,fingertip angles,and
fingertip distances.
4. Discussions
Popularity of virtual reality continues to grow and the
addition of hand gesture recognition would only further
this already burgeoning field. Advances in technology
have allowed for highly accurate collections of data
from everyday human interactions. Computers can now
process motion as input, instead of static symbols.
Implementing hand gesture recognition can help
5. broaden fields such as virtual reality, interactive
simulations, as well as help the hearing impaired.
The results we calculated with the new dataset
demonstrates how improvement in the Leap Motion
technology available for gesture recognition improves
the accuracy of gesture recognition as it now can
produce more precise information. Data such as
extended fingers and finger direction helped distinguish
between similar gestures that previously would have
been undifferentiated using older technology. As
interest in this field continues to increase, the
capabilities of technologies such as the Leap will
increase.
One reason that the Kinect features did not perform
as well on the ASL dataset is because of the similar
shapes ofmany of the gestures.As seen in Figure 6, it is
clear to see that many of the gestures are very similar.
With such similar gestures,it becomes more difficult to
distinguish between gestures like M and N, as seen in
Figure 7.
Figure 7. ASL Random Forest confusion matrix
By using Random Forest instead of an SVM, we were
able to get a higher recognition rate as well. Using the
Random Forest test works much better with a larger set
of features than an SVM. Because the training model
deals with well over 300 features considering some of
our features are vectors with multiple values, the SVM
requires much more time to train the data. In addition to
the higher accuracy provided by Random Forest, it was
also generally much more time efficient than the SVM.
5. Conclusions
In this paper, we propose new features one can extract
from the Leap Motion and Microsoft Kinect for use in
gesture recognition. Using the [1,2] dataset, we
produced promising results in gesture recognition,
especially from the Kinect. When creating a more
complex dataset consisting ofthe ASL alphabet,we had
access to the additional data the Leap Motion provides,
and observed an increase in accuracy by a significant
amount. The combination of our Kinect and Leap
features, while not perfect, work to correctly recognize
gestures with high accuracy.
Gestures like J and Z were not possible to test with
static images because in ASL as these gestures require
motion. Future work needs to be done with real time
hand tracking to correctly identify gestures in motion.
References
[1] Giulio Marin, Fabio Dominio, and Pietro Zanuttigh,
“Hand GestureRecognition With Jointly Calibrated Leap
Motion and Depth Sensor,” in Multimedia Tools and
Applications, 2014.
[2] G. Marin et al. “Hand Gesture Recognition with Jointly
Calibrated Leap Motion and Depth Sensor,” Multimedia
Tools and Applications, pp. 1-25, 2014.
[3] Kaoning Hu and Lijun Yin, “Multi-Scale Topological
Features for Hand PostureRepresentation and Analysis,”
in International Conference on Computer Vision, 2013.
[4] Yanmei Chen, Zeyu Ding, Yen-Lun Chen, and Xinyu
Wu, “Rapid Recognition of Dynamic Hand Gestures
using Leap Motion,” in IEEE International Conference
on Information and Automation, 2015.
[5] Chong Wang, Zhong Liu, and Shing-Chow Chan,
“Superpixel-Based Hand Gesture Recognition With
Kinect Depth Camera,” in IEEE Transactions on
Multimedia, Vol. 17, No. 1, 2015.
[6] Z. Ren, J. Yuan, J. Meng, and Z. Zhang, “Robust part-
based hand gesture recognition using Kinect sensor,”
IEEE Trans. Multimedia, vol.15, no. 5, 2013.
[7] Xia Liu and Kikuo Fujimura, “Hand GestureRecognition
using Depth Data,”in 6th IEEE International Conference
on Automatic Face and Gesture Recognition, 2004.
[8] Alexey Kurakin, Zhengyou Zhang, and Zicheng Liu, “A
Real-Time System for Dynamic Hand Gesture
Recognition with a Depth Sensor,”in Proc. of EUSIPCO,
2012.
[9] Zeyu Ding, Zexiong Zhang, Yanmei Chen, Yen-Lun
Chen, and Xinyu Wu, “A Real-time Dynamic Gesture
Recognition Based on 3D Trajectories in Distinguishing
Similar Gestures,” in IEEE International Conference on
Information and Automation, 2015.
[10] Fabio Dominio, Mauro Donadeo, and Pietro Zanuttigh,
“Combining MultipleDepth-based Descriptors for Hand
Gesture Recognition” Pattern Recognition Letters, 2013.