This document describes a method for controlling a computer using hand and finger gestures detected through a webcam, without the need for specialized hardware or gesture recognition training. The method tracks color markers attached to fingers to detect finger motion in real-time and uses the motion to control the mouse pointer position and clicks. An application was created with a graphical user interface that allows setting the marker color and controls the mouse based on finger movements detected by calculating pixel value changes of the colored markers in video frames. The method provides a low-cost way to interact with a computer using natural hand gestures without lag compared to existing gesture recognition methods.
2. allow the user to control a television set by using a video
camera and computer vision template matching algorithms to
detect a user's hand from across a room. In this approach a
user could show an open hand and an on-screen hand icon
would appear which could be used to adjust various
graphical controls, such as a volume slider. To activate slider
the user needs to cover the control for a fixed amount of
time. By this approach users enjoyed this alternative to the
physical remote control and that the feedback of the on-
screen hand was effective in assisting the user. However, to
activate the different controls users needs to hold their hand
up for long amounts of time, so it is tiring of user. Same type
of problem of user fatigue is common in case of the one of
the gesture-based interfaces called gorilla arm.
Other approaches works by using multiple cameras to detect
and track hand motion by producing a 3D image [2][4]. As
these systems are using multiple cameras so it required
careful installation process as calibration parameters such as
the distance between the cameras was important in the
triangulation algorithms used. Since a large amount of video
data needed to be processed in real-time these algorithms
proves computationally expensive and stereo-matching
typically fails on scenes with little or no texture. Ultimately,
it is not possible to use such systems outside of their special
lab environments. In [3] Pranav Mistry presented the Sixth
Sense wearable gestural interface, which used a camera and
projector worn on the user's chest to allow the user to zoom
in on projected maps(among other activities) by the use of
two-handed gestures. In order for the camera to detect the
user's hand, the user had to wear brightly-colored markers on
their index fingers and thumbs. The regular webcam worn by
the user would also be sensitive to environmental conditions
such as bright sunlight or darkness, which would makes it
difficult to recognize color marker. Wilson and Oliver [5]
aimed to create G Windows which is a Minority Report-like
environment . By pointing with their hand and using voice
commands the user was able to move an on-screen cursor of
a Microsoft Windows desktop to trigger actions like "close"
and "scroll" to affect the underlying application windows.
They concluded that users preferred interacting with hand
gestures over voice commands and if desktop workspaces
designed for gesture interactions so it would be more
profitable in further. When considering online workspaces,
several commercial and academic web-based collaboration
solutions have existed for some time. However, there are
limitations like interaction with other users in these
environments is usually limited to basic sharing of media
files, rather than allowing for full real-time collaboration of
entire web-based applications and their data between users
on distinctly deployed domains, as this paper proposes.
Cristian Gadea, Bogdan Ionescu [6] aimed to create Finger-
Based Gesture Control of a Collaborative Online Workspace,
but system needs continuous internet connectivity, but this is
not possible always in India. It needs an online workspace
called as UC-IC, the application is within web browser to
determine latest hand gesture, but it is not possible always to
provide all time high speed connectivity everywhere and
every time. Beside this it needs the training to recognize
gesture, it slows down the system. In [7,8,9] methods are
based on gesture recognition algorithms. It needs ‘ANN
training’ which makes whole process slow and reduce
accuracy. Because each time if we are trying to recognize the
gesture so the ANN training will be needed, and much of
time will be needed. So system will not work or can’t match
its output speed with exact motion of mouse pointer.
V. SYSTEM ARCHITECTURE
In this system we have used different preprocessing
techniques, feature extraction a tool for recognizing the
pixel based values or coordinates of RBG color by tracking
the change in pixel position of different color stickers
attached at fingers of user in real time. So accordingly the
new updated values will be sent to PC to track motion of
mouse.
Figure 1: Block diagram of the different phases of the system.
A. Video Capturing: Here continuous video will be
given as an input toby our system to the laptop.
B. Image Processing: Image segmentation is done under
two phases:
1. Skin Detection Model: To detect hand and fingers
from image.
2. Approximate Median model : For subtraction of
background.It has been observed that by the use of
both methods for segmentation was obtained much
better for further process.
C. Pixel Extraction: In this phase we will get pixel
sequence from image ‘without using any ANN training’
to get exact sequence of motion of hands and fingers.
D. Color Detection : In this phase we will extract color
positions of RGB color from pixel sequence to detect the
motion of hand and fingures by calculating change in
pixel values of RBG colors.
935935
3. E. Controlling Position of mouse Pointer: Send signals to
system to control mouse pointer motion and mouse
events. It will give an appropriate command to PC to
display the motion of mouse pointer according to motion
of users fingers or hand.
VI. TECHNIQUES FOR FOR PIXEL AND COLOR
DETECTION
A. Video Capturing
1) Loading Drivers
System may have multiple web camera. It needs camera
driver.Each Driver has a unique ID.Use
“capGetDriverDescription” function which return
Name of driver and the ID of driver.
2) Capturing
To capture camera view:
obj=capCreateCaptureWindow();
To start showing camera view in picture box in our
s/w: sendmesage(connect,obj);
B. Processing frames of video
We cant process the video directly ,so we need to convert
video into image by function: picture=hdcToPicture(obj);
Suppose camera of 16MP and fps value=45 (Frames per
second).So we will need to process 45 images per second.
To get detail pixel RGB (RED,GREEN,BLUE) values use
function “ getBitmapBits() “
C. Getting Pixel Color:
Figure 2. Getting Pixel Color
D. Scanning
Figure 3. Scanning pixel wise horizontally in x and come
back vertically in y direction
E. Algorithm for pixel and color detection
Figure 4: Algorithm for pixel and color detection
X- x-coordinate of pixel in image.
Y- y-coordinate of pixel in image.
R-Red
B-Blue
G-Green
VII. METHODOLOGY
A. Hand Position tracking and mouse control
Figure 5. Hand Position tracking and mouse control
Getting user input virtually is the main aim for this module
where user will move his finger in front of camera capture
area. This motion will capture and detected by the camera
and processed by the system frame by frame. After
processing system will try to get the finger co-ordinates and
once co-ordinates get calculated it will operate the cursor
position.
B. Laser Pointer Detection
Figure 6. Laser Pointer Detection
936936
4. C. Hand Gesture Based Auto Image Grabbing: (virtual
Zoom in/out)
Figure7. Virtual Zoom in/out
D. Camera Processing and image capturing:
Figure8. Camera Processing and image capturing
E. Virtual Sense for file handling.
This system will make use of the virtual sense technology in
order to copy a file from one system into another within a
local area network LAN/Wi-Fi. The user will make an
action of picking upon the file that needs to be copied and
then move it to the system where the file would be copied
and then release it over that system.
VIII. RESULTS AND DISCUSSION
The software has provision to control all clicking events of
mouse by using a color marker. . After several experiments,
it was observed that use of red color marker are more
effective in comparison with when other color markers are
used.
Figure9.Graphical user Interface of application .
Figure10.Start camera
Figure11. Set the marker color.
937937
5. Figure12.Control motion and clicking events of mouse with
the color marker set earlier
IX. CONCLUSION
This project can be very useful for people who want to
control computer without actually touching to system or by
using wireless mouse which needs always a platform to
operate. The accuracy is more when we are using red color
marker in comparison to the case when other color markers
were used individually. The problem of changing lighting
condition and color based recognition has been solved in
this work by giving the button to set the marker color at
starting phase of application. Still there are some problems
while recognition speed , where speed of controlling motion
of mouse is not 100% which need to be improved for some
of the gestures. All mouse movement and keys action has
already been mapped that is working well under given
circumstances. As a part of future scope the application can
be improved to work with mobile phone and play stations.
Other mode of human computer interaction like voice
recognition, facial expression, eye gaze, etc. can also be
combined to make the system more robust and flexible.
ACKNOWLEDGMENT
I want to thank all subjects participating in our experiments,
my guide for her valuable guidance, advice and help
provided during this project. And finally I will thank to my
parents for their encouragement.
REFERENCES
[1] W. T. Freeman and C. D. Weissman, "Television Control by
Hand-Gestures", in proc. of international. Workshop on
Automatic Face and Gesture Recognition. IEEE Computer
Society, 1995, pp. 179-183.
[2] Z. Jun, Z. Fangwen, W. Jiaqi, Y. Zhengpeng, and C. Jinbo,
"3D Hand-Gesture Analysis Based on Multi-Criterion in
Multi-Camera Systems”,in ICAL2008 IEEE Int. Conf. on
Automation and Logistics. IEEE Computer Society,
September 2008, pp. 2342-2346.
[3] P. Mistry and P. Maes, "Sixth Sense: A Wearable Gestural
Interface", in ACM SIGGRAPH ASIA 2009 Sketches. New
York, NY, USA: ACM,2009.
[4] A. Utsumi, T. Miyasato, and F. Kishino, " Multi-Camera
Hand Pose Recognition System Using Skeleton Image ", in
RO-MAN'95:Proc. Of 4th IEEE international Workshop on
Robot and Human Communication. IEEE Computer Society,
July 1995, pp. 219-224.
[5] A. Wilson and N. Oliver, "G Windows: Robust Stereo Vision
for Gesture Based Control of Windows", in ICMI03: Proc. of
5th Int. Con! On Multimodal interfaces. New York, NY,
USA: ACM, 2003, pp. 211-218.
[6] Cristian Gadea, BogdanIonescu, Dan Ionescu, Shahidul Islam,
Bogdan Solomon University of Ottawa, Mgestyk
Technologies, “ Finger-Based Gesture Control of a
Collaborative Online Workspace”, 7th IEEE International
Symposium on Applied computational intelligence and
Informatics· May 24-26, 2012 Timisoara, Romania.
[7] Manaram Ganasekera,”Computer Vision Based Hand
movement Capturing System ”, The 8th International
Conference on Computer Science & Education (ICCSE 2013)
April 26-28, 2013. Colombo, Sri Lanka
[8] Fabrizio Lamberti, ”Endowing Existing Desktop Applications
with Customizable Body Gesture-based Interfaces”,IEEE Int’l
Conference on Consumer Electronics(ICCE),978-1-4673-
1363-6, 2013
[9] Anupam Agrawal, Rohit Raj and Shubha Porwal, ”Vision-
based Multimodal Human-Computer Interaction using Hand
and Head Gestures”, Proceedings of 2013 IEEE Conference
on Information and Communication Technologies (ICT 2013)
[10] M. Turk and G.Robertson, “Perceptual user interfaces”,
Communications of the ACM, vol. 43(3), March 2000.
[11] Y. Wu and T. S. Huang, "Vision-Based Gesture Recognition:
A Review", Lecture Notes in Computer Science, Vol.
1739,pp. 103-115,1999
938938