Boltay Haath

Project Mentor:
Mr. Aleem Khalid Alvi [aleem_alvi@yahoo.com] [akalvi@ssuet.edu.pk]
Assistant Professor

Team Members:
Mr. Ali Muzzaffar [ali_muzzafar@yahoo.com] [alim@ssuet.edu.pk]
Mr. Mehmood Usman [apnamehmood@yahoo.co.uk] [mgazdhar@ssuet.edu.pk]
Mr. Suleman Mumtaz [smkhowaja@yahoo.com] [smumtaz@ssuet.edu.pk]
Mr. Yousuf Bin Azhar [musuf@yahoo.com] [muhybina@ssuet.edu.pk]

http://www.boltayhaath.cjb.net

! " !

#" $%%& Sir Syed University of Engineering & Technology

1. ABSTRACT

Humans know each other by conveying their ideas, thoughts, and experiences to the people
around them. There are numerous ways to achieve this and the best one among the rest is the
gift of “Speech”. Through speech everyone can very convincingly transfer their thoughts and
understand each other. It will be injustice if we ignore those who are deprived of this
invaluable gift.

The only means of communication available to the vocally disabled is the use of “Sign
Language”. Using sign language they are limited to their own world. This limitation prevents
them from interacting with the outer world to share their feelings, creative ideas and
potentials. Another problem is that very few people who are not themselves deaf ever learn to
sign. This therefore increases the isolation of deaf and dumb people.

Technology is one way to remove this hindrance and benefit these people, and the project
Boltay Haath is one such attempt to solve this problem by computerized recognition of sign
language. Boltay Haath is an Urdu phrase which means ‘Talking Hands’. The basic concept
involves the use of special data gloves connected to a computer while a vocally disabled
person (who is wearing the gloves) makes the signs. The computer analyzes these gestures
and synthesizes the sound for the corresponding word or letter for ordinary people to
understand.

Several researchers have explored these possibilities and have successfully achieved finger-
spelling recognition with high levels of accuracy, but progress in the recognition of sign
language, as a whole has been limited.

This project is an attempt to recognize Pakistan Sign Language (PSL), which has not been
done in any other system. Furthermore, the Boltay Haath project aims to produce sound
matching the accent and pronunciation of the people of the region in which PSL is used.
Since only single-handed gestures have been considered in this project it is obviously
necessary to select a subset of PSL to be considered for implementation of Boltay Haath as it
would take vast amounts of time to sample most or all of the 4000 signs in PSL.

$


2. SYSTEM OVERVIEW
The system objective was to develop a computerized Pakistan Sign Language (PSL)
recognition system which is an application of Human Computer Interface (HCI). The system
considers only single handed gestures; therefore a subset of PSL has been selected for the
implementation of Boltay Haath. The basic concept involves the use of computer interfaced
data gloves worn by a disabled person who makes the signs. The computer analyzes these
gestures, minimizes the variations and synthesizes the sound for the corresponding word or
letter for normal people to understand. The basic working of the project is depicted in the
following figure.

Figure 2.1 - System Diagram

The above diagram clearly explains the scope and use of the Boltay Haath system. The
system aims at bridging communication gaps between the deaf community and other people.
When fully operational the system will help in minimizing communication gaps, easier
collaboration and will also enable sharing of ideas and experiences.

2.1 PERFORMANCE MEASURES
The following performance parameters were kept in mind during the design of the project:
• Recognition time: A gesture should take approximately 0.25 to 0.5 second in the
recognition process in order to respond in real time.
• Synchronized speech synthesis. The speech output corresponding to a gesture should
not lag behind the gesture output by more than 0.25 seconds.
• Continuous and automatic recognition: To be more natural the system must be
capable of recognizing the gestures continuously without any manual indication or
help for demarcating the consecutive gestures.
• Recognition Accuracy: The system must recognize the gestures accurately between
80 to 90 percent.

'


2.2 DESIGN METHODOLOGY
Waterfall plus Iterative model for the development of Boltay Haath has been followed. This
model was selected because a thorough design of the system was needed before initiating. All
the specifications had to be outlined in detail and all issues worked out so that the
development of this project could carry out within the time and cost constraints. In other
words architecture-first development has been attempted. After this stage a broad
understanding was developed by the team and trouble spots could easily be sensed in the
design. So naturally the next logical step was to repeat the critical stages of the process to
iron out any problems in the way as well as evaluate design alternatives and tradeoffs. Object
oriented approach being the most practical way of developing such kind of projects was
obviously the best choice for the project. Test plans have also been designed to test the
system systematically. The sub systems were tested separately as well as in cohesion.

Five Improvements for the Waterfall
Model to Work
- Complete program design before
analysis and coding begins
- Maintain current and complete
documentation.
- Do the job twice, if possible.
- Plan, control, and monitor testing.
- Involve the user.

Figure 2.2 - Waterfall plus Iterative Model

2.3 UNIQUE AND INNOVATIVE IDEAS
Different people in different regions of the world have contributed towards the recognition of
sign language of their regions but so far no work has ever been done regarding the
recognition of sign language (PSL) of our region. So, Boltay Haath is the first system which
contributes in achieving this noble cause. Furthermore, the system aims to produce sound
matching the accent and pronunciation of the people of the region in which PSL is used.

The recognition systems developed to date usually solves the problem of gesture demarcation
through the use of various manual techniques and operations. To make the system more
natural and interactive, Boltay Haath uses the technique for the real time continuous
recognition of gestures, hence no need for any manual indication or signal.

Although the primary objective of Boltay Haath is to recognize Pakistan Sign Language but
the system is capable of recognizing any other sign language of the world by learning their
respective gestures.

The Boltay Haath system can be modified for use on hand held devices thus making the
system more portable and easier to use in daily life. For this purpose the Microsoft compact
framework for .Net is the best candidate since the system is being developed using current
.Net technologies.

(

3. IMPLEMENTATION AND ENGINEERING CONSIDERATIONS

3.1 PSL SIGNS USED IN BOLTAY HAATH
The sign language into Sub-domains that is English and Urdu. This is because of the
similarity of some gestures. Moreover English and Urdu both contain gestures of words and
letters. Gestures have been categorized into Dynamic and Static. In Urdu there are 38 letters.
In which few are dynamic and words are of both types one-handed and two-handed. In
English there are 26 letters. In which two are dynamic and words are of both types one-
handed and two-handed. PSL also contains domain specific signs for example computer
terms, Environmental terms and Traffic terms

Figure 3.1 - English and Urdu Alphabet Signs in PSL

3.2 SYSTEM ARCHITECTURE
The Boltay Haath system is divided in to the following sub systems:

• Gesture Database – Contains all the persistent data related to the system.
• Gesture Acquisition – Gets state of hand (position of fingers, roll and pitch) from
glove and convey to the main software.
• Training- It uses the collected data to train the system.
• Gesture Recognition Engine – Analyzes the input to recognize the gesture made by
the user. Two different techniques have been implemented for this purpose namely
Artificial Neural Network (ANN) and Statistical Template Matching (STM).
• Gesture Output- for gesture and textual data. Converts word/ letters obtained after
gesture Recognition into corresponding sound.

&

• Accelerometer- Accelerometer detects motion of the hand, in order to demarcate start
and end of gestures for continuous gesture recognition.

The following figure illustrates the architecture of the Boltay Haath system:

Figure 3.2 - System Architecture [1]

A detailed description of the architecture, the implementation techniques and the algorithms
of the system are given below:

3.3 GESTURE DATABASE
A particular input sample in this system is defined by the combination of five sensors for
fingers and one tilt sensor for roll and pitch which is stored in the Gesture Database during
the data acquisition phase. The gestures in the database are organized with respect to there
area of use i.e. its domain. For example, the alphabet domains contain the Urdu and English
alphabet gestures. Word domains may contain the list of emergency gestures, daily routine
gestures and other special gestures. The database also stores relevant data like gesture’s
phoneme†, training results of STM and ANN and Registered Users‡ information.

3.4 DATA ACQUISITION
This sub system captures the state of the hand (flexure of fingers, roll and pitch) from the
glove and stores it in the Gesture Database for further processing. It handles all the data
coming to and from the Data Glove. The driver software provided by the vendor had to be
adapted for use in the .Net managed environment and hence a wrapper class for the Glove

†
Phoneme is the smallest phonetic unit in a language that is capable of conveying a distinction in meaning.
‡
Registered Users are those users who participated in the training of the system.

)

Driver was written in C# for use in the system. During acquisition the training data are
identified by the Gesture ID of their corresponding signs and stored in a table for later use by
the training algorithms.

An input sample consists of five values ranging from 0 to 255 each representing the state of
the sensor on all five fingers of the glove. The sensors for roll and pitch have been ignored in
case of non-moving gestures since their values do not uniquely identify an alphabet sign [2].
The sequence diagram showing the use of data acquisition interface in gesture recognition is
shown below.

Figure 3.3 – Data Acquisition Sequence Diagram

3.5 DATA GLOVE
The input device used in Boltay Haath is the
5DT Data Glove 5. It is equipped with sensors
that sense the movements of the hand and
interface those movements with a computer. The
5DT Data Glove 5 measures finger flexure and
the orientation (pitch and roll) of user’s hand. It
consists of 8-bit flexure resolution, Platform
independent - serial port, interface (RS 232),
built in 2-axis tilt sensor.

Figure 3.4 - Components of 5DT Data Glove 5
*


3.6 TRAINING
The Training sub system trains the system so that it can perform gesture recognition
afterwards. The training process is different for the two different modes of operation of
Gesture Recognition Engine (GRE) i.e., Statistical Template Matching (STM) and Artificial
Neural Network (ANN). In both cases training is a batch process.

For the generalized recognition of gestures it was necessary to collect the data from different
users. The system was trained by using data obtained from six different signers [3]. Initially,
training data was collected for the non-moving gestures as in [4] of English as well as Urdu,
since PSL contains both types of signs [5]. This was done due to the limitations of the input
device i.e., the Data Glove 5 does not provide the abduction status and the absence of any
kind of input about the location of the glove in space.

The separate training processes for STM and ANN are disused below.

3.6.1 STATISTICAL TEMPLATE SPECIFICATION (STS)
The idea is to demarcate different gestures by calculating the mean (µ) and standard
deviations (σ) of all the sensors for each gesture in the training set. The resultant (µ, ) pairs
are stored in the gesture database for later use in gesture recognition and are called templates.
Thus the process is named “Template Specification”. The mean and standard deviation is
calculated for each sensor of each gesture as follows:

(3.1)

(3.2)
Here, xi is the ith sensor value, n is the number of samples, µ(l,m) is the mean of the lth sensor
of the mth gesture and σ(l,m) is the standard deviation of the lth sensor of the mth gesture.

Figure 3.5 – STS Sequence Diagram

+


3.6.2 ARTIFICIAL NEURAL NETWORK TRAINING (COMMITTEE SYSTEM)
The Artificial Neural Networks Training (ANN) sub system allows the training of the various
neural networks in the system. It collects data from the Gesture Database and applies
supervised learning algorithm (Backpropagation [6]) for training the neural networks and
finally saves the networks in the database.

Since a single network could not converge on the available data it did not perform well. So it
was decided to tackle the problem with a divide and conquer approach. This technique,
labeled a committee system [7], combines the outputs of multiple networks called experts to
produce a classification rather than using only a single network. The rationale is the
realization that there is not one global minimum into which every net trains but that there are
many minima where adequate classification on the training examples can be obtained. So by
combining the output of several networks it may be possible to gain superior generalization
than that of any single network.

Each small network for a particular gesture is called an ‘expert’. The training data for each of
the experts contains equal number of samples of both classes that it classified. For example,
the training set for the expert for ‘A’ contains half samples of ‘A’ and the remaining half
would comprise of the rest of the signs. Backpropagation was used to train the experts. A
learning rate of 0.1 was used and the training set comprised of more than 2000 samples for
each expert. The input was scaled further from the range of 0 to 255 to –1.28 to 1.27. This
further scaling reduced the error and provided better results after training. This is some times
called pre-processing

Figure 3.6 – ANN Training Sequence Diagram

,


3.7 GESTURE RECOGNITION ENGINE
The Gesture Recognition Engine is the core of the system. It performs gesture recognition
using the two techniques (STM and ANN). It interacts with most of the subsystems. It takes
gesture input from the Gesture Input subsystem, identifies them and gives output through the
Output subsystem in text and speech. The separate recognition processes for STM and ANN
are disused below.

3.7.1 STATISTICAL TEMPLATE MATCHING
The statistical model used in Boltay Haath is the simplest approach to recognize postures
(static gestures) [8], [9]. The model used is known as “Template Matching” or “Prototype
Matching” [10]. The idea is to demarcate different gestures by calculating the mean (µ) and
standard deviations (σ) of all the sensors for a gesture and then those input samples that are
within limits bounded by an integral multiple of standard deviation are recognized to be
correct. Gesture boundary [11] for each sensor is defined as,

(3.3)
Here, µ is the mean and σ is the standard deviation of that sensor whose gesture boundary is
to be defined. Similarly gesture boundaries for each sensor of all the gestures are defined and
used in Pattern Matching.

3.7.1.1 ALGORITHMIC MODEL

a) PATTERN RECOGNITION
After Statistical Template Specification (STS), test samples are provided to the pattern
recognition module, which analyzes them using the statistical model [12]. The upper
and lower limits for the value of a sensor for a particular gesture are defined using the
standard deviation for that sensor previously calculated. For enhancing the accuracy of
gesture recognition, various integral multiples of σ are used, denoted by k in (3.3). The
limits for any given gesture are defined as:
(3.4)

- (3.5)
Given the above-mentioned criteria, any given input can be classified as a particular
gesture if all the sensor values of the test sample lie within these limits (i.e. the gesture
boundary). These values are retrieved from the gesture database. The values of k used
for gesture recognition in Boltay Haath range from 1 to 3, providing tolerances ranging
from 2σ to 6σ. The performance achieved by varying the values of k is discussed later
in Testing and Verification.

b) AMBIGUITY AMONG OUTPUTS
Sometimes due to ambiguity among two
or more gestures STM may produce
multiple outputs. The ambiguity is
created due to the overlapping of
different gesture boundaries. The
overlapping increases as the value of k is
increased from 1 to 3.To cater to this
Fig 3.7 Ambiguous Signs

-%

problem the method of Least Mean Squares (LMS) is used. Figure 3.7 shows two
ambiguous signs ‘R’ and ‘H’.

c) LMS FOR REMOVING AMBIGUITY
There are cases where more than one gestures are candidates for output. To overcome
this type of situation the system calculates Least Mean Squares (LMS) [13] of all the
candidate gestures and then selects the one with minimum LMS value. It is calculated
as,
(3.6)

Here, xi denotes the sensor value of the ith sensor from test sample; µi denote mean
value for the ith sensor.

LMS for each candidate gesture is calculated and the gesture with least LMS value is
selected as the final output. The use of LMS is justified by the results. Analyzing the
performance of the system it has been observed that the use of LMS provides accurate
results 60 % of the time.

Figure 3.8 – STM Gesture Recognition Sequence Diagram

--


3.7.2 ARTIFICIAL NEURAL NETWORK (ANN)
In this mode, the GRE takes input data and feeds it to multiple artificial neural networks in
parallel. The approach taken is to initially process the input data so as to produce a
description in terms of the various features (handshape, orientation and motion) of a sign. The
sign can then be classified on the basis of the feature vector thus produced. This mode uses
our Artificial Neural Network Library (ANNLIB) to run Multi Layer Perceptrons (MLPs) for
recognition.

3.7.2.1 COMMITTEE SYSTEM FOR RECOGNITION
The various experts (neural networks for each gesture) were trained using a “one
against all” technique in which each network is trained for a particular sign to give a
positive response for that sign and a negative one for all the others. So in the final
system all the experts have the same architecture and are given the same input. Fig 3.9
shows the committee system used in the system.

Output
Voting
Mechanism
Input

Figure 3.9 - Committee System

a) ARCHITECTURE OF EXPERTS
The architecture of the experts used in the committee system is 5:8:1 i.e., 5 inputs, 8
hidden nodes and 1 output node. The activation function for nodes in the hidden layer
was Sigmoid Logistic and Hyperbolic for output nodes.

b) VOTING MECHANISM
The voting mechanism takes the output of all the experts as its input. It identifies the
resultant gesture by examining the outputs of all the experts and selecting the one with
a positive result.

-$


c) FINAL CLASSIFICATION
Since the experts could not be optimally trained, multiple experts can give a positive
result. To solve this problem the results of each expert can be multiplied with its
accuracy by the voting mechanism to give more weightage to the result of more
accurate experts over less accurate ones. And finally the output with the largest
positive value is selected as recognized gesture.

Figure 3.10– ANN Gesture Recognition Sequence Diagram

3.8 OUTPUT
The output of the system has two forms, one is the formatted text and other is in the form of
speech. Obviously the important one is the speech output as it accomplishes the objective of
the system.

-'


3.8.1 TEXT
This subsystem outputs recognized sign into formatted text. The Urdu sign is output into
Roman Urdu† language. This text output is then used by Text-to-speech module for its
processing.

3.8.2 SPEECH
The text-to-speech subsystem converts text into synthesized Urdu / English speech using
Microsoft Speech SDK version 5.1[14]. The lexicon and phoneme sets have been modified so
that it can pronounce words correctly in local accent.

Figure 3.11– Speech Output Sequence Diagram

3.9 ACCELEROMETER
Accelerometer is used for detecting a sign continuously without any need of manual aid for
indicating the start or end of a gesture. It automatically identifies the ending of one gesture
and start of the other. In this way gestures are recognized in continuous fashion.

3.9.1 ALGORITHMIC MODEL
Acceleration is calculated for each sensor by averaging the differences between the last n
inputs (in a sliding window fashion). When the acceleration of all the sensors is below a
certain threshold value, the system identifies the state of hand as stationary and sends the
sensor values for recognition to the engine. As soon as the acceleration exceeds the threshold
value the system marks the hand as in motion and stops recognition. The sliding window size
and the threshold values are adjusted so that the user need not make a deliberate effort to stop

†
Roman Urdu - Urdu written with the use of English alphabets.

-(

for sometime in order to get the sign recognized. It is accessed by user through an
accelerometer interface. The user can set the threshold value and sliding window size
according to his/her needs.

Window size = 8

150 151 152 153 154 156 159 161 165 168 172 178 182 185 186 190

2 3 2 4 3 4 6

Threshold = 1.8

A = 2+3+2+4+3+4+6 = 24 = 3.43
7 7
A> Threshold therefore hand is in motion

Figure 3.12(a) - Motion Detection (Hand in motion)

The above figure shows how the accelerometer determines that the hand is in motion. The
sliding window shows the state of a sliding window for a single sensor. The differences are
shown in triangles. The average of these differences is above the threshold value. Hence the
system identifies the sensor to be in motion.

Window size = 8

150 151 152 153 154 155 155 156 157 158 158 159 160 161 161 161

1 0 1 1 1 0 1

Threshold = 1.8

A = 1+0+1+1+1+0+1 = 5 = 0.714
7 7
A< Threshold therefore hand is stationary

Figure 3.12(b) - Motion Detection (Hand is stationary)

The above figure shows how the accelerometer determines that the hand is stationary. The
sliding window shows the state of a sliding window for a single sensor. The average of these
differences is below the threshold value. Hence the system identifies the sensor to be
stationary.

The accelerometer is used on a per sensor basis. So for five sensors, five accelerometer
objects are used and each is continuously provided with its corresponding sensor value. The
accelerometer design used till now is limited to either static gestures or dynamic gestures.

-&


3.10 DEVELOPED TOOLS
In the course of developing the system various tools and libraries were developed. This
includes wrapper class for glove driver (5DT Data Glove5 driver) in C#, its driver was
written originally in VC++ (unmanaged) and was converted in to C# (managed code). An
Artificial Neural Networks library ANNLib was written in C#. Also, a performance
evaluation tool for evaluating performance of STM and ANN recognition systems efficiency
was developed.

3.10.1 PERFORMANCE EVALUATION TOOL FOR GRE (PETool)
The PETool evaluates the results obtained by applying the test data to gesture recognition
engines of STM and ANN and generate reports and graphical view of data for performance
evaluation purpose. The simulation data is used to evaluate whether the current
configurations of STM and ANN are providing acceptable results or not.

3.11 TRADEOFFS
Many tradeoffs regarding accuracy and efficiency were made during the design and
implementation of the system. A major issue was the training of neural networks. The amount
of training data, the optimal architecture of the neural networks and the classification
mechanism were a few considerations. For quick training the training data set needed to be
small and for greater accuracy more data was needed but more time was required for training.

The quality of training data was of major concern for STM as well as ANN. The greater the
number of registered users, the better the generalization. But more data does not come
without its share of bad samples.

In STM recognition, gesture boundaries of sensors are defined as µ ± kσ , the system uses
k=3 after trying all values of 1, 2 and 3. This model µ ± 3σ covers large variation of data
(up to 6-sigma) but at the same time increases the overlapping of different gestures. This
overlapping of gestures creates ambiguity among outputs that has to be removed with the use
of LMS.

A similar case can be made for speech output. Text to speech output provides an efficient
way of producing speech output but the quality of sound produced is not at par with pre-
recorded human voice. However, recorded voices incur a heavy processing cost on the
system when it comes to real-time recognition.

The accelerometer is used to filter the data stream coming form the Data Glove in a thread.
So obviously the performance of the thread will degrade if a decision making block is
executed at each cycle. Same is the case with the accelerometer component [15].

3.12 IMPLEMENTATION TOOLS
Boltay Haath system has been developed in C# using Visual Studio .Net 2002. The gesture
database was maintained on a MS Access database file. Windows being the platform for the
project, all the user interfaces and input components are standard Windows objects. Microsoft
Speech SDK 5.1 was used for speech output.

-)


3.13 COST
Cost of components used in this project is given below.
Item Cost
5DT Data Glove 5 $ 300

3.14 TESTING AND VERIFICATION
The sub systems were tested separately to check their performance in various scenarios.
Because Boltay Haath has a highly modular design, top-down and bottom-up integration
occurred simultaneously. However, the system was integrated incrementally, to control the
amount of bugs that need to be fixed at any given time. Tests conducted in a black box
fashion.

Finally it was tested that software meets the performance criteria set during design system
specification. These tests were performed signers facility since it was deemed to know if the
hardware available meets the performance criteria.

3.14.1 TEST FOR RECOGNITION ACCURACY
The results were obtained using the PETool that was specially developed for measuring
performance of the system.

a) STATISTICAL TEMPLATE MATCHING TEST RESULTS
Domain Accuracy (%)
k=1 k=2 k=3
English Alphabets 24 73 80
Urdu Alphabets 24 84 88

Table 3.1 – Performance Result (Statistical Template Matching)

Figure 3.13(a) – Alphabet wise recognition accuracy for STM - English

-*


Figure 3.13 (b) – Alphabet wise recognition accuracy for STM - Urdu

b) ANN COMMITTEE SYSTEM TEST RESULTS

Domain Accuracy (%)
24 Handshapes 84

Table 3.2 – Performance Result (ANN classification with committee system)

Figure 3.14 – Alphabet wise accuracy for ANN Committee System - English

3.14.2 TEST FOR RECOGNITION TIME
Multiple gestures were provided to the system in sequence and the average time was
calculated using the system clock. Under normal conditions the average recognition time was
0.4 seconds.

-+


3.14.3 TEST FOR SYNCHRONIZED SPEECH SYNTHESIS
This performance parameter was measured using an external timing device and was found to
be within the prescribed limits.

3.14.4 TEST FOR CONTINUSOUS RECOGNITION
The system is able to distinguish between consecutive gestures using the accelerometer
component.

4. SUMMARY
Deaf and dumb people rely on sign language interpreters for communication. However, they
cannot depend on interpreters in every day life mainly due to the high costs involved and the
difficulty in finding qualified interpreters. This system will help disabled persons in
improving their quality of life significantly.

The automatic recognition of sign language is an attractive prospect; the technology exists to
make it possible, while the potential applications are exciting and worthwhile. To date the
research emphasis has been on the capture and classification of the gestures of sign language.
This project will be a valuable addition to the ongoing research in the field of Human
Computer Interface (HCI).

The Boltay Haath system has been shown to work for Pakistan Signing Language (PSL)
without invoking complex hand models. The results obtained indicate that the system is able
to recognize signs efficiently with a good percentage of success.

Future research regarding Boltay Haath will address more complex gestures, such as those
gestures involving two hands. System will be investigated by other ways to model the gesture
dynamics, such as HMMs that achieve minimal classification errors. Dynamic gestures and
online training are the two most attractive features left for future.

Several new directions have been identified through which this work could be expanded in
the near future. The techniques developed are not specific to PSL, and so the system could
easily be adapted to other sign languages or for other gesture recognition systems (for
example, as part of a VR interface, telemetry or robotic control). It can be considered as a
step towards applications which provide user interface based on hand gestures.

One aspect of communication which could not be handled in Boltay Haath is two way
communication. Currently Boltay Haath can convey words from the signer to the listener and
not the other way around. One future enhancement would be to enable two way
communication.

The Boltay Haath system is now almost complete. Though many enhancements and
optimizations can be made to make it better. On the whole 83 gestures have been recognized.
This number can be increased as and when required by user.

-,


5. REFERENCES
[1] R.S Pressman, Software Engineering: A Practitioner’s Approach, Fourth Edition, McGraw-HILL
International, 1997

[2] Vesa-Matti, Mantyla, Jani Mantyjarvi, Tapio Seppanen, Esa tuulari. 2000, “Hand Gesture Recognition of a
mobile device user”, 2000 IEEE pp.281-284.

[3] Kadous, Waleed “GRASP: Recognition of Australian sign language using Instrumented gloves”, Australia,
October 1995,pp. 1-2,4-8.

[4] Murakami and Taguchi, “Gesture Recognition using Recurrent Neural Networks”. CHI ' Conference
91
Proceedings, pp.237--242. Human Interface Laboratory, Fujitsu Laboratories, ACM, 1991.

[5] Sulman Nasir, Sadaf Zuberi, “Pakistan Sign Language – A Synopsis“, Pakistan, June 2000.

[6] Simon Haykin, Neural networks: A Comprehensive Foundation, Second Edition, McMaster University, pp.
142.

[7] Peter W. Vamplew, Recognition of Sign Language Using Neural Networks, University of Tasmania, May
1996, Pp. 98

[8] Corradini, Andrea, Horst-Michael Gross. 2000, “A Hybrid Stochastic-Connectionist Architecture for
Gesture Reognition”, 2000 IEEE, 336-341.

[9] K.S. Fu, Syntactic Pattern Recognition, Prentice-Hall 1981, pp. 75-80

[10] Corradini, Andrea Horst-Michael Gross. 2000, “Camera-based Gesture Recognition for Robot Control”,
2000 IEEE, pp.133-138.

[11] Sommerville, I, Software Engineering (6th Ed.), published by: Addison Wesley, chap. 1, pp. 8.

[12] I., Wachsmuth, T. Sowa (Eds.), “Towards an Automatic Sign Language Recognition System using
Subunits”, London, April 2001, pp. 1-2

[13] Liskov, Barbara, Program Development in Java, chap 11, pp. 356.

[14] The Microsoft Speech Website, www.microsoft.com/speech

[15] Richard, Watson, “A survey of Gesture Recognition Techniques Technical Report”, Trinity College,
Dublin, July 1993, pp. 6

$%

Boltay Haath

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Similar to Boltay Haath

Similar to Boltay Haath (20)

More from Ashar Ahmed

More from Ashar Ahmed (15)

Recently uploaded

Recently uploaded (20)

Boltay Haath