What is SST?
It is a technology that helps to transmit information
without using our vocal cords.
Aims to observe our silent speech and transform it into
The software can be installed in wrist tag/ display,
mobile or PC.
“What happens if we don’t communicate? Suddenly we
lose our voice during an accident……”
Helps those who had lost their voice but wish to speak.
Output can be routed to communication networks.
People can speak over phone without disturbing others.
Also can speak in noisy environment.
Idea was popularized in the 1968 Stanley Kubrick’s science
fiction film ‘‘2001 – A Space Odyssey ” (Using Electronic signals)
US space agency Nasa has investigated the technique for
communicating in noisy environments such as the Space Station.
SST was demonstrated in the year 2010 at CeBIT’s “future
park”, one of the largest trade fair.
This technology is being developed at Karlsruhe Institute of
Technology ( KIT ), Germany.
Wand and Tanja Shultz
A technique for evaluating and recording the electrical
activity produced by skeletal muscles.
It detects the electrical potential generated by muscle
cells, when these cells are electrically or neurologically
Performed using instrument called an electromyograph,
to produce a record called an electromyogram.
signals can be analyzed to detect medical abnormalities.
How can We Speak….?
When we generally speak aloud, air passes through
larynx or vocal cord & the tongue.
Words are produced using articulator muscle in the
mouth & jaw region.
monitor tiny muscular movements that occur when we speak.
Monitored signals are converted into electrical pulses that can
then be turned into speech, without a sound uttered.
Fig: Electromyography activity
Device presently needs nine leads to be attached to our
face which is quite impractical to make it usable.
It’s little painful.
Translation to Chinese language is a bit difficult.
Image processing In SST
A device oriented package to design and implement for the
purpose of lip reading.
It works based on our silent speech.
It can recognize words, single sentence or even continuous
sentences of people of different region.
Device consider our non-speech accent and pronunciation
by observing every movement of our lip and facial Expression
Region of Interest(ROI)
with key points
Perform Lighting Compensation on image.
Extract skin region and remove all the noisy data.
Check for face criterions.
Skin colour blocks are identified.
Height and width ratio (1.5 and 0.8) computed and
Minimal face dimension constrained is implemented.
Crop the current region.
One of the important steps in face feature extraction.
Colour segmentation of human face depends on the
colour space that is selected.
Skin colours of different people are closely grouped in
normalized RG colour plane ( by Yang and Waibel).
Search for the pixels which are close enough to this
Active Shape Models
d)Active shape of face
Used to detect face in the captured video.
Shape model is formed from a set of
manually annotated shape of faces:
•Align all shapes of the learning data to an
arbitrary reference by geometric
•Calculate average shape .
Model positioned on the face.
Iteratively deformed until it sticks to the
face in respective bounding boxes
Mouth region Localization.
Reads video frame by frame
Creates a detector for face
Detect active contour inside face region .Here active contour is lip (i.e..
major difference region).
centroidColumn(X), centroidRow(Y) – centroid point
Middlerow,middlecolumn– minor and major axis lines of lip contour
Contour fitting point location
1.Live video 2.ROI video
detected live video
4.Lip during motion with perimeter
contour and key points
5.Multi Image montage(28 frames)6.Threshold Analysis
People can communicate in different languages by translating
the output of SST.
Helps to Analyse and understand the people who have lost voice
to speak or stuttering problem.
Silent Sound Techniques is applied in Military for communicating
secret/confidential matters to others.
Helps people to make silent calls during meetings/ in mass
User can tell PIN no., credit card no., password and other
personals without bothering some eavesdroppers.
Software can be installed in wrist watch, wrist tag or
display/Mobile/Pc and etc.
The software is being trained based on the lip structure, complexion
and features of the lip area.
Provide easier mode of communication for people with speech
disabilities by converting the identified lip movements directly to
Software can be integrated onto mobile oriented or hand-held
Lip read for Chinese language Mandarin is highly personalized.
Systems are still preliminary need improvement.
Pradeep B.S. And Zhang Jingang , “Silent Sound Technology for
Sasikumar Gurumurthy and B.K.Tripathy , “Design and
Implementation of Face Recognition System in Matlab Using the
Features of Lips”.
Evangelos Skodras and Nikolaos Fakotakis , “An Unconstrained
Method for Lip Detection in Color Images”.
Priya Jethani and Bharat Choudhari , “Silent Sound Technology: A
Solution to Noisy Communication”.