Learn how Siri works without any jargon. You will understand at a high level the key concepts involved while using Siri. These will also apply to Alexa or Ok Google.
2. On AI Use Cases
Chief Data Scientist,
LinkedIn Published Author
I am not a data
scientist and so
please no
jargons. I always
wondered how
does Siri work ?
3. On AI Use Cases
Chief Data Scientist,
LinkedIn Published Author
Did you ever see a
dog making things
complicated ?
Let me explain ..
4. Concept # 1
Let’s find out, how does Siri
record your voice on iPhone
and convert to digital format
Neuron Learning
5. Try this …
Chief Data Scientist,
LinkedIn Published Author
Take your hand close to your
mouth and say the word
“Boom”. What did you
experience?
When you speak, pressure
waves are created and
microphone uses these changes
in air-pressure to record sound.
Your ear drum sort of works on
similar principle
Neuron Learning
6. Show me the waves..
Chief Data Scientist,
LinkedIn Published Author
below is a typical sound wave
and in this case I was asking
Siri directions..
Neuron Learning
7. After this step, Siri converts
your speech to text. In order
to do this, lot of compute
power is required and hence
your voice is sent to cloud.
That is the reason why Siri
doesn’t work without internet
9. Concept # 2
Machine Learning (ML) algorithms
are trained by using lots of voice
samples and providing accurate
“text” for each sample. Very
similar to how we teach kids by
showing pictures..
This is
a
“dog”
Neuron Learning
10. over 100,000 hours of
audio is used for
training
“my name”
“how long
is”
“what”
“Columbus”
…equal to a full time employee
working over 50 years
This voice labelling is done by humans and then it is
provided to algorithms as ground truth
11. Apple has been in news for
privacy concerns..
Source: TheGurdian.com
Neuron Learning
12. This is very important..
In order to train a ML
algorithm, it needs data
which has been correctly
labelled by humans.
Highly recommend you
read last 3 pages if this is
not clear
Neuron Learning
13. On AI Use Cases
Chief Data Scientist,
LinkedIn Published Author
Are you saying that
during training,
algorithm predicts
the text from voice
samples and these
predictions are
compared against
actual labels
provided by human
beings?
16. That is similar to
how I learn. When I
do a good job you
give me feedback
and then I do more
of that..
17. Character level predictions..
Chief Data Scientist,
LinkedIn Published Author
Latest algorithms predict
character in audio file. This
prediction is similar to how a non
English speaking person would
spell the word “Wife” if it was said
very slowly “WWIIIFFEE”. After that
rules such as removing duplicate
characters are applied to get the
most probable word
WW_III_FF_EE WIFE
Neuron Learning
18. Intent behind the text
Chief Data Scientist,
LinkedIn Published Author
Let’s assume a user tells Siri
“Call Wife”. Based on our
insights so far, we know Siri
will convert it to text “Call
Wife”. How does Siri know
what do these word mean?
That is covered in our last
concept. You are almost
there. Sit tight!
Neuron Learning
19. Concept # 3
Siri doesn’t try to
figure out exactly
what you said,
but rather
connect you to
most relevant
services / apps
based on “trigger
words”
Neuron Learning
20. “Call Wife”
After Siri has converted your
speech to text, it will identify
“Call” as a trigger word in “Call
Wife” and based on that it will
invoke call placing application
Neuron Learning
21. Some trigger words..
Tell
Message
Text
Set Appointment
Book Meeting
Alarm
Wake me
Temperature
Umbrella
Weather
Given large volume of voice samples,
Siri has good sense for a different
number of ways people ask for say
“Weather”
Neuron Learning
22. Back to our “Call Wife”
example
Once Siri realizes it has to
place a “call”, it will get name
of person to be called from
text. In our case if a contact
“Wife” is saved then Siri will
place the call and will
complete your request
Neuron Learning
23. You have seen this
default answer…
When Siri
stumbles upon
a request
beyond it’s
capability it
mostly invokes
a web search
Neuron Learning
24. You have seen this
default answer…
When Siri
stumbles upon
a request
beyond it’s
capability it
mostly invokes
a web search.
When you make
a request to
draw a Lion..
OMG. I cannot
believe how easy it
was to
understand. How
can I continue to
learn more?
25. Follow Amol Palekar
on LinkedIn to get next
edition of Neuron
Learning
For email notification,
subscribe :
NeuronTimes.com
25
26. Resources
Chief Data Scientist,
LinkedIn Published Author
Natural Language Processing by National Research University Higher School of Economics on Coursera
Sequence Models by deeplearning.ai on Coursera
Adam Coates. - Deep Speech at BAMMF on Youtube https://www.youtube.com/watch?
v=hyZCH3xU42E&list=PLes5o2b5ie2XMM4SLb7drQjQEPnalgJLl
Adam Coates – Deep Learning for Speech Recognition on YouTube
https://www.youtube.com/watch?v=g-sndkf7mCs&t=1980s
TheGurdian.com
These are my personal opinions and for educational purposes only. Content has been simplified to convey
core ideas to non data scientists