SlideShare a Scribd company logo
1 of 25
Download to read offline
Making an on-device
personal assistant a reality
Qualcomm Technologies, Inc.
@qualcomm_techJune 2018
2
Reasoning
Learn, infer context,
and anticipate
Perception
Hear, see, and
observe
Action
Act intuitively, interact
naturally, and protect
privacy
AI brings human-like
understanding and
behaviors to the
machines
3
Advancing AI research to make on-device AI ubiquitous
A common platform is fundamental to scaling AI internally and across the industry
Power efficiency
Model design, compression, quantization,
activation, algorithms, and efficient hardware
Efficient learning
Robust learning through minimal data,
unsupervised learning, and on-device learning
Personalization
Continuous learning, model adaptation,
and privacy-preserved distributed learning
System architecture
Multi-task and multi-modal learning, sensor fusion, and cloud-edge systems
Action
Reinforcement learning
for decision making
Reasoning
Scene understanding, language
understanding, behavior prediction
Perception
Object detection, speech
recognition, contextual fusion
AutomotiveIoT Mobile
4
A true personal assistant
One of many use cases requiring a broad set of AI capabilities
Power efficiency
Model design, compression, quantization,
activation, algorithms, and efficient hardware
Efficient learning
Robust learning through minimal data,
unsupervised learning, and on-device learning
Personalization
Continuous learning, model adaptation,
and privacy-preserved distributed learning
AutomotiveIoT Mobile
Action
Reinforcement learning
for decision making
Reasoning
Scene understanding, language
understanding, behavior prediction
Perception
Object detection, speech
recognition, contextual fusion
System architecture
Multi-task and multi-modal learning, sensor fusion, and cloud-edge systems
5
v
Designed to be:
Always-on
Conversational
Personal
Private
Critical to create a
true virtual assistant
Voice is the
transformative user
interface (UI) we’ve
been waiting for
6
Voice UI components required for an end-to-end solution
Text-to-speech
Speech
synthesis
Natural language generation
Dialog management
Natural language understanding
Natural
language
processing
Speech-to-text
Speech
recognition
“Alexa,” “Hey Snapdragon”
Always-on keyword detection
Voice
activation
Echo cancellation
Speech denoising
Speech
pre-processing
Signal acquisition and playback
Front-end
processing
Machine speech chain: listener and speaker
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
77
Machine learning has ignited the voice UI revolution
“As speech recognition accuracy goes from say 95% to 99%, all of us in the room will go from barely using it today to using it all
the time. Most people underestimate the difference between 95% and 99% accuracy— 99% is a gamechanger. No one wants
to wait 10 seconds for a response. Accuracy, followed by latency, are the two key metrics for a production speech system.”
— Andrew Ng
GMM: Gaussian Mixture Model, CNN: Convolutional Neural Network, RNN: Recurrent Neural Network
Human accuracy
GMM RNN + CNN
Machine automatic speech recognition accuracy
202020102000199019801970
50%
CNN
55%
60% 62%
70%
95%
8
TV
Smart
speakers
Headphones
and headsets
IoT
Smartphones
and tablets
In-car
entertainment
systems
PCs and
laptops
Wearables
XR
Voice UI is proliferating
across product categories
9
Moving voice UI functionality to the end device
An end-to-end solution powered by machine learning
Automatic speech
recognition
(ASR)
Text-to-speech
(TTS)
News
SMS
Music
Maps Wikipedia
Weather Stocks
Voice
activation
Service manager
Multi-mic echo
cancellation,
beamforming,
and speech
denoising
Natural language
understanding
(NLU)
On-device processing
(always-on and real-time)
Cloud processing
(services)
Cloud centric (today)
10
Moving voice UI functionality to the end device
An end-to-end solution powered by machine learning
Automatic speech
recognition
(ASR)
Text-to-speech
(TTS)
News
SMS
Music
Maps Wikipedia
Weather Stocks
Voice
activation
Service manager
Multi-mic echo
cancellation,
beamforming,
and speech
denoising
Natural language
understanding
(NLU)
On-device processing (always-on and real-time) Cloud processing (services)
On-device centric (future)
11
Cloud tasks
Complex voice fallback
Training and model update
Knowledge base
Services
On-device
processing
of voice UI
Provides unique benefits
complementing the cloud
Challenge
Providing the voice UI functionality
within the power/thermal envelope
Machine learning models
Offline raw data
Queries
On-device tasks
Automatic speech recognition
Natural language processing
Always-on audio cognition
On-device training
Benefits
Privacy
Instant response
Always-on
Device context
12
Noisy speech spectrogram
Clean speech spectrogram
“If people were more generous,
there would be no need for welfare”
0.5 1 1.5 2 2.5 3
4000
3000
2000
1000
0
Frequency
Time
0.5 1 1.5 2 2.5 3
4000
3000
2000
1000
0
Frequency
Time
DL-based denoising model
trained with extensive speech
noise databases
DL-based
denoising
Speech
denoising
• Single or multiple mics
• Applicable for
◦ Two-way conversation
◦ Voice/speaker recognition
◦ Keyword spotting
• Deep learning (DL)
significantly improves the
performance over traditional
methods
• Robust in challenging
interference and noise
scenarios
“If people were more generous,
there would be no need for welfare”
13
Qualcomm
Voice
Activation
supports:
Qualcomm® Voice
Activation (VA)
High accuracy, robust to background
noise, and supports multiple languages
Deep learning is improving performance
Among state-of-the-art in terms of
performance vs. power consumption
-47%
-11%
2014 2015 2016 2017
QualcommVApowerconsumption
WCD9330
WCD9335
WCD9340
Amazon Alexa
Baidu DUEROS
Microsoft Cortana
Google Assistant
Qualcomm Voice Activation, Qualcomm WCD9330, Qualcomm WCD9335, and Qualcomm WCD9340 are products of Qualcomm Technologies, Inc. and/or its subsidiaries.
14
Transcribe the
audio to text
Deep learning gives
state-of-the-art accuracy
on a mobile device
Personalization—adaptation
to individual accent and
acoustic environment
Automatic
speech
recognition
Natural language
understanding (NLU)
Acoustic features Acoustic model
Reduce input audio to
essential information
Language model
Deep learning converts
input into linguistic units
Adapted to each user’s
accent and environment
Uses context and language
statistics for best utterance
estimation
Adapted to each user’s
speaking tendencies
Allows the same intention
to be expressed in multiple
ways
Adapted to each user’s
intent expressions
“Turn  on  the  light”
On-device automatic speech recognition (ASR)
User intention
1515
An end-to-end on-device voice UI example for smart homes
99% on-device intent accuracy
is achieved for domain specific command sets when adapted to accent and environmental condition
Demo of automatic speech recognition and natural language understanding
Large command set
Turn on the living room lights
Click the kitchen lights off
Turn off all lights
Switch on the ceiling fan
Shut off the sprinklers
Start music
Pause song
Next track
Go back one
Play previous song
Turn speaker off
Increase temperature
Intent understanding
Turn on the kitchen light
Click kitchen light on
Switch on light in the kitchen
Turn the light on in the kitchen
NLU: These four phrases
map to the same intent
16
A true virtual
assistant
A “digital me” sitting on the device:
context aware and personalized
1717
Calendar
Messaging
Apps
On-device data
Contextual intelligence is required for personalization
The fusion of many types of sensors and personal information
17
Low power sensing, processing, and connectivity
Efficient, heterogeneous
architectures
Sensor fusion and
machine learning
Integrated, always-on
data capturing
Low-energy wireless technologies
(e.g. BT-LE, 5G NR IoT)
Cloud data
Off-device data
IoT data
Sensor fusion
Gyroscope
Compass
Camera
Ambient light
Temperature
Iris scanEnvironment
Pulse
HumidityMicrophone
Sensor data
C-V2X
1818
Creating personalized memories
Essential for a true virtual assistant
History, number
of people, identity
After the party, strolling on the
beach at sunset in La Jolla talking
with my son and laughing
Live sentiment
analysis
Strolling on the beach at sunset
in La Jolla talking with my son
and laughing
Activity analysis
Strolling on the beach at sunset
in La Jolla talking with my son
GPS location
La Jolla, California
Visual analysis
A sunset over the ocean
in La Jolla
Sound analysis
Talking with my son at sunset in
La Jolla
19
A true personal assistant is responsive and proactive
“Remember the time I was strolling with
my son after the party at La Jolla beach?”
“Yes I do, here is a picture you took of the sunset.
Should I share it with your family group on WeChat?”
Responsive
Decision-making and conversation based
on contextual analysis and prompting
(e.g. finding memories)
Proactive
Decision-making and conversation based
on contextual analysis without prompting
(e.g. automatically sharing memories)
“I noticed that you are tired and stressed, I’m turning
on the Rocky III soundtrack and navigating you
to the gym for a workout and sauna.”
“This music gets my blood going and a
workout and sauna will help me relieve stress.”
20
Multi-mic echo
cancellation,
beamforming,
and speech
denoising
The first step to an on-device virtual assistant
Enabling on-device voice UI
News
SMS
Music
Maps Wikipedia
Weather Stocks
Voice
activation
Service manager
Automatic speech
recognition (ASR)
Natural language
understanding (NLU)
On-device processing Cloud processing
Text-to-speech
(TTS)
21
Adding an “AI agent” to create a true virtual assistant
The on-device AI agent continuously learns personal knowledge and acts intuitively
On-device processing Cloud processing (services)
Cloud knowledge graph
Automatic speech
recognition (ASR)
Multi-mic echo
cancellation and
beamforming
Sensors
Text-to-speech
(TTS)
Natural language
understanding (NLU)
Voice
activation
AI agent
22
Adding an “AI agent” to create a true virtual assistant
Contextualization allows personalization at acoustic, intent, and behavior levels
Cloud processing (services)
Automatic speech
recognition (ASR)
Voice
activation
Multi-mic echo
cancellation and
beamforming
Cloud knowledge
graph
On-device processing
Text-to-speech
(TTS)
Sensors
Natural language
understanding (NLU)
AI agent
Dialog
management
Speaker identification
Acoustic event detection
Gender and age detection
Voice activity detection
Emotion classification
Contextual fusion
and learning
Local knowledge
graph
23
Various kitchen
noise samples
Posterior-gram
ML-based
acoustic event
classification
Object rustling
Object snapping
Cupboard
Cutlery
Dishes
Drawer
Glass jingling
Object impact
Walking
Washing dishes
Water tap running
2 4 6 8 10 12
50
100
150
200
250
300
350
400
450
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Acoustic
event
detection
• ML techniques
are used to
◦ Classify acoustic
signals into a set of
predefined events
◦ Infer acoustic environment
• Low power,
always-on
2424
We are advancing AI research
to make on-device AI ubiquitous
We are creating AI platform
innovations that are fundamental
to scaling AI across the industry
We provide the low-power
end-to-end on-device solution
for a true personal assistant
Follow us on:
For more information, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Thank you!
Nothing in these materials is an offer to sell any of the
components or devices referenced herein.
©2018 Qualcomm Technologies, Inc. and/or its affiliated
companies. All Rights Reserved.
Qualcomm and Snapdragon are trademarks of Qualcomm
Incorporated, registered in the United States and other
countries. Other products and brand names may be
trademarks or registered trademarks of their respective
owners.
References in this presentation to “Qualcomm” may mean Qualcomm
Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries
or business units within the Qualcomm corporate structure, as
applicable. Qualcomm Incorporated includes Qualcomm’s licensing
business, QTL, and the vast majority of its patent portfolio. Qualcomm
Technologies, Inc., a wholly-owned subsidiary of Qualcomm
Incorporated, operates, along with its subsidiaries, substantially all of
Qualcomm’s engineering, research and development functions, and
substantially all of its product and services businesses, including its
semiconductor business, QCT.

More Related Content

What's hot

Emerging vision technologies
Emerging vision technologiesEmerging vision technologies
Emerging vision technologiesQualcomm Research
 
On-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VROn-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VRQualcomm Research
 
Connected Lighting: The Next Frontier in the Internet of Everything
Connected Lighting: The Next Frontier in the Internet of EverythingConnected Lighting: The Next Frontier in the Internet of Everything
Connected Lighting: The Next Frontier in the Internet of EverythingQualcomm Developer Network
 
"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures
"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures
"The Vision AI Start-ups That Matter Most," a Presentation from Cognite VenturesEdge AI and Vision Alliance
 
Factors effecting positional accuracy of iBeacons
Factors effecting positional accuracy of iBeacons Factors effecting positional accuracy of iBeacons
Factors effecting positional accuracy of iBeacons Chris Thomson
 
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration CenterIntel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration CenterIntelAPAC
 
"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components
"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components
"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot ComponentsEdge AI and Vision Alliance
 
LocalSocial - Indoor Location Positioning Overview
LocalSocial - Indoor Location Positioning OverviewLocalSocial - Indoor Location Positioning Overview
LocalSocial - Indoor Location Positioning OverviewSean O'Sullivan
 
"Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ...
"Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ..."Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ...
"Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ...Edge AI and Vision Alliance
 
“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...
“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...
“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...Edge AI and Vision Alliance
 
Indoor gps via QR codes
Indoor gps via QR codesIndoor gps via QR codes
Indoor gps via QR codesVaibhav Sharma
 
“Video Activity Recognition with Limited Data for Smart Home Applications,” a...
“Video Activity Recognition with Limited Data for Smart Home Applications,” a...“Video Activity Recognition with Limited Data for Smart Home Applications,” a...
“Video Activity Recognition with Limited Data for Smart Home Applications,” a...Edge AI and Vision Alliance
 
Bluetooth Smart (Low Energy) for Android
Bluetooth Smart (Low Energy) for AndroidBluetooth Smart (Low Energy) for Android
Bluetooth Smart (Low Energy) for AndroidLocalz
 
"Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon...
"Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon..."Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon...
"Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon...Edge AI and Vision Alliance
 
LCU13: Networking Summit Keynote
LCU13: Networking Summit KeynoteLCU13: Networking Summit Keynote
LCU13: Networking Summit KeynoteLinaro
 
"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from QualcommEdge AI and Vision Alliance
 
Microsoft Azure-powered IoT & AI Solution To Help Farmer
Microsoft Azure-powered IoT & AI Solution To Help FarmerMicrosoft Azure-powered IoT & AI Solution To Help Farmer
Microsoft Azure-powered IoT & AI Solution To Help FarmerAndri Yadi
 
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles NgSE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles NgAMD Developer Central
 
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ..."2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...Edge AI and Vision Alliance
 

What's hot (20)

Emerging vision technologies
Emerging vision technologiesEmerging vision technologies
Emerging vision technologies
 
On-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VROn-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VR
 
Connected Lighting: The Next Frontier in the Internet of Everything
Connected Lighting: The Next Frontier in the Internet of EverythingConnected Lighting: The Next Frontier in the Internet of Everything
Connected Lighting: The Next Frontier in the Internet of Everything
 
"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures
"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures
"The Vision AI Start-ups That Matter Most," a Presentation from Cognite Ventures
 
Factors effecting positional accuracy of iBeacons
Factors effecting positional accuracy of iBeacons Factors effecting positional accuracy of iBeacons
Factors effecting positional accuracy of iBeacons
 
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration CenterIntel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
 
Harvard presentation
Harvard presentationHarvard presentation
Harvard presentation
 
"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components
"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components
"How to Choose a 3D Vision Sensor," a Presentation from Capable Robot Components
 
LocalSocial - Indoor Location Positioning Overview
LocalSocial - Indoor Location Positioning OverviewLocalSocial - Indoor Location Positioning Overview
LocalSocial - Indoor Location Positioning Overview
 
"Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ...
"Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ..."Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ...
"Imaging + AI: Opportunities Inside the Car and Beyond," a Presentation from ...
 
“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...
“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...
“Alternative Image Sensors for Intelligent In-Cabin Monitoring, Home Security...
 
Indoor gps via QR codes
Indoor gps via QR codesIndoor gps via QR codes
Indoor gps via QR codes
 
“Video Activity Recognition with Limited Data for Smart Home Applications,” a...
“Video Activity Recognition with Limited Data for Smart Home Applications,” a...“Video Activity Recognition with Limited Data for Smart Home Applications,” a...
“Video Activity Recognition with Limited Data for Smart Home Applications,” a...
 
Bluetooth Smart (Low Energy) for Android
Bluetooth Smart (Low Energy) for AndroidBluetooth Smart (Low Energy) for Android
Bluetooth Smart (Low Energy) for Android
 
"Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon...
"Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon..."Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon...
"Combining Cloud and Edge Machine Learning to Deliver the Future of Video Mon...
 
LCU13: Networking Summit Keynote
LCU13: Networking Summit KeynoteLCU13: Networking Summit Keynote
LCU13: Networking Summit Keynote
 
"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm
 
Microsoft Azure-powered IoT & AI Solution To Help Farmer
Microsoft Azure-powered IoT & AI Solution To Help FarmerMicrosoft Azure-powered IoT & AI Solution To Help Farmer
Microsoft Azure-powered IoT & AI Solution To Help Farmer
 
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles NgSE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
SE-4061, Low Power Yet Robust Biometric Fingerprint Technology, by Charles Ng
 
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ..."2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
"2D and 3D Sensing: Markets, Applications, and Technologies," a Presentation ...
 

Similar to Making an on-device personal assistant a reality

Wearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer InterfacesWearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer InterfacesJeffrey Funk
 
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first Patrick Van Renterghem
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
Unlocking New Todays: Artificial Intelligence and Data Platforms on AWS
Unlocking New Todays: Artificial Intelligence and Data Platforms on AWSUnlocking New Todays: Artificial Intelligence and Data Platforms on AWS
Unlocking New Todays: Artificial Intelligence and Data Platforms on AWSAmazon Web Services
 
Mobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile contextMobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile contextFlorent Stroppa
 
2009 Mux Florentstroppa Mobilecontext Small
2009 Mux Florentstroppa Mobilecontext Small2009 Mux Florentstroppa Mobilecontext Small
2009 Mux Florentstroppa Mobilecontext SmallFlorent Stroppa
 
Artificial intelligence Overview by Ramya Mopidevi
Artificial intelligence Overview by Ramya MopideviArtificial intelligence Overview by Ramya Mopidevi
Artificial intelligence Overview by Ramya MopideviRamya Mopidevi
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptxJhalakDashora
 
Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...
Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...
Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...BAQMaR
 
Ad:Tech Conference 2014
Ad:Tech Conference 2014Ad:Tech Conference 2014
Ad:Tech Conference 2014Filip Maertens
 
Ad:Tech Conference 2014
Ad:Tech Conference 2014Ad:Tech Conference 2014
Ad:Tech Conference 2014Argus Labs
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneIJERA Editor
 
Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...
Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...
Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...Argus Labs
 
Earables for Personal-scale Behaviour Analytics
Earables for Personal-scale Behaviour AnalyticsEarables for Personal-scale Behaviour Analytics
Earables for Personal-scale Behaviour AnalyticsFahim Kawsar
 
Human Attention (HA) for an AI augmented world #DevFest19
Human Attention (HA) for an AI augmented world #DevFest19Human Attention (HA) for an AI augmented world #DevFest19
Human Attention (HA) for an AI augmented world #DevFest19Alexandra Petruș
 
Reveal the Car's Potential with Cognitive IoT
Reveal the Car's Potential with Cognitive IoTReveal the Car's Potential with Cognitive IoT
Reveal the Car's Potential with Cognitive IoTSebastian Wedeniwski
 

Similar to Making an on-device personal assistant a reality (20)

Wearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer InterfacesWearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer Interfaces
 
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first
Filip Maertens - AI, Machine Learning and Chatbots: Think AI-first
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Unlocking New Todays: Artificial Intelligence and Data Platforms on AWS
Unlocking New Todays: Artificial Intelligence and Data Platforms on AWSUnlocking New Todays: Artificial Intelligence and Data Platforms on AWS
Unlocking New Todays: Artificial Intelligence and Data Platforms on AWS
 
Mobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile contextMobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile context
 
2009 Mux Florentstroppa Mobilecontext Small
2009 Mux Florentstroppa Mobilecontext Small2009 Mux Florentstroppa Mobilecontext Small
2009 Mux Florentstroppa Mobilecontext Small
 
Artificial intelligence Overview by Ramya Mopidevi
Artificial intelligence Overview by Ramya MopideviArtificial intelligence Overview by Ramya Mopidevi
Artificial intelligence Overview by Ramya Mopidevi
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptx
 
Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...
Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...
Filip Maertens - Artificial Intelligence: Building Emotion & Context aware Re...
 
Ad:Tech Conference 2014
Ad:Tech Conference 2014Ad:Tech Conference 2014
Ad:Tech Conference 2014
 
Ad:Tech Conference 2014
Ad:Tech Conference 2014Ad:Tech Conference 2014
Ad:Tech Conference 2014
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phone
 
Pervasive Computing
Pervasive ComputingPervasive Computing
Pervasive Computing
 
Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...
Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...
Digital Marketing First 2014 - Context Aware Computing and Cross Channel Pers...
 
emerging trends.pdf
emerging trends.pdfemerging trends.pdf
emerging trends.pdf
 
Presentation.ai
Presentation.aiPresentation.ai
Presentation.ai
 
Earables for Personal-scale Behaviour Analytics
Earables for Personal-scale Behaviour AnalyticsEarables for Personal-scale Behaviour Analytics
Earables for Personal-scale Behaviour Analytics
 
Human Attention (HA) for an AI augmented world #DevFest19
Human Attention (HA) for an AI augmented world #DevFest19Human Attention (HA) for an AI augmented world #DevFest19
Human Attention (HA) for an AI augmented world #DevFest19
 
Pervasive Social Networking
Pervasive Social NetworkingPervasive Social Networking
Pervasive Social Networking
 
Reveal the Car's Potential with Cognitive IoT
Reveal the Car's Potential with Cognitive IoTReveal the Car's Potential with Cognitive IoT
Reveal the Car's Potential with Cognitive IoT
 

More from Qualcomm Research

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdfQualcomm Research
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
 
Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Qualcomm Research
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfQualcomm Research
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfQualcomm Research
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIQualcomm Research
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingQualcomm Research
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfQualcomm Research
 
Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Qualcomm Research
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GQualcomm Research
 
3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolution3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolutionQualcomm Research
 
AI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptAI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptQualcomm Research
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Qualcomm Research
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edgeQualcomm Research
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scaleQualcomm Research
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G futureQualcomm Research
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsQualcomm Research
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingQualcomm Research
 

More from Qualcomm Research (20)

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
The future of AI is hybrid
The future of AI is hybridThe future of AI is hybrid
The future of AI is hybrid
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdf
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensing
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdf
 
Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5G
 
3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolution3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolution
 
AI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptAI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-concept
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scale
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G future
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous driving
 
Pioneering 5G broadcast
Pioneering 5G broadcastPioneering 5G broadcast
Pioneering 5G broadcast
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Making an on-device personal assistant a reality

  • 1. Making an on-device personal assistant a reality Qualcomm Technologies, Inc. @qualcomm_techJune 2018
  • 2. 2 Reasoning Learn, infer context, and anticipate Perception Hear, see, and observe Action Act intuitively, interact naturally, and protect privacy AI brings human-like understanding and behaviors to the machines
  • 3. 3 Advancing AI research to make on-device AI ubiquitous A common platform is fundamental to scaling AI internally and across the industry Power efficiency Model design, compression, quantization, activation, algorithms, and efficient hardware Efficient learning Robust learning through minimal data, unsupervised learning, and on-device learning Personalization Continuous learning, model adaptation, and privacy-preserved distributed learning System architecture Multi-task and multi-modal learning, sensor fusion, and cloud-edge systems Action Reinforcement learning for decision making Reasoning Scene understanding, language understanding, behavior prediction Perception Object detection, speech recognition, contextual fusion AutomotiveIoT Mobile
  • 4. 4 A true personal assistant One of many use cases requiring a broad set of AI capabilities Power efficiency Model design, compression, quantization, activation, algorithms, and efficient hardware Efficient learning Robust learning through minimal data, unsupervised learning, and on-device learning Personalization Continuous learning, model adaptation, and privacy-preserved distributed learning AutomotiveIoT Mobile Action Reinforcement learning for decision making Reasoning Scene understanding, language understanding, behavior prediction Perception Object detection, speech recognition, contextual fusion System architecture Multi-task and multi-modal learning, sensor fusion, and cloud-edge systems
  • 5. 5 v Designed to be: Always-on Conversational Personal Private Critical to create a true virtual assistant Voice is the transformative user interface (UI) we’ve been waiting for
  • 6. 6 Voice UI components required for an end-to-end solution Text-to-speech Speech synthesis Natural language generation Dialog management Natural language understanding Natural language processing Speech-to-text Speech recognition “Alexa,” “Hey Snapdragon” Always-on keyword detection Voice activation Echo cancellation Speech denoising Speech pre-processing Signal acquisition and playback Front-end processing Machine speech chain: listener and speaker Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
  • 7. 77 Machine learning has ignited the voice UI revolution “As speech recognition accuracy goes from say 95% to 99%, all of us in the room will go from barely using it today to using it all the time. Most people underestimate the difference between 95% and 99% accuracy— 99% is a gamechanger. No one wants to wait 10 seconds for a response. Accuracy, followed by latency, are the two key metrics for a production speech system.” — Andrew Ng GMM: Gaussian Mixture Model, CNN: Convolutional Neural Network, RNN: Recurrent Neural Network Human accuracy GMM RNN + CNN Machine automatic speech recognition accuracy 202020102000199019801970 50% CNN 55% 60% 62% 70% 95%
  • 8. 8 TV Smart speakers Headphones and headsets IoT Smartphones and tablets In-car entertainment systems PCs and laptops Wearables XR Voice UI is proliferating across product categories
  • 9. 9 Moving voice UI functionality to the end device An end-to-end solution powered by machine learning Automatic speech recognition (ASR) Text-to-speech (TTS) News SMS Music Maps Wikipedia Weather Stocks Voice activation Service manager Multi-mic echo cancellation, beamforming, and speech denoising Natural language understanding (NLU) On-device processing (always-on and real-time) Cloud processing (services) Cloud centric (today)
  • 10. 10 Moving voice UI functionality to the end device An end-to-end solution powered by machine learning Automatic speech recognition (ASR) Text-to-speech (TTS) News SMS Music Maps Wikipedia Weather Stocks Voice activation Service manager Multi-mic echo cancellation, beamforming, and speech denoising Natural language understanding (NLU) On-device processing (always-on and real-time) Cloud processing (services) On-device centric (future)
  • 11. 11 Cloud tasks Complex voice fallback Training and model update Knowledge base Services On-device processing of voice UI Provides unique benefits complementing the cloud Challenge Providing the voice UI functionality within the power/thermal envelope Machine learning models Offline raw data Queries On-device tasks Automatic speech recognition Natural language processing Always-on audio cognition On-device training Benefits Privacy Instant response Always-on Device context
  • 12. 12 Noisy speech spectrogram Clean speech spectrogram “If people were more generous, there would be no need for welfare” 0.5 1 1.5 2 2.5 3 4000 3000 2000 1000 0 Frequency Time 0.5 1 1.5 2 2.5 3 4000 3000 2000 1000 0 Frequency Time DL-based denoising model trained with extensive speech noise databases DL-based denoising Speech denoising • Single or multiple mics • Applicable for ◦ Two-way conversation ◦ Voice/speaker recognition ◦ Keyword spotting • Deep learning (DL) significantly improves the performance over traditional methods • Robust in challenging interference and noise scenarios “If people were more generous, there would be no need for welfare”
  • 13. 13 Qualcomm Voice Activation supports: Qualcomm® Voice Activation (VA) High accuracy, robust to background noise, and supports multiple languages Deep learning is improving performance Among state-of-the-art in terms of performance vs. power consumption -47% -11% 2014 2015 2016 2017 QualcommVApowerconsumption WCD9330 WCD9335 WCD9340 Amazon Alexa Baidu DUEROS Microsoft Cortana Google Assistant Qualcomm Voice Activation, Qualcomm WCD9330, Qualcomm WCD9335, and Qualcomm WCD9340 are products of Qualcomm Technologies, Inc. and/or its subsidiaries.
  • 14. 14 Transcribe the audio to text Deep learning gives state-of-the-art accuracy on a mobile device Personalization—adaptation to individual accent and acoustic environment Automatic speech recognition Natural language understanding (NLU) Acoustic features Acoustic model Reduce input audio to essential information Language model Deep learning converts input into linguistic units Adapted to each user’s accent and environment Uses context and language statistics for best utterance estimation Adapted to each user’s speaking tendencies Allows the same intention to be expressed in multiple ways Adapted to each user’s intent expressions “Turn  on  the  light” On-device automatic speech recognition (ASR) User intention
  • 15. 1515 An end-to-end on-device voice UI example for smart homes 99% on-device intent accuracy is achieved for domain specific command sets when adapted to accent and environmental condition Demo of automatic speech recognition and natural language understanding Large command set Turn on the living room lights Click the kitchen lights off Turn off all lights Switch on the ceiling fan Shut off the sprinklers Start music Pause song Next track Go back one Play previous song Turn speaker off Increase temperature Intent understanding Turn on the kitchen light Click kitchen light on Switch on light in the kitchen Turn the light on in the kitchen NLU: These four phrases map to the same intent
  • 16. 16 A true virtual assistant A “digital me” sitting on the device: context aware and personalized
  • 17. 1717 Calendar Messaging Apps On-device data Contextual intelligence is required for personalization The fusion of many types of sensors and personal information 17 Low power sensing, processing, and connectivity Efficient, heterogeneous architectures Sensor fusion and machine learning Integrated, always-on data capturing Low-energy wireless technologies (e.g. BT-LE, 5G NR IoT) Cloud data Off-device data IoT data Sensor fusion Gyroscope Compass Camera Ambient light Temperature Iris scanEnvironment Pulse HumidityMicrophone Sensor data C-V2X
  • 18. 1818 Creating personalized memories Essential for a true virtual assistant History, number of people, identity After the party, strolling on the beach at sunset in La Jolla talking with my son and laughing Live sentiment analysis Strolling on the beach at sunset in La Jolla talking with my son and laughing Activity analysis Strolling on the beach at sunset in La Jolla talking with my son GPS location La Jolla, California Visual analysis A sunset over the ocean in La Jolla Sound analysis Talking with my son at sunset in La Jolla
  • 19. 19 A true personal assistant is responsive and proactive “Remember the time I was strolling with my son after the party at La Jolla beach?” “Yes I do, here is a picture you took of the sunset. Should I share it with your family group on WeChat?” Responsive Decision-making and conversation based on contextual analysis and prompting (e.g. finding memories) Proactive Decision-making and conversation based on contextual analysis without prompting (e.g. automatically sharing memories) “I noticed that you are tired and stressed, I’m turning on the Rocky III soundtrack and navigating you to the gym for a workout and sauna.” “This music gets my blood going and a workout and sauna will help me relieve stress.”
  • 20. 20 Multi-mic echo cancellation, beamforming, and speech denoising The first step to an on-device virtual assistant Enabling on-device voice UI News SMS Music Maps Wikipedia Weather Stocks Voice activation Service manager Automatic speech recognition (ASR) Natural language understanding (NLU) On-device processing Cloud processing Text-to-speech (TTS)
  • 21. 21 Adding an “AI agent” to create a true virtual assistant The on-device AI agent continuously learns personal knowledge and acts intuitively On-device processing Cloud processing (services) Cloud knowledge graph Automatic speech recognition (ASR) Multi-mic echo cancellation and beamforming Sensors Text-to-speech (TTS) Natural language understanding (NLU) Voice activation AI agent
  • 22. 22 Adding an “AI agent” to create a true virtual assistant Contextualization allows personalization at acoustic, intent, and behavior levels Cloud processing (services) Automatic speech recognition (ASR) Voice activation Multi-mic echo cancellation and beamforming Cloud knowledge graph On-device processing Text-to-speech (TTS) Sensors Natural language understanding (NLU) AI agent Dialog management Speaker identification Acoustic event detection Gender and age detection Voice activity detection Emotion classification Contextual fusion and learning Local knowledge graph
  • 23. 23 Various kitchen noise samples Posterior-gram ML-based acoustic event classification Object rustling Object snapping Cupboard Cutlery Dishes Drawer Glass jingling Object impact Walking Washing dishes Water tap running 2 4 6 8 10 12 50 100 150 200 250 300 350 400 450 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Acoustic event detection • ML techniques are used to ◦ Classify acoustic signals into a set of predefined events ◦ Infer acoustic environment • Low power, always-on
  • 24. 2424 We are advancing AI research to make on-device AI ubiquitous We are creating AI platform innovations that are fundamental to scaling AI across the industry We provide the low-power end-to-end on-device solution for a true personal assistant
  • 25. Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Thank you! Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2018 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.