SlideShare a Scribd company logo
1 of 30
PAGE1
© 2015 Apio Systems, Inc. Confidential 1
Jared Sheehan @ Driversiti
Speech Recognition as a User Interface
PAGE2
© 2015 Apio Systems, Inc. Confidential 2
Who am I
Glass explorer, speech recognition enthusiast and big android nerd
Android Lead @Driversiti - driving safety for the mobile generation
Speech Recognition application for the Amazon Fire Phone
Suite of applications - AIM Android, Engadget Android, Distro Android, TechCrunch
Android, AOL HD, AIM Blackberry
Meetup evangelist – “DC Android Meetup Group” – Join today!
PAGE3
© 2015 Apio Systems, Inc. Confidential 3
Overview
What is voice/speech recognition?
What awesome stuff you can do with it?
How it works…
Demo!
Question and Answer
PAGE4
© 2015 Apio Systems, Inc. Confidential 4
Hello Computer…
PAGE5
© 2015 Apio Systems, Inc. Confidential 5
Definition
PAGE6
© 2015 Apio Systems, Inc. Confidential 6
What can you do with SR?
Technology that allows spoken input into software systems.
You speak to your computer, tablet, phone or device and it uses what you said as input to
trigger some sort of action.
Replace other methods of input like clicking, swiping, typing or selecting in other ways.
It is a means to make devices and software more user-friendly and to increase productivity.
It is used extensively as a form of accessibility assistance.
PAGE7
© 2015 Apio Systems, Inc. Confidential 7
ASR - Dictation
Automatic speech recognition (ASR) also called Dictation
Translates speech input into words, sentences and punctuation.
Audio is input through a microphone and streamed somewhere
The result is usually returned as a string with a confidence level
Very easy integration with Android – 2 ways to do it.
PAGE8
© 2015 Apio Systems, Inc. Confidential 8
How does it work?
A user speaks into a recording device of some sort
Speech recognition begins with the digital sampling of speech and then acoustic signal
processing of the audio.
Several processes including DTW (Dynamic time warping), HMM (Hidden Markov models)
and NN’s (Neural Networks) can achieve the desired results
Most systems use language specific knowledge to tune the models.
Next is the actual recognition of phonemes, groups of phonemes and words
PAGE9
© 2015 Apio Systems, Inc. Confidential 9
Speech Recognition system architecture
PAGE10
© 2015 Apio Systems, Inc. Confidential 10
Into the weeds
Speaker dependence
Speaker independence
Continuous Speech
How good is your system? Hint: Word Error Rate
Isolated word
Is that all it does??
PAGE11
© 2015 Apio Systems, Inc. Confidential 11
Dictation is cool, but not that cool
Next step is understanding what the user wants to do
Then act on it
Generally, the ASR results are passed into an Intent recognition system with additional
information
Contextual information can be, where the utterance is coming from (mobile phone,
computer), what app they are using, location etc.
That information is used to determine the user’s intent and execute the request.
PAGE12
© 2015 Apio Systems, Inc. Confidential 12
Intent recognition
Recognizing speech is only part of the process. How does Google Now know that I want to
send an SMS message to a friend? How does Siri know when I want to know how tall
Kobe Bryant is?
ASR is only the first step in true Speech as a user interface. To successfully help users
perform useful actions we must understand their intent. How to do this?
Three systems; ASR, Intent Recognition and a Dialog Engine
The Dialog engine takes the output from the IR system and sends responses and
actionable information to the caller.
PAGE13
© 2015 Apio Systems, Inc. Confidential 13
Android Speech APIs
PAGE14
© 2015 Apio Systems, Inc. Confidential 14
Android Speech APIs
http://developer.android.com/reference/android/speech/package-summary.html
Relatively easy implementation
<uses-permission android:name="android.permission.RECORD_AUDIO" />
A UI and no UI API
InputMethodServices use the no UI version - Keyboards
PAGE15
© 2015 Apio Systems, Inc. Confidential 15
Recognizer Intent
UI is supplied for you
Fire the intent and get a result
Again very easy to use
PAGE16
© 2015 Apio Systems, Inc. Confidential 16
SpeechRecognizer
UI is not supplied for you
Results are streamed directly to the EditText
Still “fairly” easy to use
PAGE17
© 2015 Apio Systems, Inc. Confidential 17
Google Now – Onto Intent recognition systems…
PAGE18
© 2015 Apio Systems, Inc. Confidential 18
Google Now – On tap
PAGE19
© 2015 Apio Systems, Inc. Confidential 19
Apple – Siri
PAGE20
© 2015 Apio Systems, Inc. Confidential 20
Amazon – Fire phone, Fire Tv and Echo
PAGE21
© 2015 Apio Systems, Inc. Confidential 21
Microsoft – Cortana
PAGE22
© 2015 Apio Systems, Inc. Confidential 22
Speech providers – Google, Nuance, IBM Watson
PAGE23
© 2015 Apio Systems, Inc. Confidential 23
Google Voice Interaction API
PAGE24
© 2015 Apio Systems, Inc. Confidential 24
Nuance Speech SDK
Dragon Mobile – SDK – Free up to 20k transactions per/month
Upload custom vocabularies
Developer: Uploads a new song and music vocabulary
Utterance: “Eminem” higher probability then “M&M”
PAGE25
© 2015 Apio Systems, Inc. Confidential 25
User Interface examples - Google Glass
PAGE26
© 2015 Apio Systems, Inc. Confidential 26
User Interface examples - Google Glass continued…
PAGE27
© 2015 Apio Systems, Inc. Confidential 27
User Interface examples - Google Glass continued…
PAGE28
© 2015 Apio Systems, Inc. Confidential
Enough talk!
PAGE29
© 2015 Apio Systems, Inc. Confidential
Show me code!
PAGE30
© 2015 Apio Systems, Inc. Confidential
jared.sheehan@driversiti.com
http://www.meetup.com/DCAndroid/
Tweet: @jayroo5245
THANK YOU

More Related Content

What's hot

OOW13: Developing secure mobile applications (CON8902)
OOW13: Developing secure mobile applications (CON8902)OOW13: Developing secure mobile applications (CON8902)
OOW13: Developing secure mobile applications (CON8902)GregOracle
 
Device Management for Connected Devices
Device Management for Connected Devices Device Management for Connected Devices
Device Management for Connected Devices WSO2
 
Effective Smartphone UX at GREE
Effective Smartphone UX at GREEEffective Smartphone UX at GREE
Effective Smartphone UX at GREEKenichi Yonekawa
 
Connecting The Real World With The Virtual World
Connecting The Real World With The Virtual WorldConnecting The Real World With The Virtual World
Connecting The Real World With The Virtual WorldPing Identity
 
Providing Internet Access via WSO2 Enterprise Mobility Manager
Providing Internet Access via WSO2 Enterprise Mobility Manager Providing Internet Access via WSO2 Enterprise Mobility Manager
Providing Internet Access via WSO2 Enterprise Mobility Manager WSO2
 
I phone
I phoneI phone
I phoneuos
 
Nexus Protocol Gateway and BYOD
Nexus Protocol Gateway and BYODNexus Protocol Gateway and BYOD
Nexus Protocol Gateway and BYODSamuel Erdtman
 
Patterns and Practices in Mobile SSO
Patterns and Practices in Mobile SSOPatterns and Practices in Mobile SSO
Patterns and Practices in Mobile SSOWSO2
 
Beyond MDM: 5 Things You Must do to Secure Mobile Devices in the Enterprise
Beyond MDM: 5 Things You Must do to Secure Mobile Devices in the EnterpriseBeyond MDM: 5 Things You Must do to Secure Mobile Devices in the Enterprise
Beyond MDM: 5 Things You Must do to Secure Mobile Devices in the EnterpriseCA API Management
 
Mobile SSO using NAPPS
Mobile SSO using NAPPSMobile SSO using NAPPS
Mobile SSO using NAPPSAshish Jain
 
Security Checklist: how iOS can help protecting your data.
Security Checklist: how iOS can help protecting your data.Security Checklist: how iOS can help protecting your data.
Security Checklist: how iOS can help protecting your data.Tomek Cejner
 
CASE STUDY - Ironclad Messaging & Secure App Dev for Regulated Industries
CASE STUDY - Ironclad Messaging & Secure App Dev for Regulated IndustriesCASE STUDY - Ironclad Messaging & Secure App Dev for Regulated Industries
CASE STUDY - Ironclad Messaging & Secure App Dev for Regulated IndustriesNowSecure
 

What's hot (13)

OOW13: Developing secure mobile applications (CON8902)
OOW13: Developing secure mobile applications (CON8902)OOW13: Developing secure mobile applications (CON8902)
OOW13: Developing secure mobile applications (CON8902)
 
Device Management for Connected Devices
Device Management for Connected Devices Device Management for Connected Devices
Device Management for Connected Devices
 
SYPHERSAFE
SYPHERSAFESYPHERSAFE
SYPHERSAFE
 
Effective Smartphone UX at GREE
Effective Smartphone UX at GREEEffective Smartphone UX at GREE
Effective Smartphone UX at GREE
 
Connecting The Real World With The Virtual World
Connecting The Real World With The Virtual WorldConnecting The Real World With The Virtual World
Connecting The Real World With The Virtual World
 
Providing Internet Access via WSO2 Enterprise Mobility Manager
Providing Internet Access via WSO2 Enterprise Mobility Manager Providing Internet Access via WSO2 Enterprise Mobility Manager
Providing Internet Access via WSO2 Enterprise Mobility Manager
 
I phone
I phoneI phone
I phone
 
Nexus Protocol Gateway and BYOD
Nexus Protocol Gateway and BYODNexus Protocol Gateway and BYOD
Nexus Protocol Gateway and BYOD
 
Patterns and Practices in Mobile SSO
Patterns and Practices in Mobile SSOPatterns and Practices in Mobile SSO
Patterns and Practices in Mobile SSO
 
Beyond MDM: 5 Things You Must do to Secure Mobile Devices in the Enterprise
Beyond MDM: 5 Things You Must do to Secure Mobile Devices in the EnterpriseBeyond MDM: 5 Things You Must do to Secure Mobile Devices in the Enterprise
Beyond MDM: 5 Things You Must do to Secure Mobile Devices in the Enterprise
 
Mobile SSO using NAPPS
Mobile SSO using NAPPSMobile SSO using NAPPS
Mobile SSO using NAPPS
 
Security Checklist: how iOS can help protecting your data.
Security Checklist: how iOS can help protecting your data.Security Checklist: how iOS can help protecting your data.
Security Checklist: how iOS can help protecting your data.
 
CASE STUDY - Ironclad Messaging & Secure App Dev for Regulated Industries
CASE STUDY - Ironclad Messaging & Secure App Dev for Regulated IndustriesCASE STUDY - Ironclad Messaging & Secure App Dev for Regulated Industries
CASE STUDY - Ironclad Messaging & Secure App Dev for Regulated Industries
 

Similar to Speech Recognition as a User Interface

IRJET- Voice Recognition(AI) : Voice Assistant Robot
IRJET-  	  Voice Recognition(AI) : Voice Assistant RobotIRJET-  	  Voice Recognition(AI) : Voice Assistant Robot
IRJET- Voice Recognition(AI) : Voice Assistant RobotIRJET Journal
 
Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...
Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...
Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...CA API Management
 
Another Update of Tablet Strategy Bootcamp
Another Update of Tablet Strategy BootcampAnother Update of Tablet Strategy Bootcamp
Another Update of Tablet Strategy BootcampPaul Saunders
 
Overview of Enterprise Mobility
Overview of Enterprise MobilityOverview of Enterprise Mobility
Overview of Enterprise MobilityYuvaraj Ilangovan
 
Summary of Device Coverage Report 2021.pdf
Summary of Device Coverage Report 2021.pdfSummary of Device Coverage Report 2021.pdf
Summary of Device Coverage Report 2021.pdfpCloudy
 
Make Good Apps great - Using IBM MobileFirst Foundation
Make Good Apps great - Using IBM MobileFirst FoundationMake Good Apps great - Using IBM MobileFirst Foundation
Make Good Apps great - Using IBM MobileFirst FoundationAjay Chebbi
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneIJERA Editor
 
Mobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 Mistakes
Mobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 MistakesMobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 Mistakes
Mobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 Mistakesyonster
 
Addressing the Challenges of Mobile Test Automation
Addressing the Challenges of Mobile Test AutomationAddressing the Challenges of Mobile Test Automation
Addressing the Challenges of Mobile Test AutomationTechWell
 
Core Concepts of Mobile Development.pdf
Core Concepts of Mobile Development.pdfCore Concepts of Mobile Development.pdf
Core Concepts of Mobile Development.pdfShaiAlmog1
 
JUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile InnovationJUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile InnovationJamie Brighton
 
Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...
Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...
Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...René Winkelmeyer
 
Enterprise Mobility kan det gøres let for alle
Enterprise Mobility kan det gøres let for alleEnterprise Mobility kan det gøres let for alle
Enterprise Mobility kan det gøres let for alleMicrosoft
 
How IBM and Dialogic Are Making Conferencing Smarter with AI
How IBM and Dialogic Are Making Conferencing Smarter with AIHow IBM and Dialogic Are Making Conferencing Smarter with AI
How IBM and Dialogic Are Making Conferencing Smarter with AIDialogic Inc.
 
IBM Mobile Overview for Ecosystem Partners
IBM Mobile Overview for Ecosystem PartnersIBM Mobile Overview for Ecosystem Partners
IBM Mobile Overview for Ecosystem PartnersJeremy Siewert
 
Mobile to Mainframe - En-to-end transformation
Mobile to Mainframe - En-to-end transformationMobile to Mainframe - En-to-end transformation
Mobile to Mainframe - En-to-end transformationSanjeev Sharma
 

Similar to Speech Recognition as a User Interface (20)

Marketing in the Age of Mobile
Marketing in the Age of MobileMarketing in the Age of Mobile
Marketing in the Age of Mobile
 
IRJET- Voice Recognition(AI) : Voice Assistant Robot
IRJET-  	  Voice Recognition(AI) : Voice Assistant RobotIRJET-  	  Voice Recognition(AI) : Voice Assistant Robot
IRJET- Voice Recognition(AI) : Voice Assistant Robot
 
Voice automator
Voice automatorVoice automator
Voice automator
 
FingerprintTouch
FingerprintTouchFingerprintTouch
FingerprintTouch
 
Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...
Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...
Enterprise on the Go - Devon Winkworth, Snr. Principal Consultant, Layer 7 @ ...
 
Another Update of Tablet Strategy Bootcamp
Another Update of Tablet Strategy BootcampAnother Update of Tablet Strategy Bootcamp
Another Update of Tablet Strategy Bootcamp
 
Overview of Enterprise Mobility
Overview of Enterprise MobilityOverview of Enterprise Mobility
Overview of Enterprise Mobility
 
Summary of Device Coverage Report 2021.pdf
Summary of Device Coverage Report 2021.pdfSummary of Device Coverage Report 2021.pdf
Summary of Device Coverage Report 2021.pdf
 
Make Good Apps great - Using IBM MobileFirst Foundation
Make Good Apps great - Using IBM MobileFirst FoundationMake Good Apps great - Using IBM MobileFirst Foundation
Make Good Apps great - Using IBM MobileFirst Foundation
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phone
 
Mobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 Mistakes
Mobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 MistakesMobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 Mistakes
Mobile Pharma: When 'Go Mobile' Goes Wrong - Top 9 Mistakes
 
Addressing the Challenges of Mobile Test Automation
Addressing the Challenges of Mobile Test AutomationAddressing the Challenges of Mobile Test Automation
Addressing the Challenges of Mobile Test Automation
 
Core Concepts of Mobile Development.pdf
Core Concepts of Mobile Development.pdfCore Concepts of Mobile Development.pdf
Core Concepts of Mobile Development.pdf
 
Mobile simplificado
Mobile simplificadoMobile simplificado
Mobile simplificado
 
JUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile InnovationJUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile Innovation
 
Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...
Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...
Connect 2013 - Infrastructure Fitness and Design Simplicity for IBM Mobile Co...
 
Enterprise Mobility kan det gøres let for alle
Enterprise Mobility kan det gøres let for alleEnterprise Mobility kan det gøres let for alle
Enterprise Mobility kan det gøres let for alle
 
How IBM and Dialogic Are Making Conferencing Smarter with AI
How IBM and Dialogic Are Making Conferencing Smarter with AIHow IBM and Dialogic Are Making Conferencing Smarter with AI
How IBM and Dialogic Are Making Conferencing Smarter with AI
 
IBM Mobile Overview for Ecosystem Partners
IBM Mobile Overview for Ecosystem PartnersIBM Mobile Overview for Ecosystem Partners
IBM Mobile Overview for Ecosystem Partners
 
Mobile to Mainframe - En-to-end transformation
Mobile to Mainframe - En-to-end transformationMobile to Mainframe - En-to-end transformation
Mobile to Mainframe - En-to-end transformation
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Speech Recognition as a User Interface

  • 1. PAGE1 © 2015 Apio Systems, Inc. Confidential 1 Jared Sheehan @ Driversiti Speech Recognition as a User Interface
  • 2. PAGE2 © 2015 Apio Systems, Inc. Confidential 2 Who am I Glass explorer, speech recognition enthusiast and big android nerd Android Lead @Driversiti - driving safety for the mobile generation Speech Recognition application for the Amazon Fire Phone Suite of applications - AIM Android, Engadget Android, Distro Android, TechCrunch Android, AOL HD, AIM Blackberry Meetup evangelist – “DC Android Meetup Group” – Join today!
  • 3. PAGE3 © 2015 Apio Systems, Inc. Confidential 3 Overview What is voice/speech recognition? What awesome stuff you can do with it? How it works… Demo! Question and Answer
  • 4. PAGE4 © 2015 Apio Systems, Inc. Confidential 4 Hello Computer…
  • 5. PAGE5 © 2015 Apio Systems, Inc. Confidential 5 Definition
  • 6. PAGE6 © 2015 Apio Systems, Inc. Confidential 6 What can you do with SR? Technology that allows spoken input into software systems. You speak to your computer, tablet, phone or device and it uses what you said as input to trigger some sort of action. Replace other methods of input like clicking, swiping, typing or selecting in other ways. It is a means to make devices and software more user-friendly and to increase productivity. It is used extensively as a form of accessibility assistance.
  • 7. PAGE7 © 2015 Apio Systems, Inc. Confidential 7 ASR - Dictation Automatic speech recognition (ASR) also called Dictation Translates speech input into words, sentences and punctuation. Audio is input through a microphone and streamed somewhere The result is usually returned as a string with a confidence level Very easy integration with Android – 2 ways to do it.
  • 8. PAGE8 © 2015 Apio Systems, Inc. Confidential 8 How does it work? A user speaks into a recording device of some sort Speech recognition begins with the digital sampling of speech and then acoustic signal processing of the audio. Several processes including DTW (Dynamic time warping), HMM (Hidden Markov models) and NN’s (Neural Networks) can achieve the desired results Most systems use language specific knowledge to tune the models. Next is the actual recognition of phonemes, groups of phonemes and words
  • 9. PAGE9 © 2015 Apio Systems, Inc. Confidential 9 Speech Recognition system architecture
  • 10. PAGE10 © 2015 Apio Systems, Inc. Confidential 10 Into the weeds Speaker dependence Speaker independence Continuous Speech How good is your system? Hint: Word Error Rate Isolated word Is that all it does??
  • 11. PAGE11 © 2015 Apio Systems, Inc. Confidential 11 Dictation is cool, but not that cool Next step is understanding what the user wants to do Then act on it Generally, the ASR results are passed into an Intent recognition system with additional information Contextual information can be, where the utterance is coming from (mobile phone, computer), what app they are using, location etc. That information is used to determine the user’s intent and execute the request.
  • 12. PAGE12 © 2015 Apio Systems, Inc. Confidential 12 Intent recognition Recognizing speech is only part of the process. How does Google Now know that I want to send an SMS message to a friend? How does Siri know when I want to know how tall Kobe Bryant is? ASR is only the first step in true Speech as a user interface. To successfully help users perform useful actions we must understand their intent. How to do this? Three systems; ASR, Intent Recognition and a Dialog Engine The Dialog engine takes the output from the IR system and sends responses and actionable information to the caller.
  • 13. PAGE13 © 2015 Apio Systems, Inc. Confidential 13 Android Speech APIs
  • 14. PAGE14 © 2015 Apio Systems, Inc. Confidential 14 Android Speech APIs http://developer.android.com/reference/android/speech/package-summary.html Relatively easy implementation <uses-permission android:name="android.permission.RECORD_AUDIO" /> A UI and no UI API InputMethodServices use the no UI version - Keyboards
  • 15. PAGE15 © 2015 Apio Systems, Inc. Confidential 15 Recognizer Intent UI is supplied for you Fire the intent and get a result Again very easy to use
  • 16. PAGE16 © 2015 Apio Systems, Inc. Confidential 16 SpeechRecognizer UI is not supplied for you Results are streamed directly to the EditText Still “fairly” easy to use
  • 17. PAGE17 © 2015 Apio Systems, Inc. Confidential 17 Google Now – Onto Intent recognition systems…
  • 18. PAGE18 © 2015 Apio Systems, Inc. Confidential 18 Google Now – On tap
  • 19. PAGE19 © 2015 Apio Systems, Inc. Confidential 19 Apple – Siri
  • 20. PAGE20 © 2015 Apio Systems, Inc. Confidential 20 Amazon – Fire phone, Fire Tv and Echo
  • 21. PAGE21 © 2015 Apio Systems, Inc. Confidential 21 Microsoft – Cortana
  • 22. PAGE22 © 2015 Apio Systems, Inc. Confidential 22 Speech providers – Google, Nuance, IBM Watson
  • 23. PAGE23 © 2015 Apio Systems, Inc. Confidential 23 Google Voice Interaction API
  • 24. PAGE24 © 2015 Apio Systems, Inc. Confidential 24 Nuance Speech SDK Dragon Mobile – SDK – Free up to 20k transactions per/month Upload custom vocabularies Developer: Uploads a new song and music vocabulary Utterance: “Eminem” higher probability then “M&M”
  • 25. PAGE25 © 2015 Apio Systems, Inc. Confidential 25 User Interface examples - Google Glass
  • 26. PAGE26 © 2015 Apio Systems, Inc. Confidential 26 User Interface examples - Google Glass continued…
  • 27. PAGE27 © 2015 Apio Systems, Inc. Confidential 27 User Interface examples - Google Glass continued…
  • 28. PAGE28 © 2015 Apio Systems, Inc. Confidential Enough talk!
  • 29. PAGE29 © 2015 Apio Systems, Inc. Confidential Show me code!
  • 30. PAGE30 © 2015 Apio Systems, Inc. Confidential jared.sheehan@driversiti.com http://www.meetup.com/DCAndroid/ Tweet: @jayroo5245 THANK YOU

Editor's Notes

  1. What else is there?