SlideShare a Scribd company logo
Move as I speak
Understanding the development of a game that is controlled through
                          voice commands
Agenda
 Voice recognition in android

Implementing Voice recognition in android

 Using Android speech input API

 Using IME


Integrating Voice recognition in the Game

Controlling the Game through voice commands

Limitations
Voice Recognition in Android
Voice Recognition in Android

 Available in android since Android 1.5 .

 voice recognition can be used only when connected to the internet
  until Android 4.0.X(ICS) .

 Voice sent to the cloud which returns an array of results .

 Voice recognition is available in offline mode in Android 4.1(Jelly
  bean) .

 English(US) is given as default language and many other languages
  are available for download.
Google’s Speech Recognizer
Google’s Speech Recognizer


                    Google speech ser ver !


      Japanese!                US English!            …!
  Acoustic   Dictionar y!   Acoustic   Dictionar y!
   Model!                    Model!

   Search    Dictation       Search    Dictation
  Language   Language       Language   Language
   Model!     Model!         Model!     Model!
Implementing Voice
Recognition in Android
Android Speech Input API
 Android’s  open platform makes it simple
 to access Google’s speech recognizer
 programmatically from your
 application(or any other recognizer that
 registers for RecognizerIntent)

 Simple   to Use the API to:
 •   Prompt the user to start speaking,
 •   Stream the audio Google's Servers,
 •   Retrieve the recognition hypothesis
Example Code
//called when someone clicks on a button in the app
 public void onClick(View v) {
// create a recognition request
       Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
// Set the language model
     intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
           RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
// Send the request to display propmpt, record audio, and return a result
     intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speech recognition demo");
     startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE);
   }

 // Called when speech recognition is finished
   protected void onActivityResult(int requestCode, int resultCode, Intent data) {
 //Get the n-best list
        ArrayList<String> matches = data.getStringArrayListExtra(
            RecognizerIntent.EXTRA_RESULTS);
//Do something with best result
     DoSomething(matches.get(bestResult));
}
Parameters
   Language(EXTRA_LANGUAGE), e.g
•    Ja_jp (Japanese)
•    en_us (US English)
   If not set ,then the phones default language is
    used.

   Language Model
    hints(EXTRA_LANGUAGE_MODEL)
•    Search - Good for short queries, business names,
     cities. The types of things people search for on
     Google.
•    Free Form - For dictation . Sending e-mail, SMS, etc.
Using IME
public void onCreate() {
super.onCreate();
   // Create the voice recognition trigger
   // The trigger has to unregistered, when the IME is destroyed.
mVoiceRecognitionTrigger = new VoiceRecognitionTrigger(this);
  //register the listener.
   mVoiceRecognitionTrigger.register(new VoiceRecognitionTrigger.Listener() {
    @Override
    public void onVoiceImeEnabledStatusChange() {
 // The call back is done on the main thread.
  updateVoiceImeStatus();

    }

    });

}
Using IME
// Use this method to start voice recognition.
mVoiceRecognitionTrigger.startVoiceRecognition(“en_us”);

public void onDestroy() {
        if (mVoiceRecognitionTrigger != null) {
       // To avoid service leak, the trigger has to be unregistered
      //when the IME is destroyed.
       mVoiceRecognitionTrigger.unregister(this);
    }
    super.onDestroy();
  }
Integrating Voice recognition
         in the Game
 Google voice API can not be used for a
 game because an intent has to be fired
 every time to fetch the results for every
 voice command.

 IME
    has to be used which provides
 continuous feed back of the voice
 commands to the application.
Controlling the game through
     Voice commands

// Called when speech recognition is finished
   protected void onActivityResult(int requestCode, int resu
ltCode, Intent data) {
 //Get the n-best list
        ArrayList<String> matches = data.getStringArrayListE
xtra(RecognizerIntent.EXTRA_RESULT);
 //store the best result into a variable which is used as a
command in the game
     voiceRecognizer.command=matches.get(bestResult))
;
Controlling the game through
     Voice commands
// Method which is used to render . This method is called
continuously in the game
public void animRender(Canvas c)
{
   String voiceCommand = voiceRecognizer.command;

    if(voiceCommand.equalsIgnoreCase("left"))
    {
       //do some action
    }
    else if(voiceCommand.equalsIgnoreCase("right"))
    {
       //do some action
    }
}
Limitations

 Similar to other assistive technologies, voice
  recognition systems have their own
  limitations. The most significant limitation is
  that they can be inaccurate.

 You always have to make your voice clear
  and easy to understand.

 Lastly,
        not all persons with motor disabilities
  can use voice recognition systems.

More Related Content

Similar to Droidcon ppt

Building Windows 10 Universal Apps with Speech and Cortana
Building Windows 10 Universal Apps with Speech and CortanaBuilding Windows 10 Universal Apps with Speech and Cortana
Building Windows 10 Universal Apps with Speech and Cortana
Nick Landry
 
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Nick Landry
 
Androidの音声認識とテキスト読み上げ機能について
Androidの音声認識とテキスト読み上げ機能についてAndroidの音声認識とテキスト読み上げ機能について
Androidの音声認識とテキスト読み上げ機能について
moai kids
 
General Speereo Technology
General Speereo TechnologyGeneral Speereo Technology
General Speereo Technology
Daniel Ischenko
 

Similar to Droidcon ppt (20)

Porting unity games to windows - London Unity User Group
Porting unity games to windows - London Unity User GroupPorting unity games to windows - London Unity User Group
Porting unity games to windows - London Unity User Group
 
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
 
TDC 2014 - Cortana
TDC 2014 - CortanaTDC 2014 - Cortana
TDC 2014 - Cortana
 
Otto AI
Otto AIOtto AI
Otto AI
 
Kinect v2 Introduction and Tutorial
Kinect v2 Introduction and TutorialKinect v2 Introduction and Tutorial
Kinect v2 Introduction and Tutorial
 
Building Windows 10 Universal Apps with Speech and Cortana
Building Windows 10 Universal Apps with Speech and CortanaBuilding Windows 10 Universal Apps with Speech and Cortana
Building Windows 10 Universal Apps with Speech and Cortana
 
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
 
Androidの音声認識とテキスト読み上げ機能について
Androidの音声認識とテキスト読み上げ機能についてAndroidの音声認識とテキスト読み上げ機能について
Androidの音声認識とテキスト読み上げ機能について
 
Game programming with Groovy
Game programming with GroovyGame programming with Groovy
Game programming with Groovy
 
Speech for Windows Phone 8
Speech for Windows Phone 8Speech for Windows Phone 8
Speech for Windows Phone 8
 
Speech for Windows Phone 8
Speech for Windows Phone 8Speech for Windows Phone 8
Speech for Windows Phone 8
 
Enhancing Free PBX with Adhearsion at Fosdem 2012
Enhancing Free PBX with Adhearsion at Fosdem 2012Enhancing Free PBX with Adhearsion at Fosdem 2012
Enhancing Free PBX with Adhearsion at Fosdem 2012
 
Android Audio & OpenSL
Android Audio & OpenSLAndroid Audio & OpenSL
Android Audio & OpenSL
 
Going Mobile - Flash Gaming Summit 2012
Going Mobile - Flash Gaming Summit 2012Going Mobile - Flash Gaming Summit 2012
Going Mobile - Flash Gaming Summit 2012
 
Games Speech ASDK
Games Speech ASDKGames Speech ASDK
Games Speech ASDK
 
Developing with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsDeveloping with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile Apps
 
General Speereo Technology
General Speereo TechnologyGeneral Speereo Technology
General Speereo Technology
 
AIWolf programming guide
AIWolf programming guideAIWolf programming guide
AIWolf programming guide
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Android Crash Course lunch and learn (1 of 2)
Android Crash Course lunch and learn (1 of 2)Android Crash Course lunch and learn (1 of 2)
Android Crash Course lunch and learn (1 of 2)
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Droidcon ppt

  • 1. Move as I speak Understanding the development of a game that is controlled through voice commands
  • 2. Agenda Voice recognition in android Implementing Voice recognition in android  Using Android speech input API  Using IME Integrating Voice recognition in the Game Controlling the Game through voice commands Limitations
  • 4. Voice Recognition in Android  Available in android since Android 1.5 .  voice recognition can be used only when connected to the internet until Android 4.0.X(ICS) .  Voice sent to the cloud which returns an array of results .  Voice recognition is available in offline mode in Android 4.1(Jelly bean) .  English(US) is given as default language and many other languages are available for download.
  • 5. Google’s Speech Recognizer Google’s Speech Recognizer Google speech ser ver ! Japanese! US English! …! Acoustic Dictionar y! Acoustic Dictionar y! Model! Model! Search Dictation Search Dictation Language Language Language Language Model! Model! Model! Model!
  • 7. Android Speech Input API  Android’s open platform makes it simple to access Google’s speech recognizer programmatically from your application(or any other recognizer that registers for RecognizerIntent)  Simple to Use the API to: • Prompt the user to start speaking, • Stream the audio Google's Servers, • Retrieve the recognition hypothesis
  • 8. Example Code //called when someone clicks on a button in the app public void onClick(View v) { // create a recognition request Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); // Set the language model intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); // Send the request to display propmpt, record audio, and return a result intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speech recognition demo"); startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE); } // Called when speech recognition is finished protected void onActivityResult(int requestCode, int resultCode, Intent data) { //Get the n-best list ArrayList<String> matches = data.getStringArrayListExtra( RecognizerIntent.EXTRA_RESULTS); //Do something with best result DoSomething(matches.get(bestResult)); }
  • 9. Parameters  Language(EXTRA_LANGUAGE), e.g • Ja_jp (Japanese) • en_us (US English)  If not set ,then the phones default language is used.  Language Model hints(EXTRA_LANGUAGE_MODEL) • Search - Good for short queries, business names, cities. The types of things people search for on Google. • Free Form - For dictation . Sending e-mail, SMS, etc.
  • 10. Using IME public void onCreate() { super.onCreate(); // Create the voice recognition trigger // The trigger has to unregistered, when the IME is destroyed. mVoiceRecognitionTrigger = new VoiceRecognitionTrigger(this); //register the listener. mVoiceRecognitionTrigger.register(new VoiceRecognitionTrigger.Listener() { @Override public void onVoiceImeEnabledStatusChange() { // The call back is done on the main thread. updateVoiceImeStatus(); } }); }
  • 11. Using IME // Use this method to start voice recognition. mVoiceRecognitionTrigger.startVoiceRecognition(“en_us”); public void onDestroy() { if (mVoiceRecognitionTrigger != null) { // To avoid service leak, the trigger has to be unregistered //when the IME is destroyed. mVoiceRecognitionTrigger.unregister(this); } super.onDestroy(); }
  • 12. Integrating Voice recognition in the Game  Google voice API can not be used for a game because an intent has to be fired every time to fetch the results for every voice command.  IME has to be used which provides continuous feed back of the voice commands to the application.
  • 13. Controlling the game through Voice commands // Called when speech recognition is finished protected void onActivityResult(int requestCode, int resu ltCode, Intent data) { //Get the n-best list ArrayList<String> matches = data.getStringArrayListE xtra(RecognizerIntent.EXTRA_RESULT); //store the best result into a variable which is used as a command in the game voiceRecognizer.command=matches.get(bestResult)) ;
  • 14. Controlling the game through Voice commands // Method which is used to render . This method is called continuously in the game public void animRender(Canvas c) { String voiceCommand = voiceRecognizer.command; if(voiceCommand.equalsIgnoreCase("left")) { //do some action } else if(voiceCommand.equalsIgnoreCase("right")) { //do some action } }
  • 15. Limitations  Similar to other assistive technologies, voice recognition systems have their own limitations. The most significant limitation is that they can be inaccurate.  You always have to make your voice clear and easy to understand.  Lastly, not all persons with motor disabilities can use voice recognition systems.

Editor's Notes

  1. Explain the 4 models – their definitions.