GLOBAL AZURE
BOOTCAMP 2019
Torino 27 aprile 2019
#GLOBALAZURE2019
2
SPEECH PROCESSING
WITH
AZURE COGNITIVE SERVICES
CLEMENTE GIORIO
R&D SENIOR SOFTWARE ENGINEER @DELTATRE
@TINUX80
CLEMENTE GIORIO
@TINUX80
" M A C H I N E L E A R N I N G G I V E S
C O M P U T E R S A B I L I T Y TO
L E A R N W I T H O U T B E I N G
E X P L I C I T LY P R O G R A M M E D ”
 Arthur Samuel, 1959
AI DEVELOPER
DATA SCIENTIST
Train/Create New Models
Leverage AI APIs
AI APIs
AI
AI Apps
Data + AIDoing AI where the data is
#GLOBALAZURE2019
TRAIN
MODEL
New Model
SCORE
MODEL
@sarah_edo
From faces to feelings, allow your
apps to understand images and video
Hear and speak to your users by filtering noise,
identifying speakers, and understanding intent
Process text and learn how to
recognize what users want
Tap into rich knowledge amassed from
the web, academia, or your own data
Access billions of web pages, images, videos,
and news with the power of Bing APIs
Microsoft
Cognitive Services
#GLOBALAZURE2019
COGNITIVE SERVICES
WHY COGNITIVE SERVICES ?
#GLOBALAZURE2019
GET https://westus.api.cognitive.microsoft.com/spid/v1.0/operations/{operationId}
HTTP/1.1
OCP-Apim-Subscription-Key: <API KEY>
Accessing the APIs
#GLOBALAZURE2019
SPEECH @ MICROSOFT
MSFT Product Portfolio
Cortana, Skype, Xbox
Technology
Speech-to-Text, Text-to-Speech, Speaker Identification
Delivered as part of Azure Cognitive Services Library
Speech API (a.k.a Bing Speech API)
Custom Speech Service
Speaker Recognition API
#GLOBALAZURE2019
SPEECH CLIENT SDK
this.client = SpeechRecognitionServiceFactory.CreateMicrophoneClient(
SpeechRecognitionMode.ShortPhrase,
"en-US",
<SubscriptionKey>);
// Event handlers for speech recognition results
this.client.OnMicrophoneStatus += this.OnMicrophoneStatusHandler;
this.client.OnPartialResponseReceived += this.OnPartialResponseReceivedHandler;
this.client.OnResponseReceived += this.OnResponseReceivedHandler;
this.client.OnConversationError += this.OnConversationErrorHandler;
private void OnResponseReceivedHandler (object sender, SpeechResponseEventArgs e) {
Console.WriteLine("--- OnDataShortPhraseResponseReceivedHandler ---");
Console.WriteLine(e);
}
#GLOBALAZURE2019
CUSTOM SPEECH
Consider the case where your App
users may have very specific
needs
Acoustic: noise conditions,
accents, age
Language:
vocabulary, terminology
Same SDK as Speech
Customized dictation and conversational
models hosted in Azure.
Support for 3 languages
#GLOBALAZURE2019
HOW DO YOU ADAPT A SPEECH MODEL?
Create custom language models
Customize the language model of the speech recognizer by tailoring it to the
vocabulary of the application and the speaking style of your users.
Create custom acoustic models
Customize the acoustic model of the speech recognizer to better match the
expected environment and user population of your application.
Deploy your custom models
Deploy your models to create a speech recognition endpoint that’s customized to
your application.
Custom Speech Overcome speech recognition barriers such as speaking style, vocabulary and
background noise.
#GLOBALAZURE2019
HOW TO USE IT?
this.client = SpeechRecognitionServiceFactory.CreateMicrophoneClient(
SpeechRecognitionMode.ShortPhrase, "en-US", <NewSubscriptionKey>,
<model_Uri>);
this.client.AuthenticationUri =
“https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken”;
// Event handlers for speech recognition results
this.client.OnMicrophoneStatus += this.OnMicrophoneStatusHandler;
this.client.OnPartialResponseReceived += this.OnPartialResponseReceivedHandler;
this.client.OnResponseReceived += this.OnResponseReceivedHandler;
this.client.OnConversationError += this.OnConversationErrorHandler;
Access your endpoint from any device
Send requests to your custom endpoint using RESTful API or the cognitive
services speech client library.
@sarah_edo
RESOURCES Get your own Free Key
Try out one of the API’s
There's so many to choose from
Use your new Key
Pass your key in via a header
Check us out on the web
microsoft.com/cognitive
#GLOBALAZURE2019
 Microsoft Cognitive services
https://www.microsoft.com/cognitive-services
 Speech (Bing Speech API)
https://www.microsoft.com/cognitive-services/en-us/speech-api
 Custom Speech
https://www.microsoft.com/cognitive-services/en-us/custom-recognition-
intelligent-service-cris
 Speaker Recognition
https://www.microsoft.com/cognitive-services/en-us/speaker-recognition-api
#GLOBALAZURE2019
22
organizzato da
#GLOBALAZURE2019
23
GLOBAL TECHNICAL SPONSOR

Speech Processing with Azure Cognitive Services

  • 1.
  • 2.
    #GLOBALAZURE2019 2 SPEECH PROCESSING WITH AZURE COGNITIVESERVICES CLEMENTE GIORIO R&D SENIOR SOFTWARE ENGINEER @DELTATRE @TINUX80
  • 3.
  • 6.
    " M AC H I N E L E A R N I N G G I V E S C O M P U T E R S A B I L I T Y TO L E A R N W I T H O U T B E I N G E X P L I C I T LY P R O G R A M M E D ”  Arthur Samuel, 1959
  • 7.
  • 8.
    Train/Create New Models LeverageAI APIs AI APIs AI AI Apps Data + AIDoing AI where the data is
  • 9.
  • 10.
    @sarah_edo From faces tofeelings, allow your apps to understand images and video Hear and speak to your users by filtering noise, identifying speakers, and understanding intent Process text and learn how to recognize what users want Tap into rich knowledge amassed from the web, academia, or your own data Access billions of web pages, images, videos, and news with the power of Bing APIs Microsoft Cognitive Services
  • 11.
  • 12.
  • 13.
  • 14.
    #GLOBALAZURE2019 SPEECH @ MICROSOFT MSFTProduct Portfolio Cortana, Skype, Xbox Technology Speech-to-Text, Text-to-Speech, Speaker Identification Delivered as part of Azure Cognitive Services Library Speech API (a.k.a Bing Speech API) Custom Speech Service Speaker Recognition API
  • 15.
    #GLOBALAZURE2019 SPEECH CLIENT SDK this.client= SpeechRecognitionServiceFactory.CreateMicrophoneClient( SpeechRecognitionMode.ShortPhrase, "en-US", <SubscriptionKey>); // Event handlers for speech recognition results this.client.OnMicrophoneStatus += this.OnMicrophoneStatusHandler; this.client.OnPartialResponseReceived += this.OnPartialResponseReceivedHandler; this.client.OnResponseReceived += this.OnResponseReceivedHandler; this.client.OnConversationError += this.OnConversationErrorHandler; private void OnResponseReceivedHandler (object sender, SpeechResponseEventArgs e) { Console.WriteLine("--- OnDataShortPhraseResponseReceivedHandler ---"); Console.WriteLine(e); }
  • 16.
    #GLOBALAZURE2019 CUSTOM SPEECH Consider thecase where your App users may have very specific needs Acoustic: noise conditions, accents, age Language: vocabulary, terminology Same SDK as Speech Customized dictation and conversational models hosted in Azure. Support for 3 languages
  • 17.
    #GLOBALAZURE2019 HOW DO YOUADAPT A SPEECH MODEL? Create custom language models Customize the language model of the speech recognizer by tailoring it to the vocabulary of the application and the speaking style of your users. Create custom acoustic models Customize the acoustic model of the speech recognizer to better match the expected environment and user population of your application. Deploy your custom models Deploy your models to create a speech recognition endpoint that’s customized to your application. Custom Speech Overcome speech recognition barriers such as speaking style, vocabulary and background noise.
  • 19.
    #GLOBALAZURE2019 HOW TO USEIT? this.client = SpeechRecognitionServiceFactory.CreateMicrophoneClient( SpeechRecognitionMode.ShortPhrase, "en-US", <NewSubscriptionKey>, <model_Uri>); this.client.AuthenticationUri = “https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken”; // Event handlers for speech recognition results this.client.OnMicrophoneStatus += this.OnMicrophoneStatusHandler; this.client.OnPartialResponseReceived += this.OnPartialResponseReceivedHandler; this.client.OnResponseReceived += this.OnResponseReceivedHandler; this.client.OnConversationError += this.OnConversationErrorHandler; Access your endpoint from any device Send requests to your custom endpoint using RESTful API or the cognitive services speech client library.
  • 20.
    @sarah_edo RESOURCES Get yourown Free Key Try out one of the API’s There's so many to choose from Use your new Key Pass your key in via a header Check us out on the web microsoft.com/cognitive
  • 21.
    #GLOBALAZURE2019  Microsoft Cognitiveservices https://www.microsoft.com/cognitive-services  Speech (Bing Speech API) https://www.microsoft.com/cognitive-services/en-us/speech-api  Custom Speech https://www.microsoft.com/cognitive-services/en-us/custom-recognition- intelligent-service-cris  Speaker Recognition https://www.microsoft.com/cognitive-services/en-us/speaker-recognition-api
  • 22.
  • 23.