Want to take advantage of machine learning without building and training your own models? The Google Cloud Vision and Speech APIs expose the machine learning functionality behind Google Photos, Google Images, and the speech recognition in “Ok, Google.” Developers can now build these features into their apps with a simple REST API call.
This deck gives an overview of Google Cloud Vision API, Cloud Speech API, and the Cloud Natural Language API.
Google Machine Learning APIs - puppies or muffins?
1. Puppies or muffins?
Easily leverage machine learning in your apps
Sara Robinson
@SRobTweets
Bret McGowen
@bretmcg
2. 2@SRobTweets @bretmcg
Who are we?
Developer Advocate, Google Cloud Platform
Sara Robinson / @SRobTweets
● New York, NY
● Swift fan (Taylor and language)
● Harry Potter aficionado
Developer Advocate, Google Cloud Platform
Bret McGowen / @bretmcg
● New York, NY
● U2 fan (band and plane)
● Lord of the Rings aficionado
3. What we’ll cover
01
02
03
04
05
A (very) brief overview of machine learning
Machine learning at Google
Vision API
Speech API
Natural Language API
28. 28
...
"itemListElement": [
{
"@type": "EntitySearchResult",
"result": {
"@id": "kg:/m/0c7ln",
"name": "Navy Pier",
"@type": [
"Thing", "Place", "LandmarksOrHistoricalBuildings",
"TouristAttraction"
],
...
"detailedDescription": {
"articleBody": "Navy Pier is a 3,300-foot-long
pier on the Chicago shoreline of Lake Michigan. It
is located in the Streeterville neighborhood of
the Near North Side community area.",
"url": "http://en.wikipedia.org/wiki/Navy_Pier"
...
Knowledge Graph sidebar
GET https://kgsearch.googleapis.com/v1/entities:search?ids=%2Fm%2F0b__kbm&key={API_KEY}
35. 03 The Speech API
Speech to text transcription in over 80 languages
36. 36@SRobTweets @bretmcg
What can I do with the Speech API?
● Speech to text transcription in over 80 languages
● Supports streaming and non-streaming recognition
● Filters inappropriate content
38. 38@SRobTweets @bretmcg
Let’s make a recording!
1. Make a recording using SoX, a command line utility for audio files
2. Base64 encode the recording
3. Build our API request in a JSON file
4. Send the JSON request to the Speech API
Bash script for this: bit.ly/speech-request-script
39. 04 Cloud Natural Language API
Perform sentiment analysis and entity recognition on text
40. 40@SRobTweets @bretmcg
What can I do with the Natural Language API?
Three methods:
1. Analyze entities - The Cubs are an MLB team from Chicago
2. Analyze sentiment - I love Chicago
3. Analyze syntax - Michelle Obama is married to Barack Obama