Your SlideShare is downloading. ×
0
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to Google Cloud platform technologies

3,552

Published on

This is a presentation given by Google Developer Advocate Chris Schalk at Spring One 2GX on Oct 21st, 2010. It introduces Google Storage for Developers, Prediction API, and BigQuery.

This is a presentation given by Google Developer Advocate Chris Schalk at Spring One 2GX on Oct 21st, 2010. It introduces Google Storage for Developers, Prediction API, and BigQuery.

Published in: Technology, News & Politics
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,552
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
122
Comments
0
Likes
8
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Chicago, October 19 - 22, 2010 How to analyze your data and take advantage of machine learning in your application Chris Schalk - Google
  • 2. SpringOne 2GX 2009. All rights reserved. Do not distribute without Google’s new Cloud Technologies Topics covered •  Google Storage for Developers •  Prediction API (machine learning) •  BigQuery
  • 3. Google Storage for Developers Store your data in Google's cloud
  • 4. What Is Google Storage? •  Store your data in Google's cloud o  any format, any amount, any time •  You control access to your data o  private, shared, or public •  Access via Google APIs or 3rd party tools/libraries
  • 5. Sample Use Cases Static content hosting e.g. static html, images, music, video Backup and recovery e.g. personal data, business records Sharing e.g. share data with your customers Data storage for applications e.g. used as storage backend for Android, AppEngine, Cloud based apps Storage for Computation e.g. BigQuery, Prediction API
  • 6. Google Storage Benefits High Performance and Scalability Backed by Google infrastructure Strong Security and Privacy Control access to your data Easy to Use Get started fast with Google & 3rd party tools
  • 7. Google Storage Technical Details •  RESTful API  o  Verbs: GET, PUT, POST, HEAD, DELETE  o  Resources: identified by URI o  Compatible with S3  •  Buckets  o  Flat containers  •  Objects  o  Any type o  Size: 100 GB / object •  Access Control for Google Accounts  o  For individuals and groups •  Two Ways to Authenticate Requests  o  Sign request using access keys  o  Web browser login
  • 8. Performance and Scalability •  Objects of any type and 100 GB / Object •  Unlimited numbers of objects, 1000s of buckets •  All data replicated to multiple US data centers •  Leveraging Google's worldwide network for data delivery •  Only you can use bucket names with your domain names •  Read-your-writes data consistency •  Range Get
  • 9. Security and Privacy Features •  Key-based authentication •  Authenticated downloads from a web browser •  Sharing with individuals •  Group sharing via Google Groups •  Access control for buckets and objects •  Set Read/Write/List permissions
  • 10. Demo
  • 11. Demo •  Tools: o  GS Manager o  GSUtil •  Upload / Download
  • 12. Google Storage usage within Google Haiti Relief Imagery USPTO data Partner Reporting Google BigQuery Google Prediction API Partner Reporting
  • 13. Some Early Google Storage Adopters
  • 14. Google Storage - Pricing o  Storage  $0.17/GB/Month o  Network  Upload - $0.10/GB  Download  $0.15/GB Americas / EMEA  $0.30/GB  APAC o  Requests  PUT, POST, LIST - $0.01 / 1000 Requests  GET, HEAD - $0.01 / 10000 Requests
  • 15. Google Storage - Availability •  Limited preview in US currently o  100GB free storage and network from Google per account o  Sign up for waitlist at http://code.google.com/apis/ storage/ •  Note: Non US preview available on case-by-case basis
  • 16. Google Storage Summary •  Store any kind of data using Google's cloud infrastructure •  Easy to Use APIs •  Many available tools and libraries o  gsutil, GS Manager o  3rd party:  Boto, CloudBerry, CyberDuck, JetS3t, and more
  • 17. Google Prediction API Google's prediction engine in the cloud
  • 18. Introducing the Google Prediction API •  Google's sophisticated machine learning technology •  Available as an on-demand RESTful HTTP web service
  • 19. How does it work? "english" The quick brown fox jumped over the lazy dog. "english" To err is human, but to really foul things up you need a computer. "spanish" No hay mal que por bien no venga. "spanish" La tercera es la vencida. ? To be or not to be, that is the question. ? La fe mueve montañas. The Prediction API finds relevant features in the sample data during training. The Prediction API later searches for those features during prediction.
  • 20. A virtually endless number of applications... Customer Sentiment Transaction Risk Species Identification Message Routing Legal Docket Classification Suspicious Activity Work Roster Assignment Recommend Products Political Bias Uplift Marketing Email Filtering Diagnostics Inappropriate Content Career Counselling Churn Prediction ... and many more ...
  • 21. A Prediction API Example Automatically categorize and respond to emails by language •  Customer: ACME Corp, a multinational organization •  Goal: Respond to customer emails in their language •  Data: Many emails, tagged with their languages •  Outcome: Predict language and respond accordingly
  • 22. Using the Prediction API 1. Upload 2. Train Upload your training data to Google Storage Build a model from your data Make new predictions3. Predict A simple three step process...
  • 23. Step 1: Upload Upload your training data to Google Storage •  Training data: outputs and input features •  Data format: comma separated value format (CSV) "english","To err is human, but to really ..." "spanish","No hay mal que por bien no venga." ... Upload to Google Storage gsutil cp ${data} gs://yourbucket/${data}
  • 24. Step 2: Train Create a new model by training on data To train a model: POST prediction/v1.1/training?data=mybucket%2Fmydata Training runs asynchronously. To see if it has finished: GET prediction/v1.1/training/mybucket%2Fmydata {"data":{ "data":"mybucket/mydata", "modelinfo":"estimated accuracy: 0.xx"}}}
  • 25. Step 3: Predict Apply the trained model to make predictions on new data POST prediction/v1.1/query/mybucket%2Fmydata/predict { "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}}
  • 26. Step 3: Predict Apply the trained model to make predictions on new data POST prediction/v1.1/query/mybucket%2Fmydata/predict { "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}} { data : { "kind" : "prediction#output", "outputLabel":"French", "outputMulti" :[ {"label":"French", "score": x.xx} {"label":"English", "score": x.xx} {"label":"Spanish", "score": x.xx}]}}
  • 27. Step 3: Predict Apply the trained model to make predictions on new data import httplib header = {"Content-Type" : "application/json"} #...put new data in JSON format in params variable conn = httplib.HTTPConnection("www.googleapis.com")conn.request("POST", "/prediction/v1.1/query/mybucket%2Fmydata/predict”, params, header) print conn.getresponse() An example using Python
  • 28. Prediction API Capabilities Data •  Input Features: numeric or unstructured text •  Output: up to hundreds of discrete categories Training •  Many machine learning techniques •  Automatically selected •  Performed asynchronously Access from many platforms: •  Web app from Google App Engine •  Apps Script (e.g. from Google Spreadsheet) •  Desktop app
  • 29. Prediction API v1.1 - new features •  Updated Syntax •  Multi-category prediction o  Tag entry with multiple labels •  Continuous Output o  Finer grained prediction rankings based on multiple labels •  Mixed Inputs o  Both numeric and text inputs are now supported Can combine continuous output with mixed inputs
  • 30. Demo Using the Prediction API to Predict a Cuisine
  • 31. Google BigQuery Interactive analysis of large datasets in Google's cloud
  • 32. Introducing Google BigQuery •  Google's large data adhoc analysis technology o  Analyze massive amounts of data in seconds •  Simple SQL-like query language •  Flexible access o  REST APIs, JSON-RPC, Google Apps Script
  • 33. Why BigQuery? Working with large data is a challenge
  • 34. Many Use Cases ... Spam Trends Detection Web Dashboards Network Optimization Interactive Tools
  • 35. Key Capabilities of BigQuery •  Scalable: Billions of rows •  Fast: Response in seconds •  Simple: Queries in SQL •  Web Service o  REST o  JSON-RPC o  Google App Scripts
  • 36. Using BigQuery 1. Upload 2. Import Upload your raw data to Google Storage Import raw data into BigQuery table Perform SQL queries on table3. Query Another simple three step process...
  • 37. Writing Queries Compact subset of SQL o  SELECT ... FROM ... WHERE ... GROUP BY ... ORDER BY ... LIMIT ...; Common functions o  Math, String, Time, ... Statistical approximations o  TOP o  COUNT DISTINCT
  • 38. BigQuery via REST GET /bigquery/v1/tables/{table name} GET /bigquery/v1/query?q={query} Sample JSON Reply: { "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] } } Also supports JSON-RPC
  • 39. Security and Privacy Standard Google Authentication •  Client Login •  OAuth •  AuthSub HTTPS support •  protects your credentials •  protects your data Relies on Google Storage to manage access
  • 40. Large Data Analysis Example Wikimedia Revision history data from: http://download.wikimedia.org/enwiki/latest/ enwiki-latest-pages-meta-history.xml.7z Wikimedia Revision History
  • 41. Using BigQuery Shell Python DB API 2.0 + B. Clapper's sqlcmd http://www.clapper.org/software/python/sqlcmd/
  • 42. BigQuery from a Spreadsheet
  • 43. BigQuery from a Spreadsheet
  • 44. Recap •  Google Storage o  High speed data storage on Google Cloud •  Prediction API o  Google's machine learning technology able to predict outcomes based on sample data •  BigQuery o  Interactive analysis of very large data sets o  Simple SQL query language access
  • 45. Further info available at: •  Google Storage for Developers o  http://code.google.com/apis/storage •  Prediction API o  http://code.google.com/apis/predict •  BigQuery o  http://code.google.com/apis/bigquery
  • 46. SpringOne 2GX 2010. All rights reserved. Do not distribute without permission. Demo
  • 47. SpringOne 2GX 2010. All rights reserved. Do not distribute without permission. Q&A
  • 48. SpringOne 2GX 2010. All rights reserved. Do not distribute without permission. Thank You! Chris Schalk Google Developer Advocate http://twitter.com/cschalk

×