Google Cloud forData CrunchersPatrick Chanezon, Developer Advocate, Cloud@chanezon, chanezon@google.comRajdeep Dua, Develo...
Agenda•   Google App Engine•   Google Storage for Developers•   Prediction API•   BigQuery•   Google SQL Service•   Google...
Google App Engine                    Google Developer Day 2010
What is  cloudcomputing?   3
Cloud Computing Defined         SaaS         PaaS          IaaS                          Source: Gartner AADI Summit Dec 2...
Cloud Computing Defined         SaaS         PaaS          IaaS                          Source: Gartner AADI Summit Dec 2...
Cloud Computing Defined         SaaS         PaaS          IaaS                          Source: Gartner AADI Summit Dec 2...
Cloud Computing Defined         SaaS         PaaS          IaaS                          Source: Gartner AADI Summit Dec 2...
Googles Cloud Offerings                 1. Google Apps                 2. Third party Apps:                    Google Apps...
Googles Cloud Offerings Your Apps                    1. Google Apps                    2. Third party Apps:               ...
Google App Engine - Easy to build - Easy to maintain - Easy to scale                      7
Cloud development in a box• SDK & “The Cloud”• Hardware• Networking• Operating system• Application runtime    o Java, Pyth...
App Engine Services   Memcache   Datastore   URL Fetch     Mail       XMPP      Task Queue    Images    Blobstore   User S...
Always free to get started  ~5M pageviews/month   • 6.5 CPU hrs/day   • 1 GB storage   • 650K URL Fetch calls/day   • 2,00...
Purchase additional resources * * free monthly quota of ~5 million page views still in full effect                        ...
Google App Engine for Business   Same scalable cloud hosting platform. Designed for the enterprise.    • Enterprise applic...
App Engine for Data Crunchers• High Performance Image Serving• OpenId/Oauth integration• Increased quotas    • > 1k entiti...
Mapper API • First component of App Engine’s MapReduce toolkit • Large scale data manipulation • Examples include:    • Re...
Channel API• Allows for Server Push (Comet) to browser • Blog post announcement:     • http://googleappengine.blogspot.com...
Matcher API• Allows an app to register a set of queries to match against astream of documents • Trustes Testers, Python on...
Google Storage for DevelopersStore your data in Googles cloud                                    Google Developer Day 2010
What Is Google Storage?• Store your data in Googles cloud  o any format, any amount, any time• You control access to your ...
Google Storage Technical DetailsRESTful API • Verbs: GET, PUT, POST, HEAD, DELETE • Resources: identified by URI, like:   ...
Google Storage : Concepts               Basic containers that hold your data, cannot nestBuckets        buckets           ...
Google Storage Use Cases                     Use Case        HTTP VerbCreate a BucketChange ACLs of a BucketUploads an Obj...
Performance and ScalabilityObject types and size• Objects of any type and 100GB+ / Object• Unlimited numbers of objects, 1...
Security and Privacy FeaturesAuthenticated downloads from a web browser• Sharing with individuals• Group sharing via Googl...
Tools         Google Storage Managergsutil                                  Google Developer Day 2010
Google Storage Benefits        High Performance and Scalability        Backed by Google infrastructure           Strong Se...
Some Early Google Storage Adopters                          Google Developer Day 2010
Google Storage usage within Google         Google                           Google        BigQuery                       P...
Google Storage - AvailabilityLimited preview in US* currently• 100GB free storage and network per account• Sign up for wai...
Google Prediction APIGoogles prediction engine in the cloud                                          Google Developer Day ...
Introducing the Google Prediction API• Googles sophisticated machine learning technology• Available as an on-demand RESTfu...
A virtually endless number of applications... Customer    Transaction        Species             Message         Diagnosti...
How does it work?1. TRAIN                         The quick brown fox jumped over the                     "english"The Pre...
Introducing the Google Prediction API                            Google Developer Day 2010
A Prediction API ExampleAutomatically determine application recommendations• Goal: Increase relevancy on the Apps Marketpl...
Using the Prediction APIA simple three step process...                                 Upload your training data to       ...
Step 1: Upload Upload your training data to Google Storage• Training data: outputs and input features• Data format: comma ...
Step 2: TrainCreate a new model by training on dataTo train a model:POST prediction/v1.1/training?data=appdata%2FinstallsT...
Step 3: Predict Apply the trained model to make predictions on new dataPOST prediction/v1.1/query/appdata%2Finstalls/predi...
Demo!    Google Developer Day 2010
Demo ScreenshotsPredicting apps for a 501-1,000 seat educational institution                                              ...
Demo ScreenshotsPredicting apps for a 501-1,000 seat educational institution                                              ...
Demo Screenshots    Predicting apps for a small business                                           Google Developer Day 2010
Demo Screenshots    Predicting apps for a small business                                           Google Developer Day 2010
Prediction API CapabilitiesData• Input Features: numeric or unstructured text• Output: up to hundreds of discrete categori...
Prediction API - PricingFree Quota in trial/development• 100 predictions/day, 5MB trained/day• Available for 6 monthsPaid ...
Prediction API- AvailabilityLimited preview in US* currently• Sign up for wait list at      • http://code.google.com/apis/...
Google BigQueryInteractive analysis of large datasets in Googles cloud                                           Google De...
Introducing Google BigQuery     • Googles large data adhoc analysis technology       • Analyze massive amounts of data in ...
Why BigQuery?Working with large data is a challenge                                         Google Developer Day 2010
Many Use Cases ...                                       Trends  Interactive          Spam                                ...
Key Capabilities of BigQuery• Scalable: Billions of rows• Fast: Response in seconds• Simple: Queries in SQL• Web Service  ...
Components of BigQuery                     java       python         php      bq tool               client libraries      ...
Using BigQueryAnother simple three step process...                              Upload your raw data to              1. Up...
Big Query : Create Data File• Data file is in the CSV format                Isabella,F,22067 • Format: CSV [http://tools.i...
Big Query : Upload your Data $./gsutil cp yob2009.txt gs://bucket1/tables/babynames/2009.csvTool compatible with   File co...
Big Query : Table Creation $ cat baby_schema [   { "id": "name", "type": "string", "mode": "REQUIRED" },   { "id": "gender...
Big Query : Table Creation $ bq create bucket1/tables/babynames/tblNames baby_schema {                                    ...
Big Query : Query the table$ bq query "SELECT name,count FROM [bucket1/tables/babynames/tblNames] WHERE gender = F ORDER B...
Writing QueriesCompact subset of SQL  o SELECT ... FROM ...    WHERE ...    GROUP BY ... ORDER BY ...    LIMIT ...;Common ...
BigQuery via RESTGET /bigquery/v1/tables/{table name}GET /bigquery/v1/query?q={query}Sample JSON Reply:{    "results": {  ...
Security and PrivacyStandard Google Authentication• Client Login• OAuth• AuthSubHTTPS support• protects your credentials• ...
Large Data Analysis ExampleWikimedia Revision HistoryWikimedia Revision history data from:http://download.wikimedia.org/en...
Large Data Analysis ExampleWikimedia Revision HistoryWikimedia Revision history data from:http://download.wikimedia.org/en...
Using BigQuery ShellPython DB API 2.0 + B. Clappers sqlcmdhttp://www.clapper.org/software/python/sqlcmd/                  ...
BigQuery from a Spreadsheet                     Google Developer Day 2010
BigQuery from a Spreadsheet                     Google Developer Day 2010
Google Fusion Tables                       Google Developer Day 2010
Google Fusion Tables• Manage large collections of tabular data in the cloud    • 100 Mb tables    • Filters, Aggregation, ...
Google Fusion Tables                       Google Developer Day 2010
Google Visualization API                           Google Developer Day 2010
Google Visualization API• Collection of JavaScript Visualization components    • Some from Google (Chart Tools)    • Some ...
Example: Weather data• US National Climatic Data Center    • weather data at stations around the globe since 1929    • Sto...
Example: Weather data                        Google Developer Day 2010
Google Refine                Google Developer Day 2010
Google Refine• Power tool for working with messy data     • Cleanup     • Transform     • Augment     • (Link with FreeBas...
Google Refine                Google Developer Day 2010
Recap• Google App Engine  o Easy to build, deploy and manage web apps• Google Storage  o High speed data storage on Google...
More informationhttp://code.google.com/apis/http://code.google.com/more/table/                               Google Develo...
Upcoming SlideShare
Loading in...5
×

CloudOps evening presentation from Google

4,393

Published on

Presentation from Patrick Chanezon of Savvis at the CloudOps cloud evening

0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,393
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
92
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • Does CLOUD COMPUTING just means your servers are SOMEWHERE ELSE? Or is it SOMETHING MORE?\nWHY put your servers in the cloud?\n- Don’t want to MANAGE servers?\n- Or is it the ELASTICITY and SCALABILITY of the cloud?\n- If so, you NEED: DISTRIBUTED cloud computing\n * TODAY we’ll talk about why\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • EXISTING GOOGLE SERVICES made available to your App Engine apps \n\nSPECIALIZATION: Do ONE THING. Do it WELL.\n- Doing one thing well is EASIER than doing a lot of different things\n- Less complexity: fewer corner cases, fewer bugs\n- Offload App Servers - they SPECIALIZE in serving web requests\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • CloudSherpas (Google Apps management tools) porting to Google Storage\n\nMediaBeacon publishes US Navy Image Services Media files to media outlets\n\nSocialWork is "Facebook for the enterprise".  Share image including Phone in demo\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "CloudOps evening presentation from Google"

    1. 1. Google Cloud forData CrunchersPatrick Chanezon, Developer Advocate, Cloud@chanezon, chanezon@google.comRajdeep Dua, Developer Advocate, Cloud andAndroid@rajdeepdua, rajdeep@google.com Google Developer Day 2010
    2. 2. Agenda• Google App Engine• Google Storage for Developers• Prediction API• BigQuery• Google SQL Service• Google Fusion Tables• Google Refine Google Developer Day 2010
    3. 3. Google App Engine Google Developer Day 2010
    4. 4. What is cloudcomputing? 3
    5. 5. Cloud Computing Defined SaaS PaaS IaaS Source: Gartner AADI Summit Dec 2009 Google Developer Day 2010
    6. 6. Cloud Computing Defined SaaS PaaS IaaS Source: Gartner AADI Summit Dec 2009 Google Developer Day 2010
    7. 7. Cloud Computing Defined SaaS PaaS IaaS Source: Gartner AADI Summit Dec 2009 Google Developer Day 2010
    8. 8. Cloud Computing Defined SaaS PaaS IaaS Source: Gartner AADI Summit Dec 2009 Google Developer Day 2010
    9. 9. Googles Cloud Offerings 1. Google Apps 2. Third party Apps: Google Apps Marketplace SaaS 3. ________ Google App Engine PaaS Google Storage IaaS Prediction API BigQuery Google Developer Day 2010
    10. 10. Googles Cloud Offerings Your Apps 1. Google Apps 2. Third party Apps: Google Apps Marketplace SaaS 3. ________ Google App Engine PaaS Google Storage IaaS Prediction API BigQuery Google Developer Day 2010
    11. 11. Google App Engine - Easy to build - Easy to maintain - Easy to scale 7
    12. 12. Cloud development in a box• SDK & “The Cloud”• Hardware• Networking• Operating system• Application runtime o Java, Python• Static file serving• Services• Fault tolerance• Load balancing 8
    13. 13. App Engine Services Memcache Datastore URL Fetch Mail XMPP Task Queue Images Blobstore User Service 9
    14. 14. Always free to get started ~5M pageviews/month • 6.5 CPU hrs/day • 1 GB storage • 650K URL Fetch calls/day • 2,000 recipients emailed • 1 GB/day bandwidth • 100,000 tasks enqueued • 650K XMPP messages/day 10
    15. 15. Purchase additional resources * * free monthly quota of ~5 million page views still in full effect 11
    16. 16. Google App Engine for Business Same scalable cloud hosting platform. Designed for the enterprise. • Enterprise application management – Centralized domain console • Enterprise reliability and support – 99.9% Service Level Agreement – Premium Developer Support • Hosted SQL – Managed relational SQL database in the cloud • SSL on your domain – Including "naked" domain support • Secure by default – Integrated Single Sign On (SSO) • Pricing that makes sense Google App Engine for Business – Pay only for what you use* Hosted SQL and SSL on your domain available later this year Google Developer Day 2010
    17. 17. App Engine for Data Crunchers• High Performance Image Serving• OpenId/Oauth integration• Increased quotas • > 1k entities per query • 10’’ task queues• Async UrlFetch• Mapper API (Reduce coming soon)• Channel API• Matcher API Google Developer Day 2010
    18. 18. Mapper API • First component of App Engine’s MapReduce toolkit • Large scale data manipulation • Examples include: • Report generation • Computing statistics and metrics … • Python Example: • http://blog.notdot.net/2010/05/Exploring-the-new-mapper-API • Java Example: • http://ikaisays.com/2010/07/09/using-the-java-mapper-framework-for-app- engine/ Google Developer Day 2010
    19. 19. Channel API• Allows for Server Push (Comet) to browser • Blog post announcement: • http://googleappengine.blogspot.com/2010/05/app-engine-at-google- io-2010.html • External coverage: • Sneak Peak from an early trusted tester • http://bitshaq.com/2010/09/01/sneak-peak-gae-channel-api/• Demo code for Dance Dance Robot available here: • http://code.google.com/p/dance-dance-robot/ • Also see: https://groups.google.com/group/google-appengine-java/ browse_thread/thread/6fa09953ffae2cd3/c1db7de5fdb82b65?pli=1# Google Developer Day 2010
    20. 20. Matcher API• Allows an app to register a set of queries to match against astream of documents • Trustes Testers, Python only • Group post announcement: • http://groups.google.com/group/google-appengine/msg/40021537e2e58962 • Docs: • http://code.google.com/p/google-app-engine-samples/wiki/ AppEngineMatcherService• Demo code: • http://code.google.com/p/google-app-engine-samples/source/browse/#svn/trunk/ matcher-sample Google Developer Day 2010
    21. 21. Google Storage for DevelopersStore your data in Googles cloud Google Developer Day 2010
    22. 22. What Is Google Storage?• Store your data in Googles cloud o any format, any amount, any time• You control access to your data o private, shared, or public• Access via Google APIs or 3rd party tools/libraries Google Developer Day 2010
    23. 23. Google Storage Technical DetailsRESTful API • Verbs: GET, PUT, POST, HEAD, DELETE • Resources: identified by URI, like: http://commondatastorage.googleapis.com/bucket/object• Compatible with S3 Buckets• Flat containers (no bucket hierarchy) Google Developer Day 2010
    24. 24. Google Storage : Concepts Basic containers that hold your data, cannot nestBuckets buckets Individual pieces of data : Object data and object metaObjects dataNamespace Single name space across Google storageHierarchy Flat hierarchy any combination of Unicode characters (UTF-8Object Names encoded) less than 1024 bytes in length More restrictive than object names, unique.Bucket Names Conform to DNS settings20 Google Developer Day 2010
    25. 25. Google Storage Use Cases Use Case HTTP VerbCreate a BucketChange ACLs of a BucketUploads an Object PUTChange ACLs of an ObjectList contents of a bucket or ACLsDownload an object or its ACLs GETDelete an ObjectDelete an empty Bucket DELETEUploads an Object using HTML form POSTlists the metadata of an Object HEAD 21 Google Developer Day 2010
    26. 26. Performance and ScalabilityObject types and size• Objects of any type and 100GB+ / Object• Unlimited numbers of objects, 1000s of buckets• Range-get support for data retrievalReplication• All data replicated to multiple US data centers• Leveraging Googles worldwide network for data deliveryConsistency• “Read-your-writes” data consistency Google Developer Day 2010
    27. 27. Security and Privacy FeaturesAuthenticated downloads from a web browser• Sharing with individuals• Group sharing via Google Groups• Sharing with Google Apps domainsPermissions set on Buckets or Objects• READ (an object, or list a bucket’s contents)• WRITE (applicable to buckets, allows upload/delete/etc)• FULL_CONTROL (read/write ACLs on objects or buckets) Google Developer Day 2010
    28. 28. Tools Google Storage Managergsutil Google Developer Day 2010
    29. 29. Google Storage Benefits High Performance and Scalability Backed by Google infrastructure Strong Security and Privacy Control access to your data Easy to Use Get started fast with Google & 3rd party tools Google Developer Day 2010
    30. 30. Some Early Google Storage Adopters Google Developer Day 2010
    31. 31. Google Storage usage within Google Google Google BigQuery Prediction API Haiti Relief Imagery USPTO data Partner Reporting Partner Reporting Google Developer Day 2010
    32. 32. Google Storage - AvailabilityLimited preview in US* currently• 100GB free storage and network per account• Sign up for wait list at • http://code.google.com/apis/storage/* Non-US preview available on case-by-case basis Google Developer Day 2010
    33. 33. Google Prediction APIGoogles prediction engine in the cloud Google Developer Day 2010
    34. 34. Introducing the Google Prediction API• Googles sophisticated machine learning technology• Available as an on-demand RESTful HTTP web service Google Developer Day 2010
    35. 35. A virtually endless number of applications... Customer Transaction Species Message Diagnostics Sentiment Risk Identification Routing Churn Legal Docket Suspicious Work Roster InappropriatePrediction Classification Activity Assignment ContentRecommend Political Uplift Email Career Products Bias Marketing Filtering Counseling ... and many more ... Google Developer Day 2010
    36. 36. How does it work?1. TRAIN The quick brown fox jumped over the "english"The Prediction API lazy dog.finds relevant To err is human, but to really foul thingsfeatures in the "english" up you need a computer.sample data during "spanish" No hay mal que por bien no venga.training. "spanish" La tercera es la vencida.2. PREDICT To be or not to be, that is the ?The Prediction API question.later searches for ? La fe mueve montañas.those featuresduring prediction. Google Developer Day 2010
    37. 37. Introducing the Google Prediction API Google Developer Day 2010
    38. 38. A Prediction API ExampleAutomatically determine application recommendations• Goal: Increase relevancy on the Apps Marketplace via recommendations• Customers: Businesses of various sizes and industries using Google Apps around the world• Data: Sampling of previous installs of applications• Outcome: Predict applications which would be appropriate for a new customer visiting the site Google Developer Day 2010
    39. 39. Using the Prediction APIA simple three step process... Upload your training data to 1. Upload Google Storage Build a model from your data 2. Train 3. Predict Make new predictions Google Developer Day 2010
    40. 40. Step 1: Upload Upload your training data to Google Storage• Training data: outputs and input features• Data format: comma separated value format (CSV), result in first column "SlideRocket","EDUCATION","us","en","10","5" "MailChimp","BUSINESS","us","en","7","0" "MailChimp","STANDARD","se","sv","1","0" "Smartsheet","BUSINESS","us","en","13","4" Upload to Google Storage gsutil cp installs gs://appdata/ Google Developer Day 2010
    41. 41. Step 2: TrainCreate a new model by training on dataTo train a model:POST prediction/v1.1/training?data=appdata%2FinstallsTraining runs asynchronously. To see if it has finished:GET prediction/v1.1/training/appdata%2Finstalls{"data":{ "data":"appdata/installs", "modelinfo":"estimated accuracy: 0.xx"}}} Google Developer Day 2010
    42. 42. Step 3: Predict Apply the trained model to make predictions on new dataPOST prediction/v1.1/query/appdata%2Finstalls/predict{ "data":{ "input": { "mixture" : [ "EDUCATION","us","en","10","0" ]}}}{ data : { "kind" : "prediction#output", "outputLabel":"Manymoon", "outputMulti" :[ {"label":"OffiSync", "score": x.xx} {"label":"Zoho CRM", "score": x.xx} {"label":"MailChimp", "score": x.xx}]}} Google Developer Day 2010
    43. 43. Demo! Google Developer Day 2010
    44. 44. Demo ScreenshotsPredicting apps for a 501-1,000 seat educational institution Google Developer Day 2010
    45. 45. Demo ScreenshotsPredicting apps for a 501-1,000 seat educational institution Google Developer Day 2010
    46. 46. Demo Screenshots Predicting apps for a small business Google Developer Day 2010
    47. 47. Demo Screenshots Predicting apps for a small business Google Developer Day 2010
    48. 48. Prediction API CapabilitiesData• Input Features: numeric or unstructured text• Output: up to hundreds of discrete categories, or continuous valuesTraining• Many machine learning techniques• Automatically selected• Performed asynchronouslyAccess from many platforms:• Web app from Google App Engine• Apps Script (e.g. from Google Spreadsheet)• Desktop app Google Developer Day 2010
    49. 49. Prediction API - PricingFree Quota in trial/development• 100 predictions/day, 5MB trained/day• Available for 6 monthsPaid Usage• $10/month per project includes 10,000 predictions• Additional predictions are $0.50 per 1,000• Absolute limit of 60,000 predictions per day• $0.002 per MB trained (max size per dataset is 100MB) Google Developer Day 2010
    50. 50. Prediction API- AvailabilityLimited preview in US* currently• Sign up for wait list at • http://code.google.com/apis/predict/* Non-US preview available on case-by-case basis Google Developer Day 2010
    51. 51. Google BigQueryInteractive analysis of large datasets in Googles cloud Google Developer Day 2010
    52. 52. Introducing Google BigQuery • Googles large data adhoc analysis technology • Analyze massive amounts of data in seconds • Simple SQL-like query language • Flexible access • REST APIs, JSON-RPC, Google Apps Script48 Google Developer Day 2010
    53. 53. Why BigQuery?Working with large data is a challenge Google Developer Day 2010
    54. 54. Many Use Cases ... Trends Interactive Spam Detection Tools Web Network Dashboards Optimization Google Developer Day 2010
    55. 55. Key Capabilities of BigQuery• Scalable: Billions of rows• Fast: Response in seconds• Simple: Queries in SQL• Web Service o REST o JSON-RPC o Google App Scripts Google Developer Day 2010
    56. 56. Components of BigQuery java python php bq tool client libraries REST, JSON RPC Big Query Service Big Storage52 Google Developer Day 2010
    57. 57. Using BigQueryAnother simple three step process... Upload your raw data to 1. Upload Google Storage Import raw data into 2. Import BigQuery table 3. Query Perform SQL queries on table Google Developer Day 2010
    58. 58. Big Query : Create Data File• Data file is in the CSV format Isabella,F,22067 • Format: CSV [http://tools.ietf.org/html/ Emma,F,17716 rfc4180] • Encoding: UTF-8 Olivia,F,17246 • No header row allowed Sophia,F,16743 • Newlines not supported in quoted strings Ava,F,15730 • Max row size: 64K Emily,F,15204 • Max cell size: 64K • Max file size: 1GB. • Supported cell data formats: ◦ string – UTF-8 encoded string up to 64K of data (as opposed to 64K characters). ◦ integer – IEEE 64-bit signed integers (-264–-264)54 Google Developer Day 2010
    59. 59. Big Query : Upload your Data $./gsutil cp yob2009.txt gs://bucket1/tables/babynames/2009.csvTool compatible with File containing data Destination bucket Google Storage to be uploaded for the data • Data to be uploaded into a single/multiple Big Storage Bucket/s • Use REST endpoints directly or the tool shipped with Google storage 55 Google Developer Day 2010
    60. 60. Big Query : Table Creation $ cat baby_schema [ { "id": "name", "type": "string", "mode": "REQUIRED" }, { "id": "gender", "type": "string", "mode": "REQUIRED" }, { "id": "count", "type": "integer", "mode": "REQUIRED" } ] • Define a schema • id : The string name of the field. Field names are any combination of uppercase and/or lowercase letters (A-Z, a-z), digits (0-9) and underscores. The first character must be a letter. • type : The data type of this field. Supported values: string, integer, float, or boolean • mode : Optional property, specifying whether the cell can be null or not. Supported values: NULLABLE or REQUIRED. Default value is NULLABLE.56 Google Developer Day 2010
    61. 61. Big Query : Table Creation $ bq create bucket1/tables/babynames/tblNames baby_schema { Schema File "kind": "bigquery#table", "name": "bucket1/tables/babynames/tblNames" Table name } Create table in the BigQuery $ bq import bucket1/tables/babynames/tblNames bucket1/tables/babynames/2009.csv Table name { "table": "bucket1/tables/babynames/tblNames", "kind": "bigquery#import_id", Data "import": "d0cf328ed7d9bb46" } Import the data into the table57 Google Developer Day 2010
    62. 62. Big Query : Query the table$ bq query "SELECT name,count FROM [bucket1/tables/babynames/tblNames] WHERE gender = F ORDER BY count DESC LIMIT 5";--------------name COUNT-------- -----Isabella 22067 QueryEmma 17716Olivia 17246Sophia 16743 ResultAva 15730--------------58 Google Developer Day 2010
    63. 63. Writing QueriesCompact subset of SQL o SELECT ... FROM ... WHERE ... GROUP BY ... ORDER BY ... LIMIT ...;Common functions o Math, String, Time, ...Additional statistical approximations o TOP o COUNT DISTINCT Google Developer Day 2010
    64. 64. BigQuery via RESTGET /bigquery/v1/tables/{table name}GET /bigquery/v1/query?q={query}Sample JSON Reply:{ "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] }}Also supports JSON-RPC Google Developer Day 2010
    65. 65. Security and PrivacyStandard Google Authentication• Client Login• OAuth• AuthSubHTTPS support• protects your credentials• protects your dataRelies on Google Storage to manage access Google Developer Day 2010
    66. 66. Large Data Analysis ExampleWikimedia Revision HistoryWikimedia Revision history data from:http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z Google Developer Day 2010
    67. 67. Large Data Analysis ExampleWikimedia Revision HistoryWikimedia Revision history data from:http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z Google Developer Day 2010
    68. 68. Using BigQuery ShellPython DB API 2.0 + B. Clappers sqlcmdhttp://www.clapper.org/software/python/sqlcmd/ Google Developer Day 2010
    69. 69. BigQuery from a Spreadsheet Google Developer Day 2010
    70. 70. BigQuery from a Spreadsheet Google Developer Day 2010
    71. 71. Google Fusion Tables Google Developer Day 2010
    72. 72. Google Fusion Tables• Manage large collections of tabular data in the cloud • 100 Mb tables • Filters, Aggregation, Merge • ACL, Collaboration, Discuss Data • Visualizations• REST API • Geo queries• Maps Integration • FusionTablesLayer Google Developer Day 2010
    73. 73. Google Fusion Tables Google Developer Day 2010
    74. 74. Google Visualization API Google Developer Day 2010
    75. 75. Google Visualization API• Collection of JavaScript Visualization components • Some from Google (Chart Tools) • Some from other developers • Share the same wire protocol for Data Sources Google Developer Day 2010
    76. 76. Example: Weather data• US National Climatic Data Center • weather data at stations around the globe since 1929 • Stored in Google Storage • Created a Table for Bigquery • Upload Weather Station coordinates in Fusion Tables • App Engine App • Maps API to display weather station Maps • Bigquery to query average temperature in January • A bit of Python to create a JSON Data Source • Visualization API• Just an example: rince, repeat, enhance! Google Developer Day 2010
    77. 77. Example: Weather data Google Developer Day 2010
    78. 78. Google Refine Google Developer Day 2010
    79. 79. Google Refine• Power tool for working with messy data • Cleanup • Transform • Augment • (Link with FreeBase)• Desktop software for now• http://code.google.com/p/google-refine/ Google Developer Day 2010
    80. 80. Google Refine Google Developer Day 2010
    81. 81. Recap• Google App Engine o Easy to build, deploy and manage web apps• Google Storage o High speed data storage on Google Cloud• Prediction API o Googles machine learning technology• BigQuery o Interactive analysis of very large data sets• Google Fusion Tables o Manage collections of tabular data in the cloud• Google Refine o Power tool for working with messy data• Google Visualization o Collection of JavaScript Visualization Google Developer Day 2010
    82. 82. More informationhttp://code.google.com/apis/http://code.google.com/more/table/ Google Developer Day 2010
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×