How we are using BigQuery and Apps Scripts at teowaki
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

How we are using BigQuery and Apps Scripts at teowaki

on

  • 498 views

I was invited to speak at the Google Startup Launch Summit in London about how we are using the google cloud to power our startup

I was invited to speak at the Google Startup Launch Summit in London about how we are using the google cloud to power our startup

Statistics

Views

Total Views
498
Views on SlideShare
493
Embed Views
5

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 5

http://www.slideee.com 5

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

How we are using BigQuery and Apps Scripts at teowaki Presentation Transcript

  • 1. javier ramirez @supercoco9 How we are using BigQuery and Apps Scripts at teowaki
  • 2. Set a distance. Set an expiration time. Bye bye noise.
  • 3. Analytics flow
  • 4. Analytics flow, by segment
  • 5. Automatic Alerts
  • 6. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 REST API (Ruby on Rails) + Web on top (AngularJS)
  • 7. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 8. data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. Ed Dumbill program chair for the O’Reilly Strata Conference javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 9. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 10. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 11. Cloud Storage: Cost-efficient storage of files javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 12. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 13. Hadoop Cassandra Amazon Redshift ... javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 tools we considered:
  • 14. Our choice: Google BigQuery Data analysis as a service http://developers.google.com/bigquery javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 15. Based on “Dremel” Specifically designed for interactive queries over petabytes of real-time data javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 16. loading data You just send the data in text (or JSON) format javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 17. SQL javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 select name from USERS order by date; select count(*) from users; select max(date) from USERS; select sum(total) from ORDERS group by user;
  • 18. specific extensions for analytics javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 within flatten nest stddev top first last nth variance var_pop var_samp covar_pop covar_samp quantiles
  • 19. web console screenshot javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 20. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 our most active user
  • 21. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 country segmented traffic
  • 22. javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14 10 request we should be caching
  • 23. javier ramirez @supercoco9 http://teowaki.com startup launch summit london 14 5 most created resources
  • 24. new users per month
  • 25. SELECT repository_name, repository_language, repository_description, COUNT(repository_name) as cnt, repository_url FROM github.timeline WHERE type="WatchEvent" AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC("#{yesterday} 20:00:00") AND repository_url IN ( SELECT repository_url FROM github.timeline WHERE type="CreateEvent" AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('#{yesterday} 20:00:00') AND repository_fork = "false" AND payload_ref_type = "repository" GROUP BY repository_url ) GROUP BY repository_name, repository_language, repository_description, repository_url HAVING cnt >= 5 ORDER BY cnt DESC LIMIT 25
  • 26. Automation with Apps Script Read from bigquery Create a spreadsheet on Drive E-mail it everyday as a PDF javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 27. cloud storage pricing $0.032 per GB a gzipped 4.8 MB file stores 1MM rows $0.000092 / month per 1MM rows javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 28. bigquery pricing $26 per stored TB 1000000 rows => $0.00416 / month £0.00243 / month $5 per processed TB 1 full scan = 160 MB 1 count = 0 MB 1 full scan over 1 column = 5.4 MB 100 GB => $0.05 / month £0.03javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 29. £0.054307 / month* per 1MM rows *the 1st 100GB every month are free of charge javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 30. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com startup launch summit london 14
  • 31. ig
  • 32. Find related links at https://teowaki.com/teams/javier-community/link-categories/bigquery-talk Thanks! Javier Ramírez @supercoco9 startup launch summit london 14