Using Task Queues and D3.js to build an analytics product on App Engine

2,570 views

Published on

"Using Task Queues and D3.js to build an analytics product on App Engine" by Warren Edwards, Founder, Waizee

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,570
On SlideShare
0
From Embeds
0
Number of Embeds
518
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Using Task Queues and D3.js to build an analytics product on App Engine

  1. 1. Using Task Queues and D3.js to build an analytics product on App Engine Warren Edwards Founder, Waizee
  2. 2. TODAY How Do We Handle All The Data?!? Look at the Product Why App Engine? Why D3.js? Task Queues in App Engine Code Samples A Quiz! Wrap Up
  3. 3. All of the data stored in the world currently about 4 zettabytes according to EMC. At current growth rates, that data will surpass a yottabyte in 2030. If each bit was represented by a grain of sand, a yottabyte would be about five percent the mass of the Moon. Data surpasses human readability.
  4. 4. Software analyzes the data so humans can focus on [ interpretation, judgement, action ];
  5. 5. [ Show Product Demo ]
  6. 6. AWS AE Others Team Experience Automated scaling / spin-up Automated load-balancing Automated security Task queue functionality Why App Engine? Honorable mention: AppScale path from AE
  7. 7. Google App Engine builds on Big Table
  8. 8. Why D3.js? Canvas Directly D3 PolyChart Math plotlib Super- conductor Team experience ( NONE ) Development effort Maturity of library Developer traction Honorable mention: xCharts builds on D3.js with prepackaged charts PHOTO CREDIT: http://www.flickr.com/photos/cusegoyle/
  9. 9. HTTP Request Task Queue Full access to data store? YES YES Atomic transactions? YES YES Write in Python / Java / Go? YES YES Maximum request lifetime 30-60 sec Up to 10 min Timing of execution Immediate Up to 30 days Retry Can be done manually, no policy Automatic by policy Concurrent requests No set limit Limited by policy TQ build on App Engine's HTTP Request but liberate server use from user interaction Introducing Task Queues (TQ)
  10. 10. Appeal of Task Queues Well suited to number crunching ● Long-lived jobs for running analytics ● Full access to data store and messaging to build on your existing App Engine knowledge ● Save state back to Data Store and call the next task Task Queues allow you to crunch data "while you wait"
  11. 11. USER UPLOADS DATA Task #1 Task #2 . . . Task #N RESULT Save to Data Store Pass Parameters Cascade Tasks in Queue for Multipass Processing Save to Data Store Program Flow Data Flow
  12. 12. DATASTORE Task #1 Task #2 . . . Task #N USER UPLOADS DATA RESULT Process Tasks in Parallel Queues Program Flow Data Flow
  13. 13. Let's Look at a Code Sample taskqueue.add(queue_name='analyze', url='/work1', params={'key': keyID}) class WorkerTheFirst(webapp2.RequestHandler): def post(self): keyID = self.request.get('key') app = webapp2.WSGIApplication([('/', StartPage), ('/work1', WorkerTheFirst), ('.*', ErrPage) ]) Add Task into Queue Define Task with Retrieval of Parameters Associate Task with Handler
  14. 14. One Task Calls Another class WorkerTheFirst(webapp2.RequestHandler): def post(self): keyID = self.request.get('key') # # First pass of work here # taskqueue.add(queue_name='analyze', url='/work2', params={'key': keyID}) class WorkerTheSecond(webapp2.RequestHandler): def post(self): keyID = self.request.get('key') # # Second pass of work here # app = webapp2.WSGIApplication([('/', StartPage), ('/work1', WorkerTheFirst), ('/work2', WorkerTheSecond), ('.*', ErrPage) ])
  15. 15. total_storage_limit: 120G # Max for free apps is 500M queue: # Queue for analyzing the incoming data - name: analyze rate: 35/s retry_parameters: task_retry_limit: 5 task_age_limit: 2h # Queue for user behavior heuristics - name: heuristic rate: 5/s # Queue for doing maintenance on the data store or site - name: kickoff rate: 5/s Setting up Task Queues in queue.yaml
  16. 16. svg = d3.select('body') .append('svg') .attr('class', 'circles') .attr('width', width) .attr('height', height) svg.append('g').selectAll('circle') .data(data) .enter() .append('circle') .attr('transform', 'translate(' + pan + ', 0)') # pan allows moving te whole graph slightly # to account for long text labels svg.selectAll('circle') .attr('cx', function(d) {return x(d.v2)}) .attr('cy', function(d) {return y(d.v1)}) .attr('r', dot_out) # dot_out scales the size .attr('fill', function(d) {return d.v4}) Core of the D3 Code
  17. 17. svg = d3.select('body') .append('svg') .attr('class', 'circles') .attr('width', width) .attr('height', height) svg.append('g').selectAll('circle') .data(data) .enter() .append('circle') .attr('transform', 'translate(' + pan + ', 0)') # pan allows moving te whole graph slightly # to account for long text labels svg.selectAll('circle') .attr('cx', function(d) {return x(d.v2)}) .attr('cy', function(d) {return y(d.v1)}) .attr('r', dot_out) # dot_out scales the size .attr('fill', function(d) {return d.v4}) Program Writes Its Own Code ! Axes are set heuristically by server software
  18. 18. Titles Are Set By Heuristic Text Analysis var title = '{{ pagetitle }}' var subtitle = '{{ pagesubtitle }}' var title = 'Google+ Rating More Important Metric Than Star Rating' var subtitle = 'Survey of Productivity Apps in the Chrome Web Store, Nov 2012' In Django template Rendered in Javascript to the browser Labels were pulled Heuristically from input - not hard coded!
  19. 19. QUIZ !
  20. 20. Choose the Correct Task Queue Call taskqueue.add(queue_name='analyze', url='/work2', param={key: keyID}) queue.add(queue='analyze', url='/work2', params={'key': keyID}) taskqueue.add(queue='analyze', url='/work2', param={'key': keyID}) taskqueue.add(queue_name='analyze', url='/work2', params={'key': keyID}) taskqueue.add(queue='analyze', url='/work2', params={key: keyID}) A B C D E
  21. 21. ANSWER IS ...
  22. 22. Choose the Correct Task Queue Call taskqueue.add(queue_name='analyze', url='/work2', param={key: keyID}) queue.add(queue='analyze', url='/work2', params={'key': keyID}) taskqueue.add(queue='analyze', url='/work2', param={'key': keyID}) taskqueue.add(queue_name='analyze', url='/work2', params={'key': keyID}) taskqueue.add(queue='analyze', url='/work2', params={key: keyID}) A B C D E Correct Answer is D
  23. 23. TQ Open a World of Possibilities You can send tasks to different versions of your app → Automated test of new version of app before Go Live You can access the queue’s usage data → Your app can monitor its own consumption of tasks through QueueStatistics class Task Queue + Crowdsource = ??? → Software application instructing humans !
  24. 24. Task Queues allow "while you wait" processing ● Allow server task to run autonomously ● Cascade tasks for multistep processing ● Flexible functionality to create great products Task Queues provide a great tool for automating the understanding of data D3.js offers flexible, stable tool for viz of data ● Works nicely with automated scripting ● Lush visualizations but not pre-packaged ● Leverage huge traction in San Francisco D3.js is best platform for visualization using automated processing of data
  25. 25. Questions? Do you have a passion for analytics? Let’s talk! warren@waizee.com @campbellwarren
  26. 26. We are your number cruncher in the cloud that understands your data and shows you only what is most important.
  27. 27. Data Sources The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. IDC sponsored by EMC. December 2012 diameter and volume of the Moon: Wolfram Alpha Yottabyte representation: Waizee calculations 2013 Gartner Magic Quadrant for Business Intelligence and Analytics Platforms
  28. 28. Photo Credits http://www.flickr. com/photos/pcoin/2066230078/sizes/m/in/photostream/ http://en.wikipedia.org/wiki/File:Visicalc.png http://www.todayscampus.com/rememberthis/load.aspx?art=348 http://commons.wikimedia.org/wiki/File:Chocolate_chip_cookie.jpg http://www.wpclipart. com/signs_symbol/checkmarks/checkmarks_3/checkmark_Bold_Brush _Green.png.html http://www.flickr.com/photos/cusegoyle/

×