Your SlideShare is downloading. ×
Real-Time Django
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Real-Time Django

6,954
views

Published on

The web is live. APIs give us access to continuously changing data. We discuss ways to get real-time data into your app, how to handle data processing and what to do when you get thousands of updates …

The web is live. APIs give us access to continuously changing data. We discuss ways to get real-time data into your app, how to handle data processing and what to do when you get thousands of updates per second.

Published in: Technology

0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,954
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
92
Comments
0
Likes
13
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Presented for your enjoyment at DjangoCon US 2011Real-Time Django with Ben Slavin and Adam Miskiewicz @benslavin @skevy
  • 2. The web isRead / Write
  • 3. Read / The web is Write
  • 4. GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GGET GET GET GET GET POST GET POST GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GETGET GET GET GET GET POST GET GET GET GET GET GET GET GET POST GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET POST GET GET GET G GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GGET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET G GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET G GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET PO GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET G GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET G GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET G POST GET GET GET POST GET GET GET GET GET GET GET POST GET POST GET GET GET GET GET GET GET GET GET GET GET GGET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET GET POST GETGET GET GET GET GET GET GET GET GET GET POST GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET GET P GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GGET GET GET GET GET GET POST GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET POST GET GET GET GET GET GET GET GET GET G
  • 5. 1 / second
  • 6. Django Just Works(with intelligent application design and proper caching)
  • 7. 50 / second
  • 8. 500 / second
  • 9. 5,000 / second
  • 10. Beyonce!!! 8,868 / secondhttp://twitter.com/#!/twitterglobalpr/status/108285017792331776
  • 11. Superbowl XLVhttp://blog.twitter.com/2011/02/superbowl.html
  • 12. 4,064 at peak
  • 13. >2,000 sustained
  • 14. Django wasn’tbuilt for this.
  • 15. ... but that doesn’tmean we need to use J2EE or Erlang.
  • 16. Using the techniques discussed today, we have:Processed > 4k pieces of data/second Tracked >50k live datapoints Run live events Served award-show sized audiences
  • 17. You may not deal with this scale, but hopefully you canLearn from our techniques
  • 18. Under-documented
  • 19. A lot to cover
  • 20. A play. In three parts. Retrieval ProcessingPresentation
  • 21. RetrievalPolling
  • 22. Retrieval / PollingWidely usedTwitter, Facebook, Foursquare, etc.
  • 23. RetrievalContinuous Polling the naïve approach
  • 24. Retrieval / Continuous Polling SlowSynchronously blocks the request/response cycle
  • 25. Retrieval / Continuous Polling Not neighborlyAdds undue burden on the upstream service
  • 26. Retrieval / Continuous PollingFailure model sucks If the upstream service goes down, so do you
  • 27. RetrievalCached Polling a slightly less-awful approach
  • 28. Retrieval / Cached Polling Dog pileSame as ‘continuous polling’ in the degenerate case
  • 29. Retrieval / Cached PollingFailure model sucks If the upstream service goes down, so do you
  • 30. Retrieval / PollingDON’T BREAK THE CYCLE Don’t do this in the request/response cycle
  • 31. Retrieval / Pollingmanage.py poll_stuff + crontab -e
  • 32. Retrieval / PollingStill not enough
  • 33. Retrieval / PollingRate limits ex. 500 requests / hour
  • 34. Retrieval / Rate LimitsBatched requests http://api.twitter.com/1/users/lookup.json?screen_name=bolsterlabs,benslavin,skevy
  • 35. Retrieval / Rate LimitsMultiple clients Use a pool of workers with different IPs and API keys
  • 36. Retrieval / Rate LimitsSpecial access Ask the upstream provider.
  • 37. RetrievalWeb hooks No, you come to me.
  • 38. Retrieval / Web Hooks Out of bandAsynchronous from the user’s perspective.
  • 39. Retrieval / Web HooksPubSubHubbub Used by Gowalla, Myspace, Google
  • 40. Retrieval / Web HooksThe data comes to you True ‘push’.
  • 41. Retrieval / Web Hooks Just handle itClass based views or plain-old methods. It’s just Django w/ different auth.
  • 42. Retrieval / Web HooksSetup can be complex Or worse, completely manual.
  • 43. Retrieval StreamingLong-lived, open-socket communication.
  • 44. Retrieval / StreamingLive updatesbut only when you’re connected.
  • 45. Retrieval / StreamingSingle clientcan be a significant bottleneck
  • 46. Retrieval / Streaming“Site Streams may deliver hundreds of messages per second to a client, and each stream may consume significant (> 1 Mbit/sec) bandwidth. Your processing of tweets should be asynchronous,with appropriate buffers in place to handle spikes of 3x normal throughput. Note that slow reading clients are automatically terminated.” https://dev.twitter.com/docs/streaming-api/site-streams
  • 47. Retrieval / Streaming Hot potatoPass data off as quickly as possible
  • 48. RetrievalSTORE IT. LOG IT. SAVE IT. This data is ephemeral, and there may be no good way to recreate it once it’s gone.
  • 49. Processing
  • 50. ProcessingDenormalization
  • 51. Processing / DenormalizationYour DB is slow. * Unless you know Frank Wiles.
  • 52. Processing / Denormalization db_index=Trueis not the answer
  • 53. Processing / Denormalization Tweet.objects.filter(screenname=”aplusk”) .count()
  • 54. Processing / Denormalization TweetCount.objects.get(screenname=”aplusk”) .tweet_count Also consider, memcached, Redis, etc.
  • 55. Processing / Denormalization pre_save, post_save,post_delete and F objects Use these.
  • 56. Processing / DenormalizationBe careful These only work in Django
  • 57. ProcessingWorkers
  • 58. Processing / WorkersDeconstruct the problem
  • 59. Check for profanity then Retrieve an avatar then Geo-locate the author thenAdd as input for trending terms thenRetrieve author’s social graph then Adjust the leaderboard
  • 60. Retrieve an avatarCheck for profanity Geo-locate the author Add as input for trending terms then Retrieve author’s social graph Adjust the leaderboard
  • 61. Processing / Workersdjango-celeryOr manage yourself with any queue
  • 62. Processingmap + reduce It’s not that scary
  • 63. Processing / map + reduceGenerationalGet data. Process. Cache results.
  • 64. Processing / map + reduce / GenerationalGood for many problems. Especially where the intermediate working set is large.
  • 65. Processing / map + reduce / GenerationalSolutions exist. CouchDB, Mongo, Hadoop
  • 66. Processing / map + reduceSometimes we can be smarter
  • 67. Processing / map + reduce / incrementalConsider averages. I mean the mean.
  • 68. Processing / map + reduce / incremental n 1 n Σi=1 ai
  • 69. Processing / map + reduce / incremental = ( ) (Σ) n n-11 1n Σ i=1 ai n i=1 ai + a n From O(n) to O(1)
  • 70. Processing / map + reduce / incremental This example was trivial, but you can oftenStore a partial solution
  • 71. Presentation(Of the data, not the thing we’re doing now.)
  • 72. PresentationPartial Caching
  • 73. Presentation / Partial Caching{% cache 500 my_stuff %} Template fragment caching
  • 74. Presentation / Partial Cachingclass MyModel(models.Model): as_html = models.TextField()
  • 75. Presentation / Partial Cachingserialized = json.dumps(my_stuff) cache.set(‘my_stuff’, serialized) Don’t be afraid of low-level caching.
  • 76. PresentationContinuous Caching
  • 77. Presentation / Continuous Cachingwhile True: cache_page()
  • 78. Presentation / Continuous CachingOut-of-band caching. Works when the number of pages is relatively small. Similar to proxy_cache, but more resilient.
  • 79. Presentation / Continuous Caching[Watch this space.]
  • 80. PresentationReal-Time Updates
  • 81. Presentation / Real-Time Updatesgevent, eventlet,tornado, twisted
  • 82. Presentation / Real-Time UpdatesDjango plays well with others
  • 83. Presentation / Real-Time Updates Django + RabbitMQ + node.js + socket.io
  • 84. Presentation / Real-Time Updates[Watch this space.]
  • 85. PresentationFailure Models
  • 86. Presentation / Failure ModelsThis isn’t good for anyone.
  • 87. Presentation / Failure Modelsproxy_cache_use_stale andproxy_next_upstream. For use with nginx
  • 88. Presentation / Failure ModelsBuild a small backup app. It can serve pre-cached content. Anything is better than a 404, 500 or 502 (usually)
  • 89. Follow @bolsterlabs for slides and lively discussion.
  • 90. Thank you.
  • 91. Don’t be a stranger. Ben Slavin Adam Miskiewicz @benslavin @skevyben@bolsterlabs.com adam@bolsterlabs.com @bolsterlabs