Your SlideShare is downloading. ×
Real-Time Django
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Real-Time Django

6,954
views

Published on

The web is live. APIs give us access to continuously changing data. We discuss ways to get real-time data into your app, how to handle data processing and what to do when you get thousands of updates …

The web is live. APIs give us access to continuously changing data. We discuss ways to get real-time data into your app, how to handle data processing and what to do when you get thousands of updates per second.

Published in: Technology

0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,954
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
92
Comments
0
Likes
13
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Presented for your enjoyment at DjangoCon US 2011Real-Time Django with Ben Slavin and Adam Miskiewicz @benslavin @skevy
  • 2. The web isRead / Write
  • 3. Read / The web is Write
  •
  • 5. 1 / second
  • 6. Django Just Works(with intelligent application design and proper caching)
  • 7. 50 / second
  • 8. 500 / second
  • 9. 5,000 / second
  • 10. Beyonce!!! 8,868 / secondhttp://twitter.com/#!/twitterglobalpr/status/108285017792331776
  • 11. Superbowl XLVhttp://blog.twitter.com/2011/02/superbowl.html
  • 12. 4,064 at peak
  • 13. >2,000 sustained
  • 14. Django wasn’tbuilt for this.
  • 15. ... but that doesn’tmean we need to use J2EE or Erlang.
  • 16. Using the techniques discussed today, we have:Processed > 4k pieces of data/second Tracked >50k live datapoints Run live events Served award-show sized audiences
  • 17. You may not deal with this scale, but hopefully you canLearn from our techniques
  • 18. Under-documented
  • 19. A lot to cover
  • 20. A play. In three parts. Retrieval ProcessingPresentation
  • 21. RetrievalPolling
  • 22. Retrieval / PollingWidely usedTwitter, Facebook, Foursquare, etc.
  • 23. RetrievalContinuous Polling the naïve approach
  • 24. Retrieval / Continuous Polling SlowSynchronously blocks the request/response cycle
  • 25. Retrieval / Continuous Polling Not neighborlyAdds undue burden on the upstream service
  • 26. Retrieval / Continuous PollingFailure model sucks If the upstream service goes down, so do you
  • 27. RetrievalCached Polling a slightly less-awful approach
  • 28. Retrieval / Cached Polling Dog pileSame as ‘continuous polling’ in the degenerate case
  • 29. Retrieval / Cached PollingFailure model sucks If the upstream service goes down, so do you
  • 30. Retrieval / PollingDON’T BREAK THE CYCLE Don’t do this in the request/response cycle
  • 31. Retrieval / Pollingmanage.py poll_stuff + crontab -e
  • 32. Retrieval / PollingStill not enough
  • 33. Retrieval / PollingRate limits ex. 500 requests / hour
  • 34. Retrieval / Rate LimitsBatched requests http://api.twitter.com/1/users/lookup.json?screen_name=bolsterlabs,benslavin,skevy
  • 35. Retrieval / Rate LimitsMultiple clients Use a pool of workers with different IPs and API keys
  • 36. Retrieval / Rate LimitsSpecial access Ask the upstream provider.
  • 37. RetrievalWeb hooks No, you come to me.
  • 38. Retrieval / Web Hooks Out of bandAsynchronous from the user’s perspective.
  • 39. Retrieval / Web HooksPubSubHubbub Used by Gowalla, Myspace, Google
  • 40. Retrieval / Web HooksThe data comes to you True ‘push’.
  • 41. Retrieval / Web Hooks Just handle itClass based views or plain-old methods. It’s just Django w/ different auth.
  • 42. Retrieval / Web HooksSetup can be complex Or worse, completely manual.
  • 43. Retrieval StreamingLong-lived, open-socket communication.
  • 44. Retrieval / StreamingLive updatesbut only when you’re connected.
  • 45. Retrieval / StreamingSingle clientcan be a significant bottleneck
  • 46. Retrieval / Streaming“Site Streams may deliver hundreds of messages per second to a client, and each stream may consume significant (> 1 Mbit/sec) bandwidth. Your processing of tweets should be asynchronous,with appropriate buffers in place to handle spikes of 3x normal throughput. Note that slow reading clients are automatically terminated.” https://dev.twitter.com/docs/streaming-api/site-streams
  • 47. Retrieval / Streaming Hot potatoPass data off as quickly as possible
  • 48. RetrievalSTORE IT. LOG IT. SAVE IT. This data is ephemeral, and there may be no good way to recreate it once it’s gone.
  • 49. Processing
  • 50. ProcessingDenormalization
  • 51. Processing / DenormalizationYour DB is slow. * Unless you know Frank Wiles.
  • 52. Processing / Denormalization db_index=Trueis not the answer
  • 53. Processing / Denormalization Tweet.objects.filter(screenname=”aplusk”) .count()
  • 54. Processing / Denormalization TweetCount.objects.get(screenname=”aplusk”) .tweet_count Also consider, memcached, Redis, etc.
  • 55. Processing / Denormalization pre_save, post_save,post_delete and F objects Use these.
  • 56. Processing / DenormalizationBe careful These only work in Django
  • 57. ProcessingWorkers
  • 58. Processing / WorkersDeconstruct the problem
  • 59. Check for profanity then Retrieve an avatar then Geo-locate the author thenAdd as input for trending terms thenRetrieve author’s social graph then Adjust the leaderboard
  • 60. Retrieve an avatarCheck for profanity Geo-locate the author Add as input for trending terms then Retrieve author’s social graph Adjust the leaderboard
  • 61. Processing / Workersdjango-celeryOr manage yourself with any queue
  • 62. Processingmap + reduce It’s not that scary
  • 63. Processing / map + reduceGenerationalGet data. Process. Cache results.
  • 64. Processing / map + reduce / GenerationalGood for many problems. Especially where the intermediate working set is large.
  • 65. Processing / map + reduce / GenerationalSolutions exist. CouchDB, Mongo, Hadoop
  • 66. Processing / map + reduceSometimes we can be smarter
  • 67. Processing / map + reduce / incrementalConsider averages. I mean the mean.
  • 68. Processing / map + reduce / incremental n 1 n Σi=1 ai
  • 69. Processing / map + reduce / incremental = ( ) (Σ) n n-11 1n Σ i=1 ai n i=1 ai + a n From O(n) to O(1)
  • 70. Processing / map + reduce / incremental This example was trivial, but you can oftenStore a partial solution
  • 71. Presentation(Of the data, not the thing we’re doing now.)
  • 72. PresentationPartial Caching
  • 73. Presentation / Partial Caching{% cache 500 my_stuff %} Template fragment caching
  • 74. Presentation / Partial Cachingclass MyModel(models.Model): as_html = models.TextField()
  • 75. Presentation / Partial Cachingserialized = json.dumps(my_stuff) cache.set(‘my_stuff’, serialized) Don’t be afraid of low-level caching.
  • 76. PresentationContinuous Caching
  • 77. Presentation / Continuous Cachingwhile True: cache_page()
  • 78. Presentation / Continuous CachingOut-of-band caching. Works when the number of pages is relatively small. Similar to proxy_cache, but more resilient.
  • 79. Presentation / Continuous Caching[Watch this space.]
  • 80. PresentationReal-Time Updates
  • 81. Presentation / Real-Time Updatesgevent, eventlet,tornado, twisted
  • 82. Presentation / Real-Time UpdatesDjango plays well with others
  • 83. Presentation / Real-Time Updates Django + RabbitMQ + node.js + socket.io
  • 84. Presentation / Real-Time Updates[Watch this space.]
  • 85. PresentationFailure Models
  • 86. Presentation / Failure ModelsThis isn’t good for anyone.
  • 87. Presentation / Failure Modelsproxy_cache_use_stale andproxy_next_upstream. For use with nginx
  • 88. Presentation / Failure ModelsBuild a small backup app. It can serve pre-cached content. Anything is better than a 404, 500 or 502 (usually)
  • 89. Follow @bolsterlabs for slides and lively discussion.
  • 90. Thank you.
  • 91. Don’t be a stranger. Ben Slavin Adam Miskiewicz @benslavin @skevyben@bolsterlabs.com adam@bolsterlabs.com @bolsterlabs