Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Event-Based Systems for the Real-Time Web

2,565 views

Published on

Slides from the talk I gave at Web 2.0 Expo SF 2010.

Published in: Technology

Building Event-Based Systems for the Real-Time Web

  1. 1. Building Event Based Systems for the Real- Time Web Paul Dix http://pauldix.net @pauldix
  2. 2. joiseyshowaa (http://www.flickr.com/photos/30201239@N00/3916914747/)
  3. 3. co-founder and CTO
  4. 4. The Talk
  5. 5. Event Based: data update => do something
  6. 6. Batch: need data => calculate
  7. 7. Real-Time: data update => notification
  8. 8. Scheduled: occasionally => do stuff
  9. 9. Event Based == Real Time
  10. 10. Monolithic American Backroom (http://www.flickr.com/photos/41922098@N03/4247207167/)
  11. 11. Loosely Coupled alles-schlumpf (http://www.flickr.com/photos/29487767@N02/2855271953/)
  12. 12. Scale 'Playingwithbrushes' (http://www.flickr.com/photos/82518118@N00/2280744328/)
  13. 13. Scale with Complexity PhOtOnQuAnTiQuE (http://www.flickr.com/photos/67968452@N00/1876685709/)
  14. 14. Scale with Team Size alexkess (http://www.flickr.com/photos/34838158@N00/3370167184/)
  15. 15. Definition by Example
  16. 16. SQL Databases
  17. 17. events: insert, update, delete execution: before, after, instead of Event Based: triggers
  18. 18. Batch: count, sum, min, max
  19. 19. Real-Time? counter cache
  20. 20. Scheduled? cron, at, schtasks
  21. 21. Broken nickwheeleroz (http://www.flickr.com/photos/nickwheeleroz/2474196275/in/photostream/)
  22. 22. Self Contained
  23. 23. Monolithic
  24. 24. Bad! Road Fun (http://www.flickr.com/photos/21849473@N06/2550339131/)
  25. 25. Doesn’t Scale Darwin Bell (http://www.flickr.com/photos/53611153@N00/465459020/)
  26. 26. Distributed Systems cygnoir (http://www.flickr.com/photos/35034356212@N01/163671482/)
  27. 27. Event Based: messaging, publish subscribe, amqp
  28. 28. Batch: MapReduce, Hadoop
  29. 29. Real-Time: messaging, publish/ subscribe
  30. 30. Scheduled: delayed replication, data backups
  31. 31. The Web cygnoir (http://www.flickr.com/photos/35034356212@N01/163671482/)
  32. 32. Event Based: web hooks, pubsubhubub, rsscloud
  33. 33. Batch: polling
  34. 34. Real-Time: web hooks, web sockets, pubsubhubbub, rsscloud
  35. 35. Scheduled: check every 1 hour
  36. 36. Distributed (internal) Systems
  37. 37. Service Calls
  38. 38. Feed Reader (on fetch)
  39. 39. Feed Reader (on fetch) • Update users reading list
  40. 40. Feed Reader (on fetch) • Update users reading list • Language Identification
  41. 41. Feed Reader (on fetch) • Update users reading list • Language Identification • Named Entity Extraction
  42. 42. Feed Reader (on fetch) • Update users reading list • Language Identification • Named Entity Extraction • Cluster with other Articles
  43. 43. Feed Reader (on fetch) • Update users reading list • Language Identification • Named Entity Extraction • Cluster with other Articles • Identify Trending Entities
  44. 44. Tightly Coupled Steve aka Crispin Swan (http://www.flickr.com/photos/26811962@N05/3704553536/)
  45. 45. Alejandro Groenewold (http://www.flickr.com/photos/17618485@N07/3294535473/)
  46. 46. Publish/Subscribe (pubsub)
  47. 47. RabbitMQ
  48. 48. AMQP
  49. 49. Exchanges
  50. 50. Queues
  51. 51. Bindings
  52. 52. Routing Key
  53. 53. Exchange Type
  54. 54. Topic Exchange
  55. 55. <token>.<token>...
  56. 56. # - 0 or more wildcard
  57. 57. * - 1 wildcard
  58. 58. Include Data in the Message
  59. 59. Data is the API
  60. 60. Another Example: error logging
  61. 61. routing key: domU-12-31-39-07.feed_fetcher
  62. 62. binding: *.feed_fetcher
  63. 63. binding: #
  64. 64. Loosely Coupled ° ρЯίтΛм ° (http://www.flickr.com/photos/39046851@N08/3829500850/)
  65. 65. Programmable Web 2.0
  66. 66. Event-Based
  67. 67. Real-Time
  68. 68. Web Hooks
  69. 69. GitHub Post-Receive Hooks
  70. 70. git push triggers hooks
  71. 71. POST to Post-Recieve URLs
  72. 72. { "before": "5aef35982fb2d34e9d9d4502f6ede1072793222d", "repository": { "url": "http://github.com/defunkt/github", "name": "github", "description": "You're lookin' at it.", "watchers": 5, "forks": 2, "private": 1, "owner": { "email": "chris@ozmm.org", "name": "defunkt" } }, "commits": [ { "id": "41a212ee83ca127e3c8cf465891ab7216a705f59", "url": "http://github.com/defunkt/github/commit/41a212ee83ca127e3 "author": { "email": "chris@ozmm.org", "name": "Chris Wanstrath" }, "message": "okay i give in", "timestamp": "2008-02-15T14:57:17-08:00",
  73. 73. Data is the API
  74. 74. Continuous Integration
  75. 75. Bug Tracking
  76. 76. Campfire
  77. 77. Continuous Deployment
  78. 78. PubSubHubbub
  79. 79. RSS or Atom Feed <link rel=”hub” href=”http://pubsubhubbub.appspot.com/”>
  80. 80. Subscriber tells Hub
  81. 81. Hub verifies with Subscriber
  82. 82. Publisher notifies Hub
  83. 83. Hub gets from Publisher
  84. 84. Hub sends to Subscribers
  85. 85. Data is the API
  86. 86. OAuth + Web Hooks
  87. 87. Concerns
  88. 88. don’t make web-hook callbacks while a user is waiting Responsive
  89. 89. what if the subscriber doesn’t respond? retry logic notifications Retries
  90. 90. use OAuth. SSL for sensitive data Security
  91. 91. Possibilities
  92. 92. Real-Time
  93. 93. Event Based
  94. 94. Communication
  95. 95. Are these really real- time?
  96. 96. Data Consistency
  97. 97. in brewer’s CAP theorem he talked about the relationship bet ween three requirements when building distributed systems. consistency, availability, and partition tolerance. Eric Brewer’s CAP theorem
  98. 98. consistency means that an operation either works completely or fails. this is also referred to as atomic. Bug tracker example. consistency
  99. 99. availability is pretty self explanatory. a service is available to ser ve requests. Twitter example availability
  100. 100. when you replicate data across multiple systems, you create the possibility of forming a partition. this happens when one or more systems lose connectivity to other systems. partition tolerance is defined formally as “no set of failures less than total net work failure is allowed to cause the system to respond incorrectly” partition tolerance
  101. 101. pick two
  102. 102. Can have all three “is a special form of weak consistency. if no new updates are made to an object, eventually all accesses will return the last updated value.” Werner Vogels’ eventual consistency
  103. 103. Design Around Eventual Consistency
  104. 104. In Closing...
  105. 105. Event-Based == Real Time
  106. 106. Pubsub
  107. 107. Loosely Coupled
  108. 108. Expose Data
  109. 109. Third Party Programmers
  110. 110. Extend and Improve
  111. 111. many heads are better... shoothead (http://www.flickr.com/photos/66621443@N00/519240547/)
  112. 112. than one. John Spooner (http://www.flickr.com/photos/29809546@N00/315157691/)
  113. 113. Questions? Paul Dix http://pauldix.net @pauldix

×