Building Event Based
Systems for the Real-
     Time Web
          Paul Dix
      http://pauldix.net
          @pauldix
joiseyshowaa (http://www.flickr.com/photos/30201239@N00/3916914747/)
co-founder and CTO
The Talk
Event Based:
data update => do something
Batch:
need data => calculate
Real-Time:
data update => notification
Scheduled:
occasionally => do stuff
Event Based == Real Time
Monolithic




  American Backroom (http://www.flickr.com/photos/41922098@N03/4247207167/)
Loosely
Coupled




 alles-schlumpf (http://www.flickr.com/photos/29487767@N02/2855271953/)
Scale




 'Playingwithbrushes' (http://www.flickr.com/photos/82518118@N00/2280744328/)
Scale with Complexity

PhOtOnQuAnTiQuE (http://www.flickr.com/photos/67968452@N00/1876685709/)
Scale with Team Size
alexkess (http://www.flickr.com/photos/34838158@N00/3370167184/)
Definition by Example
SQL Databases
events:
insert, update, delete

execution:
before, after, instead of




                            Event Based:
        ...
Batch:
count, sum, min, max
Real-Time?
counter cache
Scheduled?
cron, at, schtasks
Broken




nickwheeleroz (http://www.flickr.com/photos/nickwheeleroz/2474196275/in/photostream/)
Self Contained
Monolithic
Bad!




Road Fun (http://www.flickr.com/photos/21849473@N06/2550339131/)
Doesn’t
 Scale




Darwin Bell (http://www.flickr.com/photos/53611153@N00/465459020/)
Distributed Systems
cygnoir (http://www.flickr.com/photos/35034356212@N01/163671482/)
Event Based:
messaging, publish
 subscribe, amqp
Batch:
MapReduce, Hadoop
Real-Time:
messaging, publish/
    subscribe
Scheduled:
delayed replication, data
        backups
The Web
cygnoir (http://www.flickr.com/photos/35034356212@N01/163671482/)
Event Based:
     web hooks,
pubsubhubub, rsscloud
Batch:
polling
Real-Time:
   web hooks, web
sockets, pubsubhubbub,
       rsscloud
Scheduled:
check every 1 hour
Distributed (internal)
       Systems
Service Calls
Feed Reader (on fetch)
Feed Reader (on fetch)


• Update users reading list
Feed Reader (on fetch)


• Update users reading list
• Language Identification
Feed Reader (on fetch)

• Update users reading list
• Language Identification
• Named Entity Extraction
Feed Reader (on fetch)

• Update users reading list
• Language Identification
• Named Entity Extraction
• Cluster with othe...
Feed Reader (on fetch)

• Update users reading list
• Language Identification
• Named Entity Extraction
• Cluster with othe...
Tightly
Coupled




Steve aka Crispin Swan (http://www.flickr.com/photos/26811962@N05/3704553536/)
Alejandro Groenewold (http://www.flickr.com/photos/17618485@N07/3294535473/)
Publish/Subscribe
    (pubsub)
RabbitMQ
AMQP
Exchanges
Queues
Bindings
Routing Key
Exchange Type
Topic Exchange
<token>.<token>...
# - 0 or more wildcard
* - 1 wildcard
Include Data in the
      Message
Data is the API
Another Example:
  error logging
routing key:
domU-12-31-39-07.feed_fetcher
binding:
*.feed_fetcher
binding:
   #
Loosely
                                     Coupled




° ρЯίтΛм ° (http://www.flickr.com/photos/39046851@N08/3829500850/)
Programmable Web 2.0
Event-Based
Real-Time
Web Hooks
GitHub
Post-Receive Hooks
git push triggers hooks
POST to Post-Recieve
       URLs
{
    "before": "5aef35982fb2d34e9d9d4502f6ede1072793222d",
    "repository": {
       "url": "http://github.com/defunkt/g...
Data is the API
Continuous Integration
Bug Tracking
Campfire
Continuous
Deployment
PubSubHubbub
RSS or Atom Feed
<link rel=”hub” href=”http://pubsubhubbub.appspot.com/”>
Subscriber tells Hub
Hub verifies with
  Subscriber
Publisher notifies Hub
Hub gets from
  Publisher
Hub sends to
 Subscribers
Data is the API
OAuth + Web Hooks
Concerns
don’t make web-hook
callbacks while a user is
waiting




                            Responsive
what if the subscriber
doesn’t respond?

retry logic
notifications




                         Retries
use OAuth.
SSL for sensitive data




                         Security
Possibilities
Real-Time
Event Based
Communication
Are these really real-
       time?
Data Consistency
in brewer’s CAP theorem he talked
about the relationship bet ween three
requirements when building
distributed systems. co...
consistency means that an
operation either works completely
or fails. this is also referred to as
atomic.

Bug tracker exa...
availability is pretty self
explanatory. a service is
available to ser ve requests.

Twitter example




                 ...
when you replicate data across multiple
systems, you create the possibility of forming
a partition. this happens when one ...
pick two
Can have all three

“is a special form of weak
consistency. if no new updates are
made to an object, eventually all
access...
Design Around Eventual
     Consistency
In Closing...
Event-Based == Real Time
Pubsub
Loosely Coupled
Expose Data
Third Party
Programmers
Extend and Improve
many heads are better...




         shoothead (http://www.flickr.com/photos/66621443@N00/519240547/)
than one.




            John Spooner (http://www.flickr.com/photos/29809546@N00/315157691/)
Questions?
     Paul Dix
 http://pauldix.net
     @pauldix
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Building Event-Based Systems for the Real-Time Web
Upcoming SlideShare
Loading in...5
×

Building Event-Based Systems for the Real-Time Web

2,265
-1

Published on

Slides from the talk I gave at Web 2.0 Expo SF 2010.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,265
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
53
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide


























































































































  • Building Event-Based Systems for the Real-Time Web

    1. 1. Building Event Based Systems for the Real- Time Web Paul Dix http://pauldix.net @pauldix
    2. 2. joiseyshowaa (http://www.flickr.com/photos/30201239@N00/3916914747/)
    3. 3. co-founder and CTO
    4. 4. The Talk
    5. 5. Event Based: data update => do something
    6. 6. Batch: need data => calculate
    7. 7. Real-Time: data update => notification
    8. 8. Scheduled: occasionally => do stuff
    9. 9. Event Based == Real Time
    10. 10. Monolithic American Backroom (http://www.flickr.com/photos/41922098@N03/4247207167/)
    11. 11. Loosely Coupled alles-schlumpf (http://www.flickr.com/photos/29487767@N02/2855271953/)
    12. 12. Scale 'Playingwithbrushes' (http://www.flickr.com/photos/82518118@N00/2280744328/)
    13. 13. Scale with Complexity PhOtOnQuAnTiQuE (http://www.flickr.com/photos/67968452@N00/1876685709/)
    14. 14. Scale with Team Size alexkess (http://www.flickr.com/photos/34838158@N00/3370167184/)
    15. 15. Definition by Example
    16. 16. SQL Databases
    17. 17. events: insert, update, delete execution: before, after, instead of Event Based: triggers
    18. 18. Batch: count, sum, min, max
    19. 19. Real-Time? counter cache
    20. 20. Scheduled? cron, at, schtasks
    21. 21. Broken nickwheeleroz (http://www.flickr.com/photos/nickwheeleroz/2474196275/in/photostream/)
    22. 22. Self Contained
    23. 23. Monolithic
    24. 24. Bad! Road Fun (http://www.flickr.com/photos/21849473@N06/2550339131/)
    25. 25. Doesn’t Scale Darwin Bell (http://www.flickr.com/photos/53611153@N00/465459020/)
    26. 26. Distributed Systems cygnoir (http://www.flickr.com/photos/35034356212@N01/163671482/)
    27. 27. Event Based: messaging, publish subscribe, amqp
    28. 28. Batch: MapReduce, Hadoop
    29. 29. Real-Time: messaging, publish/ subscribe
    30. 30. Scheduled: delayed replication, data backups
    31. 31. The Web cygnoir (http://www.flickr.com/photos/35034356212@N01/163671482/)
    32. 32. Event Based: web hooks, pubsubhubub, rsscloud
    33. 33. Batch: polling
    34. 34. Real-Time: web hooks, web sockets, pubsubhubbub, rsscloud
    35. 35. Scheduled: check every 1 hour
    36. 36. Distributed (internal) Systems
    37. 37. Service Calls
    38. 38. Feed Reader (on fetch)
    39. 39. Feed Reader (on fetch) • Update users reading list
    40. 40. Feed Reader (on fetch) • Update users reading list • Language Identification
    41. 41. Feed Reader (on fetch) • Update users reading list • Language Identification • Named Entity Extraction
    42. 42. Feed Reader (on fetch) • Update users reading list • Language Identification • Named Entity Extraction • Cluster with other Articles
    43. 43. Feed Reader (on fetch) • Update users reading list • Language Identification • Named Entity Extraction • Cluster with other Articles • Identify Trending Entities
    44. 44. Tightly Coupled Steve aka Crispin Swan (http://www.flickr.com/photos/26811962@N05/3704553536/)
    45. 45. Alejandro Groenewold (http://www.flickr.com/photos/17618485@N07/3294535473/)
    46. 46. Publish/Subscribe (pubsub)
    47. 47. RabbitMQ
    48. 48. AMQP
    49. 49. Exchanges
    50. 50. Queues
    51. 51. Bindings
    52. 52. Routing Key
    53. 53. Exchange Type
    54. 54. Topic Exchange
    55. 55. <token>.<token>...
    56. 56. # - 0 or more wildcard
    57. 57. * - 1 wildcard
    58. 58. Include Data in the Message
    59. 59. Data is the API
    60. 60. Another Example: error logging
    61. 61. routing key: domU-12-31-39-07.feed_fetcher
    62. 62. binding: *.feed_fetcher
    63. 63. binding: #
    64. 64. Loosely Coupled ° ρЯίтΛм ° (http://www.flickr.com/photos/39046851@N08/3829500850/)
    65. 65. Programmable Web 2.0
    66. 66. Event-Based
    67. 67. Real-Time
    68. 68. Web Hooks
    69. 69. GitHub Post-Receive Hooks
    70. 70. git push triggers hooks
    71. 71. POST to Post-Recieve URLs
    72. 72. { "before": "5aef35982fb2d34e9d9d4502f6ede1072793222d", "repository": { "url": "http://github.com/defunkt/github", "name": "github", "description": "You're lookin' at it.", "watchers": 5, "forks": 2, "private": 1, "owner": { "email": "chris@ozmm.org", "name": "defunkt" } }, "commits": [ { "id": "41a212ee83ca127e3c8cf465891ab7216a705f59", "url": "http://github.com/defunkt/github/commit/41a212ee83ca127e3 "author": { "email": "chris@ozmm.org", "name": "Chris Wanstrath" }, "message": "okay i give in", "timestamp": "2008-02-15T14:57:17-08:00",
    73. 73. Data is the API
    74. 74. Continuous Integration
    75. 75. Bug Tracking
    76. 76. Campfire
    77. 77. Continuous Deployment
    78. 78. PubSubHubbub
    79. 79. RSS or Atom Feed <link rel=”hub” href=”http://pubsubhubbub.appspot.com/”>
    80. 80. Subscriber tells Hub
    81. 81. Hub verifies with Subscriber
    82. 82. Publisher notifies Hub
    83. 83. Hub gets from Publisher
    84. 84. Hub sends to Subscribers
    85. 85. Data is the API
    86. 86. OAuth + Web Hooks
    87. 87. Concerns
    88. 88. don’t make web-hook callbacks while a user is waiting Responsive
    89. 89. what if the subscriber doesn’t respond? retry logic notifications Retries
    90. 90. use OAuth. SSL for sensitive data Security
    91. 91. Possibilities
    92. 92. Real-Time
    93. 93. Event Based
    94. 94. Communication
    95. 95. Are these really real- time?
    96. 96. Data Consistency
    97. 97. in brewer’s CAP theorem he talked about the relationship bet ween three requirements when building distributed systems. consistency, availability, and partition tolerance. Eric Brewer’s CAP theorem
    98. 98. consistency means that an operation either works completely or fails. this is also referred to as atomic. Bug tracker example. consistency
    99. 99. availability is pretty self explanatory. a service is available to ser ve requests. Twitter example availability
    100. 100. when you replicate data across multiple systems, you create the possibility of forming a partition. this happens when one or more systems lose connectivity to other systems. partition tolerance is defined formally as “no set of failures less than total net work failure is allowed to cause the system to respond incorrectly” partition tolerance
    101. 101. pick two
    102. 102. Can have all three “is a special form of weak consistency. if no new updates are made to an object, eventually all accesses will return the last updated value.” Werner Vogels’ eventual consistency
    103. 103. Design Around Eventual Consistency
    104. 104. In Closing...
    105. 105. Event-Based == Real Time
    106. 106. Pubsub
    107. 107. Loosely Coupled
    108. 108. Expose Data
    109. 109. Third Party Programmers
    110. 110. Extend and Improve
    111. 111. many heads are better... shoothead (http://www.flickr.com/photos/66621443@N00/519240547/)
    112. 112. than one. John Spooner (http://www.flickr.com/photos/29809546@N00/315157691/)
    113. 113. Questions? Paul Dix http://pauldix.net @pauldix
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×