Webhooks, Asynchronous Web Applications and Push Notifications


Published on

Milwaukee ALT.NET Presentation on Webhooks, Asynchronous Web Applications and Push Notifications utilizing PubSubHubbub

Presented on 02/20/2012

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Webhooks, Asynchronous Web Applications and Push Notifications

  1. 1. 1
  2. 2. I am the Application Architect at Montage Talent. My responsibilities at Montageprimarily include designing the next generation of the Montage video interviewingsolution. When I began with Montage, the service was a simple web application witha single user interface. Since joining Montage I have led the effort to create a serviceoriented application to allow clients to integrate our services using a web API and toallow Montage to create additional web-based user interfaces including interfacesintended to be used from PCs and mobile devices.I have 10 years of professional development experience, 6 of which were inconsulting. Much of my experience has been in enterprise environments, and thatexperience is what I have taken with me to Montage as I continue to help us build outour service oriented architecture.Lastly, I ultimately love my job because I get to help people solve problems, andsolving a specific problem is the basis for this presentation. 2
  3. 3. Montage Talent is The Leader in Video Interviewing Technology. The service thatMontage provides is the Montage Network which is a video interviewing solution soldas multiple applications; Montage Interview & Montage View.Montage Interview is the application that we are using to live stream and record thispresentation. Montage Interview is a live interviewing platform that allows for up to16 live cameras to be connected in one live virtual interview room. Montage View isour recorded interview application. As a candidate, a recruiter may respond to yourjob inquiry asking you to compete a Montage View. During this process you will beprompted to answer a series of questions using your webcam, and your answers arethen available for the hiring company to review.Montage does more than just provide the technology to power video interviews. Oursolution allows our clients to elevate their brands by demonstrating to theircandidates that they understand the needs of today’s candidate and utilize the latesttechnology to make the recruiting process as quick and seamless for the candidate aspossible.Our solution also enhances the candidate experience by allowing our clients to reachout to candidates, easily allowing candidates to apply for a job in any location aroundthe globe. Our solution is also highly beneficial to those candidates that can’t expressall of their qualifications in a traditional resume. By allowing candidates to use video,candidates are able to demonstrate other soft skills on video, that may not translate 3
  4. 4. well to a written document. 3
  5. 5. Here is an overview of what tonight’s presentation will entail.First I want to present what my intent is for speaking with you tonight and why Ithought this would be a useful topic to share. Then I will present the material in aproblem solution format discussing what caused me to stumble upon this solutionand discuss how I implemented the solution. Lastly, there should be time at the endfor any questions anyone might have or discussion around how the demonstratedpatterns and protocols might be useful in your environment. 4
  6. 6. I am going to work through this presentation by first introducing the problem spacethat we faced in our business model that caused us to implement the technicalsolution that my presentation is based on.The intent of my presentation is not to tell you “this is how you should solve thisproblem,” but rather use the problem space and my solution as a vehicle for a largerdiscussion around “the real-time web”, push notifications/events and webhooks. Ihave found the problem space to be a very common issue, but I had a difficultyfinding a common pattern to solve it. By the end of my presentation I hope that youare able to identify this problem in your environment and are able to use conceptspresented and discussed during this session to solve your problem. 5
  7. 7. While I was designing the new API services to be implemented at Montage, I quicklyrealized that Montage was unlike most other services I had worked with before. Theworkflow that we needed to support through our services is one that make take daysto complete. Unlike the services I was used to working with, we are unable to giveour API consumers immediate results in some cases. The ultimate result of a requestmay come back to the consumer days later, and during that time we can keep ourconsumers up-to-date on the status of the request, but it’s a far cry from immediateresults.Thus, we needed to develop a way to send status updates and the result to ourconsumers over this extended period of time and herein lies the problem.Our services are exclusively RESTful, so implementing any sort of proprietary protocolor messaging was out of the question, especially since we don’t control ourconsumers. If this were just an internal service-to-service problem any off-the shelfmessaging solution could have worked. But our messaging needed to supportHTTP/S at a bare minimum. As a nice to have, if the messaging solution could alsosupport other delivery protocols like SMTP, SMS, etc., that would be a plus.Lastly, we needed to be sure that the solution we developed is reliable else ourconsumers would never know the status of their requests. This meant we had toeither a) guarantee delivery b) provide a way to query for status updates. 6
  8. 8. I had in my mind that the solution I was looking for was essentially a post office. Iwanted to be able to utilize some service in which I could drop off a messageindicating delivery instruction, the payload and the receiver. This service would thendo its best to deliver the message and if the receiver could not be reached, would failand notify me of such failures for reconciliation.Some of the solutions I looked into were various ESBs and Message queues.• BizTalk• RabbitMQHowever, at the time I wasn’t able to find a solution that was going to be able todeliver messages reliably over HTTP/S, so I started poking around looking for anythingI could find to point me to a common solution to this problem.I even asked a question about this on StackOverflow thinking that the thousands ofdevelopers out there would be able to help point me toward a solution I was lookingfor, but all I got were crickets. As you can see from the screenshot, I ultimatelyanswered my own question. 7
  9. 9. What I eventually found were WebHooks. Unfortunately the term WebHook is justbarely ubiquitous enough to make the term discoverable given the parameters I wassearching for when I didn’t know what I was looking for. The term WebHook wascoined by Jeff Lindsay, and he has a number of presentations available discussingwebhooks.Essentially, a WebHook is an HTTP POST used to notify a subscriber of an event thathas occurred. It represents one way to begin enabling the real-time web. If a servicedoes not offer WebHooks or some other type of notification service, consumers areforced to constantly poll the service looking for updates, which is sorely inefficient forboth the publisher and subscriber. Not only is it inefficient, it also isn’t real-time. It’sonly as timely as you are allowed to poll. Since WebHooks represent data that ispushed from a publisher to a subscriber, the subscriber will receive real-time dataefficiently.But what about RSS/ATOM some might say? With a feed I can subscribe to andconsume data from a publisher and those specifications are widely implemented.The problem with feeds is that the publisher doesn’t send the feed data to asubscriber. The publisher simply publishes the data to a well known location andanyone interested in consuming it has to come fetch it. Thus, feed readers have toimplement polling to fetch the data. Again, this means that the reader is sorelyinefficient and you never get your data immediately after it has been made available. 8
  10. 10. Facebook’s realtime API uses webhooks and is loosely based on PubSubHubbub, thesolution protocol that I will dive into in a bit.Stripe has just released a newly updated events API. Since their API is strictly JSON,they have a different implementation than I am going to present on today, but theconcept is the same.Twillio uses WebHooks to send SMS messages.GitHub uses WebHooks to implement git push. 9
  11. 11. Just recently the term “WebHook” seems to be starting to catch-on. While theconcept isn’t anything really new, the usage of the term is. Also, there is no standardimplementation of WebHooks. There are quite a few implementations out there butthere is no standard pattern.There have been a few attempts to create a standard. The webhooks.org wiki makesan effort at creating a specification for a RESTful implementation, however theirattempt simply shows examples of how this can be done, and is hardly aspecification.One specification comes from the XMPP Standards Foundation. XMPP maintainsstandards for use in IM/real-time messaging scenarios. One of these specs is XMPPPubSub. XMPP maintains “pure” standards. Implementing XMPP PubSub yourselfwouldn’t be a good use of most developer’s time since you would spend a lot of timejust implementing the protocol. Yes, finding a library to implement would ease thepain, but then you have the other side of the fence, your consumers. Not only wouldyou have to implement the protocol, but so would your consumers.That’s where PubSubHubbub comes into play. PuSH is designed to be a “pragmatic”protocol. One that is much lighter weight and easier to implement than XMPP, but as 10
  12. 12. a result, isn’t as well defined on all fronts, which we will see shortly. 10
  13. 13. PuSH was developed by 2 Google engineers, Brett Slatkin and Brad Fitzpatrick. Bradis best known for creating both LiveJournal and memcached.PuSH is a simple to use, simple to implement, topic-based publish/subscribe protocolbased on ATOM/RSS. The goal of PuSH is to convert ATOM/RSS feeds into real-timedata by eliminating the traditional polling that occurs to consume most feeds. Whileit’s not a RESTful protocol itself in-terms of the self-subscribe/unsubscribespecification, publishing is HTTP based.PuSH is simple. It consists of 3 participants:• Publisher• Subscriber• HubTogether, these 3 participants can be combined to create a real-time messagingsystem that communicates over nothing but HTTP/S. 11
  14. 14. The link in this slide to the Subscription Flow, links to the PubSubHubbub’s projectsite with a slide deck describing how the subscription process works.The XML snippet shows an example of what the <link /> node would look like in anATOM feed that supports PuSH.To subscribe, a subscriber makes an HTTP POST request to the link provided in thehub node. The hub will then verify the subscriber and return an HTTP error code thatindicates if the subscription was successful or not. Hubs can verify subscriptionseither synchronously or asynchronously. If the hub is using synchronous verification itwill return a 204 “No Content.” If asynchronous, it will return a 202 “Accepted.” Inthis case, the hub will do what it needs to verify the subscription and will send a HTTPGET request to the subscriber indicating that the subscription has been accepted. If asubscription request is denied, appropriate codes in the 4xx-5xx range will bereturned.Callback authorization uses a challenge key to authorize the subscriptionSubscribers can renew their subscription at anytime by re-subscribing 12
  15. 15. PuSH is designed so that all complexity exists in the hub. All a publisher has toprovide is a link in their feed to their hub and then they need to negotiate with theirhub on how they will notify the hub that they have updated content.All the rest of the work is put on the hub. The hub is responsible for:• accepting and verifying subscriptions• managing the subscriptions• handling update pings from the publishers• extracting new and updated data from the publisher’s feed• sending subscribers their new content• DoS protectionsPublishers may be their own hub, or they can use a commercially available hub. Theprotocol allows for publishers and hubs to negotiate how they are going tocommunicate. This means they get to choose the protocols and data formats theywill use internally. As a matter of fact, Montage has taken some liberties with thisflexibility in the protocol, which we will get into in a bit. 13
  16. 16. Thrift is a cross-language services development framework open sourced byFacebook. 13
  17. 17. Superfeedr is the largest public PuSH hub. Since PuSH is a pragmatic protocol itdoesn’t define the implementation to a T. This allows hubs to take some libertieswith their implementations, and Superfeedr has taken advantage of this.• They have added digest notifications which will give subscribers a digest of their subscription, and they suggest it can be used as a heartbeat to ensure your subscription is still active.• Feed status provides information on feed such as how much data was fetched, how many new entries there were and when the next fetch will occur.• Virtual feeds allow subscribers to filter their feed at the hub.http://boxcar.io/ - Instant Twitter, Facebook & Feed notifications to mobile devices 14
  18. 18. This diagram demonstrates the messaging infrastructure within the MontageNetwork.At Montage we are currently maintaining two APIs. One is our Core API and theother is our Notifications API. The Notification API is nothing more than endpointsthat respond to GET requests with an ATOM feed.Since the PuSH spec leaves the communication between the publisher and the hubopen for negotiation, we have taken some liberties that have allowed us to use pushnotifications, even though our entire infrastructure has not yet been converted to usereal-time events. At this time, Montage’s main user interface has not yet beenconverted to use the API, so any interaction with the system through that interface isnot setup to raise real-time notifications to the hub.To compensate we have setup a polling service to pull data from the notificationfeeds. This polling actually accomplishes 2 needs. One is the aforementioned issuewe have supporting a legacy application. The second, however, is that we use thepolling service as a catch-all for any missed notifications. We have made a decision tocentralize all complexity in the hub, which means the update pings from the MontageAPI are not reliable. If one fails we don’t care and we won’t retry. The polling service 15
  19. 19. will eventually pickup the missed event. 15
  20. 20. You may potentially be wondering what this fetch update is for. If we have polling inplace, that will pull event data from the Montage API and if we support an updateping from the Montage API, what is this doing there?There are 2 ways we could have implemented our update pings. The way we aredoing it is that we use a ping that just tells the hub that new or updated data for agiven client is available on a feed. The hub then takes this information and fetchesthe updated data. This allows the hub to control the amount of data that it’s fetchingand allows it to control how it gets the data. Thus it could fetch the dataimmediately, or could wait if it wanted, but it has that flexibility.The other way it could be implemented would be with a fat ping. The Montage APIcould be configured to send a ping that includes the data being updated to the hub.This yields the benefit of reducing the callback step and could help with an accidentalDoS on the Notification API if the hub got overloaded with pings that all requiredcallbacks to the same feed endpoint. These are all great benefits and decisions weweighed and will continue to look at. However, for now our pattern is to yield fullcontrol to the hub, which is one reason why we have chosen the latter type of ping. 16
  21. 21. As I mentioned at the beginning, I was initially looking for a solution that would allowus to send notifications over HTTP/S, SMTP, SMS, etc. Since I was unable to find atool or service I realized that PuSH could actually enable us to do this quite easily.Our solution is that we are creating our own subscribers that will translate thenotification POSTs into whatever protocol we ultimately want the notificationdelivered over. For example, when a recruiter sets up a live interview, emails need tobe sent to the participants with the meeting link. What we do is we setup asubscriber for that client with a URL to our Email Service. The Email Service accepts anotification that a new interview has just been created, it merges the participantswith an email template and sends the email to the recipient.With this pattern, we can support any outgoing protocol. In addition, we also gainthe benefit of allowing flexibility for our consumers. Some of our consumers wouldrather send out communications themselves. They have the flexibility to setup theirown subscribers to do the translation and delivery of these messages if they desire. 17
  22. 22. As for what our endpoints look like, our pattern is to create an endpoint pernotification type in the Notification API. Data is segregated by client and can belimited by the date/time of the notifications and can have count limits set by the takeand skip parameters . By using a feed that supports subscriptions, we give ourconsumers flexibility in how they wish to consume notifications.If they choose not to setup an endpoint that we can use to POST to for subscriptions,the consumer is able poll their feed for updates (as long as they don’t DoS us).However, as we would prefer they do, they may setup a subscription to any of thenotification endpoints and we will notify them of new events. 18
  23. 23. This diagram represents the internals of the Montage Notification Hub.The Montage Notification Hub is a service that accepts update notifications from theMontage Core API when events occur, fetches data that applies to these events andforwards the data to anyone who has registered as a subscriber for that data. Ourhub maintains its own database of feeds and subscribers as well as any subscriptionverification data it may need to accept new subscribers.There is one part of this diagram which refers to something that as of yet, has notbeen addressed, and that is message reliability. Just before I mentioned that we havepushed all reliability to the hub, and this is where it comes into play. We will beimplementing a queuing system to give us message durability and reliability. Theaddition of this queue will allow our subscribers to configure what to do in the caseof a failure. Will we try to resend? How many times will we try to resend, and howoften? What happens when all that fails? The PuSH spec addresses reliability, butvery loosely. “Hubs SHOULD retry notifications repeatedly until successful (up tosome reasonable maximum over a reasonable time period).” Obviously the spec isn’ttoo helpful in this regard, so in our case, our intent is to leave the reliabilityconfiguration up to the subscriber. 19
  24. 24. As mentioned, the PuSH spec is pragmatic, not pure, and definitely doesn’t defineevery nuance of an implementation. One of those nuances is authentication.The only thing close to authentication that PuSH addresses is transmission layersecurity and message security. Transmission security is easy and obvious, HTTPS,where possible. The spec does mention how to distribute content securely over anopen connection, which involves encrypting the payload of the message.What the spec doesn’t address is what if the subscriber requests that the hubauthenticate with it before it will accept notifications? This may not seem like a bigdeal when you are using a hub to retrieve blog postings, but when you aretransmitting sensitive data, authenticating with the subscriber isn’t a terrible thought.As a result of our client’s needs, we have taken our own liberties with the spec andhave implemented the ability to authenticate with our subscribers. Our architectureshould allow us to authenticate with any subscriber in the manner in which theywould like us to authenticate.HMAC Hashing 20
  25. 25. Ideally we would like to utilize a public hub or some other service to do this. Whilewe think that the solution to our problem has been well designed, maintaining a huband sending notifications isn’t our core business. We need it to support what we do,but actually making it happen is a distraction from our core functionality. For us to goto a service, the service is going to have to be flexible enough for us to accommodatethe varying needs of our consumers, or we will have to work with our consumers toconform to the way that we send notifications.While it’s ideal that we work with a service provider for our notifications, that doesn’tseem like it’s in the cards anytime soon. As a result, we are planning to implementqueuing and are looking for both queuing tools and tools to manage reading items offthe queue. Managing reading the items off the queue, again, isn’t core to what wedo, so any package we can utilize to manage that will be in our best interest.Self subscribe/unsubscribe will be added in the future allowing our clients tomaintain what notifications they would like to receive.Lastly, for simplicity all of our messaging is HTTP/S within the Montage Network, aswell as the notifications that are sent to our consumers. A change we would like tomake is to a lighter faster protocol internally, especially between the publisher and 21
  26. 26. the hub. 21
  27. 27. 22
  28. 28. 23
  29. 29. 24
  30. 30. 25