Twitter Streaming API Architecture


Published on

Follow a Tweet from creation to timeline and Streaming API delivery. The design of streaming within Twitter is influenced by the entire Twitter architecture, the direction of the platform, data syndication policies and Quality of Service requirements. We'll discuss these influences and our system implementation.

Published in: Technology
1 Comment
  • The slide notes contain more detail.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

  • So, where are we going with the Streaming API?
    What are our constraints?

    We have four big goals for Streaming.

    First, We want users to have a low latency experience.

    Instant feels like the right speed for Twitter. Not 18 seconds later. More or less right now.

    Second, every write into Twitter is an event that someone, somewhere might be interest in.

    We want to expose more event types than just new Tweets.

  • Third, we also want to provide full fidelity data for those that need it.

    Sometimes you just need everything. And you also a place to put it.

    And finally, we need to make Large Scale integrations with other services as easy as possible.

    You shouldn’t have to wrestle with parallel fetching, rate limiting, and all that.

    It should be easier for all developers to get data out of Twitter.
  • The REST API is not the solution for a low latency experience,
    for large scale integrations, or for
    exposing more and more event types.

    The REST model may great for many things, but for real-time Twitter
    where you just want to know what’s changed
    we’ve already pushed Request Response too far.

    It’s painful.

  • You can’t quickly poll for deltas on the social graph, friends timeline, user timeline, mentions, favorites, searches, lists, and trends for just one user, never mind all your followings.

    Or a million users. Or ten million. Impossible.

    As Twitter adds more features, this just gets worse.

    It’s just not practical to lift rate limits high enough to meet everyone’s goals.

    The real-time REST model is near a the point of collapse.

  • Why is REST so expensive?

    A lot of effort goes into responding to each API request.

    There’s a lot to do, a lot of data to gather, and
    none of it is on that front end box handling the request.

    To make matters more difficult, the the cost and latency distributions are very wide --
    from a cheap cache hit to a deep database crawl.

    Keeping latency low is a struggle.
  • Any solution, powerful enough to solve all of these problems, is going to be a bit dangerous.

    It needs some controls especially if rate limits are removed.

    And, will still need to preserve all of our policies around
    abuse, privacy, terms of use, and so forth.
  • We’ve really tried to think through all of the policy implications here.

    Everyone has to play by the same rules and it must be
    possible for everyone to have a chance at building a sustainable business.

    We’ve come to some win win decisions about the firehose and other elevated access levels.

    I think we can make nearly everyone very happy.

    Go to the Corp Dev Office hours at 2:30pm for more detail about our Commercial Data Licenses.

    Keep this in mind- solving these policy issues are requirements, just as much as the technology issues are.

  • Our Solution for all this is the Streaming API.

    We’ve already proven that we can offer low latency streams of all Twitter events.

    We’ve been streaming these events to ourselves for quite some time.

    Twitter Analytics, for example, takes various private streams to feed experimental and production features.


    So, how does Streaming work?

  • In a nutshell,
    we gather interesting events everywhere in the Twitter system and
    apply those events to each Streaming server.

    Inside the server,
    we examine the event just once, and
    route to all interested clients.

  • This approach is a huge huge win over Request Response.

    It has turned out to be practical, stable and very efficient.

    Little effort is wasted.

    Yes, we look at each event on each of our streaming servers,
    but that’s really nothing compared to processing billions of requests
    only to say: sorry no new tweets yet.

    Since each event is delivered only once, there’s no bandwidth wasted.

    Latency is very low too. More on that later.

  • There’s a flexibility bonus here too.

    We can add new event types to streams without having everyone recode to hit new endpoints.

    Just like adding new fields to JSON markup is future proof,
    we can also easily add new events to existing streams.

    When you are ready to use the new events, you can,
    otherwise, ignore them.

  • What does a stream look like?

    Well, its a continuous stream of discrete JSON or XML messages.

    We deliver events at least once and in roughly sorted order.

    In general, during steady state, you’ll see each event exactly once with a practical K sorting.

    I’ll talk more about how these properties affect you at my other talk.

  • These properties mean that you need to do at least a little post processing on your end.

    The data isn’t always display ready or even display worthy --
    you need to post-process the Streaming API.

    Also, the streaming api servers don’t do much markup rendering -- that happens upstream in Ruby Daemons -- so whatever rendering quirks you are used to on the REST API, well, they’ll be here too.

    At least it’s always the same quirks.

  • So how does the Streaming API fit within the Twitter system?

    It’s all a downstream model.

    Users do things, stuff happens, and we route a copy to Streaming.

    Let’s look at how we handle a common event: the creation of a new tweet.

  • In the user visible loop, the FEs validate the input and update critical stores.

    They ack the user, then drop a message into a Kestrel message queue for offline processing.

    This way we can give user feedback, yet defer the heavy lifting to our event driven architecture.

    The tweets are fanned out to internal services: search, streaming, facebook, mobile, lists, and timelines.

    As an example, timeline processing daemons read the event, serially look up all the followers in Flock and re-enqueue large batches of work.

    Even before this flock lookup completes, another timeline daemon pool reads these batches then
    updates the memcache timeline vector of all the followers in a massively parallel fashion.

    The other server do their own thing, and the tweet is eventually published everywhere.

  • Here we can see how events are fanned out from the Kestrel cluster into a single Hosebird cluster.

    Hosebird is the name of the Streaming server implementation.
    I really don’t like it. But the name stuck.

    Anyway, we use kestrel fanout queues to present each event to each fanout Hosebird process.

    Fanout queues duplicate each message for each known reader.

    Kestrel queues are bomb-proof and relatively inexpensive, but they aren’t free.
  • So, within a streaming cluster, we get cheap, by cascading.

    Cascading is where a hosebird process reads from a peer via streaming HTTP, just like any other streaming client.

    No coordination is needed and we’re eating our own dogfood.

    There’s hardly any latency added by cascading, but the cost savings are considerable when there’s a large cluster of hosebird machines.

    Also, we get rack locality of bandwidth, as the hosebirds are generally together in a rack,
    while the kestrels are located on another isle.

  • OK. We’ve routed the events to all of the hosebird servers.

    How do the servers work internally?

    Hosebird runs on the JVM.

    It’s written in Scala.

    And uses an embedded Jetty webserver to handle the front end issues.

    We feed each process 8 cores and about 12 gigs of memory.

    And they each can send a lot of data to many many of clients.
  • Events flow through Scala actors that host the application logic.

    Filtered events are sent through a Java queue
    then read by the connection thread which handles the socket writing details.

    We use the Grabby Hands kestrel client to provide
    highly parallel and low latency
    blocking transactional reads from Kestrel.

    We use our own Streaming client in the cascading case.

    Both fetching clients are very efficient and hardly use any CPU.

  • I used the Scala Actor model wherever practical,
    to prevent a lot of worrying about concurrency issues.

    It’s not a panacea but it has made much of this work trivial.

    Actors currently fall down if you have too many of them,
    so we use the Java concurrency model to host the connections.

    Otherwise its all Actors.
  • You may notice the apostasy of burning a thread per connection.

    The year 1997 is calling to mock me, I’m sure. But so far it hasn’t mattered.

    The memory utilization isn’t a limiting factor, and it keeps things very simple.

  • Feeds are logical groupings of events -- public statuses, direct messages, social graph changes, etc.

    Feeds keep a circular buffer of recent events to support the count parameter and some historical look back.

    I had to parallelize the JSON and XML parsing,
    which turned out to be the big CPU burn and probably our major tweets per second scaling risk.

  • Feeds can be reconfigured to internally forward events to other feeds.

    Arbitrary composition in conf files a pretty powerful concept.

    So, to create user streams, I just had to forward events from all these other existing feeds
    into the User feed and write some custom delivery logic.

    Yes, there are streams of direct messages. And social graph changes. And other interesting things.

    We can’t expose them just yet due to privacy policy issues.

    But, we’ll get there. Plans have been laid.

  • It doesn’t take long for a tweet to be created,
    pass through all of these components, and
    be presented to your stream.

    If all is running well with all of the upstream systems --
    tweets and other events are usually delivered with an average latency of about 160ms.
  • Here’s a ganglia monitoring graph, one of hundreds just for the Streaming API.

    Sometimes I find it funny that outside devs say
    “hey, did you know that you are throwing 503s on this endpoint”.

    Yes, we know. There’s a graph for it.

    If there isn’t a graph -- we immediately add one.

    And we roll the key ones up into a grid of 12 summary graphs that everyone watches.

    There’s also a bank of graph monitors in ops.

  • Each line above represents the average latency from each of several hosebird clusters.

    This was taken during peak load on a typical weekday.

    You can see a blip about half way through.

    Given that all clusters moved in unison, there was probably an upstream garbage collection in kestrel, or something similar.

    We’ve put a lot of effort into lowering Twitter latency and keeping it low and predictable.

    (If visible, blue line is a cascaded cluster, where yellow and green are fanout only.)
  • User streams offer a much more engaging way to interact with Twitter.

    You get to see a lot that happens to you -- who favorited your tweet, who followed you, and so forth --
    in real time.

    You also get to see what your followings are doing. Who they favorited and followed.

    There’s a huge opportunity for discovery here with User Streams.
    If two friends favorite a tweet, and two others follow the tweeter, show me the tweet!

    We know that User Streams are transformative. Goldman and I were watching #chirp during Ryan’s talk.
    It was incredible to watch them scroll by.

    In the few days we’ve been using them at the office, everyone has been transfixed.
    Engineering productivity has plummeted!
    It’s the Farmville of Twitter.

  • OK, what next for Streaming?

    First we’re going to get user streams out there.

    We have some more critical features to add and we have to
    add capacity to handle potentially millions of connections.

    We’ve announced the details for a user stream preview period.
    Read them carefully before coding or planning anything.

    There are also some interesting events that we don’t yet publish.
    We’ll see what we can get out there for you.

    Once user streams are in a good spot, we want to get back to some interesting large scale integration features.

  • With user streams its all coming together.

    Real time Twitter.

    Lots of event types.

    More engagement.

    More discovery.

    New user experiences are now possible.

    Go out and build something great!

  • Twitter Streaming API Architecture

    1. 1. Twitter Streaming API Architecture John Kalucki @jkalucki Infrastructure
    2. 2. Heading • Immediate User Experience • More event types • Full fidelity • Easier integrations
    3. 3. REST? • Downsides • Latency • Complexity • Expense • Prevents • At-Scale Integrations • Features • Fidelity
    4. 4. Needy • Authenticate • Rate Limit • Query vast caches • Query deep data stores • Render. Render. Render. • All just to say: “No new tweets. Try later.”
    5. 5. Policy • Prove Relationship via Auth Token • Terms of Use • No resyndication • Protect users, content, ecosystem
    6. 6. Streaming Solution • Gather events • Route to all servers • Present to decision logic: Exactly once per server • Move ‘em out: Send just once* to clients
    7. 7. Win, Huge • Low latency • Little duplicated computation • No wasted bandwidth • New event types without new endpoints
    8. 8. Win, Huge • Low latency • Little duplicated computation • No wasted bandwidth • New event types without new endpoints
    9. 9. Properties • At Least Once • Roughly Sorted (K-Sorted) by Time • Middleware - No rendering • More at 2:30pm talk - Thinking In Streams
    10. 10. Properties • At Least Once • Roughly Sorted (K-Sorted) by Time • Middleware - No rendering • More at 2:30pm talk - Thinking In Streams
    11. 11. Gather All Events
    12. 12. Push Events
    13. 13. Latency
    14. 14. <created_at> - client arrival Policy 200ms 100ms 0 ms 1 hour period
    15. 15. <created_at> - client arrival Policy 200ms 100ms 0 ms 1 hour period
    16. 16. User Streams chirpstream/2b/user.json
    17. 17. Future • User Streams refinement and launch • More data types • Better query support • Large scale integration support