SlideShare a Scribd company logo
Thinking In
Streaming
John Kalucki
@jkalucki
Infrastructure
Turtles All The Way Down
•   Your client ≅ Our server

•   Gather Events

•   Parse JSON

•   Match on Predicates

•   Route to Consumers
Properties
•   Offered

    •   At Least Once

    •   Roughly Sorted (K-Sorted)

•   Desired

    •   Exactly Once

    •   Sorted
Plan
•   Over-deliver
    Ensure At Least Once

•   De-duplicate
    Unordered Exactly Once

•   Sort
    Ordered Exactly Once
Why At Least Once?
•   Exactly Once impractical across streams

•   Clients must handle reconnect over-delivery

•   Reuse this capability

    •   Mask upstream failures

    •   Relax server restart issues
Why At Least Once?
•   Exactly Once impractical across streams

•   Clients must handle reconnect over-delivery

•   Reuse this capability

    •   Mask upstream failures

    •   Relax server restart issues
Why At Least Once?
•   Exactly Once impractical across streams

•   Clients must handle reconnect over-delivery

•   Reuse this capability to

    •   Mask upstream failures

    •   Relax server restart issues
Startup
•   Prefetch from peer to populate circular buffer

•   Go multi-user

•   Consume Kestrel backlog - duplicates between:

    •   Buffer and backlog

    •   Previous connection and backlog

•   Steady State: Exactly Once Delivery
Startup
•   Prefetch from peer to populate circular buffer

•   Go multi-user

•   Consume Kestrel backlog - duplicates between:

    •   Buffer and backlog

    •   Previous connection and backlog

•   Steady State: Exactly Once Delivery
Upstream Failure
•   Cascaded source fails

•   Fail over to next peer

•   Over-request to avoid loss

•   Steady State: Exactly Once Delivery
Client Over-delivery
•   Use Count Parameter after fast reconnect

•   Deep backfill from REST API

    •   Client offline offline for a while

    •   User first issues new query

•   Overlap connections slightly
De-Duplication
Infinite Streams
•   De-duplicating a randomly ordered infinite
    stream requires infinite time and storage

•   Sorting? Ditto

•   I have neither infinite time nor storage
Roughly Sorted
•   A sequence α is k-sorted IFF ∀ i, r, 1 ≤ i ≤ r ≤
    n, i ≤ r - k implies aᵢ ≤ aᵣ

•   Strictly sorted is 0-sorted.

•   Transpose two adjacent values in a 0-sorted
    sequence, becomes 1-sorted.

•   K For the firehose?
Firehose K
500k IDs   Sunday Night Monday Peak
    100.0%         3356         2507
    99.99%          650          509
    99.90%          232          271
    99.00%          143          160
    90.00%           36           45
    50.00%            5            7
   Average           14           17

           Noisy
Firehose K
500k IDs   Sunday Night Monday Peak
    100.0%         3356         2507
    99.99%          650          509
    99.90%          232          271
    99.00%          143          160
    90.00%           36           45
    50.00%            5            7
   Average           14           17

           Noisy
Pessimist’s K
•   In theory could be hours & millions of events

•   Practically, if current and stale queues exist:

    •   We’ll flush the stale queues before exposing

    •   You’ll never know this happened

•   If all queues stale:

    •   We’ll deliver the backlog

    •   K remains reasonable
Unordered
De-duplication
•   Create two HashSets: Primary, Secondary,
    each preallocated to size K

•   New event is duplicate if ID exists in Primary

•   Add new ID to both HashSets

•   When Primary.size > K / 2
    Primary.clear
    Swap Primary & Secondary
Unordered
De-duplication
•   Bounded memory consumption

•   O(n) behavior

•   Low latency

    •   Emit first tweet

    •   Discard subsequent duplicates

•   Cheaper than de-duplication by sorting?
    Probably depends on K
Ordered & De-duplicated
•   Insertion sort and de-duplicate by ID into a
    decreasing order list

•   While length > K, remove sorted tail
Ordered & De-duplicated
•   O(n) --- O(n * K)

•   Bounded memory consumption

•   Induces latency of K

•   Assumes average items not very unsorted

•   K is usually large to handle the outliers
Routing Events
•   By Keyword or by UserId

•   Add predicates to HashMap

•   Apply events to Map

•   Query holds private predicate set for later Map
    removal

•   O(n)
Reliability
•   Decompose

•   Decouple

•   Monitor
Monitoring
•   What to look at?

•   Latency

•   Throughput

•   Errors

•   Alerting
Horizontal Scale
•   Firehose keeps Growing.

•   Eventually Firehose stream will become
    impractical.

•   Partition the Firehose into N streams.

More Related Content

What's hot

Deploying Immutable infrastructures with RabbitMQ and Solr
Deploying Immutable infrastructures with RabbitMQ and SolrDeploying Immutable infrastructures with RabbitMQ and Solr
Deploying Immutable infrastructures with RabbitMQ and Solr
Jordi Llonch
 
How the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My NamespaceHow the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My Namespace
Laurent Bernaille
 
Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)
Laurent Bernaille
 
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
Laurent Bernaille
 
Self Created Load Balancer for MTA on AWS
Self Created Load Balancer for MTA on AWSSelf Created Load Balancer for MTA on AWS
Self Created Load Balancer for MTA on AWSsharu1204
 
Ease of use in Apache Solr
Ease of use in Apache SolrEase of use in Apache Solr
Ease of use in Apache Solr
Anshum Gupta
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
Xin Wang
 
Storm worker redesign
Storm worker redesignStorm worker redesign
Storm worker redesign
Roshan Naik
 
Making the most out of kubernetes audit logs
Making the most out of kubernetes audit logsMaking the most out of kubernetes audit logs
Making the most out of kubernetes audit logs
Laurent Bernaille
 
How to tune Kafka® for production
How to tune Kafka® for productionHow to tune Kafka® for production
How to tune Kafka® for production
confluent
 
Docker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profitDocker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profit
Maxime Petazzoni
 
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
SaltStack
 
Service discovery in Docker environments
Service discovery in Docker environmentsService discovery in Docker environments
Service discovery in Docker environments
alexandru giurgiu
 
Swift container sync
Swift container syncSwift container sync
Swift container syncOpen Stack
 
Kubernetes DNS Horror Stories
Kubernetes DNS Horror StoriesKubernetes DNS Horror Stories
Kubernetes DNS Horror Stories
Laurent Bernaille
 
Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726
Adam Jordens
 
Scaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customersScaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customers
Speck&Tech
 
Integration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container serviceIntegration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container service
SaltStack
 
What's new in Ansible 2.0
What's new in Ansible 2.0What's new in Ansible 2.0
What's new in Ansible 2.0
Allan Denot
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
Docker, Inc.
 

What's hot (20)

Deploying Immutable infrastructures with RabbitMQ and Solr
Deploying Immutable infrastructures with RabbitMQ and SolrDeploying Immutable infrastructures with RabbitMQ and Solr
Deploying Immutable infrastructures with RabbitMQ and Solr
 
How the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My NamespaceHow the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My Namespace
 
Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)
 
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
 
Self Created Load Balancer for MTA on AWS
Self Created Load Balancer for MTA on AWSSelf Created Load Balancer for MTA on AWS
Self Created Load Balancer for MTA on AWS
 
Ease of use in Apache Solr
Ease of use in Apache SolrEase of use in Apache Solr
Ease of use in Apache Solr
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Storm worker redesign
Storm worker redesignStorm worker redesign
Storm worker redesign
 
Making the most out of kubernetes audit logs
Making the most out of kubernetes audit logsMaking the most out of kubernetes audit logs
Making the most out of kubernetes audit logs
 
How to tune Kafka® for production
How to tune Kafka® for productionHow to tune Kafka® for production
How to tune Kafka® for production
 
Docker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profitDocker and Maestro for fun, development and profit
Docker and Maestro for fun, development and profit
 
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
 
Service discovery in Docker environments
Service discovery in Docker environmentsService discovery in Docker environments
Service discovery in Docker environments
 
Swift container sync
Swift container syncSwift container sync
Swift container sync
 
Kubernetes DNS Horror Stories
Kubernetes DNS Horror StoriesKubernetes DNS Horror Stories
Kubernetes DNS Horror Stories
 
Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726
 
Scaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customersScaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customers
 
Integration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container serviceIntegration testing for salt states using aws ec2 container service
Integration testing for salt states using aws ec2 container service
 
What's new in Ansible 2.0
What's new in Ansible 2.0What's new in Ansible 2.0
What's new in Ansible 2.0
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
 

Similar to Thinking in Streaming - Twitter Streaming API

Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
rogerbodamer
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
Shalin Shekhar Mangar
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
Morningstar Tech Talks
 
Dashboard Mania
Dashboard ManiaDashboard Mania
Dashboard Mania
Tim Lossen
 
The Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQLThe Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQLDatadog
 
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠ Cor...
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠  Cor...Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠  Cor...
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠ Cor...
corehard_by
 
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Ontico
 
Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleChristophe Grand
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
Lei (Harry) Zhang
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
ScyllaDB
 
Seek and Destroy Kafka Under Replication
Seek and Destroy Kafka Under ReplicationSeek and Destroy Kafka Under Replication
Seek and Destroy Kafka Under Replication
HostedbyConfluent
 
Modern Cryptography
Modern CryptographyModern Cryptography
Modern Cryptography
James McGivern
 
Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"
Markus Jura
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Lucidworks
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and others
Walter Liu
 
Elegant concurrency
Elegant concurrencyElegant concurrency
Elegant concurrency
Mosky Liu
 

Similar to Thinking in Streaming - Twitter Streaming API (20)

Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 
Dashboard Mania
Dashboard ManiaDashboard Mania
Dashboard Mania
 
The Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQLThe Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQL
 
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠ Cor...
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠  Cor...Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠  Cor...
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠ Cor...
 
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
 
Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Seek and Destroy Kafka Under Replication
Seek and Destroy Kafka Under ReplicationSeek and Destroy Kafka Under Replication
Seek and Destroy Kafka Under Replication
 
Modern Cryptography
Modern CryptographyModern Cryptography
Modern Cryptography
 
Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and others
 
Elegant concurrency
Elegant concurrencyElegant concurrency
Elegant concurrency
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 

Thinking in Streaming - Twitter Streaming API

  • 1.
  • 3. Turtles All The Way Down • Your client ≅ Our server • Gather Events • Parse JSON • Match on Predicates • Route to Consumers
  • 4. Properties • Offered • At Least Once • Roughly Sorted (K-Sorted) • Desired • Exactly Once • Sorted
  • 5. Plan • Over-deliver Ensure At Least Once • De-duplicate Unordered Exactly Once • Sort Ordered Exactly Once
  • 6. Why At Least Once? • Exactly Once impractical across streams • Clients must handle reconnect over-delivery • Reuse this capability • Mask upstream failures • Relax server restart issues
  • 7. Why At Least Once? • Exactly Once impractical across streams • Clients must handle reconnect over-delivery • Reuse this capability • Mask upstream failures • Relax server restart issues
  • 8. Why At Least Once? • Exactly Once impractical across streams • Clients must handle reconnect over-delivery • Reuse this capability to • Mask upstream failures • Relax server restart issues
  • 9.
  • 10. Startup • Prefetch from peer to populate circular buffer • Go multi-user • Consume Kestrel backlog - duplicates between: • Buffer and backlog • Previous connection and backlog • Steady State: Exactly Once Delivery
  • 11. Startup • Prefetch from peer to populate circular buffer • Go multi-user • Consume Kestrel backlog - duplicates between: • Buffer and backlog • Previous connection and backlog • Steady State: Exactly Once Delivery
  • 12. Upstream Failure • Cascaded source fails • Fail over to next peer • Over-request to avoid loss • Steady State: Exactly Once Delivery
  • 13. Client Over-delivery • Use Count Parameter after fast reconnect • Deep backfill from REST API • Client offline offline for a while • User first issues new query • Overlap connections slightly
  • 15. Infinite Streams • De-duplicating a randomly ordered infinite stream requires infinite time and storage • Sorting? Ditto • I have neither infinite time nor storage
  • 16. Roughly Sorted • A sequence α is k-sorted IFF ∀ i, r, 1 ≤ i ≤ r ≤ n, i ≤ r - k implies aᵢ ≤ aᵣ • Strictly sorted is 0-sorted. • Transpose two adjacent values in a 0-sorted sequence, becomes 1-sorted. • K For the firehose?
  • 17. Firehose K 500k IDs Sunday Night Monday Peak 100.0% 3356 2507 99.99% 650 509 99.90% 232 271 99.00% 143 160 90.00% 36 45 50.00% 5 7 Average 14 17 Noisy
  • 18. Firehose K 500k IDs Sunday Night Monday Peak 100.0% 3356 2507 99.99% 650 509 99.90% 232 271 99.00% 143 160 90.00% 36 45 50.00% 5 7 Average 14 17 Noisy
  • 19. Pessimist’s K • In theory could be hours & millions of events • Practically, if current and stale queues exist: • We’ll flush the stale queues before exposing • You’ll never know this happened • If all queues stale: • We’ll deliver the backlog • K remains reasonable
  • 20. Unordered De-duplication • Create two HashSets: Primary, Secondary, each preallocated to size K • New event is duplicate if ID exists in Primary • Add new ID to both HashSets • When Primary.size > K / 2 Primary.clear Swap Primary & Secondary
  • 21. Unordered De-duplication • Bounded memory consumption • O(n) behavior • Low latency • Emit first tweet • Discard subsequent duplicates • Cheaper than de-duplication by sorting? Probably depends on K
  • 22. Ordered & De-duplicated • Insertion sort and de-duplicate by ID into a decreasing order list • While length > K, remove sorted tail
  • 23. Ordered & De-duplicated • O(n) --- O(n * K) • Bounded memory consumption • Induces latency of K • Assumes average items not very unsorted • K is usually large to handle the outliers
  • 24. Routing Events • By Keyword or by UserId • Add predicates to HashMap • Apply events to Map • Query holds private predicate set for later Map removal • O(n)
  • 25. Reliability • Decompose • Decouple • Monitor
  • 26. Monitoring • What to look at? • Latency • Throughput • Errors • Alerting
  • 27. Horizontal Scale • Firehose keeps Growing. • Eventually Firehose stream will become impractical. • Partition the Firehose into N streams.

Editor's Notes

  1. There is a lot of symmetry in what the Streaming API servers do and what your streaming clients do. In both cases we’re gathering events, parsing them, and farming them out to various consumers. The issues are similar at all processing points in the stream.
  2. We present a stream of events that is roughly sorted by created at time. This means that the events are mostly in created at time order, but not exactly so. We’ve designed our system to publish each event at least once -- which means none are lost, but there may, at times, be duplicates. I’ll discuss why our streams have these properties. Also, you’ll probably want to display or process tweets exactly once -- none missing and none duplicated. You might also want to present them sorted, or you might be OK with a rough sorting. I’ll go over two algorithms for converting what the API offers into the stream that you want.
  3. The basic plan is to over deliver events and then de-duplicate them to provide an exactly once quality of service. One technique is to just de-duplicate with set logic, the other is to sort and de-duplicate. There are trade offs with each.
  4. First, let’s see why the Streaming API offers events at least once. It would be nice if we could offer everything transactionally, that is, exactly once. But, it’s impractical to synchronize this state across client reconnections. For example, it’s unlikely that you’ll reconnect to the same server.
  5. Also, event streams aren’t strictly ordered, so we wouldn’t know what to deliver. We’d have to coordinate a large vector of sent events between servers. And, clients would have to transactionally acknowledge all events received. This is quite impractical at scale unless we sorted streams, but this would introduce latency. We’ll see why sorting induces latency later.
  6. Yet, first and foremost, we want a very low latency experience. And, we want a simple programming model for clients. So, we assume that clients can over-request when reconnecting, and post process to get the required stream properties. Once we make this fundamental assumption, we can reuse this to also handle the internal data loss risk as well.
  7. Our Streaming API server is called Hosebird. Hosebird receives events from the rest of the Twitter system through Kestrel message queues. Two hosebird processes in each cluster read transactionally from Kestrel. The rest of the servers in a cluster cascade via Streaming HTTP.
  8. When a hosebird server starts, it prefetches events from a peer to pre-populate its circular buffers. These buffers are used to support the count parameter, which allows some historical back fill on streaming queries. Count allows your stream to start back a few minutes, then catch up and transition to real time streaming.
  9. This startup prefetching creates a window where you might see the same event twice, if you are unlucky enough to connect to a very recently restarted server. The backlog read from kestrel will contain some of the same events that were prefetched into the buffer. The backlog may also have events that you read on your last connection. You might have to suffer through a minute or so of duplicates as the backlog is processed and displaces the prefetched events in the circular buffer. Outside of this restart case, during steady state processing, we deliver each event exactly once on fanout servers.
  10. When a cascaded server has its source Hosebird restart, say during a deploy, the server needs to quickly fail over to another source. A gap in the stream would be introduced during the failure, detection and reconnection window. We cover this gap by requesting some back-fill from the new source. This causes a short period of duplicated events. During steady state processing, however, we deliver each event exactly once on cascaded servers.
  11. Your client should use these same techniques on reconnect. Over request with the count parameter if the connection was momentarily lost. If the client has been disconnected for an extended period, you’ll have to back fill from the REST API. When you need to make a predicate change, you can create a new connection, wait for the first event to arrive, then disconnect the old connection. This should generally produce an at least once stream.
  12. Let’s talk about de-duplication on your end.
  13. A finite stream looks a lot like a relational database table -- a finite relation. We’re used to thinking about finite relations. But, a stream appears as an infinite relation, you can’t ever read to the end. Also, since we want very low latency, we can’t wait to read to the end. We have to present results immediately.
  14. A roughly sorted sequence is mostly sorted, where no element is more than K positions away from its strictly sorted position. At Twitter, we talk about K sorted things all the time. K this, K that. Nothing is strictly ordered. We have relaxed various legs of the CAP theorem to make our distributed system feasible. We’ve never had strictly ordered event processing. Tweets are applied to your timelines in a rough ordering. On the REST API, we sort the vector before we present it to you, but it’s very loose behind the scenes. Likewise, events show up in the Streaming API roughly sorted by created at time.
  15. Here are two samples from the status firehose. I took five hundred thousand status ids, and did an insertion sort into a reverse sorted list. The most recent id at the head, the oldest status at the tail. These distributions show the number of list elements traversed before finding the sorted insertion point. So, the average and median number of hops are pretty small. The hundred percent case, the worst case, shows a much larger K.
  16. Assuming about 600 events per second on this stream, back when I took this sample, we can see that events show up as much as 5 seconds out of order. Close comparison of the distributions shows that they’re very noisy. If you took many samples, they’d all have a different shape. Having an idea of K helps us tune our de-duplication algorithms.
  17. Daily operational issues cause K to grow beyond 5 seconds now and then. It’s hard to say what a good upper bound for a display client should be. Something around a few minutes would cover most issues we’ve had over the last six months. A long-term storage client might want to assume a K of a few hours or a day or so. In the unlikely event that something goes really wrong with the system, we’ll make a judgement call on recovery. We’ll probably bias towards delivering the backlog, but, if there’s a partial failure, we’ll keep your K in mind.
  18. Now that we have a handle on K we can think about de-duplication. An infinite, but roughly sorted, stream can be de duplicated with some set logic. The key is efficiently aging out irrelevant set members. One way is to keep two hashes, and alternately clear them. You don’t have to do any fancy tracking of items, and off the shelf HashSets will work just fine. The union of the two sets contain at least K items and allow deduplication of a K sorted sequence.
  19. Given the Firehose K, you don’t even need all that much space to de-duplicate. Please don’t resort to using mySQL primary keys to de-duplicate streams. Unnecessary. The nice thing here is that we can emit events as they arrive and throw away late arriving dups. We don’t need to add any latency.
  20. On the other hand, if we want a sorted and deduplicated stream, we have to do a little more work. Given the Firehose K distribution, doing an insertion sort isn’t the worst thing. Most events don’t need to traverse too deeply into the list. Elements dequeued from the tail of the list are sorted and deduplicated.
  21. This algorithm does, however introduce a latency of K. We can’t emit a sorted event unless we have at least K elements to examine. Still, this is quite practical to do in memory. You can plow through a lot of ids per second even in a scripting language like Ruby.
  22. Now that we have a de-duplicated stream, we need to route it to consumers. This can be done very cheaply by registering every consumer’s predicates in a HashMap. If, say, you are displaying columns of search results, like TweetDeck, you can have each column register its keywords in the HashMap. Each new event is applied to the HashMap, and routed to all consumers easily. Duplicates can arise, as a given column may have several OR predicates that match. Hosebird uses a generational de-duplication scheme to solve this. This scheme is the degenerate case of the sorted algorithm above. Each client stream maintains just the primary key of the last event. If the same id is presented twice in a row, it can be discarded.
  23. Break things up into components. Host components in separate processes. Measure what happens between components. Use (reliable) queues between components.