SlideShare a Scribd company logo
1 of 58
Download to read offline
Routing Trillions of
Events Per Day @Twitter
#ApacheBigData 2017
Lohit VijayaRenu & Gary Steelman
@lohitvijayarenu @efsie
1. Event Logs at Twitter
2. Log Collection
3. Log Processing
4. Log Replication
5. The Future
6. Questions
In this talk
Overview
● Clients log events specifying a Category
name. Eg ads_view, login_event ...
● Events are grouped together across all
clients into the Category
● Events are stored on Hadoop Distributed
File System, bucketed every hour into
separate directories
○ /logs/ads_view/2017/05/01/23
○ /logs/login_event/2017/05/01/23
Life of an Event
Clients
Aggregated by Category
Storage HDFS
Http
Clients
Clients
Client Daemon
Client Daemon
Client Daemon
Http Endpoint
~3PB
<1500
Across millions of
clients
>1TTrillion Events a Day of Data a Day
Nodes
Incoming
uncompressed
Collocated with
HDFS datanodes
Event Log Stats
>600Categories
Event groups by
category
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
Event Log Architecture
Events Events
RT Storage (HDFS)
Inside
DC1
Events Events
RT Storage (HDFS)
Inside
DCN
DW Storage (HDFS)
Prod Storage (HDFS)
DW Storage (HDFS)
Cold Storage (HDFS)
Prod Storage (HDFS)
Event Log Architecture
Events Events
RT Storage (HDFS)
Inside
DC1
Events Events
RT Storage (HDFS)
Inside
DCN
DW Storage (HDFS)
Prod Storage (HDFS)
DW Storage (HDFS)
Cold Storage (HDFS)
Prod Storage (HDFS)
Collection
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
Past
Event Collection Overview
Future
Scribe
Client
Daemon
Scribe
Aggregator
Daemons
Scribe
Client
Daemon
Flume
Aggregator
Daemon
Flume
Aggregator
Daemon
Flume
Client
Daemon
Present
Event Collection
Past
Challenges with Scribe
● Too many open file handles to HDFS
○ 600 categories x 1500 aggregators x >6 per hour =~ >5.4M files per hour
● High IO wait on DataNodes at scale
● Max limit on throughput per aggregator
● Difficult to track message drops
● No longer active open source development
Event Collection
Present
Apache Flume
● Well defined interfaces
● Open source
● Concept of transactions
● Existing implementations of
interfaces
Source Sink
Channel
Client HDFS
Flume Agent
Event Collection
Present
Category Group
● Combine multiple related
categories into a category
group
● Provide different
properties per group
● Contains multiple events
to generate fewer
combined sequence files
Agent 1 Agent 2 Agent 3
Category 1 Category 3Category 2
Category Group
Group 1
Category Groups
Aggregator Group 1 Aggregator Group 2
Event Collection
Present
Aggregator
Group
● A set of aggregators
hosting same set of
category groups
● Easier to manage
group of aggregators
hosting subset of
categories
Agent 1 Agent 2 Agent 3 Agent 8
Group 2
Event Collection
Present
Flume features for category/aggregator
groups
● Extend Interceptor to multiplex events into groups
● Implement Memory Channel Group to have separate memory
channel per category group
● ZooKeeper registration per category group for service discovery
● Metrics for category groups
Event Collection
Present
Flume performance improvements
● HDFSEventSink batching increased (5x) throughput reducing
spikes on memory channel
● Implement buffering in HDFSEventSink instead of using
SpillableMemoryChannel
● Stream events close to network speed
Processing
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
>1PB 20-50%
To process one
day of data
8Wall Clock Hours Data per Day Disk Space
Output of cleaned,
compressed,
consolidated, and
converted
Saved by
processing Flume
sequence files
Log Processor Stats
Processing Trillion Events per Day
● Make processing log data easier for analytics teams
● Disk space is at a premium on analytics clusters
● Still too many files cause increased pressure on the NameNode
● Log data is read many times and different teams all perform the same
pre-processing steps on the same data sets
Log Processor Needs
Processing Trillion Events per Day
Datacenter 1
Log Processor Steps
ads_group/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hhlogin_group/yyyy/mm/dd/hh
Category Groups CategoriesDemux Jobs
ads_group_demuxer
login_group_demuxer
Datacenter 1
Log Processor Steps
ads_group/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hhlogin_group/yyyy/mm/dd/hh
Category Groups CategoriesDemux Jobs
ads_group_demuxer
login_group_demuxer
Datacenter 1
Log Processor Steps
ads_group/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hhlogin_group/yyyy/mm/dd/hh
Category Groups CategoriesDemux Jobs
ads_group_demuxer
login_group_demuxer
Decode
Base64 encoding from logged
data
1
Demux
Category groups into individual
categories for easier consumption by
analytics teams
2
Clean
Corrupt, empty, or invalid records
so data sets are more reliable
3
[Convert]
Some categories into Parquet for
fastest use in ad-hoc exploratory
tools
4
Consolidate
Small files to reduce pressure on
the NameNode
5
Compress
Logged data to the highest level to
save disk space. From LZO level 3
to LZO level 7
6
Log Processor Steps
● Scribe’s contract amounts to sending a binary blob to a port
● Scribe used new line characters to delimit records in a binary blob batch
of records
● Valid records may include newline characters
● Scribe base64 encoded received binary blobs to avoid confusion with
record delimiter
● Base 64 encoding is no longer necessary because we have moved to one
serialized Thrift object per binary blob
Why Base64 Decoding?
Legacy Choices
/logs/ads_click/yyyy/mm/dd/hh/1.lzo
/logs/ads_view/yyyy/mm/dd/hh/1.lzo
Log Demux Visual
/raw/ads_group/yyyy/mm/dd/hh/ads_group_1.seq
DEMUX
/logs/ads_view/yyyy/mm/dd/hh/1.lzo
/logs/ads_click/yyyy/mm/dd/hh/1.lzo
/logs/ads_view/yyyy/mm/dd/hh/1.lzo
Log Demux Visual
/raw/ads_group/yyyy/mm/dd/hh/ads_group_1.seq
DEMUX
/logs/ads_view/yyyy/mm/dd/hh/1.lzo
/logs/ads_click/yyyy/mm/dd/hh/1.lzo
/logs/ads_view/yyyy/mm/dd/hh/2.lzo
Log Demux Visual
/raw/ads_group/yyyy/mm/dd/hh/ads_group_1.seq
DEMUX
/logs/ads_view/yyyy/mm/dd/hh/1.lzo
● One log processor daemon per RT Hadoop cluster, where Flume
aggregates logs
● Primarily responsible for demuxing category groups out of the Flume
sequence files
● The daemon schedules Tez jobs every hour for every category group in a
thread pool
● Daemon atomically presents processed category instances so partial data
can’t be read
● Processing proceeds according to criticality of data or “tiers”
Log Processor Daemon
● Some categories are significantly larger than other categories (KBs v TBs)
● MapReduce demux? Each reducer handles a single category
● Streaming demux? Each spout or channel handles a single category
● Massive skew in partitioning by category causes long running tasks which
slows down job completion time
● Relatively well understood fault tolerance semantics similar to
MapReduce, Spark, etc
Why Apache Tez?
● Tez’s dynamic hash partitioner adjusts partitions at runtime if necessary,
allowing large partitions to be further partitioned so multiple tasks process
events for a single category one task
○ More info at TEZ-3209.
○ Thanks to team member Ming Ma for the contribution!
● Easier horizontal scaling simultaneously providing more predictable
processing times
Dynamic Partitioning
Typical Partitioning Visual
Task 3
Task 2
Input File 1
Task 1
Hash Partitioning Visual
Task 3
Task 2
Input File 1
Task 1
Task 4
Task 5
Replication
Event Log Architecture
Clients
Local log collection daemon
Clients
Aggregate log events grouped
by Category
Storage (HDFS)
HTTP
Remote
Clients
Log
Processor
Storage (HDFS)
Storage (HDFS)
Log
ReplicatorStorage (HDFS)
Inside
DataCenter
Storage
(Streaming)
>1PB ~10
Across all analytics clusters
>24kCopy Jobs per Day PB of Data per Day Analytics Clusters
Replicated to analytics clusters
Log Replication Stats
Replicating Trillion Events per Day
● Collocate logs with compute and disk capacity for analytics teams
○ Cross-data center reads are incredibly expensive
○ Cross-rack reads within data center are still expensive
● Too many users and jobs to fit into only the Hadoop RT clusters
● Hadoop RT clusters have a high SLA so we cannot allow adhoc usage
● Critical data set copies and backups in case of data center failures
Log Replication Needs
Processing Trillion Events per Day
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Datacenter N
Datacenter 1
Log Replication Visual
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
ads_view/yyyy/mm/dd/hh
login_event/yyyy/mm/dd/hh
ads_click/yyyy/mm/dd/hh
Replication Jobs
ads_click_repl
ads_view_repl
ads_click_repl
login_event_repl
Copy
Logged data from all
processing clusters to the
target cluster.
1
Merge
Copied data into one directory.
2
Present
Data atomically by renaming it
to an accessible location.
3
Publish
Metadata to the Data
Abstraction Layer to notify
analytics teams data is ready
for consumption.
4
Log Replication Steps
Distributing Trillion Events per Day
● One log replicator daemon per DW, PROD, or COLD Hadoop cluster,
where analytics users run queries and jobs
● Primarily responsible for copying category partitions out of the RT Hadoop
clusters
● The daemons schedule Hadoop DistCp jobs every hour for every category
● Daemon atomically presents processed category instances so partial data
can’t be read
● Replication proceeds according to criticality of data or “tiers”
Log Replicator Daemon
The Future
Future of Log Management
● Flume client improvement for tracing, SLA and throttling
● Flume client to support for message validation before logging
● Centralized configuration management
● Processing and replication every 10 minutes instead of every hour
Thank You
Questions?

More Related Content

What's hot

Drinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsDrinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsSamantha Quiñones
 
Real-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data WarehouseReal-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data WarehousePrecisely
 
Apache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupApache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupKarthik Ramasamy
 
Store stream data on Data Lake
Store stream data on Data LakeStore stream data on Data Lake
Store stream data on Data LakeMarcos Rebelo
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafkaconfluent
 
RedisConf18 - Writing modular & encapsulated Redis code
RedisConf18 - Writing modular & encapsulated Redis codeRedisConf18 - Writing modular & encapsulated Redis code
RedisConf18 - Writing modular & encapsulated Redis codeRedis Labs
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobus
 
RedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and BeyondRedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and BeyondRedis Labs
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentHostedbyConfluent
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedis Labs
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File TransferGlobus
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Stream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_NengStream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_NengStreamNative
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...StreamNative
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategyMariaDB plc
 
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon
 
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...confluent
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...DataStax
 
RedisConf18 - Implementing a New Data Structure for Redis
RedisConf18 - Implementing a New Data Structure for Redis  RedisConf18 - Implementing a New Data Structure for Redis
RedisConf18 - Implementing a New Data Structure for Redis Redis Labs
 

What's hot (20)

Drinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsDrinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time Metrics
 
Real-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data WarehouseReal-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
 
Apache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupApache Pulsar Seattle - Meetup
Apache Pulsar Seattle - Meetup
 
Store stream data on Data Lake
Store stream data on Data LakeStore stream data on Data Lake
Store stream data on Data Lake
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
 
RedisConf18 - Writing modular & encapsulated Redis code
RedisConf18 - Writing modular & encapsulated Redis codeRedisConf18 - Writing modular & encapsulated Redis code
RedisConf18 - Writing modular & encapsulated Redis code
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
RedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and BeyondRedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and Beyond
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File Transfer
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Stream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_NengStream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_Neng
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
 
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
 
RedisConf18 - Implementing a New Data Structure for Redis
RedisConf18 - Implementing a New Data Structure for Redis  RedisConf18 - Implementing a New Data Structure for Redis
RedisConf18 - Implementing a New Data Structure for Redis
 

Similar to Routing trillion events per day @twitter

Large Scale EventLog Management @Twitter
Large Scale EventLog Management @TwitterLarge Scale EventLog Management @Twitter
Large Scale EventLog Management @Twitterlohitvijayarenu
 
Scaling event aggregation at twitter
Scaling event aggregation at twitterScaling event aggregation at twitter
Scaling event aggregation at twitterlohitvijayarenu
 
Strategies for Context Data Persistence
Strategies for Context Data PersistenceStrategies for Context Data Persistence
Strategies for Context Data PersistenceFIWARE
 
Aggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of dataAggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of dataRostislav Pashuto
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitterTwitter Developers
 
2011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v52011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v5Samuel Rash
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidImply
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupJelena Zanko
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerFederico Palladoro
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2aspyker
 
NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2Ruslan Meshenberg
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupMárton Kodok
 
High throughput data streaming in Azure
High throughput data streaming in AzureHigh throughput data streaming in Azure
High throughput data streaming in AzureAlexander Laysha
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsWSO2
 
fluentd -- the missing log collector
fluentd -- the missing log collectorfluentd -- the missing log collector
fluentd -- the missing log collectorMuga Nishizawa
 
Explained: Domain events
Explained: Domain eventsExplained: Domain events
Explained: Domain eventsJoão Pires
 

Similar to Routing trillion events per day @twitter (20)

Large Scale EventLog Management @Twitter
Large Scale EventLog Management @TwitterLarge Scale EventLog Management @Twitter
Large Scale EventLog Management @Twitter
 
Scaling event aggregation at twitter
Scaling event aggregation at twitterScaling event aggregation at twitter
Scaling event aggregation at twitter
 
Strategies for Context Data Persistence
Strategies for Context Data PersistenceStrategies for Context Data Persistence
Strategies for Context Data Persistence
 
Aggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of dataAggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of data
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
 
2011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v52011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v5
 
Log Files
Log FilesLog Files
Log Files
 
Streaming Analytics
Streaming AnalyticsStreaming Analytics
Streaming Analytics
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache Druid
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid Meetup
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2
 
NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch Warmup
 
High throughput data streaming in Azure
High throughput data streaming in AzureHigh throughput data streaming in Azure
High throughput data streaming in Azure
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and Solutions
 
fluentd -- the missing log collector
fluentd -- the missing log collectorfluentd -- the missing log collector
fluentd -- the missing log collector
 
Explained: Domain events
Explained: Domain eventsExplained: Domain events
Explained: Domain events
 

More from lohitvijayarenu

OpenSource and the Cloud ApacheCon.pptx
OpenSource and the Cloud  ApacheCon.pptxOpenSource and the Cloud  ApacheCon.pptx
OpenSource and the Cloud ApacheCon.pptxlohitvijayarenu
 
The Adoption of Apache Beam at Twitter
The Adoption of Apache Beam at TwitterThe Adoption of Apache Beam at Twitter
The Adoption of Apache Beam at Twitterlohitvijayarenu
 
Twitter's Data Replicator for Google Cloud Storage
Twitter's Data Replicator for Google Cloud StorageTwitter's Data Replicator for Google Cloud Storage
Twitter's Data Replicator for Google Cloud Storagelohitvijayarenu
 
Hadoop 2 @Twitter, Elephant Scale. Presented at
Hadoop 2 @Twitter, Elephant Scale. Presented at Hadoop 2 @Twitter, Elephant Scale. Presented at
Hadoop 2 @Twitter, Elephant Scale. Presented at lohitvijayarenu
 
HBase backups and performance on MapR
HBase backups and performance on MapRHBase backups and performance on MapR
HBase backups and performance on MapRlohitvijayarenu
 

More from lohitvijayarenu (6)

OpenSource and the Cloud ApacheCon.pptx
OpenSource and the Cloud  ApacheCon.pptxOpenSource and the Cloud  ApacheCon.pptx
OpenSource and the Cloud ApacheCon.pptx
 
The Adoption of Apache Beam at Twitter
The Adoption of Apache Beam at TwitterThe Adoption of Apache Beam at Twitter
The Adoption of Apache Beam at Twitter
 
Twitter's Data Replicator for Google Cloud Storage
Twitter's Data Replicator for Google Cloud StorageTwitter's Data Replicator for Google Cloud Storage
Twitter's Data Replicator for Google Cloud Storage
 
Open Source india 2014
Open Source india 2014Open Source india 2014
Open Source india 2014
 
Hadoop 2 @Twitter, Elephant Scale. Presented at
Hadoop 2 @Twitter, Elephant Scale. Presented at Hadoop 2 @Twitter, Elephant Scale. Presented at
Hadoop 2 @Twitter, Elephant Scale. Presented at
 
HBase backups and performance on MapR
HBase backups and performance on MapRHBase backups and performance on MapR
HBase backups and performance on MapR
 

Recently uploaded

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 

Recently uploaded (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 

Routing trillion events per day @twitter

  • 1. Routing Trillions of Events Per Day @Twitter #ApacheBigData 2017 Lohit VijayaRenu & Gary Steelman @lohitvijayarenu @efsie
  • 2. 1. Event Logs at Twitter 2. Log Collection 3. Log Processing 4. Log Replication 5. The Future 6. Questions In this talk
  • 4. ● Clients log events specifying a Category name. Eg ads_view, login_event ... ● Events are grouped together across all clients into the Category ● Events are stored on Hadoop Distributed File System, bucketed every hour into separate directories ○ /logs/ads_view/2017/05/01/23 ○ /logs/login_event/2017/05/01/23 Life of an Event Clients Aggregated by Category Storage HDFS Http Clients Clients Client Daemon Client Daemon Client Daemon Http Endpoint
  • 5. ~3PB <1500 Across millions of clients >1TTrillion Events a Day of Data a Day Nodes Incoming uncompressed Collocated with HDFS datanodes Event Log Stats >600Categories Event groups by category
  • 6. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 7. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 8. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 9. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 10. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 11. Event Log Architecture Events Events RT Storage (HDFS) Inside DC1 Events Events RT Storage (HDFS) Inside DCN DW Storage (HDFS) Prod Storage (HDFS) DW Storage (HDFS) Cold Storage (HDFS) Prod Storage (HDFS)
  • 12. Event Log Architecture Events Events RT Storage (HDFS) Inside DC1 Events Events RT Storage (HDFS) Inside DCN DW Storage (HDFS) Prod Storage (HDFS) DW Storage (HDFS) Cold Storage (HDFS) Prod Storage (HDFS)
  • 14. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 16. Event Collection Past Challenges with Scribe ● Too many open file handles to HDFS ○ 600 categories x 1500 aggregators x >6 per hour =~ >5.4M files per hour ● High IO wait on DataNodes at scale ● Max limit on throughput per aggregator ● Difficult to track message drops ● No longer active open source development
  • 17. Event Collection Present Apache Flume ● Well defined interfaces ● Open source ● Concept of transactions ● Existing implementations of interfaces Source Sink Channel Client HDFS Flume Agent
  • 18. Event Collection Present Category Group ● Combine multiple related categories into a category group ● Provide different properties per group ● Contains multiple events to generate fewer combined sequence files Agent 1 Agent 2 Agent 3 Category 1 Category 3Category 2 Category Group
  • 19. Group 1 Category Groups Aggregator Group 1 Aggregator Group 2 Event Collection Present Aggregator Group ● A set of aggregators hosting same set of category groups ● Easier to manage group of aggregators hosting subset of categories Agent 1 Agent 2 Agent 3 Agent 8 Group 2
  • 20. Event Collection Present Flume features for category/aggregator groups ● Extend Interceptor to multiplex events into groups ● Implement Memory Channel Group to have separate memory channel per category group ● ZooKeeper registration per category group for service discovery ● Metrics for category groups
  • 21. Event Collection Present Flume performance improvements ● HDFSEventSink batching increased (5x) throughput reducing spikes on memory channel ● Implement buffering in HDFSEventSink instead of using SpillableMemoryChannel ● Stream events close to network speed
  • 23. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 24. >1PB 20-50% To process one day of data 8Wall Clock Hours Data per Day Disk Space Output of cleaned, compressed, consolidated, and converted Saved by processing Flume sequence files Log Processor Stats Processing Trillion Events per Day
  • 25. ● Make processing log data easier for analytics teams ● Disk space is at a premium on analytics clusters ● Still too many files cause increased pressure on the NameNode ● Log data is read many times and different teams all perform the same pre-processing steps on the same data sets Log Processor Needs Processing Trillion Events per Day
  • 26. Datacenter 1 Log Processor Steps ads_group/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hhlogin_group/yyyy/mm/dd/hh Category Groups CategoriesDemux Jobs ads_group_demuxer login_group_demuxer
  • 27. Datacenter 1 Log Processor Steps ads_group/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hhlogin_group/yyyy/mm/dd/hh Category Groups CategoriesDemux Jobs ads_group_demuxer login_group_demuxer
  • 28. Datacenter 1 Log Processor Steps ads_group/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hhlogin_group/yyyy/mm/dd/hh Category Groups CategoriesDemux Jobs ads_group_demuxer login_group_demuxer
  • 29. Decode Base64 encoding from logged data 1 Demux Category groups into individual categories for easier consumption by analytics teams 2 Clean Corrupt, empty, or invalid records so data sets are more reliable 3 [Convert] Some categories into Parquet for fastest use in ad-hoc exploratory tools 4 Consolidate Small files to reduce pressure on the NameNode 5 Compress Logged data to the highest level to save disk space. From LZO level 3 to LZO level 7 6 Log Processor Steps
  • 30. ● Scribe’s contract amounts to sending a binary blob to a port ● Scribe used new line characters to delimit records in a binary blob batch of records ● Valid records may include newline characters ● Scribe base64 encoded received binary blobs to avoid confusion with record delimiter ● Base 64 encoding is no longer necessary because we have moved to one serialized Thrift object per binary blob Why Base64 Decoding? Legacy Choices
  • 34. ● One log processor daemon per RT Hadoop cluster, where Flume aggregates logs ● Primarily responsible for demuxing category groups out of the Flume sequence files ● The daemon schedules Tez jobs every hour for every category group in a thread pool ● Daemon atomically presents processed category instances so partial data can’t be read ● Processing proceeds according to criticality of data or “tiers” Log Processor Daemon
  • 35. ● Some categories are significantly larger than other categories (KBs v TBs) ● MapReduce demux? Each reducer handles a single category ● Streaming demux? Each spout or channel handles a single category ● Massive skew in partitioning by category causes long running tasks which slows down job completion time ● Relatively well understood fault tolerance semantics similar to MapReduce, Spark, etc Why Apache Tez?
  • 36. ● Tez’s dynamic hash partitioner adjusts partitions at runtime if necessary, allowing large partitions to be further partitioned so multiple tasks process events for a single category one task ○ More info at TEZ-3209. ○ Thanks to team member Ming Ma for the contribution! ● Easier horizontal scaling simultaneously providing more predictable processing times Dynamic Partitioning
  • 37. Typical Partitioning Visual Task 3 Task 2 Input File 1 Task 1
  • 38. Hash Partitioning Visual Task 3 Task 2 Input File 1 Task 1 Task 4 Task 5
  • 40. Event Log Architecture Clients Local log collection daemon Clients Aggregate log events grouped by Category Storage (HDFS) HTTP Remote Clients Log Processor Storage (HDFS) Storage (HDFS) Log ReplicatorStorage (HDFS) Inside DataCenter Storage (Streaming)
  • 41. >1PB ~10 Across all analytics clusters >24kCopy Jobs per Day PB of Data per Day Analytics Clusters Replicated to analytics clusters Log Replication Stats Replicating Trillion Events per Day
  • 42. ● Collocate logs with compute and disk capacity for analytics teams ○ Cross-data center reads are incredibly expensive ○ Cross-rack reads within data center are still expensive ● Too many users and jobs to fit into only the Hadoop RT clusters ● Hadoop RT clusters have a high SLA so we cannot allow adhoc usage ● Critical data set copies and backups in case of data center failures Log Replication Needs Processing Trillion Events per Day
  • 43. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 44. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 45. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 46. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 47. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 48. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 49. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 50. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 51. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 52. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 53. Datacenter N Datacenter 1 Log Replication Visual ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh ads_view/yyyy/mm/dd/hh login_event/yyyy/mm/dd/hh ads_click/yyyy/mm/dd/hh Replication Jobs ads_click_repl ads_view_repl ads_click_repl login_event_repl
  • 54. Copy Logged data from all processing clusters to the target cluster. 1 Merge Copied data into one directory. 2 Present Data atomically by renaming it to an accessible location. 3 Publish Metadata to the Data Abstraction Layer to notify analytics teams data is ready for consumption. 4 Log Replication Steps Distributing Trillion Events per Day
  • 55. ● One log replicator daemon per DW, PROD, or COLD Hadoop cluster, where analytics users run queries and jobs ● Primarily responsible for copying category partitions out of the RT Hadoop clusters ● The daemons schedule Hadoop DistCp jobs every hour for every category ● Daemon atomically presents processed category instances so partial data can’t be read ● Replication proceeds according to criticality of data or “tiers” Log Replicator Daemon
  • 57. Future of Log Management ● Flume client improvement for tracing, SLA and throttling ● Flume client to support for message validation before logging ● Centralized configuration management ● Processing and replication every 10 minutes instead of every hour