5. Scale of Event Log Aggregation
How many and how big?
~10PB
Across millions of
clients
~3-4.1T
Trillion Events a Day of Data a Day
Incoming
uncompressed
6. Events and Event Logs @Twitter
● Clients log events specifying a Category name.
○ Eg. ads_click, like_tweet_event …
○ Multiple format. Thrift, Json, and etc.
● Events are grouped together across all clients into the Category.
○ Client could sent events via REST Endpoints. E.g. from web.
○ Client generate events via logging libs. E.g. thrift lib.
Whats is a Event?
7. Events and Event Logs @Twitter
● Events are stored on HDFS, bucketed every hour into separate directories
○ E.g. /logs/ads_click/2020/09/01/23
● Events are delivered in various format.
○ Parquet format.
○ Row-based thrift-lzo format.
● Event logs are replicated to other clusters
○ Production clusters, ad hoc clusters, test clusters.
● Multiple consumers.
○ Presto, Spark, Scalding, and etc.
○ Streaming systems.
How do we deliver the event and how is it consumed?
9. Architecture
● Modularized and Independently
Scalable
○ Client daemon
○ Aggregator daemon
● User choose destination based on
need
○ Message queue based systems
for stream processing
○ HDFS for batch processing
Overview
10. Architecture
● Client daemon is long running
process per host.
● Provide simple interface to
services on the same host and
hide backend details.
● Leverage service discovery to
forward events to Aggregator
daemons
● Client log events to local Client
Daemon (can buffer to disk)
Client daemon
12. Aggregators
● Aggregate events from massive
clients into category (or category
group in new framework) on
HDFS.
● Write data in YYYY/MM/DD/HH
structure based on event arrival
time.
● Deliver data to message queue
based system depending on
configuration.
● Back-off signal on high load or
downstream failure
Overview
13. Aggregator
● Zookeeper based service discovery
○ DC failover support
○ Aggregator register as ephemeral node.
● Tier based approach.
○ Categories are divided into different tiers based on priority
○ Control blast radius
○ Scale independently
○ Different sementantics and params based on tier
● Category level based parameter tuning
○ Batch size
○ Retry prams such as timeout
Service discovery
15. Category group
● Scalability of HDFS
○ Small files because of traffic distribution
○ Small files generated by small log categories
○ File number is too huge for namenode to handle
● What is Category Group?
○ Multiple categories with similar properties grouped together by
Flume and written to same HDFS files.
Why and what?
16. Event Logs on HDFS
Category Group
How?
Aggregator Aggregator New Aggregator
Client Daemon
Client 1 Client N
Client 2
Client Daemon Client Daemon
New Format
● Category group transparent to users.
● Category group is configured and
invisible in implementation
● Category group is created based on
the consideration of traffic, outformat,
and etc
● Aggregators group the data of
category group into same files
● Category group killed small files and
reduced file number >3x based on
current config
17. Aggregator Group
● Aggregator group is configured and
invisible in implementation
● Aggregator group is configured
based on traffic, priority and etc
● Scale independently and resource
isolation
● Friendly for debug, exp, test and
migration
Why and how?
18. Single Aggregator Improvement
● Memory model improvement
○ Introduce memory channel group. Memory is shared in the
group
○ Configure channel group based on destination, data priority
and etc
○ Set max memory usage per group to prevent bad client
● Bounded API to tackle slowness
● Introduce microbatch
19. Single Aggregator Improvement
● Batch isolation
● Improvement
○ Max lim
● Resource isolation
● Friendly for debug, exp, test and
migration
Micro batch benchmark