SlideShare a Scribd company logo
1 of 31
Download to read offline
D ATA Q U A L I T Y M O N I T O R I N G
I N R E A LT I M E A N D AT S C A L E
A L E X I S S E I G N E U R I N - @ A S E I G N E U R I N
O C T O B E R 2 0 1 7
M Y S E L F
• Data Engineer at
• → →
•
• aseigneurin.github.io or @aseigneurin
D ATA Q U A L I T Y
M O N I T O R I N G
P A R T 1
T H E P R O J E C T
• A few Kafka clusters, lots of topics
• Analyze all the messages of all the Kafka topics
• Count the number of valid or invalid messages per second
• Push metrics to InfluxDB, graph with Grafana
L O T S O F C O M P L E X I T Y T O H A N D L E
• Topics = multiple partitions → Final results must be aggregated
• High volumes: 100k+ messages / second
• Fault tolerance + exactly once processing
• Count per window of 1 second with low latency
• Event-time processing (Kafka 0.10+)
• Data can arrive late → Update results
E X A M P L E
t 0 t 1 t 2
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
t x - 0 | v | v | v | i | v | | v | i | | i | v | v |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
t x - 1 | v | i | v | v | | v | v | v | | i | i | v | v |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
• t0
- 7 valid messages (4 in partition tx-0 + 3 in partition tx-1)
- 2 invalid messages (1 in each partition)
• t1
- 4 valid messages
- 1 invalid message
• t2
- 4 valid messages
- 3 invalid messages
K A F K A → K A F K A
• Kafka (raw data) → Kafka (metrics)
• Microservice (one function, only depends on Kafka)
$ kafka-console-consumer --topic data-checker-changelog --property print.key=true ...
{"topic":"tx","window":1501273548000,"status":"valid"} 7
{"topic":"tx","window":1501273548000,"status":"invalid"} 2
{"topic":"tx","window":1501273549000,"status":"valid"} 4
{"topic":"tx","window":1501273549000,"status":"invalid"} 1
{"topic":"tx","window":1501273550000,"status":"valid"} 4
{"topic":"tx","window":1501273550000,"status":"invalid"} 3
L AT E D ATA
• E.g. one more valid message with event time = t1
• Outputs one new metric:
• This topic is a change log
• Can use a compacted topic
{"topic":"tx","window":1501273549000,"status":"valid"} 4
...
{"topic":"tx","window":1501273549000,"status":"valid"} 5
K A F K A → I N F L U X D B
• Kafka (metrics change log) → InfluxDB (time series of metrics)
• Microservice: one function, can be restarted independently
> select valid, invalid from tx
name: tx
time valid invalid
---- ----- -------
1501273548000000000 7 2
1501273549000000000 4 1
1501273550000000000 4 3
L AT E D ATA
• Only the latest value is stored in InfluxDB
• New change log item = update in InfluxDB
> select valid, invalid from tx where time=1501273549000000000
name: tx
time valid invalid
---- ----- -------
1501273549000000000 5 1
F I R S T I M P L E M E N TAT I O N
W I T H K A F K A S T R E A M S
P A R T 2
K A F K A S T R E A M S
• docs.confluent.io/current/streams/index.html
• Library to process data from Kafka
• Built on top of the Java Kafka client
• DSL + low-level API
• Leverages Consumer Groups → Horizontal scalability
I M P L E M E N TAT I O N
• Thin Scala wrapper for the Kafka Streams API
github.com/aseigneurin/kafka-streams-scala
messages
.map((_, message) => message match {
case _: GoodMessage => ("valid", 1)
case _: BadMessage => ("invalid", 1)
})
.groupByKey
.count(TimeWindows.of(1000), "metrics-agg-store")
.toStream
.map((k, v) => (MetricKey(inputTopic, k.window.start, k.key), v))
.to(metricsTopic)
R E PA R T I T I O N I N G
• Aggregations can only be done by key
• Repartition topic (internal topic) created by Streams
• (timestamps are preserved)
.map((_, message) => message match {
case _: GoodMessage => ("valid", 1)
case _: BadMessage => ("invalid", 1)
})
.groupByKey
valid 1
valid 1
invalid 1
valid 1
invalid 1
C O U N T I N G
• Count per window of 1 second
• Streams creates an in-memory state store
• Backed by an internal change log (internal topic) for fault tolerance
.count(TimeWindows.of(1000), "metrics-agg-store")
O U T P U T
• Aggregation result is a KTable → Turn it into a KStream
• Must read:
Duality of Streams and Tables, Confluent
docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
• Write the result to a topic
.toStream
.map((k, v) => (MetricKey(inputTopic, k.window.start, k.key), v))
.to(metricsTopic)
O U T P U T & L AT E D ATA
• Write with a key to preserve ordering
• Must read:
The world beyond batch: Streaming 101, Tyler Akidau
www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
$ kafka-console-consumer --topic metrics --property print.key=true ...
{"topic":"tx","window":1501273548000,"status":"valid"} 7
{"topic":"tx","window":1501273548000,"status":"invalid"} 2
{"topic":"tx","window":1501273549000,"status":"valid"} 4
...
{"topic":"tx","window":1501273549000,"status":"valid"} 5
P R O B L E M 1 - H O T PA R T I T I O N
• Most messages are valid → Repartitioning creates a hot partition
• Can only be processed by 1 thread
-----------------------------
repartition-0
-----------------------------
repartition-1 |v|v|v|v|v|v|v|v|v|v|v|v|v|v|
-----------------------------
repartition-2 |i|i|i|
-----------------------------
repartition-3
-----------------------------
P R O B L E M 2 - I N T E R N A L T O P I C S
• 2 “internal” topics per input topic
• repartition
• changelog
• Need to allocate threads to read from the repartition topic
• Cannot reuse these topics for multiple input topics
R E I M P L E M E N TAT I O N
W I T H P L A I N K A F K A
C O N S U M E R S
P A R T 3
D E S I G N - K E Y I D E A S
• No repartitioning
• Directly aggregate per partition
• Final aggregation (across partitions) made by InfluxDB
• A single change log topic shared by all instances
D E S I G N - I D E A S F R O M K A F K A S T R E A M S
• Multi-threaded with one state store per thread
• 1 thread = 1 or multiple partitions
• No sharing of state across threads
• State store backed by the change log
• Used to read the state in case of crash / repartitioning
• Event time processing + Handling of late data
• Expiration of old windows
A G G R E G AT E P E R PA R T I T I O N
t 0 t 1 t 2
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
t x - 0 | v | v | v | i | v | | v | i | | i | v | v |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
t x - 1 | v | i | v | v | | v | v | v | | i | i | v | v |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
• Metrics per partition
$ kafka-console-consumer --topic metrics --property print.key=true ...
{"topic":"tx","partition":0,"window":1501273548000,"status":"valid"} 4
{"topic":"tx","partition":0,"window":1501273548000,"status":"invalid"} 1
{"topic":"tx","partition":1,"window":1501273548000,"status":"valid"} 3
{"topic":"tx","partition":1,"window":1501273548000,"status":"invalid"} 1
{"topic":"tx","partition":0,"window":1501273549000,"status":"valid"} 1
{"topic":"tx","partition":0,"window":1501273549000,"status":"invalid"} 1
{"topic":"tx","partition":1,"window":1501273549000,"status":"valid"} 3
{"topic":"tx","partition":0,"window":1501273550000,"status":"valid"} 2
{"topic":"tx","partition":0,"window":1501273550000,"status":"invalid"} 1
{"topic":"tx","partition":1,"window":1501273550000,"status":"valid"} 2
{"topic":"tx","partition":1,"window":1501273550000,"status":"invalid"} 2
S T O R A G E I N I N F L U X D B
• Add the partition number as a tag
> select valid, invalid from tx
name: tx
time partition valid invalid
---- --------- ----- -------
1501273548000000000 0 4 1
1501273548000000000 1 3 1
1501273549000000000 0 1 1
...
A G G R E G AT I O N W I T H I N F L U X D B
• Leverage InfluxDB’s aggregation functionality
• Supported by Grafana
> select sum(valid) as valid, sum(invalid) as invalid from "topic-health" where time>=1501273548000000000
and time<=1501273550000000000
name: topic-health
time valid invalid
---- ----- -------
1501273548000000000 7 2
1501273549000000000 4 1
1501273550000000000 4 3
C H A N G E L O G
• Change log = Metrics topic + source offsets
• When partitions are assigned, populate the state store from the change log
• Filter using the topic + partition number
$ kafka-console-consumer --topic changelog --property print.key=true ...
{"topic":"tx","partition":0,"window":1501273548000,"status":"valid"} {"value":4,"offset":5}
{"topic":"tx","partition":0,"window":1501273548000,"status":"invalid"} {"value":1,"offset":4}
{"topic":"tx","partition":1,"window":1501273548000,"status":"valid"} {"value":3,"offset":4}
...
M A I N C O M P O N E N T S
• StateStore
• In-memory store of active counts
• Cleanup every few minutes
• ChangelogWriter / ChangelogReader
• State store Changelog topic
• DataCheckerThread
• Main consumer loop
• ConsumerRebalanceListener
• Reconfigures the application when a partition rebalancing happens
L I F E C Y C L E
• On startup
• Wait for partitions to be assigned
• Read the change log → State store
• Start consuming data
• Every second
• Dump new values of the State Store to the change log
• Every 5 minutes
• Discard old windows
• On repartitioning = On startup
C AT C H I N G U P
• After an interruption
• Catches up at a different rate for
each partition
S U M M A R Y
S U M M A R Y
• New implementation
✓ Very robust: fault-tolerant, exactly once processing
✓ Event-time + Late data processing
✓ 10.000 messages per second with 2 threads
• Kafka Streams?
• Great library but needs 2-step aggregation: per partition + across partitions
• Flink?
• Cluster management :-/
• aseigneurin.github.io/2017/08/04/why-kafka-streams-didnt-work-for-us-part-1.html

More Related Content

What's hot

Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and SparkJosef Adersberger
 
Apache Flink Training: DataStream API Part 1 Basic
 Apache Flink Training: DataStream API Part 1 Basic Apache Flink Training: DataStream API Part 1 Basic
Apache Flink Training: DataStream API Part 1 BasicFlink Forward
 
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Florian Lautenschlager
 
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafObtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafInfluxData
 
Spark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
Spark Barcelona Meetup: Migrating Batch Jobs into Structured StreamingSpark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
Spark Barcelona Meetup: Migrating Batch Jobs into Structured Streamingmarcgonzalez.eu
 
CSW2017 Qidan he+Gengming liu_cansecwest2017
CSW2017 Qidan he+Gengming liu_cansecwest2017CSW2017 Qidan he+Gengming liu_cansecwest2017
CSW2017 Qidan he+Gengming liu_cansecwest2017CanSecWest
 
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Andrii Gakhov
 
Processing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, TikalProcessing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, TikalCodemotion Tel Aviv
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormAndrea Iacono
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3Rob Skillington
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrQAware GmbH
 
Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink huguk
 
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Spark Summit
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with StormMariusz Gil
 
GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...
GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...
GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...Javed Barkatullah
 
RecSplit Minimal Perfect Hashing
RecSplit Minimal Perfect HashingRecSplit Minimal Perfect Hashing
RecSplit Minimal Perfect HashingThomas Mueller
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterpriseInfluxData
 

What's hot (20)

The new time series kid on the block
The new time series kid on the blockThe new time series kid on the block
The new time series kid on the block
 
Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and Spark
 
Apache Flink Training: DataStream API Part 1 Basic
 Apache Flink Training: DataStream API Part 1 Basic Apache Flink Training: DataStream API Part 1 Basic
Apache Flink Training: DataStream API Part 1 Basic
 
From Trill to Quill and Beyond
From Trill to Quill and BeyondFrom Trill to Quill and Beyond
From Trill to Quill and Beyond
 
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
 
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafObtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
 
Spark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
Spark Barcelona Meetup: Migrating Batch Jobs into Structured StreamingSpark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
Spark Barcelona Meetup: Migrating Batch Jobs into Structured Streaming
 
CSW2017 Qidan he+Gengming liu_cansecwest2017
CSW2017 Qidan he+Gengming liu_cansecwest2017CSW2017 Qidan he+Gengming liu_cansecwest2017
CSW2017 Qidan he+Gengming liu_cansecwest2017
 
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
 
Processing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, TikalProcessing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, Tikal
 
delegates
delegatesdelegates
delegates
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache Solr
 
Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink
 
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
 
GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...
GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...
GOLDSTRIKETM 1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY PROCESSOR FOR BI...
 
RecSplit Minimal Perfect Hashing
RecSplit Minimal Perfect HashingRecSplit Minimal Perfect Hashing
RecSplit Minimal Perfect Hashing
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterprise
 

Similar to Data Quality Monitoring in Real-Time and at Scale with Kafka

Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
 
09 implementing+subprograms
09 implementing+subprograms09 implementing+subprograms
09 implementing+subprogramsbaran19901990
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...confluent
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?Miklos Christine
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshareAllen (Xiaozhong) Wang
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0HBaseCon
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...confluent
 
How to Build an Apache Kafka® Connector
How to Build an Apache Kafka® ConnectorHow to Build an Apache Kafka® Connector
How to Build an Apache Kafka® Connectorconfluent
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Anyscale
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...confluent
 
Groovy concurrency
Groovy concurrencyGroovy concurrency
Groovy concurrencyAlex Miller
 
Journey into Reactive Streams and Akka Streams
Journey into Reactive Streams and Akka StreamsJourney into Reactive Streams and Akka Streams
Journey into Reactive Streams and Akka StreamsKevin Webber
 
durability, durability, durability
durability, durability, durabilitydurability, durability, durability
durability, durability, durabilityMatthew Dennis
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterLucidworks
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Boris Yen
 
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J..."Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...Dataconomy Media
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx
 

Similar to Data Quality Monitoring in Real-Time and at Scale with Kafka (20)

Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
09 implementing+subprograms
09 implementing+subprograms09 implementing+subprograms
09 implementing+subprograms
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
 
How to Build an Apache Kafka® Connector
How to Build an Apache Kafka® ConnectorHow to Build an Apache Kafka® Connector
How to Build an Apache Kafka® Connector
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
 
Groovy concurrency
Groovy concurrencyGroovy concurrency
Groovy concurrency
 
Journey into Reactive Streams and Akka Streams
Journey into Reactive Streams and Akka StreamsJourney into Reactive Streams and Akka Streams
Journey into Reactive Streams and Akka Streams
 
durability, durability, durability
durability, durability, durabilitydurability, durability, durability
durability, durability, durability
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, Twitter
 
Spanner (may 19)
Spanner (may 19)Spanner (may 19)
Spanner (may 19)
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
 
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J..."Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
 

More from Alexis Seigneurin

Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesAlexis Seigneurin
 
Data Science meets Software Development
Data Science meets Software DevelopmentData Science meets Software Development
Data Science meets Software DevelopmentAlexis Seigneurin
 
Spark (v1.3) - Présentation (Français)
Spark (v1.3) - Présentation (Français)Spark (v1.3) - Présentation (Français)
Spark (v1.3) - Présentation (Français)Alexis Seigneurin
 
Spark - Alexis Seigneurin (Français)
Spark - Alexis Seigneurin (Français)Spark - Alexis Seigneurin (Français)
Spark - Alexis Seigneurin (Français)Alexis Seigneurin
 
Spark - Alexis Seigneurin (English)
Spark - Alexis Seigneurin (English)Spark - Alexis Seigneurin (English)
Spark - Alexis Seigneurin (English)Alexis Seigneurin
 
Spark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclairSpark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclairAlexis Seigneurin
 

More from Alexis Seigneurin (8)

0712_Seigneurin
0712_Seigneurin0712_Seigneurin
0712_Seigneurin
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and Microservices
 
Data Science meets Software Development
Data Science meets Software DevelopmentData Science meets Software Development
Data Science meets Software Development
 
Spark (v1.3) - Présentation (Français)
Spark (v1.3) - Présentation (Français)Spark (v1.3) - Présentation (Français)
Spark (v1.3) - Présentation (Français)
 
Spark - Ippevent 19-02-2015
Spark - Ippevent 19-02-2015Spark - Ippevent 19-02-2015
Spark - Ippevent 19-02-2015
 
Spark - Alexis Seigneurin (Français)
Spark - Alexis Seigneurin (Français)Spark - Alexis Seigneurin (Français)
Spark - Alexis Seigneurin (Français)
 
Spark - Alexis Seigneurin (English)
Spark - Alexis Seigneurin (English)Spark - Alexis Seigneurin (English)
Spark - Alexis Seigneurin (English)
 
Spark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclairSpark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclair
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Data Quality Monitoring in Real-Time and at Scale with Kafka

  • 1. D ATA Q U A L I T Y M O N I T O R I N G I N R E A LT I M E A N D AT S C A L E A L E X I S S E I G N E U R I N - @ A S E I G N E U R I N O C T O B E R 2 0 1 7
  • 2. M Y S E L F • Data Engineer at • → → • • aseigneurin.github.io or @aseigneurin
  • 3. D ATA Q U A L I T Y M O N I T O R I N G P A R T 1
  • 4. T H E P R O J E C T • A few Kafka clusters, lots of topics • Analyze all the messages of all the Kafka topics • Count the number of valid or invalid messages per second • Push metrics to InfluxDB, graph with Grafana
  • 5. L O T S O F C O M P L E X I T Y T O H A N D L E • Topics = multiple partitions → Final results must be aggregated • High volumes: 100k+ messages / second • Fault tolerance + exactly once processing • Count per window of 1 second with low latency • Event-time processing (Kafka 0.10+) • Data can arrive late → Update results
  • 6. E X A M P L E t 0 t 1 t 2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - t x - 0 | v | v | v | i | v | | v | i | | i | v | v | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - t x - 1 | v | i | v | v | | v | v | v | | i | i | v | v | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - • t0 - 7 valid messages (4 in partition tx-0 + 3 in partition tx-1) - 2 invalid messages (1 in each partition) • t1 - 4 valid messages - 1 invalid message • t2 - 4 valid messages - 3 invalid messages
  • 7. K A F K A → K A F K A • Kafka (raw data) → Kafka (metrics) • Microservice (one function, only depends on Kafka) $ kafka-console-consumer --topic data-checker-changelog --property print.key=true ... {"topic":"tx","window":1501273548000,"status":"valid"} 7 {"topic":"tx","window":1501273548000,"status":"invalid"} 2 {"topic":"tx","window":1501273549000,"status":"valid"} 4 {"topic":"tx","window":1501273549000,"status":"invalid"} 1 {"topic":"tx","window":1501273550000,"status":"valid"} 4 {"topic":"tx","window":1501273550000,"status":"invalid"} 3
  • 8. L AT E D ATA • E.g. one more valid message with event time = t1 • Outputs one new metric: • This topic is a change log • Can use a compacted topic {"topic":"tx","window":1501273549000,"status":"valid"} 4 ... {"topic":"tx","window":1501273549000,"status":"valid"} 5
  • 9. K A F K A → I N F L U X D B • Kafka (metrics change log) → InfluxDB (time series of metrics) • Microservice: one function, can be restarted independently > select valid, invalid from tx name: tx time valid invalid ---- ----- ------- 1501273548000000000 7 2 1501273549000000000 4 1 1501273550000000000 4 3
  • 10. L AT E D ATA • Only the latest value is stored in InfluxDB • New change log item = update in InfluxDB > select valid, invalid from tx where time=1501273549000000000 name: tx time valid invalid ---- ----- ------- 1501273549000000000 5 1
  • 11. F I R S T I M P L E M E N TAT I O N W I T H K A F K A S T R E A M S P A R T 2
  • 12. K A F K A S T R E A M S • docs.confluent.io/current/streams/index.html • Library to process data from Kafka • Built on top of the Java Kafka client • DSL + low-level API • Leverages Consumer Groups → Horizontal scalability
  • 13. I M P L E M E N TAT I O N • Thin Scala wrapper for the Kafka Streams API github.com/aseigneurin/kafka-streams-scala messages .map((_, message) => message match { case _: GoodMessage => ("valid", 1) case _: BadMessage => ("invalid", 1) }) .groupByKey .count(TimeWindows.of(1000), "metrics-agg-store") .toStream .map((k, v) => (MetricKey(inputTopic, k.window.start, k.key), v)) .to(metricsTopic)
  • 14. R E PA R T I T I O N I N G • Aggregations can only be done by key • Repartition topic (internal topic) created by Streams • (timestamps are preserved) .map((_, message) => message match { case _: GoodMessage => ("valid", 1) case _: BadMessage => ("invalid", 1) }) .groupByKey valid 1 valid 1 invalid 1 valid 1 invalid 1
  • 15. C O U N T I N G • Count per window of 1 second • Streams creates an in-memory state store • Backed by an internal change log (internal topic) for fault tolerance .count(TimeWindows.of(1000), "metrics-agg-store")
  • 16. O U T P U T • Aggregation result is a KTable → Turn it into a KStream • Must read: Duality of Streams and Tables, Confluent docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables • Write the result to a topic .toStream .map((k, v) => (MetricKey(inputTopic, k.window.start, k.key), v)) .to(metricsTopic)
  • 17. O U T P U T & L AT E D ATA • Write with a key to preserve ordering • Must read: The world beyond batch: Streaming 101, Tyler Akidau www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 $ kafka-console-consumer --topic metrics --property print.key=true ... {"topic":"tx","window":1501273548000,"status":"valid"} 7 {"topic":"tx","window":1501273548000,"status":"invalid"} 2 {"topic":"tx","window":1501273549000,"status":"valid"} 4 ... {"topic":"tx","window":1501273549000,"status":"valid"} 5
  • 18. P R O B L E M 1 - H O T PA R T I T I O N • Most messages are valid → Repartitioning creates a hot partition • Can only be processed by 1 thread ----------------------------- repartition-0 ----------------------------- repartition-1 |v|v|v|v|v|v|v|v|v|v|v|v|v|v| ----------------------------- repartition-2 |i|i|i| ----------------------------- repartition-3 -----------------------------
  • 19. P R O B L E M 2 - I N T E R N A L T O P I C S • 2 “internal” topics per input topic • repartition • changelog • Need to allocate threads to read from the repartition topic • Cannot reuse these topics for multiple input topics
  • 20. R E I M P L E M E N TAT I O N W I T H P L A I N K A F K A C O N S U M E R S P A R T 3
  • 21. D E S I G N - K E Y I D E A S • No repartitioning • Directly aggregate per partition • Final aggregation (across partitions) made by InfluxDB • A single change log topic shared by all instances
  • 22. D E S I G N - I D E A S F R O M K A F K A S T R E A M S • Multi-threaded with one state store per thread • 1 thread = 1 or multiple partitions • No sharing of state across threads • State store backed by the change log • Used to read the state in case of crash / repartitioning • Event time processing + Handling of late data • Expiration of old windows
  • 23. A G G R E G AT E P E R PA R T I T I O N t 0 t 1 t 2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - t x - 0 | v | v | v | i | v | | v | i | | i | v | v | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - t x - 1 | v | i | v | v | | v | v | v | | i | i | v | v | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - • Metrics per partition $ kafka-console-consumer --topic metrics --property print.key=true ... {"topic":"tx","partition":0,"window":1501273548000,"status":"valid"} 4 {"topic":"tx","partition":0,"window":1501273548000,"status":"invalid"} 1 {"topic":"tx","partition":1,"window":1501273548000,"status":"valid"} 3 {"topic":"tx","partition":1,"window":1501273548000,"status":"invalid"} 1 {"topic":"tx","partition":0,"window":1501273549000,"status":"valid"} 1 {"topic":"tx","partition":0,"window":1501273549000,"status":"invalid"} 1 {"topic":"tx","partition":1,"window":1501273549000,"status":"valid"} 3 {"topic":"tx","partition":0,"window":1501273550000,"status":"valid"} 2 {"topic":"tx","partition":0,"window":1501273550000,"status":"invalid"} 1 {"topic":"tx","partition":1,"window":1501273550000,"status":"valid"} 2 {"topic":"tx","partition":1,"window":1501273550000,"status":"invalid"} 2
  • 24. S T O R A G E I N I N F L U X D B • Add the partition number as a tag > select valid, invalid from tx name: tx time partition valid invalid ---- --------- ----- ------- 1501273548000000000 0 4 1 1501273548000000000 1 3 1 1501273549000000000 0 1 1 ...
  • 25. A G G R E G AT I O N W I T H I N F L U X D B • Leverage InfluxDB’s aggregation functionality • Supported by Grafana > select sum(valid) as valid, sum(invalid) as invalid from "topic-health" where time>=1501273548000000000 and time<=1501273550000000000 name: topic-health time valid invalid ---- ----- ------- 1501273548000000000 7 2 1501273549000000000 4 1 1501273550000000000 4 3
  • 26. C H A N G E L O G • Change log = Metrics topic + source offsets • When partitions are assigned, populate the state store from the change log • Filter using the topic + partition number $ kafka-console-consumer --topic changelog --property print.key=true ... {"topic":"tx","partition":0,"window":1501273548000,"status":"valid"} {"value":4,"offset":5} {"topic":"tx","partition":0,"window":1501273548000,"status":"invalid"} {"value":1,"offset":4} {"topic":"tx","partition":1,"window":1501273548000,"status":"valid"} {"value":3,"offset":4} ...
  • 27. M A I N C O M P O N E N T S • StateStore • In-memory store of active counts • Cleanup every few minutes • ChangelogWriter / ChangelogReader • State store Changelog topic • DataCheckerThread • Main consumer loop • ConsumerRebalanceListener • Reconfigures the application when a partition rebalancing happens
  • 28. L I F E C Y C L E • On startup • Wait for partitions to be assigned • Read the change log → State store • Start consuming data • Every second • Dump new values of the State Store to the change log • Every 5 minutes • Discard old windows • On repartitioning = On startup
  • 29. C AT C H I N G U P • After an interruption • Catches up at a different rate for each partition
  • 30. S U M M A R Y
  • 31. S U M M A R Y • New implementation ✓ Very robust: fault-tolerant, exactly once processing ✓ Event-time + Late data processing ✓ 10.000 messages per second with 2 threads • Kafka Streams? • Great library but needs 2-step aggregation: per partition + across partitions • Flink? • Cluster management :-/ • aseigneurin.github.io/2017/08/04/why-kafka-streams-didnt-work-for-us-part-1.html