Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK

Choose Right Stream Storage:
Kinesis Data Streams vs MSK
Sungmin, Kim
Solutions Architect, AWS
2020-10-07

Agenda
• Key Components of Real-time Analytics
• Anatomy of Amazon Kinesis Data Streams and MSK
• Comparing Amazon Kinesis Data Streams to MSK
• Monitoring Metrics
• Reference Architecture
• Key Takeaways

Key Components of Real-
time Analytics

From Batch to Real-time:
Lambda Architecture
Data
Source
Stream
Storage
Speed Layer
Batch Layer
Batch
Process
Batch
View
Real-
time
View
Consumer
Query & Merge
Results
Service Layer
Stream
Ingestion
Raw Data
Storage
Streaming Data
Stream
Delivery
Stream
Process

Lambda Architecture
Streaming
Data
Batch View
Stream Process
Real-time
View
Query
Query
Batch View
Real-time
View
Raw Data
Batch Process
Batch Layer Serving Layer
Speed Layer

Key Components of Real-time Analytics
Data
Source
Stream
Storage
Stream
Process
Stream
Ingestion
Data
Sink
Devices and/or
applications that
produce real-time
data at high
velocity
Data from tens of
thousands of data
sources can be
written to a single
stream
Data are stored in the
order they were
received for a set
duration of time and
can be replayed
indefinitely during
that time
Records are read in
the order they are
produced, enabling
real-time analytics
or streaming ETL
Data lake
(most common)
Database
(least common)

Stream Storage
Data
Source
Stream
Storage
Stream
Process
Stream
Ingestion
Data
Sink
Amazon Kinesis
Data Streams
Amazon Managed
Streaming for Kafka

Anatomy of Amazon Kinesis
Data Streams and MSK

Key Features of Kinesis Data Streams and MSK
• Distributed Queue • Stream Storage
#Queue #Distributed #Storage

Consumer
oldest datanewest data
5 4 3 2 1 0
3 2 1 0 2
#Queue: FIFO, Scale-Up vs Scale-Out
5 4
4 3 2 1 05
Producers

Hash
Function
Consumer
PK
PK
PK
PK
oldest datanewest data
Producers
shard/partition-1
shard/partition-2
3 2 1 0
5 4 3 2 1 0
4 3 2 1 0
2
shard/partition-3
#Distributed: Scale-Out
Consumer0
Consumer4
0
Consumer Group
4 3 2 1 0

Hash
Function
Consumer
Consumer
Consumer
Consumer Group
PK
PK
PK
PK
= next consumer offset oldest datanewest data
Producers
shard/partition-1
shard/partition-2
5 4 3 2 1 0
3 2 1 0
4 3 2 1 0
4
2
0
shard/partition-3
#Storage: Stream Buffer
2 1 0
4 3 2 1 0
0

Hash
Function
Consumer
Consumer
Consumer
Consumer Group
PK
PK
PK
PK
= next consumer offset oldest datanewest data
Amazon Kinesis
Data Streams
Amazon Managed
Streaming for Kafka
Producers
shard/partition-1
shard/partition-2
5 4 3 2 1 0
3 2 1 0
4 3 2 1 0
4
2
0
shard/partition-3
Anatomy of

Benefits of Stream Storage
• Decouple producers &
consumers
• Persistent buffer
• Collect multiple streams
• Preserve client ordering
• Parallel consumption
• Streaming MapReduce

Comparing Amazon Kinesis
Data Streams to MSK

Topic
Amazon Kinesis
Data Streams
Amazon Managed
Streaming for Kafka
Comparing Kinesis Data Streams to MSK

Amazon Kinesis
Data Streams
Amazon Managed
Streaming for Kafka
• Operational Perspective
• Number of clusters?
• Number of brokers per cluster?
• Number of topics per broker?
• Number of partitions per topic?
• Cluster provisioning model
• Only increase number of
partitions; can’t decrease
• Integration with a few of AWS
Services such as Kinesis Data
Analytics for Java
• Operational Perspective
• Number of Kinesis Data Streams?
• Number of shards per stream?
• Throughput provisioning model
• Increase/Decrease number of
shards
• Fully Integration with AWS
Services such as Lambda
function, Kinesis Data Analytics,
etc

RequestQueue
- Length
- WaitTime
ResponseQueue
- Length
- WaitTime
Network
- Packet Drop?
Produce/Consume Rate Unbalance
Who is Leader? Disk Full?
Too many topics?
Metrics to Monitor: MSK (Kafka)

Metrics to Monitor: MSK (Kafka)
Metric Level Description
ActiveControllerCount DEFAULT Only one controller per cluster should be active at any given time.
OfflinePartitionsCount DEFAULT Total number of partitions that are offline in the cluster.
GlobalPartitionCount DEFAULT Total number of partitions across all brokers in the cluster.
GlobalTopicCount DEFAULT Total number of topics across all brokers in the cluster.
KafkaAppLogsDiskUsed DEFAULT The percentage of disk space used for application logs.
KafkaDataLogsDiskUsed DEFAULT The percentage of disk space used for data logs.
RootDiskUsed DEFAULT The percentage of the root disk used by the broker.
PartitionCount PER_BROKER The number of partitions for the broker.
LeaderCount PER_BROKER The number of leader replicas.
UnderMinIsrPartitionCount PER_BROKER The number of under minIsr partitions for the broker.
UnderReplicatedPartitions PER_BROKER The number of under-replicated partitions for the broker.
FetchConsumerTotalTimeMsMean PER_BROKER The mean total time in milliseconds that consumers spend on
fetching data from the broker.
ProduceTotalTimeMsMean PER_BROKER The mean produce time in milliseconds.

How about monitoring Kinesis Data Streams?
How long time does a record stay in a shard?
5 transactions
per second,
per shard
With only one
consumer application,
records can be
retrieved every 200 ms
up to 1MB or 1,000
records per seconds,
per shard for writes
• 10MB per second, per shard
• up to 10,000 records per call
Consumer
Application
GetRecords()
Data

Metrics to Monitor: Kinesis Data Streams
Metric Description
GetRecords.IteratorAgeMilliseconds Age of the last record in all GetRecords
ReadProvisionedThroughputExceeded Number of GetRecords calls throttled
WriteProvisionedThroughputExceeded Number of PutRecord(s) calls throttled
PutRecord.Success, PutRecords.Success Number of successful PutRecord(s) operations
GetRecords.Success Number of successful GetRecords operations

Choosing Right Metrics
Too Much = Useless = Too Little

Kafka vs MSK vs Kinesis Data Streams
Operational
Excellence
Kinesis Data
Streams
Kafka
Amazon MSK
Degree of Freedom
≈ Complexity

Comparison Summary
Attribute Apache Kafka Kinesis Streams Managed Streaming for Kafka
Cost $$$ $ (pay for what you use) $$ (pay for infrastructure)
Ease of use Advanced setup required Get started in minutes Get started in minutes
Management Overhead High Low Low
Scalability Difficult to scale
Scale in seconds with one
click
Scale in minutes with one click
Throughput Infinite
Scales with shards, supports
up to 1mb payloads
Infinite
Durability Configurable 3x by default Configurable
Infrastructure You manage AWS manages AWS manages
Write-to-Read Latency <100 ms is achievable <100 ms (with HTTP/2) <100 ms is achievable
Open Sourced? Yes No Yes

Data Hub: (Asynchronous) Event-Bus

Kinesis
Data Streams
Kinesis
Data Firehose
Amazon S3
Amazon EC2
AWS Lambda
Amazon ECS
Kinesis
Data Analytics
Amazon ES
Amazon Athena
Amazon CloudWatch
https://aws.amazon.com/solutions/case-studies/autodesk-log-analytics/
Example Usage Pattern 1: Data Hub
Amazon
MSK

Log Aggregation
Web servers access log
Aggregated logs

Example Usage Pattern 2: Web Analytics
and Leaderboards
Amazon
DynamoDB
Amazon Kinesis
Data Analytics
Amazon Kinesis
Data Streams
Amazon
Cognito
Lightweight JS
client code
Web server on
Amazon EC2
OR
Compute top 10 usersIngest web app data Persist to feed live apps
Lambda
function
https://aws.amazon.com/solutions/implementations/real-time-web-analytics-with-kinesis/
Amazon MSK

IoT
IoT
Things
Remote Control
Prediction/
Fraud Detection
Device Monitoring
Quality Control
Data Visualization
Events
Analytics
AI/ML

https://aws.amazon.com/blogs/aws/new-serverless-streaming-etl-with-aws-glue/
Example Usage Pattern 3: Monitoring
IoT Devices
Ingest sensor data
Convert json
to parquet
Store all data points
in an S3 data lake
AWS IoT
Core
IoT rule
AWS Glue
Streaming Job
Amazon Athena
Glue
Crawler
Glue Data
Catalog
S3
Bucket
AWS Cloud
MQTT
Topic
Amazon Kinesis
Data Streams
Raspberry PI
+ Sense HAT

Event Sourcing and CQRS
https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/
App Write Interface App Read Interface
Event Queue
Application
State
Kafka Streams
Topology
Kafka Topic
Event Handler
Kafka
Streams
State Store
Event Store
Event Handler + App State
Event Store

Amazon Kinesis
Data Streams
Amazon Kinesis
Data Analytics
(SQL)
Example Usage Pattern 4: Streaming SQL
Continuous filter
Aggregate function
Data enrichment (join)
S3 Bucket
Anomaly Detection
Ticker, Company
AMZN, Amazon
ASD, SomeCompanyA
BAC, SomeCompanyB
CRM, SomeCompanyC
Event Store
https://docs.aws.amazon.com/kinesisanalytics/latest/dev/examples.html
{"TICKER_SYMBOL": "CVB",
"SECTOR": "TECHNOLOGY",
"CHANGE": 0.81,
"PRICE": 53.63}
{"TICKER_SYMBOL": "ABC",
"SECTOR": "RETAIL",
"CHANGE": -1.14,
"PRICE": 23.64}
{"TICKER_SYMBOL": "JKL",
"SECTOR": "TECHNOLOGY",
"CHANGE": 0.22,
"PRICE": 15.32}
Event Handler
+ App State join

Lambda
Kappa
Lambda vs Kappa Architecture

Key Takeaways
• Distributed Queue as Stream Storage
• Preserve Ordering
• Parallel Consumption
• Persistent Buffer
• Decouple producers & consumers
• Trade-off: Operational Excellence vs Degree of Freedom
• MUST keep an eye on the right monitoring metrics
• Architectural Patterns
• Data Hub: (Asynchronous) Event-Bus
• Log Aggregation
• IoT
• Event Sourcing and CQRS

Where To Go Next?
• Amazon MSK Labs
https://amazonmsk-labs.workshop.aws/
• Amazon Managed Streaming for Kafka: Best Practices
https://docs.aws.amazon.com/msk/latest/developerguide/bestpractices.html
• Monitoring Kafka performance metrics (2020-04-16)
https://tinyurl.com/y6hrhwbq
• Apache Kafka 모니터링을 위한 Metrics 이해 및 최적화 방안 (2018-11)
https://tinyurl.com/y4uwyenx
• AWS Analytics Immersion Day - Build BI System from Scratch
• Workshop - https://tinyurl.com/yapgwv77
• Slides - https://tinyurl.com/ybxkb74b
• Realtime Analytics on AWS
https://tinyurl.com/y3evwm3v
• Writing SQL on Streaming Data with Amazon Kinesis Analytics – Part 1, 2
• Part1 - https://tinyurl.com/y8vo8q7o
• Part2 - https://tinyurl.com/ycbv7wel

Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK

More Related Content

What's hot

Similar to Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK

More from Sungmin Kim

Recently uploaded

Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK