AWS Community Nordics Virtual Meetup

Serverless Data
Streaming at Scale
Anahit Pogosova
Lead Cloud Software Engineer (Solita / Yle)
20.10.2020

o Who, What & Why?
o Under the Hood
o Gotchas and Lessons Learned

• Lead Cloud Software Engineer
• Part of the Data & AI team at
Finland’s national public
broadcasting company
Me
• AWS Community Builder
10+ years
”full stack”, all kinds of stuff

• Yle Areena, the biggest
streaming service in Finland
• Areena recommendations
• Areena image personalisation
• Automatic image extraction
• Article recommendations (yle.fi)
• Smart notifications (Yle Uutisvahti)
• .. and more
Yle

• Data!
• user interaction and content metadata
• Collecting the data
• Storing the data
• Visualizing the data
• Utilizing the data (ML & AI)
• To understand the customers
• To help provide better service for everyone
{
"adobe":true,
"is_heartbeat":true,
"collectorreceived":1555267233549,
...
"s:asset:name":"Yle TV1",
"s:event:type":"start",
"s:meta:category":"nettitv",
"s:meta:content_type":"livetv",
"s:meta:ns_st_st":"yle tv1",
...
"s:meta:title":"eduskuntavaalit 2019 - tulosilta",
...
"s:meta:yle.vrsContent":"video",
"s:meta:yle.vrsDevice":"android",
"s:meta:yle.vrsPlatform":"mobile",
"s:meta:yle.vrsProduct":"areena",
"s:meta:yle_client":"android.areena.481-b4ce224bf",
"s:meta:yle_language":"fi",
"s:sp:channel":"yleisradio",
"s:sp:hb_version":"android-2.2.1.214-d5c678",
"s:user:mid":"71057009616815049761612335654599557361"
}
Yle

• ~ 500 000 000 requests per day
• ~ 600 000 rpm during prime time
• > 0.5 TB event data per day
• Apache Parquet
• JSON
• Max so far: ~ 2.5 mln rpm
• elections + hockey finals
Yle

Agenda
Load the data to the datalake in a
columnar format
Enable content personalization
through near real-time analytics

No server is easier to manage than
“no server”.
Dr. Werner Vogels
(CTO, Amazon)

Kinesis Data Streams
• Fully managed and massively scalable service to stream data
• Data available in milliseconds and stored from 24 hours to up to 7
days
• Custom stream processing with consumers
• Shard is the unit of parallelism
• In: 1 MB/sec or 1 000 records/sec
• Out: 2 MB/sec

Amazon Kinesis
Agent
• Stand-alone Java
application to
stream data from
files
Service
Integrations
• CloudWatch Logs
• CloudWatch Events
• AWS IoT
• DB Migration Service
• API Gateway
Amazon Kinesis
Producer Library
(KPL)
• Provides higher
level of abstraction
over API calls
Amazon Kinesis
API (AWS SDK)
• Most flexible
• Allows full control
over writing data
Kinesis Data Streams, Writing Data

• putRecord(params, callback)
• putRecords(params, callback)
• Up to 500 records
• Up to 5 MiB
Kinesis Data Streams, AWS SDK

putRecords(params, callback)
• Request failure
• Retries by default up to 3 times
• Uses exponential backoff
• Base delay by default is 100 ms
Kinesis Data Streams, AWS SDK

Lambda
o One Lambda is invoked per each shard by default
• NEW(ish)! Parallelization factor (max 10)
• Up to 10 times as many concurrent Lambdas as there are shards!

o Lambda is invoked once per second,
or:
• the number of records reaches the configured batch size
(max 10 000 records)
• the record batch size reaches synchronous Lambda’s payload limit
(6MB)
• NEW(ish)! the batch window reaches its maximum value
(max 5 min)
Lambda

Before..
• Lambda retries the batch until
success or data expiration
• No other batches are
processed from the
shard (aka poison pill)!
Lambda, Error Handling
After!
• Maximum retry attempts
(max 10 000)
• Maximum record age
(1 min – 7 days)
• Bisect batch on function failure
• On-failure destination
(SQS or SNS)

• Fully managed service to load streaming
data into a data lake
• S3, Redshift, AWS Elasticsearch
• HTTP endpoints (New!)
• Datadog, New Relic, MongoDB, and Splunk (Newish!)
• Allows to load streaming data with 0 lines of code
• Scales automatically (no shards to manage)
• Can batch, compress, transform and convert
data before loading to the destination
Kinesis Firehose

• Data stored up to 24 hours
• Batches records to certain size or for certain
period of time
• 1 to 128 MB
• 60 to 900 seconds
• Uses Glue Data Catalog to convert JSON to
• Apache Parquet
• Apache ORC
Kinesis Firehose

Kinesis Streams vs. Firehose
• Fully managed service to
stream data
• Data available up to 7 days
• Scaling using shards
• Custom stream processing with
consumers
• Fully managed service to
load data into a data lake
• Data available for 24 hours
• Scales automatically
• Batching, compressing, converting
data out of the box
+ custom transformations with

Amazon Kinesis
Agent
• Stand-alone Java
application to
stream data from
files
Service
Integrations
• Kinesis Streams
• CloudWatch Logs
• CloudWatch Events
• AWS IoT
Amazon Kinesis API
(AWS SDK)
• Most flexible
• Allows full control
over writing data
Kinesis Firehose, Writing Data

putRecordBatch(params, callback)
• Request failure
• Retries by default up to 3 times
• Uses exponential backoff
• Base delay by default is 100 ms
Kinesis Firehose, AWS SDK

Load the data to the datalake in a
columnar format
Enable content personalization
through near real-time analytics
Agenda

• Fully managed service to run SQL queries
on the streaming data
• Join, filter and aggregate data over a time-based or a row-based window
Kinesis Data Analytics
Ingests data from:
Kinesis Data Stream
Kinesis Firehose
Sends results to:
Kinesis Data Stream
Kinesis Firehose
Lambda function

putRecords(params, callback)
Partial failure:
• Exponential backoff + jitter
Gotcha!
Writing to Kinesis Streams

• Kinesis limits are per second, CloudWatch metrics are per minute
• 1 MB/sec or 1 000 records/sec
• Can 5 000 records/minute exceed the throughput?
• Beware of network latency!
• can be one reason for bursts in Kinesis
• avoid the external network by using a Kinesis VPC endpoint
Gotcha!

• IncomingRecords = PutRecord + PutRecods
The number of records successfully put to the Kinesis Stream
• WriteProvisionedThroughputExceeded = PutRecord + PutRecords
The number of records rejected due to throttling
• IncomingRecords + WriteProvisionedThroughputExceeded = Total
amount of incoming records
Gotcha!

• IteratorAge
• latency between when a record is added, and when it is processed
• If it’s increasing, increase the number of shards, or
• increase the parallelization factor (NEWish)
• Two different iterator age metrics
• Kinesis stream iterator age is a combination metric across all consumers
• not too informative
• Lambda’s own iterator age should be used instead!
Gotcha!
Reading from Kinesis Streams

• Beware of timeouts!
• connectTimeout: timeout for establishing a new connection on a
socket
• if not explicitly set, this value will default to the value of timeout
• timeout: read timeout for an existing socket (2 min)
• time between when request ends and the response is
received, including service and network round-trips
Gotcha!

• Firehose scales endlessly!
Or does it?
• “It is a fully managed service that automatically scales to match the
throughput of your data.”
• ”When Direct PUT is configured as the data source, each Kinesis Data
Firehose delivery stream is subject to the following limits: […]
5,000 records/second, 2,000 transactions/second, and 5 MiB/second.
• ThrottledRecords: the number of records that were throttled because data
ingestion exceeded one of the delivery stream limits.
Gotcha!

• Always learn about the service limits
• (there are always limits)
• hard and soft limits
• Keep a close eye on lambda’s
concurrency limits
• Deep dive into the error handling
• Don’t just assume things ..
• If not sure, ask the AWS Support
• Keep a close eye on service updates
• Everything fails all the time, especially at scale, so better be prepared and
fail fast
“Everything fails,
all the time”
Dr. Werner Vogels
(CTO, Amazon)
Lessons Learned

Thank you!
ANAHIT POGOSOVA
@anahit_fi

Shameless Plug
Real World Serverless with Yan Cui, @theburningmonk
episode #14
Mastering AWS Kinesis Data Streams, Part 1 (2)
dev.solita.fi
@anahit_fi
Anahit Pogosova

AWS Community Nordics Virtual Meetup

AWS Community Nordics Virtual Meetup

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AWS Community Nordics Virtual Meetup

Similar to AWS Community Nordics Virtual Meetup (20)

Recently uploaded

Recently uploaded (20)

AWS Community Nordics Virtual Meetup