Processing Big Data in Motion
Streaming Data Ingestion and Processing
Roger Barga, General Manager, Kinesis Streaming Services, AWS
April 7, 2016
Riding the Streaming Rapids
2011 20152007 & 2008 2013201220102009 2016
Azure Stream Analytics
Complex Event Processing
over Streaming Data
Relational Semantics
and Implementation
Streaming Map Reduce
& Machine Learning over Streams
Interest in and demand for
stream data processing is rapidly
increasing*…
* Understatement of the year…
Most data is produced continuously
Why?
{
"payerId": "Joe",
"productCode": "AmazonS3",
"clientProductCode": "AmazonS3",
"usageType": "Bandwidth",
"operation": "PUT",
"value": "22490",
"timestamp": "1216674828"
}
Metering Record
127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif
HTTP/1.0" 200 2326
Common Log Entry
<165>1 2003-10-11T22:14:15.003Z
mymachine.example.com evntslog - ID47
[exampleSDID@32473 iut="3" eventSource="Application"
eventID="1011"][examplePriority@32473 class="high"]
Syslog Entry
“SeattlePublicWater/Kinesis/123/Realtime” –
412309129140
MQTT Record
<R,AMZN,T,G,R1>
NASDAQ OMX Record
Time is money
• Perishable Insights (Forrester)
Why?
• Hourly server logs: how your systems were misbehaving
an hour ago
• Weekly / Monthly Bill: What you spent this past billing
cycle?
• Daily fraud reports: tells you if there was fraud yesterday
• CloudWatch metrics: what just went wrong now
• Real-time spending alerts/caps: guaranteeing you
can’t overspend
• Real-time detection: blocks fraudulent use now
Time is money
• Perishable Insights (Forrester)
• A more efficient implementation
• Most ‘Big Data’ deployments process
continuously generated data (batched)
Why?
Availability
Variety of stream data processing systems,
active ecosystem but still early days…
Why?
Disruptive
Foundational for business critical workflows
Enable new class of applications & services
that process data continuously.
Why?
Need to begin thinking about applications &
services in terms of streams of data and
continuous processing.
You
A change in perspective is worth 80 IQ points… – Alan Kay
• Scalable & Durable Data Ingest
 A quick word on our motivation
 Kinesis Streams, through a simple example
• Continuous Stream Data Processing
 Kinesis Client Library (KCL)
 One select design challenge: dynamic resharding
 How customers are using Kinesis Streams today
• Building on Kinesis Streams
 Kinesis Firehose
 AWS Event Driven Computing
Agenda
Our Motivation for Continuous Processing
AWS Metering service
• 100s of millions of billing records per second
• Terabytes++ per hour
• Hundreds of thousands of sources
• For each customer: gather all metering records & compute monthly bill
• Auditors guarantee 100% accuracy at months end
Seem perfectly reasonable to run as a batch, but relentless pressure for realtime…
With a Data Warehouse to load
• 1000s extract-transform-load (ETL) jobs every day
• Hundreds of thousands of files per load cycle
• Thousands of daily users, hundreds of queries per hour
Our Motivation for Continuous Processing
AWS Metering service
• 100s of millions of billing records per second
• Terabytes++ per hour
• Hundreds of thousands of sources
• For each customer: gather all metering records & compute monthly bill
• Auditors guarantee 100% accuracy at months end
Other Service Teams, Similar Requirements
• CloudWatch Logs and CloudWatch Metrics
• CloudFront API logging
• ‘Snitch’ internal datacenter hardware metrics
Real-time Ingest
• Highly Scalable
• Durable
• Replayable Reads
Continuous Processing
• Support multiple simultaneous
data processing applications
• Load-balancing incoming
streams, scale out processing
• Fault-tolerance, Checkpoint /
Replay
Right Tool for the Job
Enable Streaming Data Ingestion and Processing
twitter-trends.com
Elastic Beanstalk
twitter-trends.com
Example application
twitter-trends.com website
twitter-trends.com
Too big to handle on one box
twitter-trends.com
The solution: streaming map/reduce
My top-10
My top-10
My top-10
Global top-10
twitter-trends.com
Core concepts
My top-10
My top-10
My top-10
Global top-10
Data record
Stream
Partition key
Shard
Worker
Shard: 14 17 18 21 23
Data record
Sequence number
twitter-trends.com
How this relates to Kinesis
Kinesis
Kinesis application
Kinesis Streaming Data Ingestion
• Streams are made of Shards
• Each Shard ingests data up to
1MB/sec, and up to 1000 TPS
• Producers use a PUT call to store
data in a Stream: PutRecord {Data,
PartitionKey, StreamName}
• Each Shard emits up to 2 MB/sec
• All data is stored for 24 hours, 7
days if extended retention is ‘ON’
• Scale Kinesis streams by adding
or removing Shards
• Replay data from retention period
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates data
across three data centers (availability zones)
Aggregate and
archive to S3
Millions of
sources producing
100s of terabytes
per hour
Front
End
Authentication
Authorization
Ordered stream
of events supports
multiple readers
Real-time
dashboards
and alarms
Machine learning
algorithms or
sliding window
analytics
Aggregate analysis
in Hadoop or a
data warehouse
Inexpensive: $0.028 per million puts
Real-Time Streaming Data Ingestion
Custom-built
Streaming
Applications
(KCL)
Inexpensive: $0.014 per 1,000,000 PUT Payload Units
25 – 40ms 100 – 150ms
Kinesis Client Library
twitter-trends.com
Using the Kinesis API directly
K
I
N
E
S
I
S
twitter-trends.com
Using the Kinesis API directly
K
I
N
E
S
I
S
iterator = getShardIterator(shardId, LATEST);
while (true) {
[records, iterator] =
getNextRecords(iterator, maxRecsToReturn);
process(records);
}
process(records): {
for (record in records) {
updateLocalTop10(record);
}
if (timeToDoOutput()) {
writeLocalTop10ToDDB();
}
}
while (true) {
localTop10Lists =
scanDDBTable();
updateGlobalTop10List(
localTop10Lists);
sleep(10);
}
K
I
N
E
S
I
S
twitter-trends.com
Challenges with using the Kinesis API directly
Kinesis
application
Manual creation of workers and
assignment to shards
K
I
N
E
S
I
S
twitter-trends.com
Challenges with using the Kinesis API directly
Kinesis
application
How many workers
per EC2 instance?
K
I
N
E
S
I
S
twitter-trends.com
Challenges with using the Kinesis API directly
Kinesis
application
How many EC2 instances?
K
I
N
E
S
I
S
twitter-trends.com
Using the Kinesis Client Library
Kinesis
application
Shard mgmt
table
K
I
N
E
S
I
S
twitter-trends.com
Elasticity and Load Balancing
Shard mgmt
table
K
I
N
E
S
I
S
twitter-trends.com
Elasticity and Load Balancing
Auto
scaling
Group
Shard mgmt
table
K
I
N
E
S
I
S
twitter-trends.com
Elasticity and Load Balancing
Auto
scaling
Group
Shard mgmt
table
K
I
N
E
S
I
S
twitter-trends.com
Elasticity and Load Balancing
Shard mgmt
table
Auto
scaling
Group
K
I
N
E
S
I
S
twitter-trends.com
Elasticity and Load Balancing
Shard mgmt
table
Auto
scaling
Group
K
I
N
E
S
I
S
twitter-trends.com
Fault Tolerance Support
Shard mgmt
table
Availability Zone
1
Availability Zone
3
K
I
N
E
S
I
S
twitter-trends.com
Fault Tolerance Support
Shard mgmt
table
X
Availability Zone
1
Availability Zone
3
K
I
N
E
S
I
S
twitter-trends.com
Fault Tolerance Support
Shard mgmt
table
X
Availability Zone
1
Availability Zone
3
K
I
N
E
S
I
S
twitter-trends.com
Fault Tolerance Support
Shard mgmt
table
Availability Zone
3
Worker Fail Over
Amazon.com Confidential 37
Shard-0
Shard-1
Shard-2
Worker1
Worker2
Worker3
LeaseKey LeaseOwner LeaseCounter
Shard-0 Worker1 85
Shard-1 Worker2 94
Shard-2 Worker3 76
Worker Fail Over
Amazon.com Confidential 38
Shard-0
Shard-1
Shard-2
Worker1
Worker2
Worker3
LeaseKey LeaseOwner LeaseCounter
Shard-0 Worker1 85 86
Shard-1 Worker2 94
Shard-2 Worker3 76 77
X
Worker Fail Over
Amazon.com Confidential 39
Shard-0
Shard-1
Shard-2
Worker1
Worker2
Worker3
LeaseKey LeaseOwner LeaseCounter
Shard-0 Worker1 85 86 87
Shard-1 Worker2 94
Shard-2 Worker3 76 77 78
X
Worker Fail Over
Amazon.com Confidential 40
Shard-0
Shard-1
Shard-2
Worker1
Worker2
Worker3
LeaseKey LeaseOwner LeaseCounter
Shard-0 Worker1 85 86 87 88
Shard-1 Worker3 94 95
Shard-2 Worker3 76 77 78 79
X
Worker Load Balancing
Amazon.com Confidential 41
Shard-0
Shard-1
Shard-2
Worker1
Worker2
Worker3
Worker4
LeaseKey LeaseOwner LeaseCounter
Shard-0 Worker1 88
Shard-1 Worker3 96
Shard-2 Worker3 78
X
Worker Load Balancing
Amazon.com Confidential 42
Shard-0
Shard-1
Shard-2
Worker1
Worker2
Worker3
Worker4
LeaseKey LeaseOwner LeaseCounter
Shard-0 Worker1 88
Shard-1 Worker3 96
Shard-2 Worker4 79
X
Resharding
Amazon.com Confidential 43
Shard-0
Worker1
Worker2
LeaseKey LeaseOwner LeaseCounter checkpoint
Shard-0 Worker1 90 SHARD_END
Shard-0
Shard-1
Shard-2
Resharding
Amazon.com Confidential 44
Shard-0
Shard-1
Shard-2
Worker1
Worker2
LeaseKey LeaseOwner LeaseCounter checkpoint
Shard-0 Worker1 90 SHARD_END
Shard-1 0 TRIM_HORIZON
Shard-2 0 TRIM_HORIZON
Shard-0
Shard-1
Shard-2
Resharding
Amazon.com Confidential 45
Shard-0
Shard-1
Shard-2
Worker1
Worker2
LeaseKey LeaseOwner LeaseCounter checkpoint
Shard-0 Worker1 90 SHARD_END
Shard-1 Worker1 2 TRIM_HORIZON
Shard-2 Worker2 3 TRIM_HORIZON
Shard-0
Shard-1
Shard-2
Resharding
Amazon.com Confidential 46
Shard-1
Shard-2
Worker1
Worker2
LeaseKey LeaseOwner LeaseCounter checkpoint
Shard-1 Worker1 2 TRIM_HORIZON
Shard-2 Worker2 3 TRIM_HORIZON
Shard-0
Shard-1
Shard-2
500MM tweets/day = ~ 5,800 tweets/sec
2k/tweet is ~12MB/sec (~1TB/day)
$0.015/hour per shard, $0.014/million PUTS
Kinesis cost is $0.47/hour
Redshift cost is $0.850/hour (for a 2TB node)
Total: $1.32/hour
Cost &
Scale
Putting this into production
Design Challenge(s)
• Dynamic Resharding & Scale Out
• Enforcing Quotas (think proxy fleet with 1Ks servers)
• Distributed Denial of Service Attack (unintentional)
• Dynamic Load Balancing on Storage Servers
• Heterogeneous Workloads (tip of stream vs 7 day)
• Optimizing Fleet Utilization (proxy, control, data planes)
• Avoid Scaling Cliffs
• …
Kinesis Streams: Streaming Data the AWS Way
• Pay as you go, no up front costs
• Elastically scalable
• Choose the service, or combination of
services, for your specific use cases.
• Real-time latencies
Deploy
• Easy to provision, deploy, and manage
Sushiro: Kaiten Sushi Restaurants
380 stores stream data from sushi plate sensors and stream to Kinesis
Sushiro: Kaiten Sushi Restaurants
380 stores stream data from sushi plate sensors and stream to Kinesis
Real-Time Streaming Data with Kinesis Streams
5 billion events/wk from
connected devices | IoT
17 PB of game data per
season | Entertainment
100 billion ad
impressions/day, 30 ms
response time | Ad Tech
100 GB/day click streams
250+ sites | Enterprise
50 billion ad
impressions/day sub-50
ms responses | Ad Tech
17 million events/day
| Technology
1 billion transactions per
day | Bitcoin
1 TB+/day game data
analyzed in real-time
| Gaming
Streams provide a foundational
abstraction on which to build higher
level services
Amazon Kinesis Firehose
• Zero Admin: Capture and deliver streaming data into S3, Redshift, and other
destinations without writing an application or managing infrastructure
• Direct-to-data store integration: Batch, compress, and encrypt streaming data
for delivery into S3, and other destinations in as little as 60 secs, set up in minutes
• Seamless elasticity: Seamlessly scales to match data throughput
Capture and submit
streaming data to Firehose
Firehose loads streaming data
continuously into S3 and Redshift
Analyze streaming data using your favorite
BI tools
AWSEndpoint
[Batch,
Compress,
Encrypt]
Data
Sources
S3No Partition Keys
No Provisioning
End to End Elastic
Amazon Kinesis Firehose
Fully Managed Service for Delivering Data Streams into AWS Destinations
Redshift
AWS Event-Driven Computing
• Compute in response to recently occurring events
• Newly arrived/changed data
– Example: generate thumbnail for an image uploaded to S3
• Newly occurring system state changes
– Example: EC2 instance created
– Example: DynamoDB table deleted
– Example: Auto-scaling group membership change
– Example: RDS-HA primary fail-over occurs
Event Driven Computing in AWS Today
SQS
S3 event notifications
Event Driven Computing in AWS Today
DynamoDB Update Streams
Cloudtrails event log for API calls
Event Driven Computing in AWS Today
S3
Customer 1
Customer 2
Customer 3
Event Driven Computing in AWS Tomorrow
Single Event logs for asynchronous
service events
Customer 1
Customer 2
Customer 3
Event Driven Computing in AWS Tomorrow
Event logs for asynchronous service events
Event logs from other data storage services
Customer 1
Customer 2
Customer 3
A Unified Event Log Approach
KinesisSQS
(Unordered Events) (Ordered Events)
K
I
N
E
S
I
S
Ordered Event Log Using Kinesis Streams
and the Kinesis Client Library
Shard mgmt
table
User
State
AWS EDC
ASG
Use of the KCL
Mostly writing business logic
EDC Rules Language
Simple CloudWatch actions in
response to matching rules.
Event Logs for Customers’ Services
Vision: customers’ services and applications leverage the AWS
event log infrastructure
Cust. 2593
Cust. 7302
Cust. 3826
Widget A
Widget B
Widget C
www.widget.com
Per-customer control plane events sent
to customer’s unified control plane log
Per-entity data plane event logs
Streaming data is highly prevalent and relevant;
Stream data processing is on the rise;
A key part of business critical workflows today, a
powerful abstraction for building a new class of
applications & data intensive services tomorrow.
A rich area for distributed systems, programming
model, IoT, and new service(s) research.
Closing Thoughts
Questions

Barga IC2E & IoTDI'16 Keynote

  • 1.
    Processing Big Datain Motion Streaming Data Ingestion and Processing Roger Barga, General Manager, Kinesis Streaming Services, AWS April 7, 2016
  • 2.
    Riding the StreamingRapids 2011 20152007 & 2008 2013201220102009 2016 Azure Stream Analytics Complex Event Processing over Streaming Data Relational Semantics and Implementation Streaming Map Reduce & Machine Learning over Streams
  • 3.
    Interest in anddemand for stream data processing is rapidly increasing*… * Understatement of the year…
  • 4.
    Most data isproduced continuously Why? { "payerId": "Joe", "productCode": "AmazonS3", "clientProductCode": "AmazonS3", "usageType": "Bandwidth", "operation": "PUT", "value": "22490", "timestamp": "1216674828" } Metering Record 127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 Common Log Entry <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"][examplePriority@32473 class="high"] Syslog Entry “SeattlePublicWater/Kinesis/123/Realtime” – 412309129140 MQTT Record <R,AMZN,T,G,R1> NASDAQ OMX Record
  • 5.
    Time is money •Perishable Insights (Forrester) Why? • Hourly server logs: how your systems were misbehaving an hour ago • Weekly / Monthly Bill: What you spent this past billing cycle? • Daily fraud reports: tells you if there was fraud yesterday • CloudWatch metrics: what just went wrong now • Real-time spending alerts/caps: guaranteeing you can’t overspend • Real-time detection: blocks fraudulent use now
  • 6.
    Time is money •Perishable Insights (Forrester) • A more efficient implementation • Most ‘Big Data’ deployments process continuously generated data (batched) Why?
  • 7.
    Availability Variety of streamdata processing systems, active ecosystem but still early days… Why?
  • 8.
    Disruptive Foundational for businesscritical workflows Enable new class of applications & services that process data continuously. Why?
  • 9.
    Need to beginthinking about applications & services in terms of streams of data and continuous processing. You A change in perspective is worth 80 IQ points… – Alan Kay
  • 10.
    • Scalable &Durable Data Ingest  A quick word on our motivation  Kinesis Streams, through a simple example • Continuous Stream Data Processing  Kinesis Client Library (KCL)  One select design challenge: dynamic resharding  How customers are using Kinesis Streams today • Building on Kinesis Streams  Kinesis Firehose  AWS Event Driven Computing Agenda
  • 11.
    Our Motivation forContinuous Processing AWS Metering service • 100s of millions of billing records per second • Terabytes++ per hour • Hundreds of thousands of sources • For each customer: gather all metering records & compute monthly bill • Auditors guarantee 100% accuracy at months end Seem perfectly reasonable to run as a batch, but relentless pressure for realtime… With a Data Warehouse to load • 1000s extract-transform-load (ETL) jobs every day • Hundreds of thousands of files per load cycle • Thousands of daily users, hundreds of queries per hour
  • 12.
    Our Motivation forContinuous Processing AWS Metering service • 100s of millions of billing records per second • Terabytes++ per hour • Hundreds of thousands of sources • For each customer: gather all metering records & compute monthly bill • Auditors guarantee 100% accuracy at months end Other Service Teams, Similar Requirements • CloudWatch Logs and CloudWatch Metrics • CloudFront API logging • ‘Snitch’ internal datacenter hardware metrics
  • 13.
    Real-time Ingest • HighlyScalable • Durable • Replayable Reads Continuous Processing • Support multiple simultaneous data processing applications • Load-balancing incoming streams, scale out processing • Fault-tolerance, Checkpoint / Replay Right Tool for the Job Enable Streaming Data Ingestion and Processing
  • 14.
  • 15.
  • 16.
    twitter-trends.com The solution: streamingmap/reduce My top-10 My top-10 My top-10 Global top-10
  • 17.
    twitter-trends.com Core concepts My top-10 Mytop-10 My top-10 Global top-10 Data record Stream Partition key Shard Worker Shard: 14 17 18 21 23 Data record Sequence number
  • 18.
    twitter-trends.com How this relatesto Kinesis Kinesis Kinesis application
  • 19.
    Kinesis Streaming DataIngestion • Streams are made of Shards • Each Shard ingests data up to 1MB/sec, and up to 1000 TPS • Producers use a PUT call to store data in a Stream: PutRecord {Data, PartitionKey, StreamName} • Each Shard emits up to 2 MB/sec • All data is stored for 24 hours, 7 days if extended retention is ‘ON’ • Scale Kinesis streams by adding or removing Shards • Replay data from retention period
  • 20.
    Amazon Web Services AZAZ AZ Durable, highly consistent storage replicates data across three data centers (availability zones) Aggregate and archive to S3 Millions of sources producing 100s of terabytes per hour Front End Authentication Authorization Ordered stream of events supports multiple readers Real-time dashboards and alarms Machine learning algorithms or sliding window analytics Aggregate analysis in Hadoop or a data warehouse Inexpensive: $0.028 per million puts Real-Time Streaming Data Ingestion Custom-built Streaming Applications (KCL) Inexpensive: $0.014 per 1,000,000 PUT Payload Units 25 – 40ms 100 – 150ms
  • 21.
  • 22.
    twitter-trends.com Using the KinesisAPI directly K I N E S I S
  • 23.
    twitter-trends.com Using the KinesisAPI directly K I N E S I S iterator = getShardIterator(shardId, LATEST); while (true) { [records, iterator] = getNextRecords(iterator, maxRecsToReturn); process(records); } process(records): { for (record in records) { updateLocalTop10(record); } if (timeToDoOutput()) { writeLocalTop10ToDDB(); } } while (true) { localTop10Lists = scanDDBTable(); updateGlobalTop10List( localTop10Lists); sleep(10); }
  • 24.
    K I N E S I S twitter-trends.com Challenges with usingthe Kinesis API directly Kinesis application Manual creation of workers and assignment to shards
  • 25.
    K I N E S I S twitter-trends.com Challenges with usingthe Kinesis API directly Kinesis application How many workers per EC2 instance?
  • 26.
    K I N E S I S twitter-trends.com Challenges with usingthe Kinesis API directly Kinesis application How many EC2 instances?
  • 27.
    K I N E S I S twitter-trends.com Using the KinesisClient Library Kinesis application Shard mgmt table
  • 28.
  • 29.
    K I N E S I S twitter-trends.com Elasticity and LoadBalancing Auto scaling Group Shard mgmt table
  • 30.
    K I N E S I S twitter-trends.com Elasticity and LoadBalancing Auto scaling Group Shard mgmt table
  • 31.
    K I N E S I S twitter-trends.com Elasticity and LoadBalancing Shard mgmt table Auto scaling Group
  • 32.
    K I N E S I S twitter-trends.com Elasticity and LoadBalancing Shard mgmt table Auto scaling Group
  • 33.
    K I N E S I S twitter-trends.com Fault Tolerance Support Shardmgmt table Availability Zone 1 Availability Zone 3
  • 34.
    K I N E S I S twitter-trends.com Fault Tolerance Support Shardmgmt table X Availability Zone 1 Availability Zone 3
  • 35.
    K I N E S I S twitter-trends.com Fault Tolerance Support Shardmgmt table X Availability Zone 1 Availability Zone 3
  • 36.
  • 37.
    Worker Fail Over Amazon.comConfidential 37 Shard-0 Shard-1 Shard-2 Worker1 Worker2 Worker3 LeaseKey LeaseOwner LeaseCounter Shard-0 Worker1 85 Shard-1 Worker2 94 Shard-2 Worker3 76
  • 38.
    Worker Fail Over Amazon.comConfidential 38 Shard-0 Shard-1 Shard-2 Worker1 Worker2 Worker3 LeaseKey LeaseOwner LeaseCounter Shard-0 Worker1 85 86 Shard-1 Worker2 94 Shard-2 Worker3 76 77 X
  • 39.
    Worker Fail Over Amazon.comConfidential 39 Shard-0 Shard-1 Shard-2 Worker1 Worker2 Worker3 LeaseKey LeaseOwner LeaseCounter Shard-0 Worker1 85 86 87 Shard-1 Worker2 94 Shard-2 Worker3 76 77 78 X
  • 40.
    Worker Fail Over Amazon.comConfidential 40 Shard-0 Shard-1 Shard-2 Worker1 Worker2 Worker3 LeaseKey LeaseOwner LeaseCounter Shard-0 Worker1 85 86 87 88 Shard-1 Worker3 94 95 Shard-2 Worker3 76 77 78 79 X
  • 41.
    Worker Load Balancing Amazon.comConfidential 41 Shard-0 Shard-1 Shard-2 Worker1 Worker2 Worker3 Worker4 LeaseKey LeaseOwner LeaseCounter Shard-0 Worker1 88 Shard-1 Worker3 96 Shard-2 Worker3 78 X
  • 42.
    Worker Load Balancing Amazon.comConfidential 42 Shard-0 Shard-1 Shard-2 Worker1 Worker2 Worker3 Worker4 LeaseKey LeaseOwner LeaseCounter Shard-0 Worker1 88 Shard-1 Worker3 96 Shard-2 Worker4 79 X
  • 43.
    Resharding Amazon.com Confidential 43 Shard-0 Worker1 Worker2 LeaseKeyLeaseOwner LeaseCounter checkpoint Shard-0 Worker1 90 SHARD_END Shard-0 Shard-1 Shard-2
  • 44.
    Resharding Amazon.com Confidential 44 Shard-0 Shard-1 Shard-2 Worker1 Worker2 LeaseKeyLeaseOwner LeaseCounter checkpoint Shard-0 Worker1 90 SHARD_END Shard-1 0 TRIM_HORIZON Shard-2 0 TRIM_HORIZON Shard-0 Shard-1 Shard-2
  • 45.
    Resharding Amazon.com Confidential 45 Shard-0 Shard-1 Shard-2 Worker1 Worker2 LeaseKeyLeaseOwner LeaseCounter checkpoint Shard-0 Worker1 90 SHARD_END Shard-1 Worker1 2 TRIM_HORIZON Shard-2 Worker2 3 TRIM_HORIZON Shard-0 Shard-1 Shard-2
  • 46.
    Resharding Amazon.com Confidential 46 Shard-1 Shard-2 Worker1 Worker2 LeaseKeyLeaseOwner LeaseCounter checkpoint Shard-1 Worker1 2 TRIM_HORIZON Shard-2 Worker2 3 TRIM_HORIZON Shard-0 Shard-1 Shard-2
  • 47.
    500MM tweets/day =~ 5,800 tweets/sec 2k/tweet is ~12MB/sec (~1TB/day) $0.015/hour per shard, $0.014/million PUTS Kinesis cost is $0.47/hour Redshift cost is $0.850/hour (for a 2TB node) Total: $1.32/hour Cost & Scale Putting this into production
  • 48.
    Design Challenge(s) • DynamicResharding & Scale Out • Enforcing Quotas (think proxy fleet with 1Ks servers) • Distributed Denial of Service Attack (unintentional) • Dynamic Load Balancing on Storage Servers • Heterogeneous Workloads (tip of stream vs 7 day) • Optimizing Fleet Utilization (proxy, control, data planes) • Avoid Scaling Cliffs • …
  • 49.
    Kinesis Streams: StreamingData the AWS Way • Pay as you go, no up front costs • Elastically scalable • Choose the service, or combination of services, for your specific use cases. • Real-time latencies Deploy • Easy to provision, deploy, and manage
  • 50.
    Sushiro: Kaiten SushiRestaurants 380 stores stream data from sushi plate sensors and stream to Kinesis
  • 51.
    Sushiro: Kaiten SushiRestaurants 380 stores stream data from sushi plate sensors and stream to Kinesis
  • 52.
    Real-Time Streaming Datawith Kinesis Streams 5 billion events/wk from connected devices | IoT 17 PB of game data per season | Entertainment 100 billion ad impressions/day, 30 ms response time | Ad Tech 100 GB/day click streams 250+ sites | Enterprise 50 billion ad impressions/day sub-50 ms responses | Ad Tech 17 million events/day | Technology 1 billion transactions per day | Bitcoin 1 TB+/day game data analyzed in real-time | Gaming
  • 53.
    Streams provide afoundational abstraction on which to build higher level services
  • 54.
    Amazon Kinesis Firehose •Zero Admin: Capture and deliver streaming data into S3, Redshift, and other destinations without writing an application or managing infrastructure • Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into S3, and other destinations in as little as 60 secs, set up in minutes • Seamless elasticity: Seamlessly scales to match data throughput Capture and submit streaming data to Firehose Firehose loads streaming data continuously into S3 and Redshift Analyze streaming data using your favorite BI tools
  • 55.
    AWSEndpoint [Batch, Compress, Encrypt] Data Sources S3No Partition Keys NoProvisioning End to End Elastic Amazon Kinesis Firehose Fully Managed Service for Delivering Data Streams into AWS Destinations Redshift
  • 56.
    AWS Event-Driven Computing •Compute in response to recently occurring events • Newly arrived/changed data – Example: generate thumbnail for an image uploaded to S3 • Newly occurring system state changes – Example: EC2 instance created – Example: DynamoDB table deleted – Example: Auto-scaling group membership change – Example: RDS-HA primary fail-over occurs
  • 57.
    Event Driven Computingin AWS Today SQS S3 event notifications
  • 58.
    Event Driven Computingin AWS Today DynamoDB Update Streams
  • 59.
    Cloudtrails event logfor API calls Event Driven Computing in AWS Today S3 Customer 1 Customer 2 Customer 3
  • 60.
    Event Driven Computingin AWS Tomorrow Single Event logs for asynchronous service events Customer 1 Customer 2 Customer 3
  • 61.
    Event Driven Computingin AWS Tomorrow Event logs for asynchronous service events Event logs from other data storage services Customer 1 Customer 2 Customer 3
  • 62.
    A Unified EventLog Approach KinesisSQS (Unordered Events) (Ordered Events)
  • 63.
    K I N E S I S Ordered Event LogUsing Kinesis Streams and the Kinesis Client Library Shard mgmt table User State AWS EDC ASG Use of the KCL Mostly writing business logic EDC Rules Language Simple CloudWatch actions in response to matching rules.
  • 64.
    Event Logs forCustomers’ Services Vision: customers’ services and applications leverage the AWS event log infrastructure Cust. 2593 Cust. 7302 Cust. 3826 Widget A Widget B Widget C www.widget.com Per-customer control plane events sent to customer’s unified control plane log Per-entity data plane event logs
  • 65.
    Streaming data ishighly prevalent and relevant; Stream data processing is on the rise; A key part of business critical workflows today, a powerful abstraction for building a new class of applications & data intensive services tomorrow. A rich area for distributed systems, programming model, IoT, and new service(s) research. Closing Thoughts
  • 66.