SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Amazon DynamoDB Design Workshop
Sean Shriver
NoSQL Solutions Architect, AWS
Data Modeling
Hierarchical data structures as items
• Use composite sort key to define a hierarchy
• Highly selective result sets with sort key queries
• Index anything, scales to any size
Primary Key
Attributes
ProductID type
Items
1 bookID
title author genre publisher datePublished ISBN
Some Book John Smith Science Fiction Ballantine Oct-70 0-345-02046-4
2 albumID
title artist genre label studio released producer
Some Album Some Band Progressive Rock Harvest Abbey Road 3/1/73 Somebody
2 albumID:trackID
title length music vocals
Track 1 1:30 Mason Instrumental
2 albumID:trackID
title length music vocals
Track 2 2:43 Mason Mason
2 albumID:trackID
title length music vocals
Track 3 3:30 Smith Johnson
3 movieID
title genre writer producer
Some Movie Scifi Comedy Joe Smith 20th Century Fox
3 movieID:actorID
name character image
Some Actor Joe img2.jpg
3 movieID:actorID
name character image
Some Actress Rita img3.jpg
3 movieID:actorID
name character image
Some Actor Frito img1.jpg
… or as documents (JSON)
• JSON data types (M, L, BOOL, NULL)
• Document SDKs available
• 400 KB maximum item size (limits hierarchical data structure)
Primary Key
Attributes
ProductID
Items
1
id title author genre publisher datePublished ISBN
bookID Some Book Some Guy Science Fiction Ballantine Oct-70 0-345-02046-4
2
id title artist genre Attributes
albumID Some Album Some Band Progressive Rock
{ label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink
Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals:
"Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour,
Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music: ”Gilmour,
Waters", vocals: "Instrumental"}]}
3
id title genre writer Attributes
movieID Some Movie Scifi Comedy Joe Smith
{ producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob: "9/21/71",
character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya Rudolph", dob:
"7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax Shepard", dob:
"1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
1:1 relationships or key-values
• Use a table or GSI with a partition key
• Use GetItem or BatchGetItem API
Example: Given a user or email, get attributes
Users Table
Partition key Attributes
UserId = bob Email = bob@gmail.com, JoinDate = 2011-11-15
UserId = fred Email = fred@yahoo.com, JoinDate = 2011-12-01
Users-Email-GSI
Partition key Attributes
Email = bob@gmail.com UserId = bob, JoinDate = 2011-11-15
Email = fred@yahoo.com UserId = fred, JoinDate = 2011-12-01
1:N relationships or parent-children
• Use a table or GSI with partition and sort key
• Use Query API
Example:
Given a device, find all readings between epoch X, Y
Device-measurements
Part. Key Sort key Attributes
DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90
DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
N:M relationships
• Use a table and GSI with partition and sort key elements
switched
• Use Query API
Example:
Given a user, find all games. Or given a game, find all
users.
User-Games-Table
Part. Key Sort key
UserId = bob GameId = Game1
UserId = fred GameId = Game2
UserId = bob GameId = Game3
Game-Users-GSI
Part. Key Sort key
GameId = Game1 UserId = bob
GameId = Game2 UserId = fred
GameId = Game3 UserId = bob
Data Modeling Exercise #1
• A shopping cart use case
• Attributes: cartId, dateTime, set of SKU’s, etc.
• Mostly under 1KB in size
• Up to 100 million carts in the table – 100GB total
• Accessed by cartId (key-value access pattern)
• 10K writes, and 10K reads per second
• TTL to expire “old” carts
• Also:
• Notify when there is a price change for a SKU in a cart
Cost estimation
• The table: cartId, dateTime, set of SKU’s, etc.
• Under 1KB in size
• Up to 100 million carts in the table – up to 100GB total
• Accessed by cartId (key-value access pattern)
• 10K writes, and 10K reads per second
• Cost estimate
• Enter the storage, item size, reads, and writes
• Get an estimate of your monthly bill: ~$6300
• The price notification job
• E.g. 100 million items, scan/update over 24 hrs
• ~ 1200 RCU and WCU for 24 hrs: ~ $22
Use case: Time Series Data
sensor_data
DeviceId: 123
Timestamp: 1492641900
Temp: 172
AWS Lambda
Kinesis Stream
• Sensor data (temp) from 1000’s of
sensors
• Need to store for fast access
• Access by sensor ID + time range
• Store for 90 days
DynamoDB
Solution DeviceId: 123
Timestamp: 1492641900
Temp: 172
MyTTL: 1492736400
AWS Lambda
Kinesis Stream
Group multiple data points into
a single item
Save on writes
Use TTL to manage data lifetime
sensor_data
DynamoDB
Time Series Data Example Architecture
Sensors
AWS
Lambda
Amazon
S3 hosted
site
Amazon
DynamoDB
Amazon
Kinesis
Amazon
CloudWatch
alarm
Amazon
Cognito
Javascript SDK
request
Amazon
SNS
Python (boto) JavaScriptJava
Time Series Data
 High volume ingest: Kinesis + DynamoDB
 Fast access: DynamoDB
 At a reasonable cost: <$10K/month for:
• 2.5TB of stored data points per month
• Ingest of 100K data points per second
All using a serverless architecture with managed services on AWS
Cost estimate
• Data rate: 100000 data points per second
• Data storage: 1 month’s worth = ~2.5TB
Kinesis: 100000 records -> 100 shards -> ~$5K per month
DynamoDB: 100000 WCU’s -> $50K per month
• Binned writes: 10000 WCU’s -> $5000 per month (10x less…)
• Diff. over 1 year: $600K vs. $60K
We can save a lot by storing multiple data points per item
Using TTL to age out cold data
RCUs = 1000
WCUs = 1000
HotdataColddata
Use DynamoDB TTL and Streams
to archive cold data
Events_table_2016_April
Event_id
(Partition key)
Timestamp
(sort key)
myTTL
1489188093
…. Attribute NData table
S3
AWS Lambda
Amazon Kinesis
Firehose
Age out cold data
• Configure a TTL attribute for table
• For each item, set it to item expiration date/time
• DynamoDB automatically deletes expired items
• Use DynamoDB Streams to act on expired items
Dealing with data that gets cold over
time (all data!)
Messages
table
Messages app
David
SELECT *
FROM Messages
WHERE Recipient='David'
LIMIT 50
ORDER BY Date DESC
Inbox
SELECT *
FROM Messages
WHERE Sender ='David'
LIMIT 50
ORDER BY Date DESC
Outbox
Use case: Messaging App
Recipient Date Sender Message
David 2014-10-02 Bob …
… 48 more messages for David …
David 2014-10-03 Alice …
Alice 2014-09-28 Bob …
Alice 2014-10-01 Carol …
Solution v.1
(Many more messages)
David
Messages table
50 items × 256 KB each
Partition key Sort key
Large message bodies
Attachments
SELECT *
FROM Messages
WHERE Recipient='David'
LIMIT 50
ORDER BY Date DESC
Inbox
Computing inbox query cost
Items evaluated by query
Average item size
Conversion ratio
Eventually consistent reads
50 * 256KB * (1 RCU / 4 KB) * (1 / 2) = 1600 RCU
Recipient Date Sender Subject MsgId
David 2014-10-02 Bob Hi!… afed
David 2014-10-03 Alice RE: The… 3kf8
Alice 2014-09-28 Bob FW: Ok… 9d2b
Alice 2014-10-01 Carol Hi!... ct7r
Solution v.2: Separate the bulk data
Inbox-GSI Messages table
MsgId …
9d2b …
3kf8 …
ct7r …
afed …
David
(50 sequential items at 128 bytes)
Query inbox-GSI: 50 * 128B * (1 RCU / 4 KB) * (1 / 2) = 1 RCU
Inbox GSI
Outbox GSI
SELECT *
FROM Messages
WHERE Sender ='David'
LIMIT 50
ORDER BY Date DESC
Messaging app
Messages
table
David
Inbox
global secondary
index
Inbox
Outbox
global secondary
index
Outbox
• Reduce one-to-many item sizes
• Configure secondary index projections
• Use GSIs to model M:N relationship
between sender and recipient
• Use GSIs to select the right subset of data
for better performance and lower cost
Distribute large items
Querying many large items at once
InboxMessagesOutbox
Use case: Real-time voting
Votes Table
Voters
RawVotes Table
Voting App
Partition 1
1000 WCUs
Partition K
1000 WCUs
Partition M
1000 WCUs
Partition N
1000 WCUs
Votes Table
Candidate A Candidate B
Real-time voting: scaling bottlenecks
Voters
Provision 200,000 WCUs
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_7 Candidate B_8
UpdateItem: “CandidateA_” + rand(0, 10)
ADD 1 to Votes
Candidate A_6
Candidate A_8
Candidate A_5
Voter
Votes Table
Write-sharding
Votes Table
Shard aggregation
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_5
Candidate A_6 Candidate A_8
Candidate A_7 Candidate B_8
Periodic
process
Candidate A
Total: 2.5M
1. Sum
2. Store Voter
Real-time voting: tables
UserId Candidate Date
Alice A 2016-10-02
Bob B 2016-10-02
Eve B 2016-10-02
Chuck A 2016-10-02
RawVotes Table
Segment Votes
A_1 23
B_2 12
B_1 14
A_2 25
Votes Table
1. Record vote and de-dupe 2. Increment candidate counter
• Trade off read cost for write scalability
• Consider throughput per partition key
Shard write-heavy partition keys
Your write workload is not horizontally
scalable
Real-time voting v2: aggregation with Streams
Votes Table
Amazon
Redshift Amazon EMR
Amazon Kinesis–
Enabled App
Voters RawVotes TableVoting App RawVotes
DynamoDB
Stream
Real-time voting v2: tables
UserId Candidate Date
Alice A 2016-10-02
Bob B 2016-10-02
Eve B 2016-10-02
Chuck A 2016-10-02
RawVotes Table
Candidate Votes
A 23
B 12
C 14
D 25
Votes Table
1. Record vote and de-dupe 2. Increment candidate counter
Analytics with DynamoDB Streams
• Collect and de-dupe data in DynamoDB
• Aggregate data in-memory and flush periodically
Performing real-time aggregation and
analytics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved
aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS
Thank you!

More Related Content

What's hot

Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
Databricks
 
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and FeastScaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
Databricks
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Ververica
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Ververica
 
Building Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with PythonBuilding Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with Python
Timothy Spann
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
Databricks
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
Flink Forward
 
Stateful stream processing with Apache Flink
Stateful stream processing with Apache FlinkStateful stream processing with Apache Flink
Stateful stream processing with Apache Flink
Knoldus Inc.
 
Data Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its EcosystemData Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its Ecosystem
Databricks
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
Kishore Gopalakrishna
 
Apache Spark at Airbnb
Apache Spark at AirbnbApache Spark at Airbnb
Apache Spark at Airbnb
Databricks
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
Jason Hubbard
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue
 
How to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboardsHow to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboards
Wiiisdom
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
Databricks
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 

What's hot (20)

Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
 
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and FeastScaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
 
Building Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with PythonBuilding Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with Python
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Stateful stream processing with Apache Flink
Stateful stream processing with Apache FlinkStateful stream processing with Apache Flink
Stateful stream processing with Apache Flink
 
Data Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its EcosystemData Quality With or Without Apache Spark and Its Ecosystem
Data Quality With or Without Apache Spark and Its Ecosystem
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
 
Apache Spark at Airbnb
Apache Spark at AirbnbApache Spark at Airbnb
Apache Spark at Airbnb
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
How to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboardsHow to design web intelligence reports that behave like real dashboards
How to design web intelligence reports that behave like real dashboards
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 

Similar to Amazon DynamoDB Design Workshop

DynamoDB Design Workshop
DynamoDB Design WorkshopDynamoDB Design Workshop
DynamoDB Design Workshop
Amazon Web Services
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
AWS Webcast - Dynamo DB
AWS Webcast - Dynamo DBAWS Webcast - Dynamo DB
AWS Webcast - Dynamo DB
Amazon Web Services
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
Amazon Web Services
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
Amazon Web Services
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
Amazon Web Services
 
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
Amazon Web Services Korea
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
Amazon Web Services
 
AWS Data Collection & Storage
AWS Data Collection & StorageAWS Data Collection & Storage
AWS Data Collection & Storage
Amazon Web Services
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
Amazon Web Services
 
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Amazon Web Services
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
Amazon Web Services
 
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
Amazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
Amazon Web Services
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming
Amazon Web Services
 
How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks
How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech TalksHow to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks
How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks
Amazon Web Services
 

Similar to Amazon DynamoDB Design Workshop (20)

DynamoDB Design Workshop
DynamoDB Design WorkshopDynamoDB Design Workshop
DynamoDB Design Workshop
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
AWS Webcast - Dynamo DB
AWS Webcast - Dynamo DBAWS Webcast - Dynamo DB
AWS Webcast - Dynamo DB
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
 
AWS Data Collection & Storage
AWS Data Collection & StorageAWS Data Collection & Storage
AWS Data Collection & Storage
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming
 
How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks
How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech TalksHow to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks
How to Migrate from Cassandra to Amazon DynamoDB - AWS Online Tech Talks
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Amazon DynamoDB Design Workshop

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved Amazon DynamoDB Design Workshop Sean Shriver NoSQL Solutions Architect, AWS
  • 3. Hierarchical data structures as items • Use composite sort key to define a hierarchy • Highly selective result sets with sort key queries • Index anything, scales to any size Primary Key Attributes ProductID type Items 1 bookID title author genre publisher datePublished ISBN Some Book John Smith Science Fiction Ballantine Oct-70 0-345-02046-4 2 albumID title artist genre label studio released producer Some Album Some Band Progressive Rock Harvest Abbey Road 3/1/73 Somebody 2 albumID:trackID title length music vocals Track 1 1:30 Mason Instrumental 2 albumID:trackID title length music vocals Track 2 2:43 Mason Mason 2 albumID:trackID title length music vocals Track 3 3:30 Smith Johnson 3 movieID title genre writer producer Some Movie Scifi Comedy Joe Smith 20th Century Fox 3 movieID:actorID name character image Some Actor Joe img2.jpg 3 movieID:actorID name character image Some Actress Rita img3.jpg 3 movieID:actorID name character image Some Actor Frito img1.jpg
  • 4. … or as documents (JSON) • JSON data types (M, L, BOOL, NULL) • Document SDKs available • 400 KB maximum item size (limits hierarchical data structure) Primary Key Attributes ProductID Items 1 id title author genre publisher datePublished ISBN bookID Some Book Some Guy Science Fiction Ballantine Oct-70 0-345-02046-4 2 id title artist genre Attributes albumID Some Album Some Band Progressive Rock { label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals: "Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour, Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music: ”Gilmour, Waters", vocals: "Instrumental"}]} 3 id title genre writer Attributes movieID Some Movie Scifi Comedy Joe Smith { producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob: "9/21/71", character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya Rudolph", dob: "7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax Shepard", dob: "1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
  • 5. 1:1 relationships or key-values • Use a table or GSI with a partition key • Use GetItem or BatchGetItem API Example: Given a user or email, get attributes Users Table Partition key Attributes UserId = bob Email = bob@gmail.com, JoinDate = 2011-11-15 UserId = fred Email = fred@yahoo.com, JoinDate = 2011-12-01 Users-Email-GSI Partition key Attributes Email = bob@gmail.com UserId = bob, JoinDate = 2011-11-15 Email = fred@yahoo.com UserId = fred, JoinDate = 2011-12-01
  • 6. 1:N relationships or parent-children • Use a table or GSI with partition and sort key • Use Query API Example: Given a device, find all readings between epoch X, Y Device-measurements Part. Key Sort key Attributes DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90 DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
  • 7. N:M relationships • Use a table and GSI with partition and sort key elements switched • Use Query API Example: Given a user, find all games. Or given a game, find all users. User-Games-Table Part. Key Sort key UserId = bob GameId = Game1 UserId = fred GameId = Game2 UserId = bob GameId = Game3 Game-Users-GSI Part. Key Sort key GameId = Game1 UserId = bob GameId = Game2 UserId = fred GameId = Game3 UserId = bob
  • 8. Data Modeling Exercise #1 • A shopping cart use case • Attributes: cartId, dateTime, set of SKU’s, etc. • Mostly under 1KB in size • Up to 100 million carts in the table – 100GB total • Accessed by cartId (key-value access pattern) • 10K writes, and 10K reads per second • TTL to expire “old” carts • Also: • Notify when there is a price change for a SKU in a cart
  • 9. Cost estimation • The table: cartId, dateTime, set of SKU’s, etc. • Under 1KB in size • Up to 100 million carts in the table – up to 100GB total • Accessed by cartId (key-value access pattern) • 10K writes, and 10K reads per second • Cost estimate • Enter the storage, item size, reads, and writes • Get an estimate of your monthly bill: ~$6300 • The price notification job • E.g. 100 million items, scan/update over 24 hrs • ~ 1200 RCU and WCU for 24 hrs: ~ $22
  • 10. Use case: Time Series Data sensor_data DeviceId: 123 Timestamp: 1492641900 Temp: 172 AWS Lambda Kinesis Stream • Sensor data (temp) from 1000’s of sensors • Need to store for fast access • Access by sensor ID + time range • Store for 90 days DynamoDB
  • 11. Solution DeviceId: 123 Timestamp: 1492641900 Temp: 172 MyTTL: 1492736400 AWS Lambda Kinesis Stream Group multiple data points into a single item Save on writes Use TTL to manage data lifetime sensor_data DynamoDB
  • 12. Time Series Data Example Architecture Sensors AWS Lambda Amazon S3 hosted site Amazon DynamoDB Amazon Kinesis Amazon CloudWatch alarm Amazon Cognito Javascript SDK request Amazon SNS Python (boto) JavaScriptJava
  • 13. Time Series Data  High volume ingest: Kinesis + DynamoDB  Fast access: DynamoDB  At a reasonable cost: <$10K/month for: • 2.5TB of stored data points per month • Ingest of 100K data points per second All using a serverless architecture with managed services on AWS
  • 14. Cost estimate • Data rate: 100000 data points per second • Data storage: 1 month’s worth = ~2.5TB Kinesis: 100000 records -> 100 shards -> ~$5K per month DynamoDB: 100000 WCU’s -> $50K per month • Binned writes: 10000 WCU’s -> $5000 per month (10x less…) • Diff. over 1 year: $600K vs. $60K We can save a lot by storing multiple data points per item
  • 15. Using TTL to age out cold data RCUs = 1000 WCUs = 1000 HotdataColddata Use DynamoDB TTL and Streams to archive cold data Events_table_2016_April Event_id (Partition key) Timestamp (sort key) myTTL 1489188093 …. Attribute NData table S3 AWS Lambda Amazon Kinesis Firehose
  • 16. Age out cold data • Configure a TTL attribute for table • For each item, set it to item expiration date/time • DynamoDB automatically deletes expired items • Use DynamoDB Streams to act on expired items Dealing with data that gets cold over time (all data!)
  • 17. Messages table Messages app David SELECT * FROM Messages WHERE Recipient='David' LIMIT 50 ORDER BY Date DESC Inbox SELECT * FROM Messages WHERE Sender ='David' LIMIT 50 ORDER BY Date DESC Outbox Use case: Messaging App
  • 18. Recipient Date Sender Message David 2014-10-02 Bob … … 48 more messages for David … David 2014-10-03 Alice … Alice 2014-09-28 Bob … Alice 2014-10-01 Carol … Solution v.1 (Many more messages) David Messages table 50 items × 256 KB each Partition key Sort key Large message bodies Attachments SELECT * FROM Messages WHERE Recipient='David' LIMIT 50 ORDER BY Date DESC Inbox
  • 19. Computing inbox query cost Items evaluated by query Average item size Conversion ratio Eventually consistent reads 50 * 256KB * (1 RCU / 4 KB) * (1 / 2) = 1600 RCU
  • 20. Recipient Date Sender Subject MsgId David 2014-10-02 Bob Hi!… afed David 2014-10-03 Alice RE: The… 3kf8 Alice 2014-09-28 Bob FW: Ok… 9d2b Alice 2014-10-01 Carol Hi!... ct7r Solution v.2: Separate the bulk data Inbox-GSI Messages table MsgId … 9d2b … 3kf8 … ct7r … afed … David (50 sequential items at 128 bytes) Query inbox-GSI: 50 * 128B * (1 RCU / 4 KB) * (1 / 2) = 1 RCU
  • 22. Outbox GSI SELECT * FROM Messages WHERE Sender ='David' LIMIT 50 ORDER BY Date DESC
  • 24. • Reduce one-to-many item sizes • Configure secondary index projections • Use GSIs to model M:N relationship between sender and recipient • Use GSIs to select the right subset of data for better performance and lower cost Distribute large items Querying many large items at once InboxMessagesOutbox
  • 25. Use case: Real-time voting Votes Table Voters RawVotes Table Voting App
  • 26. Partition 1 1000 WCUs Partition K 1000 WCUs Partition M 1000 WCUs Partition N 1000 WCUs Votes Table Candidate A Candidate B Real-time voting: scaling bottlenecks Voters Provision 200,000 WCUs
  • 27. Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 UpdateItem: “CandidateA_” + rand(0, 10) ADD 1 to Votes Candidate A_6 Candidate A_8 Candidate A_5 Voter Votes Table Write-sharding
  • 28. Votes Table Shard aggregation Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_5 Candidate A_6 Candidate A_8 Candidate A_7 Candidate B_8 Periodic process Candidate A Total: 2.5M 1. Sum 2. Store Voter
  • 29. Real-time voting: tables UserId Candidate Date Alice A 2016-10-02 Bob B 2016-10-02 Eve B 2016-10-02 Chuck A 2016-10-02 RawVotes Table Segment Votes A_1 23 B_2 12 B_1 14 A_2 25 Votes Table 1. Record vote and de-dupe 2. Increment candidate counter
  • 30. • Trade off read cost for write scalability • Consider throughput per partition key Shard write-heavy partition keys Your write workload is not horizontally scalable
  • 31. Real-time voting v2: aggregation with Streams Votes Table Amazon Redshift Amazon EMR Amazon Kinesis– Enabled App Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  • 32. Real-time voting v2: tables UserId Candidate Date Alice A 2016-10-02 Bob B 2016-10-02 Eve B 2016-10-02 Chuck A 2016-10-02 RawVotes Table Candidate Votes A 23 B 12 C 14 D 25 Votes Table 1. Record vote and de-dupe 2. Increment candidate counter
  • 33. Analytics with DynamoDB Streams • Collect and de-dupe data in DynamoDB • Aggregate data in-memory and flush periodically Performing real-time aggregation and analytics
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved aws.amazon.com/activate Everything and Anything Startups Need to Get Started on AWS Thank you!