Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Dive on Amazon DynamoDB

818 views

Published on

In this session, we explore Amazon DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.

  • Hey guys! Who wants to chat with me? More photos with me here 👉 http://www.bit.ly/katekoxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Searching sex for a single night? Welcome to http://goo.gl/1MPRGf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Deep Dive on Amazon DynamoDB

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved Deep Dive: Amazon DynamoDB Sean Shriver NoSQL Solutions Architect Amazon Web Services
  2. 2. Agenda • Tables, API, data types, indexes • Scaling • Data modeling • Scenarios and best practices • DynamoDB Streams • Reference architecture
  3. 3. Amazon DynamoDB • Managed NoSQL database service • Supports both document and key-value data models • Highly scalable • Consistent, single-digit millisecond latency at any scale • Highly available—3x replication • Simple and powerful API
  4. 4. Tables, Partitioning
  5. 5. Table Table Items Attributes Partition Key Sort Key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for a partition key ==, <, >, >=, <= “begins with” “between” sorted results counts top/bottom N values paged responses
  6. 6. • CreateTable • UpdateTable • DeleteTable • DescribeTable • ListTables • UpdateTimeToLive • DescribeTimeToLive • GetItem • Query • Scan • BatchGetItem _______________ • PutItem • UpdateItem • DeleteItem • BatchWriteItem • ListStreams • DescribeStream • GetShardIterator • GetRecords Stream API DynamoDB Table Item APIs
  7. 7. Data types • String (S) • Number (N) • Binary (B) • String Set (SS) • Number Set (NS) • Binary Set (BS) • Map (M) • List (L) • Boolean (BOOL) • Null (NULL) Used for storing nested JSON documents
  8. 8. 00 55 A954 AA FF Partition table • Partition key uniquely identifies an item • Partition key is used for building an unordered hash index • Table can be partitioned for scale 00 FF Id = 1 Name = Jim Hash (1) = 7B Id = 2 Name = Andy Dept = Engg Hash (2) = 48 Id = 3 Name = Kim Dept = Ops Hash (3) = CD Key Space
  9. 9. Partitions are three-way replicated Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Replica 1 Replica 2 Replica 3 Partition 1 Partition 2 Partition N
  10. 10. Partition-sort key table • Partition key and sort key together uniquely identify an Item • Within unordered partition key-space, data is sorted by the sort key • No limit on the number of items (∞) per partition key – Except if you have local secondary indexes 00:0 FF:∞ Hash (2) = 48 Customer# = 2 Order# = 10 Item = Pen Customer# = 2 Order# = 11 Item = Shoes Customer# = 1 Order# = 10 Item = Toy Customer# = 1 Order# = 11 Item = Boots Hash (1) = 7B Customer# = 3 Order# = 10 Item = Book Customer# = 3 Order# = 11 Item = Paper Hash (3) = CD 55:0 A9:∞54:∞ AA:0 Partition 1 Partition 2 Partition 3
  11. 11. Indexes
  12. 12. Global secondary index (GSI) • Alternate partition (+sort) key • Index is across all table partition keys GSIs A5 (part.) A4 (sort) A1 (table key) A3 (projected) Table INCLUDE A3 A4 (part.) A5 (sort) A1 (table key) A2 (projected) A3 (projected) ALL A2 (part.) A1 (table key) KEYS_ONLY RCU/WCU provisioned separately for GSIs Online Indexing A1 (partition) A2 A3 A4 A5
  13. 13. How do GSI updates work? Table Primary table Primary table Primary table Primary table Global Secondary Index Client 3. Asynchronous update (in progress) If GSIs don’t have enough write capacity, table writes will be throttled!
  14. 14. How do GSI updates work? Table Primary table Primary table Primary table Primary table Global Secondary Index Client If GSI is under provisioned, back pressure can occur
  15. 15. Local secondary index (LSI) • Alternate sort key attribute • Index is local to a partition key A1 (partition) A3 (sort) A2 (table key) A1 (partition) A2 (sort) A3 A4 A5 LSIs A1 (partition) A4 (sort) A2 (table key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (partition) A5 (sort) A2 (table key) A3 (projected) A4 (projected) ALL 10 GB max per partition key, therefore LSIs limit the # of sort keys!
  16. 16. LSI or GSI? • LSI can be modeled as a GSI • If data size in an item collection > 10 GB, use GSI • If eventual consistency is okay for your scenario, use GSI!
  17. 17. Scaling
  18. 18. Scaling: Disk Size • Scaling is achieved through partitioning • Size on Disk – Partitions are ~10GB – Add any number of items to a table • Max item size is 400 KB • LSIs limit the number of items due to 10 GB limit
  19. 19. Scaling: Throughput • Provisioned at the table level – Write capacity units (WCUs) are measured in 1 KB per second – Read capacity units (RCUs) are measured in 4 KB per second • RCUs measure strictly consistent reads • Eventually consistent reads cost 1/2 of consistent reads • Read and write throughput limits are independent WCURCU
  20. 20. Partitioning math # 𝑜𝑓 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 = 𝑇𝑎𝑏𝑙𝑒 𝑆𝑖𝑧𝑒 𝑖𝑛 𝐺𝐵 10 𝐺𝐵(𝑓𝑜𝑟 𝑠𝑖𝑧𝑒) # 𝑜𝑓 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 (𝑓𝑜𝑟 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡) = 𝑅𝐶𝑈𝑓𝑜𝑟 𝑟𝑒𝑎𝑑𝑠 3000 𝑅𝐶𝑈 + 𝑊𝐶𝑈𝑓𝑜𝑟 𝑤𝑟𝑖𝑡𝑒𝑠 1000 𝑊𝐶𝑈 (𝑓𝑜𝑟 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡)(𝑓𝑜𝑟 𝑠𝑖𝑧𝑒)(𝑡𝑜𝑡𝑎𝑙) In the future, these details might change…
  21. 21. Partitioning example # 𝑜𝑓 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 = 8 𝐺𝐵 10 𝐺𝐵 = 0.8 = 1 (𝑓𝑜𝑟 𝑠𝑖𝑧𝑒) # 𝑜𝑓 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 (𝑓𝑜𝑟 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡) = 5000 𝑅𝐶𝑈 3000 𝑅𝐶𝑈 + 500 𝑊𝐶𝑈 1000 𝑊𝐶𝑈 = 2.17 = 3 Table size = 8 GB, RCUs = 5000, WCUs = 500 (𝑡𝑜𝑡𝑎𝑙) RCUs per partition = 5000/3 = 1666.67 WCUs per partition = 500/3 = 166.67 Data/partition = 10/3 = 3.33 GB RCUs and WCUs are uniformly spread across partitions on table creation
  22. 22. Getting the most out of DynamoDB throughput “To get the most out of DynamoDB throughput, create tables where the partition key has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible.” —DynamoDB Developer Guide 1. Key Choice: High key cardinality 2. Uniform Access: access is evenly spread over the key-space 3. Time: requests arrive evenly spaced in time
  23. 23. Example: Key Choice or Uniform Access Partition Time Heat
  24. 24. Example: Time
  25. 25. Example: Uniform access
  26. 26. How does DynamoDB handle bursts? • DynamoDB saves 300 seconds of unused capacity per partition Bursting is best effort!
  27. 27. Burst capacity is built-in 0 400 800 1200 1600 CapacityUnits Time Provisioned Consumed “Save up” unused capacity Consume saved up capacity Burst: 300 seconds (1200 × 300 = 360k CU)
  28. 28. Burst capacity may not be sufficient 0 400 800 1200 1600 CapacityUnits Time Provisioned Consumed Attempted Throttled requests Don’t completely depend on burst capacity… provision sufficient throughput Burst: 300 seconds (1200 × 300 = 360k CU)
  29. 29. What causes throttling? • If sustained throughput goes beyond provisioned throughput per partition • From the example before: – Table created with 5000 RCUs, 500 WCUs – RCUs per partition = 1666.67 – WCUs per partition = 166.67 – If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition, DynamoDB may throttle requests • Solution: Increase provisioned throughput
  30. 30. What causes throttling? • Non-uniform workloads – Hot keys/hot partitions – Large collection sizes [100s of MBs under one partition key] • Dilution of throughout across partitions caused by mixing hot data with cold data – Use a table per time period for storing time series data so WCUs and RCUs are applied to the hot data set
  31. 31. Data Modeling Store data based on how you will access it!
  32. 32. 1:1 relationships or key-values • Use a table or GSI with a partition key • Use GetItem or BatchGetItem API Example: Given a user or email, get attributes Users Table Partition key Attributes UserId = bob Email = bob@gmail.com, JoinDate = 2016-11-15 UserId = fred Email = fred@yahoo.com, JoinDate = 2016-12-01 Users-Email-GSI Partition key Attributes Email = bob@gmail.com UserId = bob, JoinDate = 2016-11-15 Email = fred@yahoo.com UserId = fred, JoinDate = 2016-12-01
  33. 33. 1:N relationships or parent-children • Use a table or GSI with partition and sort key • Use Query API Example: Given a device, find all readings between epoch X, Y Device-measurements Part. Key Sort key Attributes DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90 DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
  34. 34. N:M relationships • Use a table and GSI with partition and sort key elements switched • Use Query API Example: Given a user, find all games. Or given a game, find all users. User-Games-Table Part. Key Sort key UserId = bob GameId = Game1 UserId = fred GameId = Game2 UserId = bob GameId = Game3 Game-Users-GSI Part. Key Sort key GameId = Game1 UserId = bob GameId = Game2 UserId = fred GameId = Game3 UserId = bob
  35. 35. Documents (JSON) • Data types (M, L, BOOL, NULL) introduced to support JSON • Document SDKs – Simple programming model – Conversion to/from JSON – Java, JavaScript, Ruby, .NET • Cannot create an Index on elements of a JSON object stored in Map – They need to be modeled as top- level table attributes to be used in LSIs and GSIs • Set, Map, and List have no element limit but depth is 32 levels Javascript DynamoDB string S number N boolean BOOL null NULL array L object M
  36. 36. Rich expressions • Projection expression – Query/Get/Scan: ProductReviews.FiveStar[0] • Filter expression – Query/Scan: #V > :num (#V is a place holder for keyword VIEWS) • Conditional expression – Put/Update/DeleteItem: attribute_not_exists (#pr.FiveStar) • Update expression – UpdateItem: set Replies = Replies + :num
  37. 37. Scenarios and Best Practices
  38. 38. Event Logging Storing time series data
  39. 39. Time series tables Events_table_2016_April Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N Events_table_2016_March Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N Events_table_2016_Feburary Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N Events_table_2016_January Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N RCUs = 1000 WCUs = 100 RCUs = 10000 WCUs = 10000 RCUs = 100 WCUs = 1 RCUs = 10 WCUs = 1 Current table Older tables HotdataColddata Don’t mix hot and cold data; archive cold data to Amazon S3
  40. 40. DynamoDB TTL RCUs = 10000 WCUs = 10000 RCUs = 100 WCUs = 1 HotdataColddata Use DynamoDB TTL and Streams to archive Events_table_2016_April Event_id (Partition key) Timestamp (sort key) myTTL 1489188093 …. Attribute N Current table Events_Archive Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N
  41. 41. Isolate cold data from hot data • Pre-create daily, weekly, monthly tables • Provision required throughput for current table • Writes go to the current table • Turn off (or reduce) throughput for older tables OR move items to separate table with TTL Dealing with time series data
  42. 42. Product Catalog Popular items (read)
  43. 43. Partition 1 2000 RCUs Partition K 2000 RCUs Partition M 2000 RCUs Partition 50 2000 RCU Scaling bottlenecks Product A Product B Shoppers ProductCatalog Table 100,000 𝑅𝐶𝑈 50 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≈ 𝟐𝟎𝟎𝟎 𝑅𝐶𝑈 𝑝𝑒𝑟 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 SELECT Id, Description, ... FROM ProductCatalog WHERE Id="POPULAR_PRODUCT"
  44. 44. RequestsPerSecond Item Primary Key Request Distribution Per Partition Key DynamoDB Requests
  45. 45. Partition 1 Partition 2 ProductCatalog Table User DynamoDB User SELECT Id, Description, ... FROM ProductCatalog WHERE Id="POPULAR_PRODUCT"
  46. 46. RequestsPerSecond Item Primary Key Request Distribution Per Partition Key DynamoDB Requests Cache Hits
  47. 47. Messaging App Large items Filters vs. indexes M:N Modeling—inbox and outbox
  48. 48. Messages Table Messages App David SELECT * FROM Messages WHERE Recipient='David' LIMIT 50 ORDER BY Date DESC Inbox SELECT * FROM Messages WHERE Sender ='David' LIMIT 50 ORDER BY Date DESC Outbox
  49. 49. Recipient Date Sender Message David 2016-10-02 Bob … … 48 more messages for David … David 2016-10-03 Alice … Alice 2016-09-28 Bob … Alice 2016-10-01 Carol … Large and small attributes mixed (Many more messages) David Messages Table 50 items × 256 KB each Large message bodies Attachments SELECT * FROM Messages WHERE Recipient='David' LIMIT 50 ORDER BY Date DESC Inbox
  50. 50. Computing inbox query cost Items evaluated by query Average item size Conversion ratio Eventually consistent reads
  51. 51. Recipient Date Sender Subject MsgId David 2016-10-02 Bob Hi!… afed David 2016-10-03 Alice RE: The… 3kf8 Alice 2016-09-28 Bob FW: Ok… 9d2b Alice 2016-10-01 Carol Hi!... ct7r Separate the bulk data Inbox-GSI Messages Table MsgId Body 9d2b … 3kf8 … ct7r … afed … David 1. Query Inbox-GSI: 1 RCU 2. BatchGetItem Messages: 1600 RCU (50 separate items at 256 KB) (50 sequential items at 128 bytes) Uniformly distributes large item reads
  52. 52. Inbox GSI
  53. 53. Simplified writes David PutItem { MsgId: 123, Body: ..., Recipient: Steve, Sender: David, Date: 2016-10-23, ... } Inbox Global secondary index Messages Table
  54. 54. Messaging app Messages Table David Inbox Global secondary index Inbox Outbox Global secondary index Outbox
  55. 55. • Reduce one-to-many item sizes • Configure secondary index projections • Use GSIs to model M:N relationship between sender and recipient Distribute large items Querying many large items at once InboxMessagesOutbox
  56. 56. Multiplayer Online Gaming Query filters vs. composite key indexes
  57. 57. GameId Date Host Opponent Status d9bl3 2016-10-02 David Alice DONE 72f49 2016-09-30 Alice Bob PENDING o2pnb 2016-10-08 Bob Carol IN_PROGRESS b932s 2016-10-03 Carol Bob PENDING ef9ca 2016-10-03 David Bob IN_PROGRESS Games Table Multiplayer online game data
  58. 58. Query for incoming game requests • DynamoDB indexes provide partition and sort key • What about queries for two equalities and a sort? SELECT * FROM Game WHERE Opponent='Bob‘ AND Status=‘PENDING' ORDER BY Date DESC (partition) (sort) (???)
  59. 59. Secondary Index Opponent Date GameId Status Host Alice 2016-10-02 d9bl3 DONE David Carol 2016-10-08 o2pnb IN_PROGRESS Bob Bob 2016-09-30 72f49 PENDING Alice Bob 2016-10-03 b932s PENDING Carol Bob 2016-10-03 ef9ca IN_PROGRESS David Approach 1: Query filter Bob
  60. 60. Secondary Index Approach 1: Query filter Bob Opponent Date GameId Status Host Alice 2016-10-02 d9bl3 DONE David Carol 2016-10-08 o2pnb IN_PROGRESS Bob Bob 2016-09-30 72f49 PENDING Alice Bob 2016-10-03 b932s PENDING Carol Bob 2016-10-03 ef9ca IN_PROGRESS David SELECT * FROM Game WHERE Opponent='Bob' ORDER BY Date DESC FILTER ON Status='PENDING' (filtered out)
  61. 61. Needle in a haystack Bob
  62. 62. • Send back less data “on the wire” • Simplify application code • Simple SQL-like expressions – AND, OR, NOT, () Use query filter Your index isn’t entirely selective
  63. 63. Approach 2: Composite key StatusDate DONE_2016-10-02 IN_PROGRESS_2016-10-08 IN_PROGRESS_2016-10-03 PENDING_2016-09-30 PENDING_2016-10-03 Status DONE IN_PROGRESS IN_PROGRESS PENDING PENDING Date 2016-10-02 2016-10-08 2016-10-03 2016-10-03 2016-09-30
  64. 64. Secondary Index Approach 2: Composite key Opponent StatusDate GameId Host Alice DONE_2016-10-02 d9bl3 David Carol IN_PROGRESS_2016-10-08 o2pnb Bob Bob IN_PROGRESS_2016-10-03 ef9ca David Bob PENDING_2016-09-30 72f49 Alice Bob PENDING_2016-10-03 b932s Carol
  65. 65. Opponent StatusDate GameId Host Alice DONE_2016-10-02 d9bl3 David Carol IN_PROGRESS_2016-10-08 o2pnb Bob Bob IN_PROGRESS_2016-10-03 ef9ca David Bob PENDING_2016-09-30 72f49 Alice Bob PENDING_2016-10-03 b932s Carol Secondary Index Approach 2: Composite key Bob SELECT * FROM Game WHERE Opponent='Bob' AND StatusDate BEGINS_WITH 'PENDING'
  66. 66. Needle in a sorted haystack Bob
  67. 67. Approach 2: Simplified with sparse indexes Id (Part.) User Game Score Date Award 1 Bob G1 1300 2016-12-23 2 Bob G1 1450 2016-12-23 3 Jay G1 1600 2016-12-24 4 Mary G1 2000 2016-10-24 Champ 5 Ryan G2 123 2016-03-10 6 Jones G2 345 2016-03-20 Game-scores-table Award (Part.) Id User Score Champ 4 Mary 2000 Award-GSI Scan sparse partition GSIs
  68. 68. • Concatenate attributes to form useful secondary index keys • Take advantage of sparse indexes Replace filter with indexes You want to optimize a query as much as possible Status + Date
  69. 69. Real-Time Voting Write-heavy items
  70. 70. Requirements for voting • Allow each person to vote only once • No changing votes • Real-time aggregation • Voter analytics, demographics
  71. 71. Real-time voting architecture AggregateVotes Table Voters RawVotes Table Voting App
  72. 72. Partition 1 1000 WCUs Partition K 1000 WCUs Partition M 1000 WCUs Partition N 1000 WCUs Votes Table Candidate A Candidate B Scaling bottlenecks Voters Provision 200,000 WCUs
  73. 73. Write sharding Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 Candidate A_6 Candidate A_8 Candidate A_5 Voter Votes Table
  74. 74. Write sharding Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 UpdateItem: “CandidateA_” + rand(0, 10) ADD 1 to Votes Candidate A_6 Candidate A_8 Candidate A_5 Voter Votes Table
  75. 75. Votes Table Shard aggregation Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_5 Candidate A_6 Candidate A_8 Candidate A_7 Candidate B_8 Periodic Process Candidate A Total: 2.5M 1. Sum 2. Store Voter
  76. 76. • Trade off read cost for write scalability • Consider throughput per partition key and per partition Shard write-heavy partition keys Your write workload is not horizontally scalable
  77. 77. Correctness in voting UserId Candidate Date Alice A 2016-10-02 Bob B 2016-10-02 Eve B 2016-10-02 Chuck A 2016-10-02 RawVotes Table Segment Votes A_1 23 B_2 12 B_1 14 A_2 25 AggregateVotes Table Voter 1. Record vote and de-dupe; retry 2. Increment candidate counter
  78. 78. Correctness in aggregation? UserId Candidate Date Alice A 2016-10-02 Bob B 2016-10-02 Eve B 2016-10-02 Chuck A 2016-10-02 RawVotes Table Segment Votes A_1 23 B_2 12 B_1 14 A_2 25 AggregateVotes Table Voter
  79. 79. DynamoDB Streams
  80. 80. • Stream of updates to a table • Asynchronous • Exactly once • Strictly ordered – Per item • Highly durable • Scale with table • 24-hour lifetime • Sub-second latency DynamoDB Streams
  81. 81. View Type Destination Old image—before update Name = John, Destination = Mars New image—after update Name = John, Destination = Pluto Old and new images Name = John, Destination = Mars Name = John, Destination = Pluto Keys only Name = John View types UpdateItem (Name = John, Destination = Pluto)
  82. 82. Stream Partition 1 Partition 2 Partition 3 Partition 4 Table Shard 1 Shard 2 Shard 3 Shard 4 KCL Worker KCL Worker KCL Worker KCL Worker Amazon Kinesis Client Library Application DynamoDB Client Application Updates DynamoDB Streams and Amazon Kinesis Client Library
  83. 83. DynamoDB Streams Open Source Cross- Region Replication Library Asia Pacific (Sydney) EU (Ireland) Replica US East (N. Virginia) Cross-region replication
  84. 84. DynamoDB Streams and AWS Lambda
  85. 85. Real-time voting architecture (improved) AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  86. 86. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis- Enabled App Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  87. 87. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis- Enabled app Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  88. 88. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting app RawVotes DynamoDB Stream
  89. 89. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting app RawVotes DynamoDB Stream
  90. 90. Analytics with DynamoDB Streams • Collect and de-dupe data in DynamoDB • Aggregate data in-memory and flush periodically Performing real-time aggregation and analytics
  91. 91. Architecture
  92. 92. Reference Architecture
  93. 93. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved aws.amazon.com/activate Everything and Anything Startups Need to Get Started on AWS

×