AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Dive into Global Secondary Indexes
Upcoming SlideShare
Loading in...5
×
 

AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Dive into Global Secondary Indexes

on

  • 2,457 views

Amazon DynamoDB is a fully managed, highly scalable distributed database service. Global Secondary Indexes (GSI) give you the flexibility to query your DynamoDB tables in new and powerful ways.

Amazon DynamoDB is a fully managed, highly scalable distributed database service. Global Secondary Indexes (GSI) give you the flexibility to query your DynamoDB tables in new and powerful ways.

In this session, we will:
• Describe how GSI's work under the covers to ensure consistent low latency at any scale.
• Walk through various access patterns so that you will learn how to take full advantage of GSI's and implement best practice designs that will scale efficiently and cost-effectively.

This session is designed for developers and architects seeking to build rich applications that require performance and availability with absolute data durability.

Statistics

Views

Total Views
2,457
Views on SlideShare
2,456
Embed Views
1

Actions

Likes
1
Downloads
20
Comments
0

1 Embed 1

http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Dive into Global Secondary Indexes  AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Dive into Global Secondary Indexes Presentation Transcript

  • Optimize Your Database for the Cloud with DynamoDB A Deep Dive into Global Secondary Indexes (GSI) David Pearson Siva Raghupathy 1 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • automated operations = predictable performance database service durable low latency cost effective 2 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Durable Low Latency WRITES Continuously replicated to 3 AZ’s Quorum acknowledgment Persisted to disk (custom SSD) READS Strongly or eventually consistent No trade-off in latency 3 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Recent Announcements Secondary Indexes (Local and Global) DynamoDB Local • Disconnected development with full API support • No network • No usage costs • No SLA Fine-Grained Access Control • Direct-to-DynamoDB access for mobile devices Geospatial and Transaction Libraries 4 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts table 5 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts table items 6 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts table items attributes schema-less schema is defined per attribute 7 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts hash hash keys mandatory for all items in a table key-value access pattern 8 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts partition 1 .. N hash keys mandatory for all items in a table key-value access pattern determines data distribution 9 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts hash range range keys model 1:N relationships enable rich query capabilities composite primary key all items for a hash key ==, <, >, >=, <= “begins with” “between” sorted results counts top / bottom N values paged responses 10 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts local secondary indexes (LSI) alternate range key + same hash key index and table data is co-located (same partition) 11 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • LSI Attribute Projections Table LSIs A1 A2 (hash) (range) A3 A1 A3 A2 (hash) (range) (table key) A4 A5 KEYS_ONLY A1 A4 A2 A3 (hash) (range) (table key) (projected) INCLUDE A3 A1 A4 A2 A3 A5 (hash) (range) (table key) (projected) (projected) ALL 12 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Concepts global secondary indexes (GSI) any attribute indexed as new hash and/or range key 13 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Local Secondary Index 1 Key = hash key and a range key Global Secondary Index Key = hash or hash-and-range 2 Hash same attribute as that of the table. Range key can be any scalar table attribute The index hash key and range key (if present) can be any scalar table attributes 3 For each hash key, the total size of all indexed items must be 10 GB or less No size restrictions for global secondary indexes 4 Query over a single partition, as specified by the hash Query over the entire table, across all partitions key value in the query 5 Eventual consistency or strong consistency Eventual consistency only 6 Read and write capacity units consumed from the table. Every global secondary index has its own provisioned read and write capacity units 7 Query will automatically fetch non-projected attributes Query can only request projected attributes. It will not from the table fetch any attributes from the table 14 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • GSI Attribute Projections Table A1 (hash) A2 A2 A1 (hash) (table key) GSIs A3 A4 A5 KEYS_ONLY A5 A3 A1 (hash) (range) (table key) KEYS_ONLY A5 A4 A1 A3 (hash) (range) (table key) (projected) INCLUDE A3 A4 A5 A1 A2 A3 (hash) (range) (table key) (projected) (projected) ALL 15 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • GSI Query Pattern Query covered by GSI • Query GSI & get the attributes Query not covered by GSI • Query GSI get the table key(s) • BatchGetItem/GetItem from table • 2 or more round trips to DynamoDB Tip: If you need very low latency then project all required attributes into GSI 16 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • How do GSI updates work Client Table Primary Primary Primary table Global Primary table table Secondary table 2. Asynchronous update (in progress) Index 17 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 1 Table update = 0, 1 or 2 GSI updates Table Operation No of GSI index updates • Item not in Index before or after update 0 • Update introduces a new indexed-attribute • Update deletes the indexed-attribute 1 • Updated changes the value of an indexed attribute from 2 A to B 18 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • GSI EXAMPLES 19 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Example1: Multi-tenant application for file storing and sharing Access Patterns 1. 2. 3. 4. 5. 6. Users should be able to query all the files they own Search by File Name Search by File Type Search by Date Range Keep track of Shared Files Search by descending order or File Size © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DynamoDB Data Model Users • Hash key = UserId (S) • Attributes = User Name (S), Email (S), Address (SS), etc. User_Files • Hash key = UserId (S) – This is also the tenant id • Range key = FileId (S) • Attributes = Name (S), Type (S), Size (N), Date (S), SharedFlag (S), S3key (S) © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Global Secondary Indexes Table Name Index Name Attribute to Index Projected Attribute User_Files NameIndex Name KEYS User_Files TypeIndex Type KEYS + Name User_Files DateIndex Date KEYS + Name User_Files SharedFlagIndex SharedFlag KEYS + Name User_Files SizeIndex Size KEYS + Name © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Access Pattern 1 Find all files owned by a user • Query (UserId = 2) UserId (Hash) FileId (Range) Name Date Type 1 1 File1 2013-04-23 JPG 1 2 File2 2013-03-10 PDF 2 3 File3 2013-03-10 PNG 2 4 File4 2013-03-10 3 5 File5 2013-04-10 SharedFlag Size S3key 1000 bucket1 Y 100 bucket2 Y 2000 bucket3 DOC 3000 bucket4 TXT 400 bucket5 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Access Pattern 2 NameIndex Search by file name Name (range) FileId 1 File1 1 1 File2 2 2 File3 3 2 File4 4 3 • Query (IndexName = NameIndex, UserId = 1, Name = File1) UserId (hash) File5 5 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Access Pattern 3 TypeIndex Search for file name by file Type FileId Name 1 JPG 1 File1 1 PDF 2 File2 2 DOC 4 File4 2 PNG 3 File3 3 • Query (IndexName = TypeIndex, UserId = 2, Type = DOC) UserId Type (hash) (range) TXT 5 File5 Projection © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Access Pattern 4 Search for file name by date range • Query (IndexName = DateIndex, UserId = 1, Date between 2013-0301 and 2013-03-29) DateIndex UserId Date (hash) (range) FileId Name 1 2013-03-10 2 File2 1 2013-04-23 1 File1 2 2013-03-10 3 File3 2 2013-03-10 4 File4 3 2013-04-10 5 File5 Projection © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Access Pattern 5 SharedFlagIndex Search for names of Shared files • Query (IndexName = SharedFlagIndex, UserId = 1, SharedFlag = Y) UserId SharedFlag (hash) (range) FileId Name 1 Y 2 File2 2 Y 3 File3 Projection © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Access Pattern 6 Query for file names by descending order of file size • Query (IndexName = SizeIndex, UserId = 1, ScanIndexForward = false) SizeIndex UserId (hash) Size (range) FileId Name 1 100 1 File1 3 400 2 File2 1 1000 3 File3 2 2000 4 File4 2 3000 5 File5 Projection © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Example2: Find top score for game G1 Game-scores-table Id (hash key) User Game Score Date 1 Bob G1 1300 2012-12-23 18:00:00 2 3 Bob Jay G1 G1 1450 1600 2012-12-23 19:00:00 2012-12-24 20:00:00 4 5 6 Mary Ryan Jones G1 G2 G2 2000 123 345 2012-10-24 17:00:00 2012-03-10 15:00:00 2012-03-20 15:00:00 29 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • GameScoresIndex Game-scores-table Id (hash key) User Game Score Date Id (table key) User (projected) Date (projected) Game-scores-index Games (hash) Score (range) 30 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Game-scores-index Game (Hash) Score (Range) Id User Date G1 2000 4 Mary 2012-10-24 17:00:00 G1 1600 3 Jay 2012-12-24 20:00:00 G1 1450 2 Bob 2012-12-23 19:00:00 G1 1300 1 Bob 2012-12-23 18:00:00 G2 345 6 Jones 2012-03-20 15:00:00 G2 123 5 Ryan 2012-03-10 15:00:00 31 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Query: Find top score for game G1 32 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • DATA MODELING WITH GSI 33 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Modeling 1:1 relationships Use a table with a Hash key or a GSI with a hash key Example: Users Table • Users Hash key = UserID • Users-email-GSI Hash key = Email Hash key UserId = bob UserId = fred Attributes Email = bob@gmail.com, JoinDate = 2011-11-15 Email = fred@yahoo.com, JoinDate = 2011-1201, Sex = M Users-email-GSI Hash key Email = bob@gmail.com Email = fred@yahoo.com Attributes UserId = bob, JoinDate = 2011-11-15 UserId = fred, JoinDate = 2011-12-01, Sex = M 34 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Modeling 1:N relationships Use a table with Hash and Range key or GSI () Example: • One (1) User can play many (N) Games Hash Key UserId = bob UserId = fred UserId = bob User-Games-Table Attributes GameId = Game1, HighScore = 10500, ScoreDate = 2011-10-20 GameId = Game2 HIghScore = 12000, ScoreDate = 2012-01-10 GameId = Game3 HighScore = 20000, ScoreDate = 2012-02-12 User-Games-GSI Hash Key Range Attributes key UserId = bob GameId HighScore = 10500, = Game1 ScoreDate = 2011-10-20 UserId = GameId HIghScore = 12000, fred = Game2 ScoreDate = 2012-01-10 UserId = bob GameId HighScore = 20000, = Game3 ScoreDate = 2012-02-12 35 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Modeling N:M relationships Use GSI • Example: 1 user plays multiple games and 1 game has multiple users User-Games-Table Hash Key Range key UserId = bob GameId = Game1 UserId = fred GameId = Game2 UserId = bob GameId = Game3 Game-Users-GSI Hash Key Range key GameId = Game1 UserId = bob GameId = Game2 UserId = fred GameId = Game3 UserId = bob 36 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Best Practices Choose a GSI Hash Key with high cardinality Employee-Table Id (hash) Name Sex Address Cardinality of Sex = 2 (M/F) SexDOB-GSI Sex (Hash) DOB DOB Id Name Address Solution: Generate aliases for M/F by suffixing a known range of integers (say 1 to 100) and Query for each value M_1 to M_100 37 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Best Practices Take advantage of Sparse Indexes Game-scores-table Id (hash) User Game Score Date 1 Bob G1 1300 2012-12-23 2 3 Bob Jay G1 G1 1450 2012-12-23 1600 2012-12-24 4 5 6 Mary G1 Ryan G2 Jones G2 2000 2012-10-24 123 2012-03-10 345 2012-03-20 Award-GSI Award Award (hash) Id User Score Champ 4 Mary 2000 Champ 38 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Best Practices Query GSI for quick item lookups • Less read capacity units consumed Mail Box-Table ID (hash key) Timestamp (range key) Mail Box-lookup-GSI Attribute1 Attribute2 Attribute3 …. LargeAttachment ID (hash key) Timestamp (range key) Attribute1 Attribute2 Attribute3 39 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Best Practices Provision enough throughput for GSI • one update to the table may result in two writes to an index If GSIs do not have enough write capacity, table writes will eventually be throttled down to what the "slowest" index can consume 40 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Debugging Throughput Issues ProvisionedThroughputExceededException (HTTP status code 400) • "The level of configured provisioned throughput for one or more global secondary indexes of the table was exceeded. Consider increasing your provisioning level for the under-provisioned global secondary indexes with the UpdateTable API" 41 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Debugging Throughput Issues GSI CloudWatch Metrics • • • • ProvisionedReadCapacityUnits Vs ConsumedReadCapacityUnits ProvisionedWriteCapacityUnits Vs ConsumedWriteCapacityUnits ReadThrottleEvents WriteThrottleEvents 42 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Questions 43 © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.