Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Getting Started with NoSQL on AWS
Padma Malligarjunan
Sr. Technical Account Manager
AWS Enterprise Support
September 14, 2...
Agenda
• Brief history of data processing
• Relational (SQL) vs. nonrelational (NoSQL)
• NoSQL solutions on AWS
• Introduc...
Data volume since 2010
• 90% of stored data generated in
last 2 years
• 1 terabyte of data in 2010 equals
6.5 petabytes to...
Timeline of database technology
DataPressure
Technology adoption and the hype curve
Relational (SQL) vs.
nonrelational (NoSQL)
Relational vs. nonrelational databases
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
DB
DBDB
DB DB
Scale out
SQL vs. NoSQL schema design
NoSQL design optimizes for
compute instead of storage
Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Inst...
NoSQL solutions on AWS
• Bring your own NoSQL (or) use Amazon DynamoDB
• The widest range of NoSQL options
MongoDB
Cassand...
NoSQL solutions using Amazon EC2 and EBS
DB hosted on-premises DB hosted on Amazon EC2
MongoDB Best Practices
on Amazon Web Services
Relational Data Model
Game
GameID
PlayerOneID
PlayerTwoID
GameParam1
GameParamN
StartTime
EndTime
State
GameMoves
GameID
M...
MongoDB – JSON style
game {
player_one: PlayerID,
player_two: PlayerID,
moves [ { token: <moveToken1>, ts: timestamp1 }, {...
MongoDB named a
leader in The
Forrester Wave™:
Document Stores,
Q3 2016
The Forrester Wave™ is copyrighted by Forrester Re...
MongoDB – Deployment Best Practices
Packages
• Always use 64-bit builds for production. 32-bit builds support systems that...
MongoDB – High Availability and Horizontal
Scale Out
Availability
Zone A
Availability
Zone B
Availability
Zone C
AWS Regio...
Amazon DynamoDB
Run your business, not your database
Fully managed
Fast, consistent performance
Highly scalable
Flexible
Event-driven programming
Fine-grained access control
D...
Fully managed service = automated operations
DB hosted on-premises DB hosted on Amazon EC2
Fully managed service = automated operations
DB hosted on premise DynamoDB
Consistently low latency at scale
PREDICTABLE
PERFORMANCE!
WRITES
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No latency ...
Customer use cases
RDBMS
DynamoDB
Amazon’s Path to DynamoDB
MLBAM (MLB Advanced Media) is a full service solutions
provider, operating a powerful content delivery platform.
For the f...
Redfin is a full-service real estate company with local
agents and online tools to help people buy & sell homes.
We have b...
Expedia is a leader in the $1 trillion travel industry, with an
extensive portfolio that includes some of the world’s most...
Nexon is a leading South Korean video game developer
and a pioneer in the world of interactive entertainment.
By using AWS...
Ad Tech Gaming MobileIoT Web
Scaling high-velocity use cases with DynamoDB
That sounds really good. How
do I get started?
Let’s create a table..
Products
Product_Id
DynamoDB table structure
Table
Items
Attributes
Partition
key
Sort
key
Mandatory
Key-value access pattern
Determines data ...
Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(it...
Global secondary index (GSI)
Alternate partition and/or sort key
Index is across all partition keys
A1
(partition)
A2 A3 A...
How do GSI updates work?
Table
Primary
table
Primary
table
Primary
table
Primary
table
Global
secondary
index
Client
2. As...
LSI or GSI?
LSI can be modeled as a GSI
If data size in an item collection > 10 GB, use GSI
If eventual consistency is oka...
Advanced topics in DynamoDB
• Design patterns and best practices
• Data modeling
• Understanding Partitions
• DynamoDB Sca...
To learn more, please attend:
Deep Dive on Amazon DynamoDB
Sean Shriver, AWS NoSQL Solutions Architect
Demo
Serverless Web Apps with
Amazon DynamoDB, API
Gateway, and AWS Lambda
Simple serverless web application – use case
Architecture of a simple serverless web
application
Identity & Access
Management
DynamoDBAPI
Gateway
JavaScript
users
S3 B...
Architecture of a simple serverless web
application
Identity & Access
Management
DynamoDBAPI
Gateway
JavaScript
users
S3 B...
Architecture of a simple serverless web
application
Identity & Access
Management
DynamoDBAPI
Gateway
JavaScript
users
S3 B...
Architecture of a simple serverless web
application
Identity & Access
Management
DynamoDBAPI
Gateway
JavaScript
users
S3 B...
Architecture of a simple serverless web
application
Identity & Access
Management
DynamoDBAPI
Gateway
JavaScript
users
S3 B...
Demo
• Free Tier
 25GB of storage
 25 Reads per second
 25 Writes per second
• Pricing for additional usage in US East (N. V...
Resources
Padma Malligarjunan | pmalli@amazon.com
Amazon DynamoDB: https://aws.amazon.com/dynamodb/
NoSQL on AWS: https://...
aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS
Getting started with Amazon DynamoDB
Getting started with Amazon DynamoDB
Upcoming SlideShare
Loading in …5
×

Getting started with Amazon DynamoDB

1,837 views

Published on

Learn the fundamentals of Amazon DynamoDB and see the DynamoDB console first-hand as we walk through a demo of building a serverless web application using this high-performance key-value and JSON document store.

Published in: Technology
  • Be the first to comment

Getting started with Amazon DynamoDB

  1. 1. Getting Started with NoSQL on AWS Padma Malligarjunan Sr. Technical Account Manager AWS Enterprise Support September 14, 2016
  2. 2. Agenda • Brief history of data processing • Relational (SQL) vs. nonrelational (NoSQL) • NoSQL solutions on AWS • Introduction to MongoDB on AWS • Amazon DynamoDB’s fully managed features • Demo – serverless applications
  3. 3. Data volume since 2010 • 90% of stored data generated in last 2 years • 1 terabyte of data in 2010 equals 6.5 petabytes today • Linear correlation between data pressure and technical innovation • No reason these trends will not continue over time
  4. 4. Timeline of database technology DataPressure
  5. 5. Technology adoption and the hype curve
  6. 6. Relational (SQL) vs. nonrelational (NoSQL)
  7. 7. Relational vs. nonrelational databases Traditional SQL NoSQL DB Primary Secondary Scale up DB DB DBDB DB DB Scale out
  8. 8. SQL vs. NoSQL schema design NoSQL design optimizes for compute instead of storage
  9. 9. Why NoSQL? Optimized for storage Optimized for compute Normalized/relational Denormalized/hierarchical Ad hoc queries Instantiated views Scale vertically Scale horizontally Good for OLAP Built for OLTP at scale SQL NoSQL
  10. 10. NoSQL solutions on AWS • Bring your own NoSQL (or) use Amazon DynamoDB • The widest range of NoSQL options MongoDB Cassandra • Avoid the overhead of provisioning hardware • Visit https://aws.amazon.com/nosql/document/ Couchbase MarkLogic Amazon DynamoDB
  11. 11. NoSQL solutions using Amazon EC2 and EBS DB hosted on-premises DB hosted on Amazon EC2
  12. 12. MongoDB Best Practices on Amazon Web Services
  13. 13. Relational Data Model Game GameID PlayerOneID PlayerTwoID GameParam1 GameParamN StartTime EndTime State GameMoves GameID MoveID PlayerID MoveToken Timestamp Player PlayerID Name Wins Losses Email Password Dashboard1 GameID DashID Placement <De-normalized Fields?>
  14. 14. MongoDB – JSON style game { player_one: PlayerID, player_two: PlayerID, moves [ { token: <moveToken1>, ts: timestamp1 }, { token: <moveToken2>, ts: timestamp2 ... ], game_parameters: { param1 : v, param2 : v, ... }, start_time: ts, end_time: ts, state: <state>, // in_play, mate, resign, draw winner: W || B || D }
  15. 15. MongoDB named a leader in The Forrester Wave™: Document Stores, Q3 2016 The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
  16. 16. MongoDB – Deployment Best Practices Packages • Always use 64-bit builds for production. 32-bit builds support systems that have only 2 GB of memory • Use the latest version of MongoDB Networking • Limit exposure by using network rules that prevent access from unknown machines, systems, & networks Storage • When using the default WiredTiger storage engine, the use of XFS file system is strongly recommended • Turn off atime and diratime when you mount the data volume • For improved performance, consider separating your data, journal and logs onto separate storage devices • Assign swap space for your system • Use a NOOP scheduler for best performance Operating system • Raise file descriptor limits. The default limit of 1,024 open files on most systems won’t work for most production-scale workloads • Disable transparent huge pages. MongoDB performs better with standard (4,096) virtual memory pages Visit: https://d0.awsstatic.com/whitepapers/AWS_NoSQL_MongoDB.pdf
  17. 17. MongoDB – High Availability and Horizontal Scale Out Availability Zone A Availability Zone B Availability Zone C AWS Region Primary Secondary Secondary PrimarySecondarySecondary Shard A Shard B You can also get an optimally configured cluster with MongoDB Atlas, the new database as a service for MongoDB available on AWS. Visit mongodb.com/atlas to learn more.
  18. 18. Amazon DynamoDB Run your business, not your database
  19. 19. Fully managed Fast, consistent performance Highly scalable Flexible Event-driven programming Fine-grained access control DynamoDB Benefits
  20. 20. Fully managed service = automated operations DB hosted on-premises DB hosted on Amazon EC2
  21. 21. Fully managed service = automated operations DB hosted on premise DynamoDB
  22. 22. Consistently low latency at scale PREDICTABLE PERFORMANCE!
  23. 23. WRITES Replicated continuously to 3 AZs Persisted to disk (custom SSD) READS Strongly or eventually consistent No latency trade-off Designed to support 99.99% of availability Built for high durability High availability and durability
  24. 24. Customer use cases
  25. 25. RDBMS DynamoDB Amazon’s Path to DynamoDB
  26. 26. MLBAM (MLB Advanced Media) is a full service solutions provider, operating a powerful content delivery platform. For the first time, we can measure things we’ve never been able to measure before. Joe Inzerillo Executive Vice President and CTO, MLBAM ” “ • MLBAM can scale to support many games on a single day. • Amazon DynamoDB powers queries and supports the fast data retrieval required. • MLBAM distributes 25,000 live events annually and 10 million streams daily. Major League Baseball Fields Big Data, Excitement with Amazon DynamoDB
  27. 27. Redfin is a full-service real estate company with local agents and online tools to help people buy & sell homes. We have billions of records on DynamoDB being refreshed daily or hourly or even by seconds. Yong Huang Director, Big Data Analytics, Redfin ” “ • Redfin provides property and agent details and ratings through its websites and apps. • With DynamoDB, latency for “similar” properties improved from 2 seconds to just 12 milliseconds. • Redfin stores and processes five billion items in DynamoDB. Redfin Is Revolutionizing Home Buying and Selling with Amazon DynamoDB
  28. 28. Expedia is a leader in the $1 trillion travel industry, with an extensive portfolio that includes some of the world’s most trusted travel brands. With DynamoDB we were up and running in a less than day, and there is no need for a team to maintain. Kuldeep Chowhan Principal Engineer, Expedia ” “ • Expedia’s real-time analytics application collects data for its “test & learn” experiments on Expedia sites. • The analytics application processes ~200 million messages daily. • Ease of setup, monitoring, and scaling were key factors in choosing Amazon DynamoDB. Expedia’s Real-time Analytics Application Uses Amazon DynamoDB
  29. 29. Nexon is a leading South Korean video game developer and a pioneer in the world of interactive entertainment. By using AWS, we decreased our initial investment costs, and only pay for what we use. Chunghoon Ryu Department Manager, Nexon ” “ • Nexon used Amazon DynamoDB as its primary game database for a new blockbuster mobile game, HIT • HIT became the #1 Mobile Game in Korea within the first day of launch and has > 2M registered users • Nexon’s HIT leverages DynamoDB to deliver steady latency of less than 10ms to deliver a fantastic mobile gaming experience for 170,000 concurrent players Nexon Scales Mobile Gaming with Amazon DynamoDB
  30. 30. Ad Tech Gaming MobileIoT Web Scaling high-velocity use cases with DynamoDB
  31. 31. That sounds really good. How do I get started? Let’s create a table..
  32. 32. Products Product_Id
  33. 33. DynamoDB table structure Table Items Attributes Partition key Sort key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for key ==, <, >, >=, <= “begins with” “between” “contains” “in” sorted results counts top/bottom N values
  34. 34. Local secondary index (LSI) Alternate sort key attribute Index is local to a partition key A1 (partition) A3 (sort) A2 (item key) A1 (partition) A2 (sort) A3 A4 A5 LSIs A1 (partition) A4 (sort) A2 (item key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (partition) A5 (sort) A2 (item key) A3 (projected) A4 (projected) ALL 10 GB maximum per partition key; LSIs limit the number of range keys!
  35. 35. Global secondary index (GSI) Alternate partition and/or sort key Index is across all partition keys A1 (partition) A2 A3 A4 A5 GSIs A5 (partition) A4 (sort) A1 (item key) A3 (projected) Table INCLUDE A3 A4 (partition) A5 (sort) A1 (item key) A2 (projected) A3 (projected) ALL A2 (partition) A1 (itemkey) KEYS_ONLY Online indexing Read capacity units (RCUs) and write capacity units (WCUs) are provisioned separately for GSIs
  36. 36. How do GSI updates work? Table Primary table Primary table Primary table Primary table Global secondary index Client 2. Asynchronous update (in progress) If GSIs don’t have enough write capacity, table writes will be throttled!
  37. 37. LSI or GSI? LSI can be modeled as a GSI If data size in an item collection > 10 GB, use GSI If eventual consistency is okay for your scenario, use GSI!
  38. 38. Advanced topics in DynamoDB • Design patterns and best practices • Data modeling • Understanding Partitions • DynamoDB Scaling
  39. 39. To learn more, please attend: Deep Dive on Amazon DynamoDB Sean Shriver, AWS NoSQL Solutions Architect
  40. 40. Demo Serverless Web Apps with Amazon DynamoDB, API Gateway, and AWS Lambda
  41. 41. Simple serverless web application – use case
  42. 42. Architecture of a simple serverless web application Identity & Access Management DynamoDBAPI Gateway JavaScript users S3 Bucket internet Lambda
  43. 43. Architecture of a simple serverless web application Identity & Access Management DynamoDBAPI Gateway JavaScript users S3 Bucket internet Lambda
  44. 44. Architecture of a simple serverless web application Identity & Access Management DynamoDBAPI Gateway JavaScript users S3 Bucket internet Lambda
  45. 45. Architecture of a simple serverless web application Identity & Access Management DynamoDBAPI Gateway JavaScript users S3 Bucket internet Lambda
  46. 46. Architecture of a simple serverless web application Identity & Access Management DynamoDBAPI Gateway JavaScript users S3 Bucket internet Lambda
  47. 47. Demo
  48. 48. • Free Tier  25GB of storage  25 Reads per second  25 Writes per second • Pricing for additional usage in US East (N. Virginia)  $0.25 per GB per month  Write throughput: $0.0065 per hour for every 10 units of Write Capacity  Read throughput: $0.0065 per hour for every 50 units of Read Capacity DynamoDB Pricing & Free Tier
  49. 49. Resources Padma Malligarjunan | pmalli@amazon.com Amazon DynamoDB: https://aws.amazon.com/dynamodb/ NoSQL on AWS: https://aws.amazon.com/nosql/document/ Upcoming session: Deep Dive on Amazon DynamoDB
  50. 50. aws.amazon.com/activate Everything and Anything Startups Need to Get Started on AWS

×