• Get an overview of managed database services available on AWS
• Learn how to combine them for high-performance cost effective architectures
• Learn how to choose between the AWS database services based on your use case
On AWS you can choose from a variety of managed database services that save effort, save time, and unlock new capabilities and economies. In this session, we make it easy to understand how they differ, what they have in common, and how to choose one or more. We'll explain the fundamentals of Amazon RDS, a managed relational database service in the cloud; Amazon DynamoDB, a fully managed NoSQL database service; Amazon ElastiCache, a fast, in-memory caching service in the cloud; and Amazon Redshift, a fully managed, petabyte-scale data-warehouse solution that can be economical. We will cover how each service might help support your application and how to get started.
2. Today’s agenda
• Why managed database services?
• A non-relational managed database
• A relational managed database
• A managed in-memory cache
• A managed data warehouse
• What to do next
4. Options for running your database
• Self-Managed—You are responsible for the hardware,
OS, security, updates, backups, replication etc., but have
full control over it. This is typically on premise.
• EC2 Instances—You only need to focus on the database
level updates, patches, replication, backups etc. and
don’t have to worry about the hardware and underlying
infrastructure.
• Fully Managed—You get features such as backup and
replication etc. as a package service and don’t have to
bother with patching and updates.
6. A managed service for each major DB type
Amazon
DynamoDB
Document
and Key-
Value Store
Amazon
RDS
SQL
Database
Engines
Amazon
ElastiCache
In-Memory
Key-Value
Store
Amazon
Redshift
Data
Warehouse
12. Amazon RDS is simple and fast to scale
Database instance types
offer a range of CPU and
memory selections
Scale up or down among
instance types on demand
Database storage is
scalable on demand
13. Amazon RDS offers fast, predictable storage
General Purpose
(SSD) for most
workloads
Provisioned IOPS
(SSD) for OLTP
workloads up to
30,000 IOPS
Magnetic for small
workloads with
infrequent access
14. High availability Multi-AZ deployments
Enterprise-grade fault tolerance solution for
production databases
15. Choose Read Replicas for greater scalability
Bring data close to your customer’s
applications in different regions
Relieve pressure on your master
node for supporting reads and
writes.
Promote a read replica to a master
for faster recovery in the event of
disaster
16. Choose cross-region replication for enhanced data
locality, even more ease of migration
Even faster recovery in the
event of disaster
Bring data close to your
customers
Promote to a master for
easy migration
17. Choose cross-region snapshot copy for even
greater durability, ease of migration
Copy a database snapshot
to a different AWS region
Warm standby for disaster
recovery
Base for migration to a
different region
18. How Amazon RDS backups work?
Automated backups
Restore your database to a point in
time
Enabled by default
Choose a retention period, up to 35
days
Manual snapshots
Build a new database instance from a
snapshot when needed
Initiated by you
Persist until you delete them
Stored in Amazon S3
19. Monthly
bill = +
Further details at http://aws.amazon.com/rds/pricing/
Storage consumedDuration for which DB
instances were used
(Price depends on
type of storage)
(Price depends on type
of DB instance)
Free tier (for first 12 months)
• 750 micro DB instance hours
• 20 GB of DB storage
• 20 GB for backups
• 10 million I/O operations
GBN ×
You pay for the resources that you use
23. Amazon DynamoDB: a managed document and
key-value store
Simple and fast to deploy
Simple and fast to scale
• To millions of IOPS
Data is automatically replicated
Fast, predictable performance
• Backed by SSD storage
Secondary indexes offer fast lookups
No cost to get started; pay only for what you consume
24. Popular use cases
Ad Tech IoT Gaming
Mobile
& Web
Ad serving,
retargeting, ID
lookup, user
profile
management,
session-
tracking, RTB
Tracking state,
metadata and
readings from
millions of
devices, real-
time
notifications
Recording
game details,
leaderboards,
session
information,
usage history,
and logs
Storing user
profiles,
session details,
personalization
settings, entity
specific
metadata
25. Writes
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
Reads
Strongly or eventually consistent
No latency trade-off
Automatic replication for rock-solid durability
and availability
26. Amazon DynamoDB is a schemaless database
Table Items
Attributes (name-
value pairs)
27. Each item must include a key
Hash key
(DynamoDB maintains an
unordered index)
28. Each item must include a key
Hash key
Range key
(DynamoDB maintains a
sorted index)
30. Global secondary indexes = “pivot charts”
for your table
Choose which
attributes
to project (if any)
31. Define the desired performance using
provisioned throughput
Read
capacity units
Write
capacity units
1 RPS > 2.5 M
requests in a
month
32. DynamoDB: What are capacity units?
One write capacity unit One read capacity unit
One strongly consistent
read per second up to 4KB
or
Two eventually consistent
reads per second
One write per
second up to 1KB
33. Simple app architecture with Amazon DynamoDB
Elastic Load
Balancing Amazon EC2
app instances
Clients
DynamoDB
Business logic
34. How DynamoDB billing works
Monthly
bill = GB +
Assumes DB instance accessed only from AWS region
Further details at http://aws.amazon.com/dynamodb/pricing/
≈ 5 GB * $0.25 +
21 * 720 hrs * $0.0065/10 +
35 * 720 hrs * $0.0065/50
≈ $14.36
Storage consumed
(plus 100 bytes per item)
Charge for
write capacity units
per hour
+
Charge for
read capacity units
per hour
35. How DynamoDB billing works (with free tier)
Monthly
bill = GB +
Assumes DB instance accessed only from AWS region
Further details at http://aws.amazon.com/dynamodb/pricing/
≈ 5–25 GB * $0.25 +
21–25 * 720 hrs * $0.0065/10 +
35–25 * 720 hrs * $0.0065/50
Storage consumed
(plus 100 bytes per item)
Charge for
write capacity units
per hour
Charge for
read capacity units
per hour
+
Free tier (for first 12 months)
• 25 GB Storage
• 25 Units Write Capacity
• 25 Units Read Capacity
36. How DynamoDB billing works (with free tier)
Monthly
bill = GB +
Assumes DB instance accessed only from AWS region
Further details at http://aws.amazon.com/dynamodb/pricing/
≈ 0 +
0 +
10 * 720 hrs * $0.0065/50
≈ $0.94
Storage consumed
(plus 100 bytes per item)
Charge for
write capacity units
per hour
+
Charge for
read capacity units
per hour
38. NoSQL vs. SQL for a new app: how to choose?
• Strong schema, complex
relationships, transactions
and joins
• Scaling is difficult
• Focus on consistency
over scale and availability
• Schema-less, easy reads
and writes, simple data
model
• Scaling is easy
• Focus on performance and
availability at any scale
NoSQL SQL
40. Amazon
Redshift
a lot faster
a lot cheaper
a whole lot simpler
Relational data warehouse
Massively parallel; petabyte scale
Fully managed
HDD and SSD platforms
$1,000/TB/year; starts at $0.25/hour
41. Who uses Amazon Redshift?
• Reduce costs by extending
DW rather than adding HW
• Migrate completely from
existing DW systems
• Respond faster to business;
provision in minutes
• Improve performance by an
order of magnitude
• Make more data available
for analysis
• Access business data via
standard reporting tools
• Add analytic functionality to
applications
• Scale DW capacity as
demand grows
• Reduce HW and SW costs by
an order of magnitude
Traditional enterprise DW Companies with big dataSaaS companiesCompanies with big data
42. Amazon Redshift architecture
Leader node
• Simple SQL endpoint
• Stores metadata
• Optimizes query plan
• Coordinates query execution
Compute nodes
• Local columnar storage
• Parallel/distributed execution of all
queries, loads, backups, restores,
resizes
Start at just $0.25/hour, grow to 2 PB
(compressed)
• DC1: SSD; scale 160 GB–326 TB
• DS2: HDD; scale 2 TB–2 PB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
43. Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage • With row storage, you do
unnecessary I/O
• To get total amount, you have to
read everything
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
44. • With column storage, you
only read the data you need
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
46. Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
• Track the minimum and
maximum value for each block
• Skip over blocks that don’t
contain relevant data
47. Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
DW.HS1.8XL:
• > 2 GB/sec scan rate
• Optimized for data processing
• High disk density
DW.HS1.XL:
48. Fully managed, continuous/incremental
backups
Multiple copies within cluster
Continuous and incremental backups
to Amazon S3
Continuous and incremental backups
across regions
Streaming restore
Amazon S3
Amazon S3
Region 1
Region 2
49. Amazon Redshift offers rock-solid fault
tolerance
Amazon S3
Amazon S3
Region 1
Region 2
Disk failures
Node failures
Network failure
AZ/region level disasters
50. You pay for what you use
Further details at https://aws.amazon.com/redshift/pricing/
Monthly
bill = N ×
Duration for which the
nodes were used
Number of nodes
(Price depends on type of
node)
Free Tier (2 month free trial)
• 750 DC1.Large hours per month
51. Redshift has a large ecosystem
Data Integration Systems IntegratorsBusiness Intelligence
55. Popular use cases
Caching layer for performance or cost optimization
of an underlying database
Storage of ephemeral key-value data
High-performance application patterns such as
leaderboards (for gaming users), session
management, event counters, in-memory lists
59. How ElastiCache billing works
Monthly
bill = N ×
Further details at http://aws.amazon.com/elasticache/pricing/
Duration for which the
nodes were used
Number of nodes
(Price depends on type of
node)
Free tier (for first 12 months)
• 750 micro cache node hours