How to choose the right database for your workload

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMI T
Databases on AWS: How To
Choose The Right Database
Deliveroo
Anne Byrne, Software Engineer
Erika Morenosierra, Software Engineer
Nina Sridhar, Software Engineer
D A T 1
AWS
Richard Ainley, Solutions Architect

Most enterprise database & analytics cloud customers

Most startup database & analytics cloud customers

Modern apps create new requirements
Users: 1 million+
Data volume: TB–PB–EB
Locality: Global
Performance: Milliseconds–microseconds
Request rate: Millions
Access: Web, mobile, IoT, devices
Scale: Up-down, Out-in
Economics: Pay for what you use
Developer access: No assembly requiredSocial mediaRide hailing Media streaming Dating

Common data categories and use cases
Relational
Referential
integrity, ACID
transactions,
schema-
on-write
Lift and shift, ERP,
CRM, finance
Key-value
High
throughput, low-
latency reads
and writes,
endless scale
Real-time bidding,
shopping cart,
social, product
catalog, customer
preferences
Document
Store
documents and
quickly access
querying on
any attribute
Content
management,
personalization,
mobile
In-memory
Query by key
with
microsecond
latency
Leaderboards,
real-time analytics,
caching
Graph
Quickly and
easily create
and navigate
relationships
between
data
Fraud detection,
social networking,
recommendation
engine
Time-series
Collect, store,
and process
data sequenced
by time
IoT applications,
event tracking
Ledger
Complete,
immutable, and
verifiable history
of all changes to
application data
Systems
of record, supply
chain, health care,
registrations,
financial

SUMMI T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Relational
Referential
integrity, ACID
transactions,
schema-
on-write
Lift and shift, ERP,
CRM, finance

Relational data
• Divide data among tables
• Highly structured
• Relationships established
via keys enforced by the
system
• Data accuracy and
consistency
Patient
* Patient ID
First Name
Last Name
Gender
DOB
* Doctor ID
Visit
* Visit ID
* Patient ID
* Hospital ID
Date
* Treatment ID
Medical Treatment
* Treatment ID
Procedure
How Performed
Adverse Outcome
Contraindication
Doctor
* Doctor ID
First Name
Last Name
Medical Specialty
* Hospital Affiliation
Hospital
* Hospital ID
Name
Address
Rating

Amazon Relational Database Service (RDS)
Managed relational database service with a choice of six popular database engines
Easy to administer Available and durable Highly scalable Fast and secure
No need for infrastructure
provisioning, installing, and
maintaining DB software
Automatic Multi-AZ data
replication; automated backup,
snapshots, failover
Scale database compute
and storage with a few
clicks with no app
downtime
SSD storage and guaranteed
provisioned I/O; data
encryption at rest and in
transit

Amazon Aurora
MySQL and PostgreSQL-compatible relational database built for the cloud
Performance and availability of commercial-grade databases at 1/10th the cost
Performance
and scalability
Availability
and durability
Highly secure Fully managed
5x throughput of standard
MySQL and 3x of standard
PostgreSQL; scale-out up to
15 read replicas
Fault-tolerant, self-healing
storage; six copies of data
across three Availability Zones;
continuous backup to Amazon S3
Network isolation,
encryption at
rest/transit
Managed by RDS:
No hardware provisioning,
software patching, setup,
configuration, or backups

AWS Database Migration Service (AWS DMS)
M I G R A T I N G
D A T A B A S E S
T O A W S
Migrate between on-premises and AWS
Migrate between databases
Automated schema conversion
Data replication for
zero-downtime migration
100,000+
databases migrated

SCT OLTP conversions
Source Database Target Database on Amazon RDS
Microsoft SQL Server
Aurora, MySQL, PostgreSQL, MariaDB,
Microsoft SQL Server
MySQL Aurora, MySQL, PostgreSQL
Oracle
Aurora, MySQL, PostgreSQL, MariaDB,
Oracle
PostgreSQL Aurora, MySQL, PostgreSQL
IBM DB2 LUW Aurora, MySQL, PostgreSQL, MariaDB
Apache Cassandra Amazon DynamoDB
Sybase Aurora, MySQL, PostgreSQL
https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Welcome.html

Key-value
High
throughput, low-
latency reads
and writes,
endless scale
Real-time bidding,
shopping cart,
social, product
catalog, customer
preferences
Document
Store
documents and
quickly access
querying on
any attribute
Content
management,
personalization,
mobile
In-memory
Query by key
with
microsecond
latency
Leaderboards,
real-time analytics,
caching
Graph
Quickly and
easily create
and navigate
relationships
between
data
Fraud detection,
social networking,
recommendation
engine

Key-value data
Use cases
• Serverless Web
Applications
• Microservices
Data Store
• Mobile Backends
• Ad Tech
• Gaming
• IOT
Gamers
Primary Key Attributes
GamerTag Level Points High Score Plays
Hammer57 21 4050 483610 1722
FluffyDuffy 5 1123 10863 43
Lol777313 14 3075 380500 1307
Jam22Jam 20 3986 478658 1694
ButterZZ_55 7 1530 12547 66
… … … … …
Gamers
Hammer57
21
4050,
483610,
1722
GET {
TableName:"Gamers",
Key: {
"GamerTag":"Hammer57“,
“ProjectionExpression“:”Points”
} }

Amazon DynamoDB
Fast and flexible key value database service for any scale
Comprehensive
security
Encrypts all data by
default and fully integrates
with AWS Identity and
Access Management for
robust security
Performance at scale
Consistent, single-digit
millisecond response times at
any scale; build applications with
virtually unlimited throughput
Global database for
global users and apps
Build global applications with
fast access to local data by easily
replicating tables across multiple
AWS Regions
Serverless
No server provisioning,
software patching, or
upgrades; scales up or down
automatically; continuously
backs up your data

Document
Data Use cases
• Mobile and Web Applications
• Content and Catalog
Management
• Profile Management

Amazon DocumentDB
Fast, scalable, highly available, fully managed MongoDB-compatible database
service
Fully Managed
Managed by AWS:
No hardware provisioning,
software patching, setup,
configuration, or backups
Fast
Millions of requests per second,
millisecond latency
MongoDB-compatible
Compatible with MongoDB
Community Edition 3.6. Use the
same drivers and tools
Reliable
Six replicas of your data across
three AZs with full backup and
restore

Amazon ElastiCache
Redis and Memcached compatible, in-memory data store and cache
Secure and
reliable
Network isolation,
encryption at rest/transit,
HIPAA, PCI, FedRAMP,
multi AZ, and automatic
failover
Redis & Memcached
compatible
Fully compatible with open
source Redis and Memcached
Easily scalable
Scale writes and reads with
sharding and replicas
Extreme
performance
In-memory data store and
cache for microsecond
response times

Graph
Data Use cases
• Social Networking
• Recommendation Engines
• Fraud Detection
• Knowledge Graphs
• Life Sciences
• Network / IT Operations
PURCHASED PURCHASED
FOLLOWS
PURCHASED
KNOWS
PRODUCT
SPORT
FOLLOWS

Graph use case
// Product recommendation to a user
gremlin> V().has(‘name’,’sara’).as(‘customer’).out(‘follows’).in(‘follows’).out(‘purchased’)
( (‘customer’)).dedup() (‘name’) ('name')
KN
O
W
S
PURCHASED PURCHASED
FOLLOWS
PURCHASED
KNOWS
PRODUCT
SPORT
FOLLOWS
FOLLOWS
// Identify a friend in common and
make a recommendation
gremlin> g.V().has('name','mary').as(‘start’).
both('knows').both('knows’).
where(neq(‘start’)).
dedup().by('name').properties('name')

Amazon Neptune
Fully managed graph database
Easy
Build powerful queries
easily with Gremlin and
SPARQL
Fast
Query billions of relationships
with millisecond latency
Open
Supports Apache TinkerPop &
W3C RDF graph models
Reliable
Six replicas of your data across
three AZs with full backup and
restore

Time-series
Collect, store,
and process
data sequenced
by time
IoT applications,
event tracking

Time-series data
What is time-series
data?
What is special about a
time-series database?
A sequence of data points
recorded over a time interval
Time is the
single primary axis
of the data model
t

Existing time-series databasesRelational databases
Difficult to
maintain high
availability
Difficult to scale Limited data
lifecycle
management
Inefficient
time-series data
processing
Unnatural for
time-series
data
Rigid schema
inflexible for fast
moving time-series
data
Building with time-series data is challenging

Amazon Timestream (sign up for the preview)
Fast, scalable, fully managed time-series database
1,000x faster and 1/10th the
cost of relational databases
Collect data at the rate of
millions of inserts per
second (10M/second)
Trillions of
daily events
Adaptive query processing
engine maintains steady,
predictable performance
Time-series analytics
Built-in functions for
interpolation, smoothing,
and approximation
Serverless
Automated setup,
configuration, server
provisioning, software patching

Ledger
Complete,
immutable, and
verifiable
history of all
changes to
application data
Systems
of record, supply
chain, health care,
registrations,
financial

Common customer use cases
Ledgers with centralized control
Healthcare
Verify and track hospital
equipment inventory
Manufacturers
Track distribution of a
recalled product
HR & Payroll
Track changes to an
individual’s profile
Government
Track vehicle title
history

Challenges with building ledgers
Adds unnecessary
complexity
BlockchainRDBMS - audit tables
Difficult to
maintain
Hard to use
and slow
Hard to build
Custom audit functionality using
triggers or stored procedures
Impossible to verify
No way to verify changes made
to data by sys admins

Amazon Quantum Ledger Database (QLDB) (Preview)
Fully managed ledger database
Track and verify history of all changes made to your application’s data
Immutable
Maintains a sequenced record of
all changes to your data, which
cannot be deleted or modified;
you have the ability to query and
analyze the full history
Cryptographically
verifiable
Uses cryptography to
generate a secure output
file of your data’s history
Easy to use
Easy to use, letting you
use familiar database
capabilities like SQL APIs
for querying the data
Highly scalable
Executes 2–3X as many
transactions than ledgers
in common blockchain
frameworks

AWS purpose-built databases
Relational Key-value Document In-memory Graph Time-series Ledger
DynamoDB NeptuneAmazon RDS
Aurora CommercialCommunity
Timestream QLDBElastiCacheDocumentDB

Click to edit
presentation title
Click to edit presentation subtitle
Deliveroo:
Life after Rails

Nina Sridhar
Backend Engineer
Anne Byrne
Backend Engineer
Erika Moreno Sierra
Backend Engineer

Changes at Deliveroo
Service decomposition
Complex 3-sided marketplace
Growth

Upgrading to PG10 using DMS
DynamoDB in the hotpath

Order Status Service
KAFKA
SQS
POSTGRESQL 9.6

Performance
Insights
PostgreSQL 10

Database Migration Service (DMS)

https://deliveroo.engineering/

DynamoDBDynamoDB in the hotpath

Build trust in Deliveroo’s marketplace and
empower teams to move money confidently.

Fast
Reliable
Scalable
Deadline
Verification Service Requirements

● Multiple event types
● Immutable datastore
● Throughput

Using ORM with statically typed languages
Data access patterns
Scan vs Query
Composite indexes
Misconceptions with DynamoDB

What did we learn?
Know your schema
Shared resources makes naming things hard(er)!
Hard requirement for app start

Why it was the right choice
Fast
Multi - region replication
Auto-scaling

How to choose the right database for your workload

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How to choose the right database for your workload

Similar to How to choose the right database for your workload (20)

More from Amazon Web Services

More from Amazon Web Services (20)

How to choose the right database for your workload