SlideShare a Scribd company logo
1 of 61
Download to read offline
DAT203 - AWS Storage and Database
Architecture Best Practices
Siva Raghupathy, Amazon Web Services

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
The Third Platform
• Built on:
–
–
–
–

Mobile devices
Cloud services
Social technologies
Big data

• Billions of users
• Millions of apps
Data Volume, Velocity, Variety
• 2.7 zettabytes (ZB) of data
exists in the digital universe
today
– 1 ZB = 1 billion terabytes

• 450 billion transaction per day
by 2020
• More unstructured data than
structured data
Common Questions from Database Developers
Cloud Migration
• How do I move (my data) to the
cloud?
Data/Storage Technologies
• What data store should I use?
– SQL or NoSQL?
– Hadoop or DW?
– What about search?

Management Concerns
• Is my data (in the cloud) secure?
• Relational features w/o management
nightmares?
• My data volume, velocity, and variety
are exploding!
• How can I reduce cost?
Performance and Delivery
• Need low latency (ms or µs)
• Need high throughput
• Need to ship in days – not years!
Cloud Data Tier Anti-Pattern

Data Tier
Cloud Data Tier Architecture – Use the Right Tool for the Job!
Client Tier

App/Web Tier

Data Tier
Search

Cache

Blob Store

ETL

NoSQL

SQL

Data
Warehouse

Hadoop
AWS
Deployment & Administration

App Services

Compute

Storage

Database

Networking
AWS Global Infrastructure
AWS Managed Database & Storage Services
Structured – Complex Query
• SQL
– Amazon RDS
(MySQL, Oracle, SQL Server)

• Data Warehouse
– Amazon Redshift

Structured – Simple Query
• NoSQL
– Amazon DynamoDB

• Cache
– Amazon ElastiCache
(Memcached, Redis)

• Search
– Amazon
CloudSearch

Unstructured – Custom Query
• Hadoop
– Amazon Elastic MapReduce
(EMR)

Unstructured – No Query
• Cloud Storage
– Amazon S3
– Amazon Glacier
AWS Primitive Compute and Storage
Compute Capabilities
• Many different EC2 instance
types
–
–
–
–

General purpose
Compute optimized
Storage optimized
Memory optimized

• Host any major data storage
technology

Raw Storage Options
• EC2 Instance store (ephemeral)
• Amazon Elastic Block Store (EBS)
– Standard volume
• 1 TB, ~100 IOPS per volume

– Provisioned IOPS volume
• 1 TB, up to 4000 IOPS per volume

– Stripe multiple volumes for higher
IOPS or storage

– RDBMS
– NoSQL
– Cache

Primitives add flexibility, but also come with operational burden!
AWS Data Tier Architecture - Us the right tool for the job!

Data Tier
Amazon
ElastiCache

Amazon
CloudSearch

Amazon
Elastic MapReduce

Amazon S3

Amazon
Glacier
Amazon DynamoDB

Amazon RDS

Amazon Redshift

AWS Data Pipeline
Reference Architecture
Amazon
CloudSearch

Amazon
ElastiCache
Amazon
RDS

Amazon
EMR

Amazon
DynamoDB

Amazon
Redshift

AWS Data Pipeline

Reference Architecture

Amazon
S3

Amazon
Glacier
Use Case: A Video Streaming Application
Use Case: A Video Streaming App – Upload

Amazon
CloudSearch
Amazon
RDS

Amazon
DynamoDB

Amazon
S3
A Video Streaming App – Discovery
CloudFront

Amazon
CloudSearch
Amazon
ElastiCache
Amazon
RDS

X

Amazon
DynamoDB

Amazon
S3

Amazon
Glacier
Use Case: A Video Streaming App – Recs

Amazon
DynamoDB

Amazon
EMR

Amazon
S3

Amazon
Glacier
Use Case: A Video Streaming App – Analytics

Amazon
EMR

Amazon
S3

Amazon
Redshift

Amazon
Glacier
What is the temperature of your data?
Data Characteristics: Hot, Warm, Cold
Hot

Warm

Cold

Volume
Item size
Latency
Durability

MB–GB
B–KB
ms
Low–High

GB–TB
KB–MB
ms, sec
High

PB
KB–TB
min, hrs
Very High

Request rate
Cost/GB

Very High
$$-$

High
$-¢¢

Low
¢
Structure

Low

Amazon
Glacier

Amazon S3
Amazon
ElastiCache

Amazon
EMR
Amazon
DynamoDB

Amazon
RDS

Amazon
Redshift

High
High
High
Low
Low

Request rate
Cost/GB
Latency
Data Volume

Low
Low
High
High
What data store should I use?
ElastiCache

Amazon
DynamoDB

Amazon
RDS

Cloud
Search

Amazon
Redshift

Amazon
EMR (Hive)

Amazon S3

Amazon
Glacier

Average
latency

ms

ms

ms,sec

ms,sec

sec,min

sec,min,
hrs

ms,sec,min hrs
(~ size)

Data volume

GB

GB–TBs
(no limit)

GB–TB
GB–TB
(3 TB Max)

Item size

B-KB

KB
KB
(64 KB max) (~rowsize)

TB–PB
GB–PB GB–PB
(1.6 PB max) (~nodes) (no limit)

GB–PB
(no limit)

KB
(1 MB
max)

KB
(64 K max)

KB-MB

KB-GB
(5 TB max)

GB
(40 TB
max)

Request rate Very High Very High

High

High

Low

Low

Low–
Very High
(no limit)

Very Low
(no limit)

Storage cost $$
$/GB/month

¢¢

$

¢

¢

¢

¢

High

High

High

High

Very High

Very High

Durability

¢¢

Low Very High
Moderate
Hot Data

Warm Data

Cold Data
AWS Data Tier Architecture - Use the right tool for the job!

Data Tier
Amazon
ElastiCache

Amazon
CloudSearch

Amazon
Elastic MapReduce

Amazon S3
Amazon
Glacier

Amazon DynamoDB

Amazon RDS

Amazon Redshift

AWS Data Pipeline
Cost Conscious Design
Cost Conscious Design
Example: Should I use Amazon S3 or Amazon DynamoDB?
“I’m currently scoping out a project that will greatly increase
my team’s use of Amazon S3. Hoping you could answer
some questions. The current iteration of the design calls for
many small files, perhaps up to a billion during peak. The
total size would be on the order of 1.5 TB per month…”
Request rate Object size Total size Objects per month
(Writes/sec) (Bytes)
(GB/month)
300

2048

1483

777,600,000
Cost Conscious Design
Example: Should I use Amazon S3 or Amazon DynamoDB?
Amazon S3 or
Amazon
DynamoDB?

Request rate Object size Total size Objects per
(Writes/sec) (Bytes)
(GB/month) month
300

2,048

1,483

777,600,000
Amazon DynamoDB

use

Request rate Object size Total size Objects per
(Writes/sec) (Bytes)
(GB/month) month
Scenario 1 300

2,048

1,483

777,600,000

Scenario 2 300

32,768

23,730

777,600,000

use

Amazon S3
Best Practices
Amazon RDS
When to use

When not to use

•
•
•

•

Transactions
Complex queries
Medium to high query/write rate
– Up to 30 K IOPS (15 K reads + 15
K writes)

•
•
•

100s of GB to low TBs
Workload can fit in a single node
High durability

Massive read/write rates
– Example: 150 K write requests per
second

•

Data size or throughput demands
sharding
– Example: 10 s or 100 s of terabytes

•
•

Simple Get/Put and queries that a
NoSQL can handle
Complex analytics
Push-Button Scaling

Multi-AZ

AZ 1

AZ 2

Region

Read Replicas
Amazon RDS Best Practices
• Use the right DB instance class
• Use EBS-optimized instances
– db.m1.large, db.m1.xlarge, db.m2.2xlarge, db.m2.4xlarge,
db.cr1.8xlarge

• Use provisioned IOPS
• Use multi-AZ for high availability
• Use read replicas for
– Scaling reads
– Schema changes
– Additional failure recovery
Amazon DynamoDB
When to use
•
•
•
•
•
•
•

Fast and predictable performance
Seamless/massive scale
Autosharding
Consistent/low latency
No size or throughput limits
Very high durability
Key-value or simple queries

When not to use
•
•
•
•

Need multi-item/row or cross table
transactions
Need complex queries, joins
Need real-time analytics on
historic data
Storing cold data
Amazon DynamoDB Best Practices
• Keep item size small
• Store metadata in Amazon DynamoDB and
large blobs in Amazon S3
• Use a table with a hash key for extremely
high scale
• Use table per day, week, month etc. for
storing time series data
• Use conditional/OCC updates
• Use hash-range key to model
– 1:N relationships
– Multi-tenancy

• Avoid hot keys and hot partitions

Events_table_2012
Event_id
(Hash key)

Timestam
p
(range key)

Attribute1

….

Attribute N

Events_table_2012_05_week1
Events_table_2012_05_week2
Attribute1
…. Attribute N
Event_id
Timestam
(Hash key)
p Timestam
Attribute1
…. Attribute N
Event_id
(range key)
(Hash key)
p
Events_table_2012_05_week3
(range key)
Attribute1
…. Attribute N
Event_id
Timestam
(Hash key)
p
(range key)
Amazon ElastiCache (Memcached)
When to use

When not to use

•
•
•

•
•

•

Transient key-value store
Need to speed up reads/write
Caching frequent SQL, NoSQL or
DW query results
Saving transient and frequently
updated data
–
–

•

Increment/decrement game
scores/counters
Web application session storage

Best effort deduplication

Store infrequently used data
Need persistence
Amazon ElastiCache (Memcached) Best Practices
•
•
•
•
•

Use autodiscovery
Share memcached client objects in application
Use TTLs
Consider memory for connections overhead
Use Amzon CloudWatch alarms / SNS alerts
•
•
•

Number of connections
Swap memory usage
Freeable memory
Amazon ElastiCache (Redis)
When to use

When not to use

•

•
•
•
•

Key-value store with advanced
data structures
– Strings, lists, sets, sorted sets,
hashes

•
•
•
•
•
•

Caching
Leader boards
High-speed sorting
Atomic counters
Queuing systems
Activity streams

Need “native” sharding or scale-out
Need “hard” persistence
Data won’t fit in memory
Need transaction rollback even
under exceptions
Amazon ElastiCache (Redis) Best Practices
•
•

Use TTL
Use the right instance types
•

•

Use read replicas
•
•
•

•
•

Instances with high ECU/vCPU and network performance
yield the highest throughput. Example: m2.4xlarge, m2.2xlarge

Increase read throughput
AOF cannot protect against all failure modes
Promote read replicas to primary

Use RDB file snapshot for on-premises to Amazon ElastiCache migration
Key parameter group settings
•
•
•

Avoid “AOF with fsync always” – huge impact on performance
AOF (+ RDB) with fsync everysec – best durability + performance
Pub-sub: set client-output-buffer-limit-pubsub-hard-limit and client-output-buffer-limit-pubsub-soft-limit
based on the workloads
Amazon CloudSearch
When to use

When not to use

•
•
•
•
•
•

•

No search expertise
Full-text search
Ranking
Relevance
Structured and unstructured data
Faceting
– $0 to $10 (4 items)
– $10 and above (3 items)

Not as replacement for a database
–

Not as a system of record

– Transient data
– Nonatomic updates
Amazon CloudSearch Best Practices
• Batch documents for uploading
• Use Amazon CloudSearch for searching and another
store for retrieving full records for the UI (i.e. don’t use
return fields)
• Include other data like popularity scores in documents
• Use stop words to remove common terms
• Use fielded queries to reduce match sets
• Query latency is proportional to query specificity
Amazon Redshift
When to use

When not to use

•
•

•

•
•
•
•
•
•

Information analysis and reporting
Complex DW queries that
summarize historical data
Batched large updates e.g. daily
sales totals
10s of concurrent queries
100s GB to PB
Compression
Column based
Very high durability

OLTP workloads
– 1000s of concurrent users
– Large number of singleton
updates
Amazon Redshift Best Practices
• Use COPY command to load large data sets from Amazon
S3, Amazon DynamoDB, Amazon EMR/EC2/Unix/Linux hosts
– Split your data into multiple files
– Use GZIP or LZOP compression
– Use manifest file

• Choose proper sort key
– Range or equality on WHERE clause

• Choose proper distribution key
– Join column, foreign key or largest dimension, group by column
– Avoid distribution key for denormalized data
Amazon Elastic MapReduce
When to use

When not to use

•

•

Batch analytics/processing
–

•
•
•
•
•

Answers in minutes or hours

Structured and unstructured data
•
Parallel scans of the entire dataset
with uniform query performance
Supports Hive QL + other languages
GB, TB, or PB of data
Replicated data store (HDFS) for
ad-hoc and real-time queries
(HBase)

Real-time analytics (DW)
– Need answers in seconds

1000s of concurrent users
Amazon Elastic MapReduce Best Practices
• Choose between transient and persistent
clusters for best TCO
• Leverage Amazon S3 integration for
highly durable and interim storage
• Right-size cluster instances based on
each job – not one size fits all
• Leverage resizing and spot to add and
remove capacity cost-effectively
• Tuning cluster instances can be easier
than tuning Hadoop code

Job Flow

Duration:
14 Hours

Job Flow

Duration:
7 Hours
AWS Data Pipeline
When to use
•
•

Automate movement and transformation
of data (ETL in the cloud)
Dependency management
–
–

•
•
•

Schedule management
Transient Amazon EMR clusters
Regular data move pattern
–
–

•

Data
Control

Every hour, day
Every 30 minutes

Amazon DynamoDB backups
–

Cross region

When not to use
•
•
•

Less that 15 minutes scheduling
interval
Execution latency less than a minute
Event-based scheduling
AWS Data Pipeline Best Practices
•
•
•
•

Use dependency rather than time based
Make your activities idempotent
Add in your tools using shell activity
Use Amazon S3 for staging
Amazon S3
When to use

When not to use

•
•
•
•
•

•
•
•
•

Store large objects
Key-value store - Get/Put/List
Unlimited storage
Versioning
Very high durability
– 99.999999999%

•

•

Very high throughput (via parallel
clients)
Use for storing persistent data
– Backups
– Source/target for EMR
– Blob store with metadata in SQL
or NoSQL

•

Complex queries
Very low latency (ms)
Search
Read-after-write consistency for
overwrites
Need transactions
Amazon S3 Best Practices
•
•
•
•

Use random hash prefix for keys
Ensure a random access pattern
Use Amazon CloudFront for high throughput GETs and PUTs
Leverage the high durability, high throughput design of Amazon S3
for backup and as a common storage sink
•
•
•

•
•

Durable sink between data services
Supports de-coupling and asynchronous delivery
Consider RRS for lower cost, lower durability storage of derivatives or copies

Consider parallel threads and multipart upload for faster writes
Consider parallel threads and range get for faster reads
Amazon Glacier
When to use

When not to use

•
•
•

•
•

•
•
•

Infrequently accessed data sets
Very low cost storage
Data retrieval times of several
hours is acceptable
Encryption at rest
Very high durability
– 99.999999999%
Unlimited amount of storage

Frequent access
Low latency access
Amazon Glacier Best Practices
• Reduce request and storage costs with aggregation
•
•
•

Aggregating your files into bigger files before sending them to Amazon Glacier
Store checksums along with your files
Use a format that allows you to access files within your aggregate archive

• Improve speed and reliability with multipart upload
• Reduce costs with ranged retrievals
• Maintaining your own index in a highly durable store
Amazon EC2 + Amazon EBS/Instance
Storage
When to use

When not to use

•
•
•

•

Alternate data store technologies
Hand-tuned performance needs
Direct/admin access required

•

When a managed service will do
the job
When operational experience is
low
Amazon EBS Best Practices
•

Pick the right EC2 instance type
•
•

•
•
•

Use provisioned IOPS volumes for database workloads requiring
consistent IOPS
Use standard volumes for workloads requiring low to moderate IOPS
& occasional bursts
Stripe multiple Amazon EBS volumes for higher IOPS or storage
•
•

•

Higher “network performance” instances for driving more Amazon EBS IOPS
EBS-Optimized EC2 instances for dedicated throughput between EC2 & Amazon EBS

RAID0 for higher I/O
RAID10 for highest local durability

Amazon EBS snapshots
•

Quiesce the file system and take a snapshot
Amazon EC2 Best Practices

HI-Best IOPS/$
HS-Best GB/$

Best vCPU/$

Best MemoryGiB/$
Summary
Cloud Data Tier Architecture Anti-Pattern

Data Tier
AWS Data Tier Architecture - Use the right tool for the job!

Data Tier
Amazon
ElastiCache

Amazon
CloudSearch

Amazon
Elastic MapReduce

Amazon S3
Amazon
Glacier

Amazon DynamoDB

Amazon RDS

Amazon Redshift

AWS Data Pipeline
Amazon
CloudSearch

Amazon
ElastiCache
Amazon
RDS

Amazon
EMR

Amazon
DynamoDB

Amazon
Redshift

AWS Data Pipeline

Reference Architecture

Amazon
S3

Amazon
Glacier
Cost Conscious Design
Please give us your feedback on this
presentation

DAT203
As a thank you, we will select prize
winners daily for completed surveys!
Remember…

More Related Content

What's hot

Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon KinesisAmazon Web Services
 
Accelerating Your Cloud Migration Journey with MAP
Accelerating Your Cloud Migration Journey with MAPAccelerating Your Cloud Migration Journey with MAP
Accelerating Your Cloud Migration Journey with MAPAmazon Web Services
 
Introduction to AWS Cost Management
Introduction to AWS Cost ManagementIntroduction to AWS Cost Management
Introduction to AWS Cost ManagementAmazon Web Services
 
AWS Security Week: AWS Secrets Manager
AWS Security Week: AWS Secrets ManagerAWS Security Week: AWS Secrets Manager
AWS Security Week: AWS Secrets ManagerAmazon Web Services
 
Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Amazon Web Services
 
AWS Monitoring & Logging
AWS Monitoring & LoggingAWS Monitoring & Logging
AWS Monitoring & LoggingJason Poley
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Amazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon Web Services
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveCobus Bernard
 
Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)Amazon Web Services
 
Introduction to AWS Organizations
Introduction to AWS OrganizationsIntroduction to AWS Organizations
Introduction to AWS OrganizationsAmazon Web Services
 
민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS
민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS
민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWSAmazon Web Services Korea
 
Introduction to AWS Storage Services
Introduction to AWS Storage ServicesIntroduction to AWS Storage Services
Introduction to AWS Storage ServicesAmazon Web Services
 

What's hot (20)

Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
Accelerating Your Cloud Migration Journey with MAP
Accelerating Your Cloud Migration Journey with MAPAccelerating Your Cloud Migration Journey with MAP
Accelerating Your Cloud Migration Journey with MAP
 
Introduction to AWS Cost Management
Introduction to AWS Cost ManagementIntroduction to AWS Cost Management
Introduction to AWS Cost Management
 
AWS Security Week: AWS Secrets Manager
AWS Security Week: AWS Secrets ManagerAWS Security Week: AWS Secrets Manager
AWS Security Week: AWS Secrets Manager
 
Amazon S3 Masterclass
Amazon S3 MasterclassAmazon S3 Masterclass
Amazon S3 Masterclass
 
Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)
 
AWS Monitoring & Logging
AWS Monitoring & LoggingAWS Monitoring & Logging
AWS Monitoring & Logging
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3) AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
 
Amazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage Overview
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep Dive
 
AWS Storage Gateway
AWS Storage GatewayAWS Storage Gateway
AWS Storage Gateway
 
Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
Introduction to AWS Organizations
Introduction to AWS OrganizationsIntroduction to AWS Organizations
Introduction to AWS Organizations
 
민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS
민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS
민첩하고 비용효율적인 Data Lake 구축 - 문종민 솔루션즈 아키텍트, AWS
 
Athena & Glue
Athena & GlueAthena & Glue
Athena & Glue
 
Amazon EFS
Amazon EFSAmazon EFS
Amazon EFS
 
Introduction to AWS Storage Services
Introduction to AWS Storage ServicesIntroduction to AWS Storage Services
Introduction to AWS Storage Services
 

Similar to AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent 2013

(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
AWS November Webinar Series - Architectural Patterns & Best Practices for Big...
AWS November Webinar Series - Architectural Patterns & Best Practices for Big...AWS November Webinar Series - Architectural Patterns & Best Practices for Big...
AWS November Webinar Series - Architectural Patterns & Best Practices for Big...Amazon Web Services
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivAmazon Web Services
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
February 2016 Webinar Series - Architectural Patterns for Big Data on AWS
February 2016 Webinar Series - Architectural Patterns for Big Data on AWSFebruary 2016 Webinar Series - Architectural Patterns for Big Data on AWS
February 2016 Webinar Series - Architectural Patterns for Big Data on AWSAmazon Web Services
 
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...Amazon Web Services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Amazon Web Services
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big DataAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayChoosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayAmazon Web Services Korea
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)Amazon Web Services
 
Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rasmus Ekman
 
Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Amazon Web Services
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 

Similar to AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent 2013 (20)

(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
AWS November Webinar Series - Architectural Patterns & Best Practices for Big...
AWS November Webinar Series - Architectural Patterns & Best Practices for Big...AWS November Webinar Series - Architectural Patterns & Best Practices for Big...
AWS November Webinar Series - Architectural Patterns & Best Practices for Big...
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
Deep Dive in Big Data
Deep Dive in Big DataDeep Dive in Big Data
Deep Dive in Big Data
 
February 2016 Webinar Series - Architectural Patterns for Big Data on AWS
February 2016 Webinar Series - Architectural Patterns for Big Data on AWSFebruary 2016 Webinar Series - Architectural Patterns for Big Data on AWS
February 2016 Webinar Series - Architectural Patterns for Big Data on AWS
 
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayChoosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
 
Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)
 
Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud.
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Recently uploaded (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent 2013

  • 1. DAT203 - AWS Storage and Database Architecture Best Practices Siva Raghupathy, Amazon Web Services © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. The Third Platform • Built on: – – – – Mobile devices Cloud services Social technologies Big data • Billions of users • Millions of apps
  • 3. Data Volume, Velocity, Variety • 2.7 zettabytes (ZB) of data exists in the digital universe today – 1 ZB = 1 billion terabytes • 450 billion transaction per day by 2020 • More unstructured data than structured data
  • 4. Common Questions from Database Developers Cloud Migration • How do I move (my data) to the cloud? Data/Storage Technologies • What data store should I use? – SQL or NoSQL? – Hadoop or DW? – What about search? Management Concerns • Is my data (in the cloud) secure? • Relational features w/o management nightmares? • My data volume, velocity, and variety are exploding! • How can I reduce cost? Performance and Delivery • Need low latency (ms or µs) • Need high throughput • Need to ship in days – not years!
  • 5. Cloud Data Tier Anti-Pattern Data Tier
  • 6. Cloud Data Tier Architecture – Use the Right Tool for the Job! Client Tier App/Web Tier Data Tier Search Cache Blob Store ETL NoSQL SQL Data Warehouse Hadoop
  • 7.
  • 8. AWS Deployment & Administration App Services Compute Storage Database Networking AWS Global Infrastructure
  • 9. AWS Managed Database & Storage Services Structured – Complex Query • SQL – Amazon RDS (MySQL, Oracle, SQL Server) • Data Warehouse – Amazon Redshift Structured – Simple Query • NoSQL – Amazon DynamoDB • Cache – Amazon ElastiCache (Memcached, Redis) • Search – Amazon CloudSearch Unstructured – Custom Query • Hadoop – Amazon Elastic MapReduce (EMR) Unstructured – No Query • Cloud Storage – Amazon S3 – Amazon Glacier
  • 10. AWS Primitive Compute and Storage Compute Capabilities • Many different EC2 instance types – – – – General purpose Compute optimized Storage optimized Memory optimized • Host any major data storage technology Raw Storage Options • EC2 Instance store (ephemeral) • Amazon Elastic Block Store (EBS) – Standard volume • 1 TB, ~100 IOPS per volume – Provisioned IOPS volume • 1 TB, up to 4000 IOPS per volume – Stripe multiple volumes for higher IOPS or storage – RDBMS – NoSQL – Cache Primitives add flexibility, but also come with operational burden!
  • 11. AWS Data Tier Architecture - Us the right tool for the job! Data Tier Amazon ElastiCache Amazon CloudSearch Amazon Elastic MapReduce Amazon S3 Amazon Glacier Amazon DynamoDB Amazon RDS Amazon Redshift AWS Data Pipeline
  • 14. Use Case: A Video Streaming Application
  • 15. Use Case: A Video Streaming App – Upload Amazon CloudSearch Amazon RDS Amazon DynamoDB Amazon S3
  • 16. A Video Streaming App – Discovery CloudFront Amazon CloudSearch Amazon ElastiCache Amazon RDS X Amazon DynamoDB Amazon S3 Amazon Glacier
  • 17. Use Case: A Video Streaming App – Recs Amazon DynamoDB Amazon EMR Amazon S3 Amazon Glacier
  • 18. Use Case: A Video Streaming App – Analytics Amazon EMR Amazon S3 Amazon Redshift Amazon Glacier
  • 19. What is the temperature of your data?
  • 20. Data Characteristics: Hot, Warm, Cold Hot Warm Cold Volume Item size Latency Durability MB–GB B–KB ms Low–High GB–TB KB–MB ms, sec High PB KB–TB min, hrs Very High Request rate Cost/GB Very High $$-$ High $-¢¢ Low ¢
  • 22. What data store should I use? ElastiCache Amazon DynamoDB Amazon RDS Cloud Search Amazon Redshift Amazon EMR (Hive) Amazon S3 Amazon Glacier Average latency ms ms ms,sec ms,sec sec,min sec,min, hrs ms,sec,min hrs (~ size) Data volume GB GB–TBs (no limit) GB–TB GB–TB (3 TB Max) Item size B-KB KB KB (64 KB max) (~rowsize) TB–PB GB–PB GB–PB (1.6 PB max) (~nodes) (no limit) GB–PB (no limit) KB (1 MB max) KB (64 K max) KB-MB KB-GB (5 TB max) GB (40 TB max) Request rate Very High Very High High High Low Low Low– Very High (no limit) Very Low (no limit) Storage cost $$ $/GB/month ¢¢ $ ¢ ¢ ¢ ¢ High High High High Very High Very High Durability ¢¢ Low Very High Moderate Hot Data Warm Data Cold Data
  • 23. AWS Data Tier Architecture - Use the right tool for the job! Data Tier Amazon ElastiCache Amazon CloudSearch Amazon Elastic MapReduce Amazon S3 Amazon Glacier Amazon DynamoDB Amazon RDS Amazon Redshift AWS Data Pipeline
  • 25. Cost Conscious Design Example: Should I use Amazon S3 or Amazon DynamoDB? “I’m currently scoping out a project that will greatly increase my team’s use of Amazon S3. Hoping you could answer some questions. The current iteration of the design calls for many small files, perhaps up to a billion during peak. The total size would be on the order of 1.5 TB per month…” Request rate Object size Total size Objects per month (Writes/sec) (Bytes) (GB/month) 300 2048 1483 777,600,000
  • 26. Cost Conscious Design Example: Should I use Amazon S3 or Amazon DynamoDB?
  • 27. Amazon S3 or Amazon DynamoDB? Request rate Object size Total size Objects per (Writes/sec) (Bytes) (GB/month) month 300 2,048 1,483 777,600,000
  • 28. Amazon DynamoDB use Request rate Object size Total size Objects per (Writes/sec) (Bytes) (GB/month) month Scenario 1 300 2,048 1,483 777,600,000 Scenario 2 300 32,768 23,730 777,600,000 use Amazon S3
  • 30. Amazon RDS When to use When not to use • • • • Transactions Complex queries Medium to high query/write rate – Up to 30 K IOPS (15 K reads + 15 K writes) • • • 100s of GB to low TBs Workload can fit in a single node High durability Massive read/write rates – Example: 150 K write requests per second • Data size or throughput demands sharding – Example: 10 s or 100 s of terabytes • • Simple Get/Put and queries that a NoSQL can handle Complex analytics Push-Button Scaling Multi-AZ AZ 1 AZ 2 Region Read Replicas
  • 31. Amazon RDS Best Practices • Use the right DB instance class • Use EBS-optimized instances – db.m1.large, db.m1.xlarge, db.m2.2xlarge, db.m2.4xlarge, db.cr1.8xlarge • Use provisioned IOPS • Use multi-AZ for high availability • Use read replicas for – Scaling reads – Schema changes – Additional failure recovery
  • 32. Amazon DynamoDB When to use • • • • • • • Fast and predictable performance Seamless/massive scale Autosharding Consistent/low latency No size or throughput limits Very high durability Key-value or simple queries When not to use • • • • Need multi-item/row or cross table transactions Need complex queries, joins Need real-time analytics on historic data Storing cold data
  • 33. Amazon DynamoDB Best Practices • Keep item size small • Store metadata in Amazon DynamoDB and large blobs in Amazon S3 • Use a table with a hash key for extremely high scale • Use table per day, week, month etc. for storing time series data • Use conditional/OCC updates • Use hash-range key to model – 1:N relationships – Multi-tenancy • Avoid hot keys and hot partitions Events_table_2012 Event_id (Hash key) Timestam p (range key) Attribute1 …. Attribute N Events_table_2012_05_week1 Events_table_2012_05_week2 Attribute1 …. Attribute N Event_id Timestam (Hash key) p Timestam Attribute1 …. Attribute N Event_id (range key) (Hash key) p Events_table_2012_05_week3 (range key) Attribute1 …. Attribute N Event_id Timestam (Hash key) p (range key)
  • 34. Amazon ElastiCache (Memcached) When to use When not to use • • • • • • Transient key-value store Need to speed up reads/write Caching frequent SQL, NoSQL or DW query results Saving transient and frequently updated data – – • Increment/decrement game scores/counters Web application session storage Best effort deduplication Store infrequently used data Need persistence
  • 35. Amazon ElastiCache (Memcached) Best Practices • • • • • Use autodiscovery Share memcached client objects in application Use TTLs Consider memory for connections overhead Use Amzon CloudWatch alarms / SNS alerts • • • Number of connections Swap memory usage Freeable memory
  • 36. Amazon ElastiCache (Redis) When to use When not to use • • • • • Key-value store with advanced data structures – Strings, lists, sets, sorted sets, hashes • • • • • • Caching Leader boards High-speed sorting Atomic counters Queuing systems Activity streams Need “native” sharding or scale-out Need “hard” persistence Data won’t fit in memory Need transaction rollback even under exceptions
  • 37. Amazon ElastiCache (Redis) Best Practices • • Use TTL Use the right instance types • • Use read replicas • • • • • Instances with high ECU/vCPU and network performance yield the highest throughput. Example: m2.4xlarge, m2.2xlarge Increase read throughput AOF cannot protect against all failure modes Promote read replicas to primary Use RDB file snapshot for on-premises to Amazon ElastiCache migration Key parameter group settings • • • Avoid “AOF with fsync always” – huge impact on performance AOF (+ RDB) with fsync everysec – best durability + performance Pub-sub: set client-output-buffer-limit-pubsub-hard-limit and client-output-buffer-limit-pubsub-soft-limit based on the workloads
  • 38. Amazon CloudSearch When to use When not to use • • • • • • • No search expertise Full-text search Ranking Relevance Structured and unstructured data Faceting – $0 to $10 (4 items) – $10 and above (3 items) Not as replacement for a database – Not as a system of record – Transient data – Nonatomic updates
  • 39. Amazon CloudSearch Best Practices • Batch documents for uploading • Use Amazon CloudSearch for searching and another store for retrieving full records for the UI (i.e. don’t use return fields) • Include other data like popularity scores in documents • Use stop words to remove common terms • Use fielded queries to reduce match sets • Query latency is proportional to query specificity
  • 40. Amazon Redshift When to use When not to use • • • • • • • • • Information analysis and reporting Complex DW queries that summarize historical data Batched large updates e.g. daily sales totals 10s of concurrent queries 100s GB to PB Compression Column based Very high durability OLTP workloads – 1000s of concurrent users – Large number of singleton updates
  • 41. Amazon Redshift Best Practices • Use COPY command to load large data sets from Amazon S3, Amazon DynamoDB, Amazon EMR/EC2/Unix/Linux hosts – Split your data into multiple files – Use GZIP or LZOP compression – Use manifest file • Choose proper sort key – Range or equality on WHERE clause • Choose proper distribution key – Join column, foreign key or largest dimension, group by column – Avoid distribution key for denormalized data
  • 42. Amazon Elastic MapReduce When to use When not to use • • Batch analytics/processing – • • • • • Answers in minutes or hours Structured and unstructured data • Parallel scans of the entire dataset with uniform query performance Supports Hive QL + other languages GB, TB, or PB of data Replicated data store (HDFS) for ad-hoc and real-time queries (HBase) Real-time analytics (DW) – Need answers in seconds 1000s of concurrent users
  • 43. Amazon Elastic MapReduce Best Practices • Choose between transient and persistent clusters for best TCO • Leverage Amazon S3 integration for highly durable and interim storage • Right-size cluster instances based on each job – not one size fits all • Leverage resizing and spot to add and remove capacity cost-effectively • Tuning cluster instances can be easier than tuning Hadoop code Job Flow Duration: 14 Hours Job Flow Duration: 7 Hours
  • 44. AWS Data Pipeline When to use • • Automate movement and transformation of data (ETL in the cloud) Dependency management – – • • • Schedule management Transient Amazon EMR clusters Regular data move pattern – – • Data Control Every hour, day Every 30 minutes Amazon DynamoDB backups – Cross region When not to use • • • Less that 15 minutes scheduling interval Execution latency less than a minute Event-based scheduling
  • 45. AWS Data Pipeline Best Practices • • • • Use dependency rather than time based Make your activities idempotent Add in your tools using shell activity Use Amazon S3 for staging
  • 46. Amazon S3 When to use When not to use • • • • • • • • • Store large objects Key-value store - Get/Put/List Unlimited storage Versioning Very high durability – 99.999999999% • • Very high throughput (via parallel clients) Use for storing persistent data – Backups – Source/target for EMR – Blob store with metadata in SQL or NoSQL • Complex queries Very low latency (ms) Search Read-after-write consistency for overwrites Need transactions
  • 47. Amazon S3 Best Practices • • • • Use random hash prefix for keys Ensure a random access pattern Use Amazon CloudFront for high throughput GETs and PUTs Leverage the high durability, high throughput design of Amazon S3 for backup and as a common storage sink • • • • • Durable sink between data services Supports de-coupling and asynchronous delivery Consider RRS for lower cost, lower durability storage of derivatives or copies Consider parallel threads and multipart upload for faster writes Consider parallel threads and range get for faster reads
  • 48. Amazon Glacier When to use When not to use • • • • • • • • Infrequently accessed data sets Very low cost storage Data retrieval times of several hours is acceptable Encryption at rest Very high durability – 99.999999999% Unlimited amount of storage Frequent access Low latency access
  • 49. Amazon Glacier Best Practices • Reduce request and storage costs with aggregation • • • Aggregating your files into bigger files before sending them to Amazon Glacier Store checksums along with your files Use a format that allows you to access files within your aggregate archive • Improve speed and reliability with multipart upload • Reduce costs with ranged retrievals • Maintaining your own index in a highly durable store
  • 50. Amazon EC2 + Amazon EBS/Instance Storage When to use When not to use • • • • Alternate data store technologies Hand-tuned performance needs Direct/admin access required • When a managed service will do the job When operational experience is low
  • 51. Amazon EBS Best Practices • Pick the right EC2 instance type • • • • • Use provisioned IOPS volumes for database workloads requiring consistent IOPS Use standard volumes for workloads requiring low to moderate IOPS & occasional bursts Stripe multiple Amazon EBS volumes for higher IOPS or storage • • • Higher “network performance” instances for driving more Amazon EBS IOPS EBS-Optimized EC2 instances for dedicated throughput between EC2 & Amazon EBS RAID0 for higher I/O RAID10 for highest local durability Amazon EBS snapshots • Quiesce the file system and take a snapshot
  • 52. Amazon EC2 Best Practices HI-Best IOPS/$ HS-Best GB/$ Best vCPU/$ Best MemoryGiB/$
  • 54. Cloud Data Tier Architecture Anti-Pattern Data Tier
  • 55. AWS Data Tier Architecture - Use the right tool for the job! Data Tier Amazon ElastiCache Amazon CloudSearch Amazon Elastic MapReduce Amazon S3 Amazon Glacier Amazon DynamoDB Amazon RDS Amazon Redshift AWS Data Pipeline
  • 58. Please give us your feedback on this presentation DAT203 As a thank you, we will select prize winners daily for completed surveys!
  • 59.
  • 60.