Choosing  The  Right  Database  Service
with  Japanese  customer  cases
Sangpill Kim,  Solutions  Architect,  AWS  Korea
Yutaka  Hoshino,  Solutions  Architect,  AWS  Japan
Time  :  13:00  – 13:30
AGENDA
Afternoon  Sessions
1:00  pm — 1:30  pm Choosing  the  Right  Database   Service Sangpil Kim,  Yutaka  Hoshino
1:30  pm — 2:50  pm Aurora  Technical  Deep  Dive KiWaon Kim
2:50  am — 3:10  am Coffee  Break
3:10  pm — 4:30  pm DynamoDB for  Developers Andy  Kim
4:30  pm — 5:50  pm Redshift  Deep  Dive Kevin  Kim
5:50  pm — 6:00  pm Wrap-­up  and  Closing
Choosing the Right Database Service
Agenda
• Amazon  Managed  Database  Services
• Choosing  the  Right  Database  Service
• Useful  insights  from  Japaness Database  service  customer  cases  
-­ Aurora
If you host your databases on-premises
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
If you host your databases on-premises
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
If you host your databases in Amazon EC2
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
If you host your databases in Amazon EC2
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
you
App optimization
Power, HVAC, net
Rack and stack
Server maintenance
OS installation
If you choose a managed DB service
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB s/w patches
Database backups
App optimization
High availability
DB s/w installs
OS installation
you
Scaling
Quick summary of the options
• Self-Managed - You are responsible for the hardware, OS, security, up
dates, backups, replication etc., but have full control over it.
• EC2 Instances - You only need to focus on the database level updates
, patches, replication, backups etc. and don’t have to worry about the
hardware or the OS installation.
• Fully Managed - You get features such as backup and replication etc.
as a package service and don’t have to bother with patching and upd
ates.
AWS Managed Database Options
Amazon  DynamoDB
• NoSQL  database
• Fully  managed  
• Single-­digit  millisecond  
latency
• Massive  and  seamless  
scalability
• Low  cost
Amazon  RDS
• Relational  databases
• Fully  managed
• Predictable  performance
• Simple  and  fast  to  scale
• Low  cost,  pay  for  what  
you  use
Aurora
Amazon  Elasticache
• In-­memory  key-­value  store
• High-­performance
• Memcached and  Redis
• Fully  managed,  zero  admin
Amazon  Redshift
• Relational  data  warehouse
• Massively  parallel;;  petabyt
e  scale
• Fully  managed
• HDD  and  SSD  platforms
• $1,000/TB/year;;  starts  at  $
0.25/hour
Database + Search Tier Anti-pattern
RDBMS
Database + Search Tier
Applications
Best Practice - Use the Right Tool for the Job
Data Tier
Search
Amazon Elasticsearch
Amazon CloudSearch
Cache
Redis
Memcached
SQL
Amazon Aurora
MySQL
PostgreSQL
Oracle
SQL Server
NoSQL
Cassandra
Amazon DynamoDB
HBase
MongoDB
Applications
Database + Search Tier
Materialized Views
What Data Store Should I Use?
• Data structure : Fixed schema / JSON / key-value
• Access patterns : Store data in the format you will access it
• Data / access characteristics : Hot / Warm / Cold
• Cost : Right cost
Data Structure and Access Patterns
Access Patterns What to use?
Put/Get (Key, Value) Cache, NoSQL
Simple relationships → 1:N, M:N NoSQL
Cross table joins, transaction, SQL SQL
Faceting, Search Search
Data Structure What to use?
Fixed schema SQL, NoSQL
Schema-free (JSON) NoSQL, Search
Key, Value Cache, NoSQL
What Is the Temperature of Your Data / Access ?
Hot Warm Cold
Volume MB–GB GB–TB PB
Item size B–KB KB–MB KB–TB
Latency ms ms, sec min, hrs
Durability Low–High High Very High
Request rate Very High High Low
Cost/GB $$-$ $-¢¢ ¢
Hot Data Warm Data Cold Data
Data / Access Characteristics: Hot, Warm, Cold
Cache
SQL
Request Rate
High Low
Cost/GB
High Low
Latency
Low High
Data Volume
Low High
Glacier
Structure
NoSQL
Hot Data Warm Data Cold Data
Low
High
Search
What Data Store Should I Use?
Amazon
ElastiCache
Amazon
DynamoDB
Amazon
Aurora
Amazon
Elasticsearch
Amazon
EMR (HDFS)
Amazon
S3
Amazon
Glacier
Average
latency
ms ms ms, sec ms,sec sec,min,hrs ms,sec,min
(~ size)
hrs
Data
volume
GB GB–TBs
(no limit)
GB–TB
(64 TB
Max)
GB–TB GB–PB
(~nodes)
MB–PB
(no limit)
GB–PB
(no limit)
Item size B-KB KB
(400 KB max)
KB
(64 KB)
KB
(1 MB max)
MB-GB KB-GB
(5 TB max)
GB
(40 TB max)
Request rate High -
Very High
Very High
(no limit)
High High Low – Very High Low –
Very High
(no limit)
Very Low
Storage cost
GB/month
$$ ¢¢ ¢¢ ¢¢ ¢ ¢ ¢/10
Durability Low - Moderate Very High Very High High High Very High Very High
Hot Data Warm Data Cold Data
What Data Store Should I Use?
Cost Conscious Design – Example : DynamoDB vs. S3?
“I’m currently scoping out a project that will greatly increase my team’s u
se of Amazon S3. Hoping you could answer some questions. The current
iteration of the design calls for many small files, perhaps up to a billion
during peak. The total size would be on the order of 1.5 TB per month
…” Request rate  
(Writes/sec)
Object  size
(Bytes)
Total  size
(GB/month)
Objects per  month
300 2,048 1,483 777,600,000  
http://calculator.s3.amazonaws.com/index.html
SIMPLE  MONTHLY  CALCULATOR
Cost Conscious Design – Example : DynamoDB vs. S3?
Request rate  
(Writes/sec)
Object  size
(Bytes)
Total  size
(GB/month)
Objects per  month
300 2,048 1,483 777,600,000  
Request rate  
(Writes/sec)
Object  size
(Bytes)
Total  size
(GB/month)
Objects per  month
Scenario  1 300 2,048 1,483 777,600,000  
Scenario  2 300 32,768 23,730 777,600,000  
Amazon S3
Amazon DynamoDB
use
use
Cost Conscious Design – Example : DynamoDB vs. S3?
Amazon  Aurora  among  Japanese  Customers
~Lessons  from  Case  Studies~
Yutaka  Hoshino  (Solutions  Architect)
Amazon  Web  Services  Japan  K.K.
Time  :  13:00  – 13:30
http://grani.jp/
Getting performance from Amazon Aurora
Before
RDS  for  MySQL
MultiAZ +  Read  Replica
r3.4xlarge
After
Amazon  Aurora
2  node (1  writer  /  1  reader)
r3.4xlarge
Getting performance from Amazon Aurora
Aurora  3x  faster  than  MySQL  (Total)
Getting performance from Amazon Aurora
Update  :  5x+  faster
Getting performance from Amazon Aurora
Insert  :  2x+  faster
Getting performance from Amazon Aurora
We  could  reduce  #  of  nodes  by  DB  consolidation  and  improved  ov
erall  performance.  
In  Grani’s case,  they  achieve  $10.6K  cost  saving  in  a  year  by  DB  
consolidation.
RDS (db.r3.4xlarge / gp2 /
OnDemand)
Hourly Daily Yearly
RDS for MySQL
(MultiAZ + 1 ReadReplica)
$4.54 + $2.27 = $6.81 $163.44 $59,655.60
Aurora (+ 1 Replica) $2.8 * 2 = $5.6 $134.40 $49,056.00
Reduction ▲$1.21 ▲$29.04 ▲$10,599.6
Getting performance from Amazon Aurora
http://reader.livedwango.com/
Consolidating  complex  databases  to  simple
•Live  dwango reader(LDR)  /  LDR  Pocket
• Online  RSS  Reader  for  PC  and  Smartphone
• 9  years  old  service
• Very  old  source  code…
•A  bunch  of  data  in  MySQL  servers
•Difficulties  in  touching  the  old  source  codes
Consolidating  complex  databases  to  simple
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
DB on
Instance
• 5  shards  to  store  article  data
• Aurora  will  achieve  5x  throughput
• Consolidate  to  Single Aurora  cluster  !!
Aurora
Consolidating  complex  databases  to  simple
•Before
• Data  Size:  5TB
• EBSs:  500GiB  *  5,  800GiB  *  10
• DBs:  master  *  7  instances,  slave  *  8  instances  (All  r3.large)
•After
• Aurora(r3.xlarge)  *  2 instances
• No  code  change!!
•Cost  reduction
• $24,000/year  (estimated)
Consolidating  complex  databases  to  simple
The  Mainichi  Newspapers  Co.,Ltd
http://mainichi.jp/english/
Wire compatibility with MySQL5.6
• Full  migration  from  on-­prem to  AWS
• All  contents  data  are  stored  into  Aurora
• At  first,  they  planed  to  use  RDS  MySQL
• Change  RDS  MySQL  to  Aurora  two  week
s  before  launch  without  code  and  configur
ation  change
• Fast  failover
• Stability
• Fault  tolerance
• Cost  reduction
Wire compatibility with MySQL5.6
All  contents  data  are  stored  into  Aurora
Wire compatibility with MySQL5.6
• Good  cost  performance
• Reduce  cost  compared  to  MySQL
• RDS  MySQL:  #  of  instance  *  storage  (provisioned  IOPS  and  
storage  size  in  advance)
• Aurora:  1  Aurora  cluster  needs  single  storage  price  (pay  as  
you  go)
• Wire  compatibility  with  MySQL5.6
Wire compatibility with MySQL5.6
Thank  you!

Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day

  • 1.
    Choosing  The  Right Database  Service with  Japanese  customer  cases Sangpill Kim,  Solutions  Architect,  AWS  Korea Yutaka  Hoshino,  Solutions  Architect,  AWS  Japan Time  :  13:00  – 13:30
  • 2.
    AGENDA Afternoon  Sessions 1:00  pm— 1:30  pm Choosing  the  Right  Database   Service Sangpil Kim,  Yutaka  Hoshino 1:30  pm — 2:50  pm Aurora  Technical  Deep  Dive KiWaon Kim 2:50  am — 3:10  am Coffee  Break 3:10  pm — 4:30  pm DynamoDB for  Developers Andy  Kim 4:30  pm — 5:50  pm Redshift  Deep  Dive Kevin  Kim 5:50  pm — 6:00  pm Wrap-­up  and  Closing Choosing the Right Database Service
  • 3.
    Agenda • Amazon  Managed Database  Services • Choosing  the  Right  Database  Service • Useful  insights  from  Japaness Database  service  customer  cases   -­ Aurora
  • 4.
    If you hostyour databases on-premises Power, HVAC, net Rack and stack Server maintenance OS patches DB s/w patches Database backups Scaling High availability DB s/w installs OS installation you App optimization
  • 5.
    If you hostyour databases on-premises Power, HVAC, net Rack and stack Server maintenance OS patches DB s/w patches Database backups Scaling High availability DB s/w installs OS installation you App optimization
  • 6.
    If you hostyour databases in Amazon EC2 Power, HVAC, net Rack and stack Server maintenance OS patches DB s/w patches Database backups Scaling High availability DB s/w installs OS installation you App optimization
  • 7.
    If you hostyour databases in Amazon EC2 OS patches DB s/w patches Database backups Scaling High availability DB s/w installs you App optimization Power, HVAC, net Rack and stack Server maintenance OS installation
  • 8.
    If you choosea managed DB service Power, HVAC, net Rack and stack Server maintenance OS patches DB s/w patches Database backups App optimization High availability DB s/w installs OS installation you Scaling
  • 9.
    Quick summary ofthe options • Self-Managed - You are responsible for the hardware, OS, security, up dates, backups, replication etc., but have full control over it. • EC2 Instances - You only need to focus on the database level updates , patches, replication, backups etc. and don’t have to worry about the hardware or the OS installation. • Fully Managed - You get features such as backup and replication etc. as a package service and don’t have to bother with patching and upd ates.
  • 10.
    AWS Managed DatabaseOptions Amazon  DynamoDB • NoSQL  database • Fully  managed   • Single-­digit  millisecond   latency • Massive  and  seamless   scalability • Low  cost Amazon  RDS • Relational  databases • Fully  managed • Predictable  performance • Simple  and  fast  to  scale • Low  cost,  pay  for  what   you  use Aurora Amazon  Elasticache • In-­memory  key-­value  store • High-­performance • Memcached and  Redis • Fully  managed,  zero  admin Amazon  Redshift • Relational  data  warehouse • Massively  parallel;;  petabyt e  scale • Fully  managed • HDD  and  SSD  platforms • $1,000/TB/year;;  starts  at  $ 0.25/hour
  • 11.
    Database + SearchTier Anti-pattern RDBMS Database + Search Tier Applications
  • 12.
    Best Practice -Use the Right Tool for the Job Data Tier Search Amazon Elasticsearch Amazon CloudSearch Cache Redis Memcached SQL Amazon Aurora MySQL PostgreSQL Oracle SQL Server NoSQL Cassandra Amazon DynamoDB HBase MongoDB Applications Database + Search Tier
  • 13.
  • 14.
    What Data StoreShould I Use? • Data structure : Fixed schema / JSON / key-value • Access patterns : Store data in the format you will access it • Data / access characteristics : Hot / Warm / Cold • Cost : Right cost
  • 15.
    Data Structure andAccess Patterns Access Patterns What to use? Put/Get (Key, Value) Cache, NoSQL Simple relationships → 1:N, M:N NoSQL Cross table joins, transaction, SQL SQL Faceting, Search Search Data Structure What to use? Fixed schema SQL, NoSQL Schema-free (JSON) NoSQL, Search Key, Value Cache, NoSQL
  • 16.
    What Is theTemperature of Your Data / Access ?
  • 17.
    Hot Warm Cold VolumeMB–GB GB–TB PB Item size B–KB KB–MB KB–TB Latency ms ms, sec min, hrs Durability Low–High High Very High Request rate Very High High Low Cost/GB $$-$ $-¢¢ ¢ Hot Data Warm Data Cold Data Data / Access Characteristics: Hot, Warm, Cold
  • 18.
    Cache SQL Request Rate High Low Cost/GB HighLow Latency Low High Data Volume Low High Glacier Structure NoSQL Hot Data Warm Data Cold Data Low High Search What Data Store Should I Use?
  • 19.
    Amazon ElastiCache Amazon DynamoDB Amazon Aurora Amazon Elasticsearch Amazon EMR (HDFS) Amazon S3 Amazon Glacier Average latency ms msms, sec ms,sec sec,min,hrs ms,sec,min (~ size) hrs Data volume GB GB–TBs (no limit) GB–TB (64 TB Max) GB–TB GB–PB (~nodes) MB–PB (no limit) GB–PB (no limit) Item size B-KB KB (400 KB max) KB (64 KB) KB (1 MB max) MB-GB KB-GB (5 TB max) GB (40 TB max) Request rate High - Very High Very High (no limit) High High Low – Very High Low – Very High (no limit) Very Low Storage cost GB/month $$ ¢¢ ¢¢ ¢¢ ¢ ¢ ¢/10 Durability Low - Moderate Very High Very High High High Very High Very High Hot Data Warm Data Cold Data What Data Store Should I Use?
  • 20.
    Cost Conscious Design– Example : DynamoDB vs. S3? “I’m currently scoping out a project that will greatly increase my team’s u se of Amazon S3. Hoping you could answer some questions. The current iteration of the design calls for many small files, perhaps up to a billion during peak. The total size would be on the order of 1.5 TB per month …” Request rate   (Writes/sec) Object  size (Bytes) Total  size (GB/month) Objects per  month 300 2,048 1,483 777,600,000   http://calculator.s3.amazonaws.com/index.html SIMPLE  MONTHLY  CALCULATOR
  • 21.
    Cost Conscious Design– Example : DynamoDB vs. S3? Request rate   (Writes/sec) Object  size (Bytes) Total  size (GB/month) Objects per  month 300 2,048 1,483 777,600,000  
  • 22.
    Request rate   (Writes/sec) Object size (Bytes) Total  size (GB/month) Objects per  month Scenario  1 300 2,048 1,483 777,600,000   Scenario  2 300 32,768 23,730 777,600,000   Amazon S3 Amazon DynamoDB use use Cost Conscious Design – Example : DynamoDB vs. S3?
  • 23.
    Amazon  Aurora  among Japanese  Customers ~Lessons  from  Case  Studies~ Yutaka  Hoshino  (Solutions  Architect) Amazon  Web  Services  Japan  K.K. Time  :  13:00  – 13:30
  • 24.
  • 25.
    Before RDS  for  MySQL MultiAZ+  Read  Replica r3.4xlarge After Amazon  Aurora 2  node (1  writer  /  1  reader) r3.4xlarge Getting performance from Amazon Aurora
  • 26.
    Aurora  3x  faster than  MySQL  (Total) Getting performance from Amazon Aurora
  • 27.
    Update  :  5x+ faster Getting performance from Amazon Aurora
  • 28.
    Insert  :  2x+ faster Getting performance from Amazon Aurora
  • 29.
    We  could  reduce #  of  nodes  by  DB  consolidation  and  improved  ov erall  performance.   In  Grani’s case,  they  achieve  $10.6K  cost  saving  in  a  year  by  DB   consolidation. RDS (db.r3.4xlarge / gp2 / OnDemand) Hourly Daily Yearly RDS for MySQL (MultiAZ + 1 ReadReplica) $4.54 + $2.27 = $6.81 $163.44 $59,655.60 Aurora (+ 1 Replica) $2.8 * 2 = $5.6 $134.40 $49,056.00 Reduction ▲$1.21 ▲$29.04 ▲$10,599.6 Getting performance from Amazon Aurora
  • 30.
  • 31.
    •Live  dwango reader(LDR) /  LDR  Pocket • Online  RSS  Reader  for  PC  and  Smartphone • 9  years  old  service • Very  old  source  code… •A  bunch  of  data  in  MySQL  servers •Difficulties  in  touching  the  old  source  codes Consolidating  complex  databases  to  simple
  • 32.
    DB on Instance DB on Instance DBon Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance DB on Instance • 5  shards  to  store  article  data • Aurora  will  achieve  5x  throughput • Consolidate  to  Single Aurora  cluster  !! Aurora Consolidating  complex  databases  to  simple
  • 33.
    •Before • Data  Size: 5TB • EBSs:  500GiB  *  5,  800GiB  *  10 • DBs:  master  *  7  instances,  slave  *  8  instances  (All  r3.large) •After • Aurora(r3.xlarge)  *  2 instances • No  code  change!! •Cost  reduction • $24,000/year  (estimated) Consolidating  complex  databases  to  simple
  • 34.
    The  Mainichi  Newspapers Co.,Ltd http://mainichi.jp/english/ Wire compatibility with MySQL5.6
  • 35.
    • Full  migration from  on-­prem to  AWS • All  contents  data  are  stored  into  Aurora • At  first,  they  planed  to  use  RDS  MySQL • Change  RDS  MySQL  to  Aurora  two  week s  before  launch  without  code  and  configur ation  change • Fast  failover • Stability • Fault  tolerance • Cost  reduction Wire compatibility with MySQL5.6
  • 36.
    All  contents  data are  stored  into  Aurora Wire compatibility with MySQL5.6
  • 37.
    • Good  cost performance • Reduce  cost  compared  to  MySQL • RDS  MySQL:  #  of  instance  *  storage  (provisioned  IOPS  and   storage  size  in  advance) • Aurora:  1  Aurora  cluster  needs  single  storage  price  (pay  as   you  go) • Wire  compatibility  with  MySQL5.6 Wire compatibility with MySQL5.6
  • 38.