SlideShare a Scribd company logo
1 of 45
Cloud databases in Amazon
Web Services
Roman Gomolko
roman@userreport.com
October 2015
Ciklum Speakers Corner
Let’s get acquired
UserReport
Developing products that allow to learn the audience
Started using AWS more than 5 years ago
Fully migrated to AWS more than 1.5 years ago
Processing 3 billions requests monthly
Generating reports based on 8 billions of requests with batched reports
Online reports on 300 millions of records
Used ~50% of services provided by AWS
Totally happy regarding using AWS
A database is an organized collection of data
RDS
Relational Databases hosted and maintained by Amazon
Different Engines & Editions & Versions
Captain Obvious’s notes
● RDS doesn’t host particular DB but it hosts RDMS
● Create your root user, create separate users for each
database/application
● Your instance is firewalled with security groups
● Advanced configuration is available through parameter groups
Multi A-Z deployments for production workloads
● SLA 99.95% monthly uptime
● Doubles prices
● Allows to maintain your database without downtime
○ Minor updates
○ Major updates
○ Disk resize
○ EC2 upgrade
● No support for MS SQL Web, Express, Standard
Pricing
RDS price = EC2 + ELB + license
On-Demand or Reserved purchases with up-front payment
Backups
● Automated with automated rotation
● Restore to point of time
● Restore will create new instance and deploy desired version. It takes a
while
● Manual backup via Snapshots
Advanced optimizations
● Read replicas
○ you can create on the fly high available read-only copies of your data
● Using ElastiCache for performance boost
○ Using memcache will massively boost your queries
Downsides
● No control over EC2 for very advanced optimizations
● Backup works over instance
○ One RDS per DB
○ Or custom backups
● No Active Directory integration
● No Cross-region replication
Aurora
MySQL compatible database by Amazon with cloud in the mind
Aurora
Available and Durable
Amazon Aurora is designed to offer greater than 99.99% availability,
replicating 6 copies of data across 3 Availability Zones and backing up
data continuously to Amazon S3. Recovery from physical storage failures
is transparent and instance restarts typically require less than a minute.
Aurora
Highly Scalable
You can use Amazon RDS to scale your Amazon Aurora database
instance up to 32 vCPUs and 244GiB Memory. You can also add up to 15
Amazon Aurora Replicas across three availability zones to further scale
read capacity. Amazon Aurora automatically grows storage as needed,
from 10GB up to 64TB.
DynamoDB
Document database with biscuits by Amazon
DynamoDB overview
● Operates with tables
● Table definition consist of
○ key (required)
○ sort (range) key (optional)
○ indexes (optional)
● Table contains items
● Items is described by
○ key
DynamoDB item overview
● Max 64 KB
● Unlimited number of attributes
● Attribute types
○ string
○ string array
○ number
○ number array
○ binary
DynamoDB operations
● Put - insert or update
● Get
● Delete
● Scan
● Query
Demo time
DynamoDB show-case
DynamoDB performance
● You provision read and write capacity
● DynamoDB is divided into shards. Each shard has following limits:
○ 2 Gb of data
○ 3000 Read Capacity Units
○ 2000 Write Capacity Units
● Your requests can be throttled (API cares about retry-logic in most cases)
● You can setup autoscale of DynamoDB
DynamoDB Streams
● Triggers on data changes
● Cross-region replication
● ElasticSearch integration to allow to search among your data
https://aws.amazon.com/blogs/aws/new-logstash-plugin-search-
dynamodb-content-using-elasticsearch/
Backups and maintenance
● All data is replicated on three nodes - no backup required
● Change of provisioned throughput does not downgrade performance
● You can setup AutoScale for DynamoDB
https://github.com/sebdah/dynamic-dynamodb
*hit happens
DynamoDB had massive outage (high error rate on API request) in N. Verginia
that affected:
● SQS
● CloudWatch
● AutoScale Groups
● SNS
https://aws.amazon.com/message/5467D2/
Application design best practices
ElastiCache
Key-value store is also database
Redis
● Extremely fast in-memory database
● Different data structures
○ Sets
○ Lists
○ Ordered sets
○ HyperLogLog
○ HashSets
○ Geo data
Redis hosted in AWS
● Different versions supported
● Multi AZ master/slave configuration maintained by Amazon
● Automated backups
● Monitoring with CloudWatch
● No chance to patch Redis for your needs (geeks like custom operations)
Example 1. Calculating unique visitors
PFADD visitors.20151001 xxx
PFCOUNT visitors.20151001
INC pageviews.20151001
GET pageviews.20151001
Example 2. Working with sets
# users 1 and 2 add item to basket
SADD added_item_to_cart id1
SADD added_item_to_cart id2
SADD begin_checkout id1
# users haven’t began checkout
SDIFFSTORE no_checkout added_item_to_cart begin_checkout
# users with email and haven’t started checkout
SINTER known_email no_checkout
Example 3. Top scored users
ZADD gamescore 1 user1
ZADD gamescore 4 user2
ZADD gamescore 2 user3
ZREVRANGE gamescore 0 9
user2
user3
user1
Learn more
Redshift
It’s like PostgreSQL but for peta-bytes
Redshift
● Multiple-node cluster deployment that scales up to petabytes
● $1000/Tb/year
● Good for data mining
● Query execution minutes or hours
Table design
● HashKey - how data will be distributed across nodes
● SortKey - how data will be sorted within node
● Primary key, foreign keys, constraints - they are hints to query optimizer
Uploading data
● From CSV
● From DynamoDB
● From EMR
● Bulk insert
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_example
s.html
Loading data from S3
copy table
from 's3://mybucket/data/table.txt'
credentials 'aws_access_key_id=<access-key-
id>;aws_secret_access_key=<secret-access-key>'
csv [gzip] [delimiter "|"];
Query Execution
● PostgreSQL compatible syntax with many disabled features
● No views
● No stored procedures
● Recently deployed scalar custom functions
● 10 parallel queries
Getting query results
unload ('select * from mytable)
to 's3://mybucket/unload/result/'
credentials
'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-
key>';
S3 + EMR
Why don’t query files?
S3 as storage
● CSV
● JSON
● XML
● Parquet
EMR
EMR can launch Elastic Map Reduce cluster so
● Hadoop
● Spark
● Hive
● Presto
Distributed SQL Query Engine for Big Data
Demo time
One size fits all principle does not work here
"Cloud databases amazon web services" by Roman Gomolko

More Related Content

More from Ciklum Ukraine

Developing high load systems using C++
Developing high load systems using C++Developing high load systems using C++
Developing high load systems using C++Ciklum Ukraine
 
Collection view layout
Collection view layoutCollection view layout
Collection view layoutCiklum Ukraine
 
Introduction to auto layout
Introduction to auto layoutIntroduction to auto layout
Introduction to auto layoutCiklum Ukraine
 
Unit Testing: Special Cases
Unit Testing: Special CasesUnit Testing: Special Cases
Unit Testing: Special CasesCiklum Ukraine
 
Model-View-Controller: Tips&Tricks
Model-View-Controller: Tips&TricksModel-View-Controller: Tips&Tricks
Model-View-Controller: Tips&TricksCiklum Ukraine
 
Future of Outsourcing report published in The Times featuring Ciklum's CEO To...
Future of Outsourcing report published in The Times featuring Ciklum's CEO To...Future of Outsourcing report published in The Times featuring Ciklum's CEO To...
Future of Outsourcing report published in The Times featuring Ciklum's CEO To...Ciklum Ukraine
 
Михаил Попчук "Cкрытые резервы команд или 1+1=3"
Михаил Попчук "Cкрытые резервы команд или 1+1=3"Михаил Попчук "Cкрытые резервы команд или 1+1=3"
Михаил Попчук "Cкрытые резервы команд или 1+1=3"Ciklum Ukraine
 
"To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod...
"To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod..."To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod...
"To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod...Ciklum Ukraine
 
"Marmalade" presentation at Ciklum event "Defining your Mobile Strategy"
"Marmalade" presentation at Ciklum event "Defining your Mobile Strategy""Marmalade" presentation at Ciklum event "Defining your Mobile Strategy"
"Marmalade" presentation at Ciklum event "Defining your Mobile Strategy"Ciklum Ukraine
 
Ciklum Mobile Development Capability: Project Clients' References
Ciklum Mobile Development Capability: Project Clients' ReferencesCiklum Mobile Development Capability: Project Clients' References
Ciklum Mobile Development Capability: Project Clients' ReferencesCiklum Ukraine
 
Mecom Group's Digital Innovation and IT Sourcing Strategy
Mecom Group's Digital Innovation and IT Sourcing StrategyMecom Group's Digital Innovation and IT Sourcing Strategy
Mecom Group's Digital Innovation and IT Sourcing StrategyCiklum Ukraine
 
Journey and lessons from launching a new SaaS based marketing platform
Journey and lessons from launching a new SaaS based marketing platform Journey and lessons from launching a new SaaS based marketing platform
Journey and lessons from launching a new SaaS based marketing platform Ciklum Ukraine
 
Marmalade: more platforms, more possibilities
Marmalade: more platforms, more possibilitiesMarmalade: more platforms, more possibilities
Marmalade: more platforms, more possibilitiesCiklum Ukraine
 

More from Ciklum Ukraine (20)

Developing high load systems using C++
Developing high load systems using C++Developing high load systems using C++
Developing high load systems using C++
 
Collection view layout
Collection view layoutCollection view layout
Collection view layout
 
Introduction to auto layout
Introduction to auto layoutIntroduction to auto layout
Introduction to auto layout
 
Groovy on Android
Groovy on AndroidGroovy on Android
Groovy on Android
 
Unit Testing: Special Cases
Unit Testing: Special CasesUnit Testing: Special Cases
Unit Testing: Special Cases
 
Material design
Material designMaterial design
Material design
 
Kanban development
Kanban developmentKanban development
Kanban development
 
Mobile sketching
Mobile sketching Mobile sketching
Mobile sketching
 
More UX in our life
More UX in our lifeMore UX in our life
More UX in our life
 
Model-View-Controller: Tips&Tricks
Model-View-Controller: Tips&TricksModel-View-Controller: Tips&Tricks
Model-View-Controller: Tips&Tricks
 
Unit Tesing in iOS
Unit Tesing in iOSUnit Tesing in iOS
Unit Tesing in iOS
 
Future of Outsourcing report published in The Times featuring Ciklum's CEO To...
Future of Outsourcing report published in The Times featuring Ciklum's CEO To...Future of Outsourcing report published in The Times featuring Ciklum's CEO To...
Future of Outsourcing report published in The Times featuring Ciklum's CEO To...
 
Михаил Попчук "Cкрытые резервы команд или 1+1=3"
Михаил Попчук "Cкрытые резервы команд или 1+1=3"Михаил Попчук "Cкрытые резервы команд или 1+1=3"
Михаил Попчук "Cкрытые резервы команд или 1+1=3"
 
"To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod...
"To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod..."To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod...
"To be, rather than to seem” interview with Ciklum VP of HR Marina Vyshegorod...
 
Why to join Ciklum?
Why to join Ciklum?Why to join Ciklum?
Why to join Ciklum?
 
"Marmalade" presentation at Ciklum event "Defining your Mobile Strategy"
"Marmalade" presentation at Ciklum event "Defining your Mobile Strategy""Marmalade" presentation at Ciklum event "Defining your Mobile Strategy"
"Marmalade" presentation at Ciklum event "Defining your Mobile Strategy"
 
Ciklum Mobile Development Capability: Project Clients' References
Ciklum Mobile Development Capability: Project Clients' ReferencesCiklum Mobile Development Capability: Project Clients' References
Ciklum Mobile Development Capability: Project Clients' References
 
Mecom Group's Digital Innovation and IT Sourcing Strategy
Mecom Group's Digital Innovation and IT Sourcing StrategyMecom Group's Digital Innovation and IT Sourcing Strategy
Mecom Group's Digital Innovation and IT Sourcing Strategy
 
Journey and lessons from launching a new SaaS based marketing platform
Journey and lessons from launching a new SaaS based marketing platform Journey and lessons from launching a new SaaS based marketing platform
Journey and lessons from launching a new SaaS based marketing platform
 
Marmalade: more platforms, more possibilities
Marmalade: more platforms, more possibilitiesMarmalade: more platforms, more possibilities
Marmalade: more platforms, more possibilities
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

"Cloud databases amazon web services" by Roman Gomolko

  • 1. Cloud databases in Amazon Web Services Roman Gomolko roman@userreport.com October 2015 Ciklum Speakers Corner
  • 3. UserReport Developing products that allow to learn the audience Started using AWS more than 5 years ago Fully migrated to AWS more than 1.5 years ago Processing 3 billions requests monthly Generating reports based on 8 billions of requests with batched reports Online reports on 300 millions of records Used ~50% of services provided by AWS Totally happy regarding using AWS
  • 4. A database is an organized collection of data
  • 5. RDS Relational Databases hosted and maintained by Amazon
  • 6. Different Engines & Editions & Versions
  • 7. Captain Obvious’s notes ● RDS doesn’t host particular DB but it hosts RDMS ● Create your root user, create separate users for each database/application ● Your instance is firewalled with security groups ● Advanced configuration is available through parameter groups
  • 8. Multi A-Z deployments for production workloads ● SLA 99.95% monthly uptime ● Doubles prices ● Allows to maintain your database without downtime ○ Minor updates ○ Major updates ○ Disk resize ○ EC2 upgrade ● No support for MS SQL Web, Express, Standard
  • 9. Pricing RDS price = EC2 + ELB + license On-Demand or Reserved purchases with up-front payment
  • 10. Backups ● Automated with automated rotation ● Restore to point of time ● Restore will create new instance and deploy desired version. It takes a while ● Manual backup via Snapshots
  • 11. Advanced optimizations ● Read replicas ○ you can create on the fly high available read-only copies of your data ● Using ElastiCache for performance boost ○ Using memcache will massively boost your queries
  • 12. Downsides ● No control over EC2 for very advanced optimizations ● Backup works over instance ○ One RDS per DB ○ Or custom backups ● No Active Directory integration ● No Cross-region replication
  • 13. Aurora MySQL compatible database by Amazon with cloud in the mind
  • 14. Aurora Available and Durable Amazon Aurora is designed to offer greater than 99.99% availability, replicating 6 copies of data across 3 Availability Zones and backing up data continuously to Amazon S3. Recovery from physical storage failures is transparent and instance restarts typically require less than a minute.
  • 15. Aurora Highly Scalable You can use Amazon RDS to scale your Amazon Aurora database instance up to 32 vCPUs and 244GiB Memory. You can also add up to 15 Amazon Aurora Replicas across three availability zones to further scale read capacity. Amazon Aurora automatically grows storage as needed, from 10GB up to 64TB.
  • 16. DynamoDB Document database with biscuits by Amazon
  • 17. DynamoDB overview ● Operates with tables ● Table definition consist of ○ key (required) ○ sort (range) key (optional) ○ indexes (optional) ● Table contains items ● Items is described by ○ key
  • 18. DynamoDB item overview ● Max 64 KB ● Unlimited number of attributes ● Attribute types ○ string ○ string array ○ number ○ number array ○ binary
  • 19. DynamoDB operations ● Put - insert or update ● Get ● Delete ● Scan ● Query
  • 21. DynamoDB performance ● You provision read and write capacity ● DynamoDB is divided into shards. Each shard has following limits: ○ 2 Gb of data ○ 3000 Read Capacity Units ○ 2000 Write Capacity Units ● Your requests can be throttled (API cares about retry-logic in most cases) ● You can setup autoscale of DynamoDB
  • 22. DynamoDB Streams ● Triggers on data changes ● Cross-region replication ● ElasticSearch integration to allow to search among your data https://aws.amazon.com/blogs/aws/new-logstash-plugin-search- dynamodb-content-using-elasticsearch/
  • 23. Backups and maintenance ● All data is replicated on three nodes - no backup required ● Change of provisioned throughput does not downgrade performance ● You can setup AutoScale for DynamoDB https://github.com/sebdah/dynamic-dynamodb
  • 24. *hit happens DynamoDB had massive outage (high error rate on API request) in N. Verginia that affected: ● SQS ● CloudWatch ● AutoScale Groups ● SNS https://aws.amazon.com/message/5467D2/
  • 27. Redis ● Extremely fast in-memory database ● Different data structures ○ Sets ○ Lists ○ Ordered sets ○ HyperLogLog ○ HashSets ○ Geo data
  • 28. Redis hosted in AWS ● Different versions supported ● Multi AZ master/slave configuration maintained by Amazon ● Automated backups ● Monitoring with CloudWatch ● No chance to patch Redis for your needs (geeks like custom operations)
  • 29. Example 1. Calculating unique visitors PFADD visitors.20151001 xxx PFCOUNT visitors.20151001 INC pageviews.20151001 GET pageviews.20151001
  • 30. Example 2. Working with sets # users 1 and 2 add item to basket SADD added_item_to_cart id1 SADD added_item_to_cart id2 SADD begin_checkout id1 # users haven’t began checkout SDIFFSTORE no_checkout added_item_to_cart begin_checkout # users with email and haven’t started checkout SINTER known_email no_checkout
  • 31. Example 3. Top scored users ZADD gamescore 1 user1 ZADD gamescore 4 user2 ZADD gamescore 2 user3 ZREVRANGE gamescore 0 9 user2 user3 user1
  • 33. Redshift It’s like PostgreSQL but for peta-bytes
  • 34. Redshift ● Multiple-node cluster deployment that scales up to petabytes ● $1000/Tb/year ● Good for data mining ● Query execution minutes or hours
  • 35. Table design ● HashKey - how data will be distributed across nodes ● SortKey - how data will be sorted within node ● Primary key, foreign keys, constraints - they are hints to query optimizer
  • 36. Uploading data ● From CSV ● From DynamoDB ● From EMR ● Bulk insert http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_example s.html
  • 37. Loading data from S3 copy table from 's3://mybucket/data/table.txt' credentials 'aws_access_key_id=<access-key- id>;aws_secret_access_key=<secret-access-key>' csv [gzip] [delimiter "|"];
  • 38. Query Execution ● PostgreSQL compatible syntax with many disabled features ● No views ● No stored procedures ● Recently deployed scalar custom functions ● 10 parallel queries
  • 39. Getting query results unload ('select * from mytable) to 's3://mybucket/unload/result/' credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access- key>';
  • 40. S3 + EMR Why don’t query files?
  • 41. S3 as storage ● CSV ● JSON ● XML ● Parquet
  • 42. EMR EMR can launch Elastic Map Reduce cluster so ● Hadoop ● Spark ● Hive ● Presto Distributed SQL Query Engine for Big Data
  • 44. One size fits all principle does not work here