SlideShare a Scribd company logo
© 2014 CLUSTRIX© 2016 CLUSTRIX
Scaling RDBMS on AWS:
Strategies, Challenges, &
A Better Solution
Dave A. Anselmi @AnselmiDave
Director of Product Management
Clustrix
Database Landscape
Splice Machine Proprietary and Confidential
High Concurrency/Write heavy / Real Time Analytics Historical Analytics / Exploratory
Transactional / OLTP Analytics / OLAP
Traditional RDBMS
DW/Analytical
DBMS
Hadoop
Scale-OutScale-Up
NoSQLScale-Out RDBMS
(NewSQL)
RDBMS Scale-Out Dimensions
3
Resiliency
Capacity
Elasticity
Enterprise
RDBMS Scale
RDBMS Scale-Out Considerations
Relational Database Scaling Is Very Hard (c.f. “SQL Databases Don’t Scale”, 2006)
•  Data Consistency
•  Read vs. Write Scale
•  ACID Properties
•  Throughput and Latency
•  Application Impact
4
RDBMS Scale-Out Dimensions
5
Resiliency
Capacity
Elasticity
SCALE
§  Data, Users, Session
THROUGHPUT
§  Concurrency, Transactions
LATENCY
§  Response Time
The ‘Promise of the Cloud’ – Scaling RDBMS Up/Down like a Web Node
6
RDBMS
SCALING STRATEGIES
Scaling-Up: Reads + Writes
•  Keep increasing the size of the (single) database server
•  Pros
–  Simple, no application changes needed. ‘Click to Scale-up’ on AWS console
–  Best solution for Capacity, if it can handle your workload
•  Cons
–  Capacity Limit. Most clouds provide up to 36 ‘vcpu’s at most for a single server
–  Leave the cloud=Expensive. Soon, you’re often paying 5x for 2x the performance
Eventually you ‘hit the wall’, and you literally cannot scale-up anymore
7
Scaling Reads: Master/Slave
•  Add a ‘Slave’ read-server(s) to your ‘Master’ database server
•  Pros
–  Simple to implement, lots of automation available. AWS has ‘Read Replicas’
–  Read/write fan-out can be done at the proxy level
•  Cons
–  Best for read-heavy workloads- only adds Read performance
–  Data consistency issues can occur, especially if the application isn’t coded to
ensure read-consistency between Master & Slave (not an issue with RDS)
8
Scaling Reads + Writes: Master/Master
•  Add additional ‘Master’(s) to your ‘Master’ database server
•  Pros
–  Adds Reads + Write scaling without needing to shard
–  Depending on workload (e.g. non-serialized), scaling can approach linear
•  Cons
–  Adds Write scaling at the cost of read-slaves, which would add even more latency
–  Application changes are required to ensure data consistency / conflict resolution
–  AWS: Not available on RDS console; ‘roll-your-own’ with EC2
9
Examples: Master/Master Replication Solutions
•  Replication-based synchronous COMMIT solutions:
–  Galera (open-source library)
–  Percona XtraDB Cluster (leverages Galera replication library)
–  Tungsten
•  Pros
–  Good for High-Availability
–  Good for Read scaling
•  Cons
–  Provides variable Write scale, depending on workload
–  Replication has inherent potential consistency and latency issues.
High-transaction workloads such as OLTP (e.g. E-Commerce) are exactly the
workloads that replication struggles the most with
10
Scaling Reads & Writes: Horizontal (‘Regular’) Sharding
•  Partitioning tables across separate database servers
•  Pros
–  Adds both Read and Write scaling, depending on well-chosen sharding keys and low skew
–  Most common way to scale-out both Reads and Writes
•  Cons
–  Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID;
Application must ‘re-invent the wheel’
–  Consistent backups across all the shards are very hard to manage
–  Data management (skew/hotness) is ongoing significant maintenance
–  AWS: Not available on RDS console; ‘roll-your-own’ with EC2
11
SHARDO1 SHARDO2 SHARDO3 SHARDO4
A - K L - O P - S T - Z
Examples: Horizontal Sharding Solutions
MySQL Fabric
•  Pros
–  Elasticity: Can add nodes using Python scripts or OpenStack, etc
–  Resiliency: Automated load-balancing, auto slave promotion, & master/promotion-
aware routing, all transparent to the application
•  Cons
–  Application needs to provide sharding key per query
–  JOINs involving multiple shards not supported
–  Data rebalancing across shards is manual operation
ScaleArc
•  Pros
–  Capacity: Rule-based range or key-based sharding. Automatic read-slave promotion
–  Resiliency: Automatically manages MySQL replication, managing Master/Master,
promotion, and fail-over
•  Cons
–  All queries need to route through ‘smart load balancer’ which manages shards
–  Data rebalancing across shards is manual operation
12
Scaling Reads & Writes: Vertical Sharding
•  Separating tables across separate database servers (used by Magento eCommerce 2, etc)
•  Pros
–  Adds both write and read scaling, depending on well-chosen table distribution
–  Much less difficult than ‘regular’ sharding, and can have much of the gains
•  Cons
–  Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID;
Application must ‘re-invent the wheel’
–  Consistent backups across all the shards are very hard to manage
–  Data management (skew/hotness) is ongoing significant maintenance
–  AWS: Not available on RDS console; ‘roll-your-own’ with EC2
13
SHARDO1 SHARDO2 SHARDO3 SHARDO4
Table
1,2
Table
3,4
Table
5,6
Table
7,8
Application Workload Partitioning
•  Partition entire application + RDBMS stack across several “pods”
•  Pros
–  Adds both Write and Read scaling
–  Flexible: can keep scaling with addition of pods
•  Cons
–  No data consistency across pods (only suited for cases
where it is not needed)
–  Queries / Reports across all pods can be very complex
–  Complex environment to setup and support
14
APP
APP
APP
APP
APP
APP
RDBMS Scale-Out Dimensions
15
Resiliency
Capacity
Elasticity
EASE & SPEED of ADDING and
REMOVING resources
Flex Up or Down
§  Capacity On-Demand
Adapt Resources to Price-
Performance Requirements
More ‘Promise of the Cloud’ – Pay for Only What you Need
Elasticity – Flexing Up and Down
•  Application (for reference)
•  Scale-up
•  Master – Slave
•  Master – Master
•  Sharding
•  Application Partitioning
16
Scaling Options Flex UP Flex DOWN
o  Easy: Add more web nodes o  Easy: Drop web nodes
o  RDS: Easy. EC2: Expensive
and awkward
o  RDS: Easy. EC2: Difficult and
awkward
o  Easy: add read Replicas or
slave(s)
o  Easy: Drop read Replicas or
slave(s)
o  Involved o  Involved
o  Expensive and complex o  Infeasible &/or untenable
o  Expensive and complex o  Expensive and complex
RDBMS Scale-Out Dimensions
17
Resiliency
TRANSPARENCY to Failures
§  Hardware or Software
Fault Tolerance and
High Availability
Capacity
Elasticity
Who Needs High-Availability? – How Far do you Want to Walk?
Resiliency – High-Availably and Fault Tolerance
•  Application (for reference)
•  Scale-up
•  Master – Slave
•  Master – Master
•  Sharding
•  Application Partitioning
18
Scaling Options
o  No single point failure – failed node bypassed
Resilience to failures
o  RDS: Easy if standby instance. EC2: One large machine à Single
point failure
o  RDS: Easy. EC2: Fail-over to Slave à Potential data consistency
issue(s)
o  RDS: Unavailable. EC2: Resilient to one of the Masters failing
o  RDS: Unavailable. EC2: Multiple points of failures, without redundant
hardware
o  RDS: Unavailable. EC2: Multiple points of failures, without redundant
hardware
Summary: RDBMS Capacity, Elasticity and Resiliency
Scale-up
Master – Slave
Master – Master
Sharding
ClustrixDB
19
RDBMS Scaling
Many cores – expensive if
exceed cloud instance sizes
Reads Only
Read / some Write
Unbalanced Read/Writes
Scale-out Reads + Writes
Capacity
Single Point Failure
Fail-over
Yes
Multiple points of failure
Can lose node(s)
without data loss or
downtime
ResiliencyElasticity
RDS: Yes
EC2: No
RDS: Yes
EC2: Yes
RDS: No
EC2: Yes
RDS: No
EC2: Yes
Yes
None
Consistent reads requires
coding
High – conflict resolution
Very High
No application changes
needed
Application Impact
20
ANOTHER APPROACH:
§  MYSQL-COMPATIBLE CLUSTERED DATABASE
§  LINEAR SCALE-OUT OF BOTH WRITES & READS
§  HIGH-TRANSACTION, LOW-LATENCY
§  ARCHITECTED FROM THE GROUND-UP TO ADDRESS:
CAPACITY, ELASTICITY AND RESILIENCY
CLUSTRIXDB
ClustrixDB: Scale-Out, Fault-tolerant, MySQL-Compatible
21
ClustrixDB
ACID Compliant
Transactions & Joins
Optimized for OLTP
Built-In Fault Tolerance
Flex-Up and Flex-Down
Minimal DB Admin
Also runs GREAT in
the Data Center
Built to run
GREAT
in the Cloud
Linear Scale-Out: Sysbench OLTP 90:10 Mix (bare metal)
•  90% Reads + 10% Writes
–  Very typical workload mix
•  1 TPS = 10 SQL
–  9 SELECT + 1 UPDATE
–  a.k.a 10 operations/sec
•  Linearly scales TPS by
adding servers:
–  Oak4 = 4x 8core (32 cores)
–  Oak16 = 16x 8core (128 cores)
–  Oak28 = 28x 8core (224 cores)
22
800,000 SQL/sec
@ 20 ms
ClustrixDB vs. RDS_db1 vs. RDS_db2 (AWS)
•  90% Reads + 10% Writes
–  Very typical workload mix
•  1 TPS = 10 SQL
–  9 SELECT + 1 UPDATE
–  a.k.a 10 operations/sec
•  Shows scaling TPS by
adding servers:
–  Aws4 = 4x 8vcpu ClustrixDB
–  Aws16 = 16x 8vcpu ClustrixDB
–  Aws20 = 20x 8vcpu ClustrixDB
23
ClustrixDB scaling TPS 4X past RDS_db2’s
largest instance (db.r3.8xlarge) at 20ms
RDS_db1
(8XL)
RDS_db2
(8XL)
ClustrixDB
>400,000 SQL/sec
@ 20 ms
ClustrixDB
(20x c3.2XL)
24
CLUSTRIX RDBMS
Production Customer Workload Examples
Example: Heavy Write Workload (AWS Deployment)
25
The Application
Inserts 254 million / day
Updates 1.35 million / day
Reads 252.3 million / day
Deletes 7,800 / day
The Database
Queries 5-9k per sec
CPU Load 45-65%
Nodes - Cores 10 nodes - 80 cores
Application Sees a Single RDBMS Instance
Example: Very Heavy Update Workload (Bare-Metal)
26
The Application
Inserts 31.4 million / day
Updates 3.7 billion / day
Reads 1 billion / day
Deletes 4,300 / day
The Database
Queries 35-55k per sec
CPU Load 25-35%
Nodes - Cores 8 nodes - 160 cores
Application Sees a Single RDBMS Instance
27
CLUSTRIX RDBMS
§  MYSQL COMPATIBLE SHARED-NOTHING CLUSTERED RDBMS
§  FULL TRANSACTIONAL ACID COMPLIANCE ACROSS ALL NODES
§  ARCHITECTED FROM THE GROUND-UP TO ADDRESS:
CAPACITY, ELASTICITY AND RESILIENCY
TECHNICAL OVERVIEW
ClustrixDB Overview
Fully Distributed & Consistent Cluster
•  Fully Consistent, and ACID-compliant database
–  Cross-node Transactions & JOINs
–  Optimized for OLTP
–  But also supports reporting SQL
•  All servers are read + write
•  All servers accept client connections
•  Tables & Indexes distributed across all nodes
–  Fully automatic distribution, re-balancing
& re-protection
–  All Primary and Secondary Keys
28
PrivateNetwork
ClustrixDB on commodity/cloud servers
HW or SW Load
Balancer
SQL-based
Applications
High Concurrency
Custom:
PHP, Java, Ruby, etc
Packaged:
Magento, etc
ClustrixDB – Shared Nothing Symmetric Architecture
•  Database Engine:
–  all nodes can perform all database operations (no
leader, aggregator, leaf, data-only, special nodes)
•  Query Compiler:
–  distribute compiled partial query fragments to the
node containing the ranking replica
•  Data: Table Slices:
–  All table slices auto-redistributed by the
Rebalancer (default: replicas=2)
•  Data Map:
–  all nodes know where all replicas are
29
Each Node Contains
ClustrixDB
Compiler Map
Engine Data
Compiler Map
Engine Data
Compiler Map
Engine Data
BillionsofRows
Database
Tables
S1 S2
S2
S3
S3
S4
S4
S5
S5
Intelligent Data Distribution
•  Tables auto-split into slices
•  Every slice has a replica on another server
–  Auto-distributed and auto-protected
30
S1
ClustrixDB
S1
S2
S3
S3
S4
S4
S5
Database Capacity And Elasticity
•  Easy and simple Flex Up (and Flex Down)
–  Flex multiple nodes at the same time
•  Data is automatically rebalanced
across the cluster
31
S1
ClustrixDB
S2
S5
S1
S2
S3
S3
S4
S4
S5
Built-in Fault Tolerance
•  No Single Point-of-Failure
–  No Data Loss
–  No Downtime
•  Server node goes down…
–  Data is automatically rebalanced across
the remaining nodes
32
S1
ClustrixDB
S2
S5
Query
Distributed Query Processing
•  Queries are fielded by any peer node
–  Routed to node holding the data
•  Complex queries are split into fragments processed in parallel
–  Automatically distributed for optimized performance
33
ClustrixDB
Load
Balancer
TRXTRXTRX
Automatic Cluster Data Rebalancing
The ClustrixDB Rebalancer:
•  Initial Data: Distributes the data into even slices across nodes
•  Data Growth: Splits large slices into smaller slices
•  Failed Nodes: Re-protects slices to ensure proper replicas exist
•  Flex-Up/Flex-Down: Moves slices to leverage new nodes and/or evacuate nodes
•  Skewed Data: Re-distributes the data to even out across nodes
•  Hotness Detection: Finds hot slices and balances then across nodes
Patent 8,543,538 - Systems and methods for redistributing data in a relational database
Patent 8,554,726 - Systems and methods for reslicing data in a relational database
Replication and Disaster Recovery
35
Asynchronous multi-point MySQL 5.6 Replication
ClustrixDB
Parallel Backup
up to 10x faster
Replicate to any cloud, any datacenter, anywhere
Patent 9,348,883 - Systems and methods for replication replay in a relational database
36
FINAL THOUGHTS
ClustrixDB
37
Capacity
Massive
read write scalability
Very high
concurrency
Linear throughput
scale
Elasticity
Flex UP in
minutes
Flex DOWN
easily
Right-size resources
on-demand
Resiliency
Automatic, 100%
fault tolerance
No single
point of failure
Battle-tested
performance
Cloud
Cloud, VM, or
bare-metal
Virtual Images
available
Point/click
Scale-out
Thank You.
facebook.com/clustrix
www.clustrix.com
@clustrix
linkedin.com/clustrix
38
39
SUPPLEMENTARY SLIDES
40
CLUSTRIX RDBMS
GRAPHICAL USER INTERFACE
New UI –
Enhanced
Dashboard
41
New UI –
Workload
Comparison
42
New UI –
FLEX
Administration
43
44
CLUSTRIX RDBMS
SCALE-OUT BENCHMARKS
Sysbench OLTP 100% Reads (bare metal)
•  100% Reads
–  Max throughput test
•  1 TPS = 10 SQL
–  10 SELECT
–  a.k.a 10 operations/sec
•  Linearly scales TPS by
adding servers:
–  Oak6 = 6 servers
–  Oak18 = 18 servers
–  Oak30 = 30 servers
45
>1 Million SQL/sec
@ 20 ms
Yahoo! Cloud Service
Benchmark (YCSB) (AWS)
•  95% Reads + 5% Writes
–  1 Transaction/sec = 1 SQL
•  100% Reads
•  Over 1 Million TPS
–  With 3 ms query response
–  Using 50 ClustrixDB servers
46
> 1,000,000 TPS
@ 3 ms
ClustrixDB scaled to 50 nodes
(c3.2xl, 400 vcpu) in 1 day
47
CLUSTRIX RDBMS
UNDER THE HOOD
§  DISTRIBUTION STRATEGY
§  REBALANCER TASKS
§  QUERY OPTIMIZER
§  EVALUATION MODEL
§  CONCURRENCY CONTROL
ClustrixDB key components enabling Scale-Out
•  Shared-nothing architecture
–  Eliminates potential bottlenecks.
•  Independent Index Distribution
–  Hash each distribution key to a 64-bit number space divided into ranges with a specific
slice owning each range
•  Rebalancer
–  Ensures optimal data distribution across all nodes.
–  Rebalancer assigns slices to available nodes for data capacity and access balance
•  Query Optimizer
–  Distributed query planner, compiler, and distributed shared-nothing execution engine
–  Executes queries with max parallelism and many simultaneous queries concurrently.
•  Evaluation Model
–  Parallelizes queries, which are distributed to the node(s) with the relevant data.
•  Consistency and Concurrency Control
–  Using Multi-Version Concurrency Control (MVCC), 2 Phase Locking (2PL) on writes,
and Paxos Consensus Protocol
48
Rebalancer Process
•  User tables are vertically partitioned in representations.
•  Representations are horizontally partitioned into slices.
•  Rebalancer ensures:
–  The representation has an appropriate number of slices.
–  Slices are well distributed around the cluster on storage devices
–  Slices are not placed on server(s) that are being flexed-down.
–  Reads from each representation are balanced across the nodes
49
ClustrixDB Rebalancer Tasks
•  Flex-UP
–  Re-distribute replicas to new nodes
•  Flex-DOWN
–  Move replicas from the flex-down nodes to other nodes in the cluster
•  Under-Protection – when a slice has fewer replicas than desired
–  Create a new copy of the slice on a different node.
•  Slice Too Big
–  Split the slice into several new slices and re-distribute them
50
ClustrixDB Query Optimizer
•  The ClustrixDB Query Optimizer is modeled on the Cascades optimization framework.
–  Other RDBMS leverage Cascades are Tandem's Nonstop SQL and Microsoft's SQL Server.
–  Cost-driven - Extensible via a rule based mechanism
–  Top-down approach
•  Query Optimizer must answer the following, per SQL query:
–  In what order should the tables be joined?
–  Which indexes should be used?
–  Should the sort/aggregate be non-blocking?
51
ClustrixDB Evaluation Model
•  Parallel query evaluation
•  Massively Parallel Processing (MPP) for analytic queries
•  The Fair Scheduler ensures OLTP prioritized ahead of OLAP
•  Queries are broken into fragments (functions).
•  Joins require more data movement by their nature.
–  ClustrixDB is able to achieve minimal data movement
–  Each representation (table or index) has its own distribution map,
allowing direct look-ups for which node/slice to go to next, removing
broadcasts.
–  There is no a central node orchestrating data motion. Data moves
directly to the next node it needs to go to. This reduces hops to the
minimum possible given the data distribution.
52
COMPILATION
FRAGMENTS
FRAGMENT
1
FRAGMENT
2
VM
FRAGMENT 1
Node := lookup id = 15
<forward to node>
VM
FRAGMENT 2
SELECT id, amount
<return>
SELECT id, amount
FROM donation
WHERE id=15
Concurrency Control
•  Readers never interfere with writers (or vice-versa). Writers use explicit locking for updates
•  MVCC maintains a version of each row as writers modify rows
•  Readers have lock-free snapshot isolation while writers use 2PL to manage conflict
53
Time
reader
reader
writer
writer
writer
row conflict one
writer blocked
no conflict
no blocking
Lock Conflict Matrix
Reader Writer
Reader None None
Writer None Row
Thank You.
facebook.com/clustrix
www.clustrix.com
@clustrix
linkedin.com/clustrix
54

More Related Content

What's hot

Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Amazon Web Services
 
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData
FlyData Inc.
 
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon Web Services
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
Amazon Web Services
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
Amazon Web Services
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for Oracle
Maris Elsins
 
What's New in Amazon Aurora
What's New in Amazon AuroraWhat's New in Amazon Aurora
What's New in Amazon Aurora
Amazon Web Services
 
Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
Amazon Web Services
 
PASS 17 SQL Server on AWS Best Practices
PASS 17 SQL Server on AWS Best PracticesPASS 17 SQL Server on AWS Best Practices
PASS 17 SQL Server on AWS Best Practices
Amazon Web Services
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
Amazon Web Services
 
Mining AWR V2 - Trend Analysis
Mining AWR V2 - Trend AnalysisMining AWR V2 - Trend Analysis
Mining AWR V2 - Trend Analysis
Maris Elsins
 
AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722
Amazon Web Services
 
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Amazon Web Services
 
SQLIO - measuring storage performance
SQLIO - measuring storage performanceSQLIO - measuring storage performance
SQLIO - measuring storage performance
valerian_ceaus
 
Best Practices running SQL Server on AWS
Best Practices running SQL Server on AWSBest Practices running SQL Server on AWS
Best Practices running SQL Server on AWS
Amazon Web Services
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
Kel Graham
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
Amazon Web Services
 
AWS Webcast - Migrating to RDS Oracle
AWS Webcast - Migrating to RDS OracleAWS Webcast - Migrating to RDS Oracle
AWS Webcast - Migrating to RDS Oracle
Amazon Web Services
 
Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...
Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...
Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...
Amazon Web Services
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
 

What's hot (20)

Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
 
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData
 
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for Oracle
 
What's New in Amazon Aurora
What's New in Amazon AuroraWhat's New in Amazon Aurora
What's New in Amazon Aurora
 
Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
 
PASS 17 SQL Server on AWS Best Practices
PASS 17 SQL Server on AWS Best PracticesPASS 17 SQL Server on AWS Best Practices
PASS 17 SQL Server on AWS Best Practices
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Mining AWR V2 - Trend Analysis
Mining AWR V2 - Trend AnalysisMining AWR V2 - Trend Analysis
Mining AWR V2 - Trend Analysis
 
AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722
 
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
 
SQLIO - measuring storage performance
SQLIO - measuring storage performanceSQLIO - measuring storage performance
SQLIO - measuring storage performance
 
Best Practices running SQL Server on AWS
Best Practices running SQL Server on AWSBest Practices running SQL Server on AWS
Best Practices running SQL Server on AWS
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
AWS Webcast - Migrating to RDS Oracle
AWS Webcast - Migrating to RDS OracleAWS Webcast - Migrating to RDS Oracle
AWS Webcast - Migrating to RDS Oracle
 
Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...
Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...
Design, Deploy, and Optimize Microsoft SQL Server on AWS - WIN306 - re:Invent...
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 

Viewers also liked

High availability solution database mirroring
High availability solution database mirroringHigh availability solution database mirroring
High availability solution database mirroring
Mustafa EL-Masry
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
Clustrix
 
ClustrixDB 7.5 Announcement
ClustrixDB 7.5 AnnouncementClustrixDB 7.5 Announcement
ClustrixDB 7.5 Announcement
Clustrix
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Clustrix
 
C1 basic concepts of database
C1 basic concepts of databaseC1 basic concepts of database
C1 basic concepts of database
Wan Azni
 
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
Amazon Web Services
 

Viewers also liked (6)

High availability solution database mirroring
High availability solution database mirroringHigh availability solution database mirroring
High availability solution database mirroring
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
 
ClustrixDB 7.5 Announcement
ClustrixDB 7.5 AnnouncementClustrixDB 7.5 Announcement
ClustrixDB 7.5 Announcement
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
 
C1 basic concepts of database
C1 basic concepts of databaseC1 basic concepts of database
C1 basic concepts of database
 
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
 

Similar to Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711

Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Clustrix
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWS
Pythian
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
I Goo Lee
 
(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS
Amazon Web Services
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
Clustrix
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
Joe Alex
 
Data engineering
Data engineeringData engineering
Data engineering
Parimala Killada
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
Manish Singh
 
HSBC and AWS Day - Database Options on AWS
HSBC and AWS Day - Database Options on AWSHSBC and AWS Day - Database Options on AWS
HSBC and AWS Day - Database Options on AWS
Amazon Web Services
 
Bases de datos en la nube con AWS
Bases de datos en la nube con AWSBases de datos en la nube con AWS
Bases de datos en la nube con AWS
Amazon Web Services LATAM
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
Aravindharamanan S
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
Aravindharamanan S
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
Amazon Web Services
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
shuwutong
 
NoSQL
NoSQLNoSQL
NoSQL
dbulic
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
Oleksandr Semenov
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled Apps
Amazon Web Services
 
AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17
Neal Davis
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
Amazon Web Services
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Qian Lin
 

Similar to Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711 (20)

Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWS
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Data engineering
Data engineeringData engineering
Data engineering
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
HSBC and AWS Day - Database Options on AWS
HSBC and AWS Day - Database Options on AWSHSBC and AWS Day - Database Options on AWS
HSBC and AWS Day - Database Options on AWS
 
Bases de datos en la nube con AWS
Bases de datos en la nube con AWSBases de datos en la nube con AWS
Bases de datos en la nube con AWS
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
 
NoSQL
NoSQLNoSQL
NoSQL
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled Apps
 
AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 

Recently uploaded

Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Lecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptxLecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptx
TaghreedAltamimi
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
ToXSL Technologies
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative AnalysisOdoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Envertis Software Solutions
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 

Recently uploaded (20)

Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Lecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptxLecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptx
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative AnalysisOdoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 

Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711

  • 1. © 2014 CLUSTRIX© 2016 CLUSTRIX Scaling RDBMS on AWS: Strategies, Challenges, & A Better Solution Dave A. Anselmi @AnselmiDave Director of Product Management Clustrix
  • 2. Database Landscape Splice Machine Proprietary and Confidential High Concurrency/Write heavy / Real Time Analytics Historical Analytics / Exploratory Transactional / OLTP Analytics / OLAP Traditional RDBMS DW/Analytical DBMS Hadoop Scale-OutScale-Up NoSQLScale-Out RDBMS (NewSQL)
  • 4. RDBMS Scale-Out Considerations Relational Database Scaling Is Very Hard (c.f. “SQL Databases Don’t Scale”, 2006) •  Data Consistency •  Read vs. Write Scale •  ACID Properties •  Throughput and Latency •  Application Impact 4
  • 5. RDBMS Scale-Out Dimensions 5 Resiliency Capacity Elasticity SCALE §  Data, Users, Session THROUGHPUT §  Concurrency, Transactions LATENCY §  Response Time The ‘Promise of the Cloud’ – Scaling RDBMS Up/Down like a Web Node
  • 7. Scaling-Up: Reads + Writes •  Keep increasing the size of the (single) database server •  Pros –  Simple, no application changes needed. ‘Click to Scale-up’ on AWS console –  Best solution for Capacity, if it can handle your workload •  Cons –  Capacity Limit. Most clouds provide up to 36 ‘vcpu’s at most for a single server –  Leave the cloud=Expensive. Soon, you’re often paying 5x for 2x the performance Eventually you ‘hit the wall’, and you literally cannot scale-up anymore 7
  • 8. Scaling Reads: Master/Slave •  Add a ‘Slave’ read-server(s) to your ‘Master’ database server •  Pros –  Simple to implement, lots of automation available. AWS has ‘Read Replicas’ –  Read/write fan-out can be done at the proxy level •  Cons –  Best for read-heavy workloads- only adds Read performance –  Data consistency issues can occur, especially if the application isn’t coded to ensure read-consistency between Master & Slave (not an issue with RDS) 8
  • 9. Scaling Reads + Writes: Master/Master •  Add additional ‘Master’(s) to your ‘Master’ database server •  Pros –  Adds Reads + Write scaling without needing to shard –  Depending on workload (e.g. non-serialized), scaling can approach linear •  Cons –  Adds Write scaling at the cost of read-slaves, which would add even more latency –  Application changes are required to ensure data consistency / conflict resolution –  AWS: Not available on RDS console; ‘roll-your-own’ with EC2 9
  • 10. Examples: Master/Master Replication Solutions •  Replication-based synchronous COMMIT solutions: –  Galera (open-source library) –  Percona XtraDB Cluster (leverages Galera replication library) –  Tungsten •  Pros –  Good for High-Availability –  Good for Read scaling •  Cons –  Provides variable Write scale, depending on workload –  Replication has inherent potential consistency and latency issues. High-transaction workloads such as OLTP (e.g. E-Commerce) are exactly the workloads that replication struggles the most with 10
  • 11. Scaling Reads & Writes: Horizontal (‘Regular’) Sharding •  Partitioning tables across separate database servers •  Pros –  Adds both Read and Write scaling, depending on well-chosen sharding keys and low skew –  Most common way to scale-out both Reads and Writes •  Cons –  Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID; Application must ‘re-invent the wheel’ –  Consistent backups across all the shards are very hard to manage –  Data management (skew/hotness) is ongoing significant maintenance –  AWS: Not available on RDS console; ‘roll-your-own’ with EC2 11 SHARDO1 SHARDO2 SHARDO3 SHARDO4 A - K L - O P - S T - Z
  • 12. Examples: Horizontal Sharding Solutions MySQL Fabric •  Pros –  Elasticity: Can add nodes using Python scripts or OpenStack, etc –  Resiliency: Automated load-balancing, auto slave promotion, & master/promotion- aware routing, all transparent to the application •  Cons –  Application needs to provide sharding key per query –  JOINs involving multiple shards not supported –  Data rebalancing across shards is manual operation ScaleArc •  Pros –  Capacity: Rule-based range or key-based sharding. Automatic read-slave promotion –  Resiliency: Automatically manages MySQL replication, managing Master/Master, promotion, and fail-over •  Cons –  All queries need to route through ‘smart load balancer’ which manages shards –  Data rebalancing across shards is manual operation 12
  • 13. Scaling Reads & Writes: Vertical Sharding •  Separating tables across separate database servers (used by Magento eCommerce 2, etc) •  Pros –  Adds both write and read scaling, depending on well-chosen table distribution –  Much less difficult than ‘regular’ sharding, and can have much of the gains •  Cons –  Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID; Application must ‘re-invent the wheel’ –  Consistent backups across all the shards are very hard to manage –  Data management (skew/hotness) is ongoing significant maintenance –  AWS: Not available on RDS console; ‘roll-your-own’ with EC2 13 SHARDO1 SHARDO2 SHARDO3 SHARDO4 Table 1,2 Table 3,4 Table 5,6 Table 7,8
  • 14. Application Workload Partitioning •  Partition entire application + RDBMS stack across several “pods” •  Pros –  Adds both Write and Read scaling –  Flexible: can keep scaling with addition of pods •  Cons –  No data consistency across pods (only suited for cases where it is not needed) –  Queries / Reports across all pods can be very complex –  Complex environment to setup and support 14 APP APP APP APP APP APP
  • 15. RDBMS Scale-Out Dimensions 15 Resiliency Capacity Elasticity EASE & SPEED of ADDING and REMOVING resources Flex Up or Down §  Capacity On-Demand Adapt Resources to Price- Performance Requirements More ‘Promise of the Cloud’ – Pay for Only What you Need
  • 16. Elasticity – Flexing Up and Down •  Application (for reference) •  Scale-up •  Master – Slave •  Master – Master •  Sharding •  Application Partitioning 16 Scaling Options Flex UP Flex DOWN o  Easy: Add more web nodes o  Easy: Drop web nodes o  RDS: Easy. EC2: Expensive and awkward o  RDS: Easy. EC2: Difficult and awkward o  Easy: add read Replicas or slave(s) o  Easy: Drop read Replicas or slave(s) o  Involved o  Involved o  Expensive and complex o  Infeasible &/or untenable o  Expensive and complex o  Expensive and complex
  • 17. RDBMS Scale-Out Dimensions 17 Resiliency TRANSPARENCY to Failures §  Hardware or Software Fault Tolerance and High Availability Capacity Elasticity Who Needs High-Availability? – How Far do you Want to Walk?
  • 18. Resiliency – High-Availably and Fault Tolerance •  Application (for reference) •  Scale-up •  Master – Slave •  Master – Master •  Sharding •  Application Partitioning 18 Scaling Options o  No single point failure – failed node bypassed Resilience to failures o  RDS: Easy if standby instance. EC2: One large machine à Single point failure o  RDS: Easy. EC2: Fail-over to Slave à Potential data consistency issue(s) o  RDS: Unavailable. EC2: Resilient to one of the Masters failing o  RDS: Unavailable. EC2: Multiple points of failures, without redundant hardware o  RDS: Unavailable. EC2: Multiple points of failures, without redundant hardware
  • 19. Summary: RDBMS Capacity, Elasticity and Resiliency Scale-up Master – Slave Master – Master Sharding ClustrixDB 19 RDBMS Scaling Many cores – expensive if exceed cloud instance sizes Reads Only Read / some Write Unbalanced Read/Writes Scale-out Reads + Writes Capacity Single Point Failure Fail-over Yes Multiple points of failure Can lose node(s) without data loss or downtime ResiliencyElasticity RDS: Yes EC2: No RDS: Yes EC2: Yes RDS: No EC2: Yes RDS: No EC2: Yes Yes None Consistent reads requires coding High – conflict resolution Very High No application changes needed Application Impact
  • 20. 20 ANOTHER APPROACH: §  MYSQL-COMPATIBLE CLUSTERED DATABASE §  LINEAR SCALE-OUT OF BOTH WRITES & READS §  HIGH-TRANSACTION, LOW-LATENCY §  ARCHITECTED FROM THE GROUND-UP TO ADDRESS: CAPACITY, ELASTICITY AND RESILIENCY CLUSTRIXDB
  • 21. ClustrixDB: Scale-Out, Fault-tolerant, MySQL-Compatible 21 ClustrixDB ACID Compliant Transactions & Joins Optimized for OLTP Built-In Fault Tolerance Flex-Up and Flex-Down Minimal DB Admin Also runs GREAT in the Data Center Built to run GREAT in the Cloud
  • 22. Linear Scale-Out: Sysbench OLTP 90:10 Mix (bare metal) •  90% Reads + 10% Writes –  Very typical workload mix •  1 TPS = 10 SQL –  9 SELECT + 1 UPDATE –  a.k.a 10 operations/sec •  Linearly scales TPS by adding servers: –  Oak4 = 4x 8core (32 cores) –  Oak16 = 16x 8core (128 cores) –  Oak28 = 28x 8core (224 cores) 22 800,000 SQL/sec @ 20 ms
  • 23. ClustrixDB vs. RDS_db1 vs. RDS_db2 (AWS) •  90% Reads + 10% Writes –  Very typical workload mix •  1 TPS = 10 SQL –  9 SELECT + 1 UPDATE –  a.k.a 10 operations/sec •  Shows scaling TPS by adding servers: –  Aws4 = 4x 8vcpu ClustrixDB –  Aws16 = 16x 8vcpu ClustrixDB –  Aws20 = 20x 8vcpu ClustrixDB 23 ClustrixDB scaling TPS 4X past RDS_db2’s largest instance (db.r3.8xlarge) at 20ms RDS_db1 (8XL) RDS_db2 (8XL) ClustrixDB >400,000 SQL/sec @ 20 ms ClustrixDB (20x c3.2XL)
  • 25. Example: Heavy Write Workload (AWS Deployment) 25 The Application Inserts 254 million / day Updates 1.35 million / day Reads 252.3 million / day Deletes 7,800 / day The Database Queries 5-9k per sec CPU Load 45-65% Nodes - Cores 10 nodes - 80 cores Application Sees a Single RDBMS Instance
  • 26. Example: Very Heavy Update Workload (Bare-Metal) 26 The Application Inserts 31.4 million / day Updates 3.7 billion / day Reads 1 billion / day Deletes 4,300 / day The Database Queries 35-55k per sec CPU Load 25-35% Nodes - Cores 8 nodes - 160 cores Application Sees a Single RDBMS Instance
  • 27. 27 CLUSTRIX RDBMS §  MYSQL COMPATIBLE SHARED-NOTHING CLUSTERED RDBMS §  FULL TRANSACTIONAL ACID COMPLIANCE ACROSS ALL NODES §  ARCHITECTED FROM THE GROUND-UP TO ADDRESS: CAPACITY, ELASTICITY AND RESILIENCY TECHNICAL OVERVIEW
  • 28. ClustrixDB Overview Fully Distributed & Consistent Cluster •  Fully Consistent, and ACID-compliant database –  Cross-node Transactions & JOINs –  Optimized for OLTP –  But also supports reporting SQL •  All servers are read + write •  All servers accept client connections •  Tables & Indexes distributed across all nodes –  Fully automatic distribution, re-balancing & re-protection –  All Primary and Secondary Keys 28 PrivateNetwork ClustrixDB on commodity/cloud servers HW or SW Load Balancer SQL-based Applications High Concurrency Custom: PHP, Java, Ruby, etc Packaged: Magento, etc
  • 29. ClustrixDB – Shared Nothing Symmetric Architecture •  Database Engine: –  all nodes can perform all database operations (no leader, aggregator, leaf, data-only, special nodes) •  Query Compiler: –  distribute compiled partial query fragments to the node containing the ranking replica •  Data: Table Slices: –  All table slices auto-redistributed by the Rebalancer (default: replicas=2) •  Data Map: –  all nodes know where all replicas are 29 Each Node Contains ClustrixDB Compiler Map Engine Data Compiler Map Engine Data Compiler Map Engine Data
  • 30. BillionsofRows Database Tables S1 S2 S2 S3 S3 S4 S4 S5 S5 Intelligent Data Distribution •  Tables auto-split into slices •  Every slice has a replica on another server –  Auto-distributed and auto-protected 30 S1 ClustrixDB
  • 31. S1 S2 S3 S3 S4 S4 S5 Database Capacity And Elasticity •  Easy and simple Flex Up (and Flex Down) –  Flex multiple nodes at the same time •  Data is automatically rebalanced across the cluster 31 S1 ClustrixDB S2 S5
  • 32. S1 S2 S3 S3 S4 S4 S5 Built-in Fault Tolerance •  No Single Point-of-Failure –  No Data Loss –  No Downtime •  Server node goes down… –  Data is automatically rebalanced across the remaining nodes 32 S1 ClustrixDB S2 S5
  • 33. Query Distributed Query Processing •  Queries are fielded by any peer node –  Routed to node holding the data •  Complex queries are split into fragments processed in parallel –  Automatically distributed for optimized performance 33 ClustrixDB Load Balancer TRXTRXTRX
  • 34. Automatic Cluster Data Rebalancing The ClustrixDB Rebalancer: •  Initial Data: Distributes the data into even slices across nodes •  Data Growth: Splits large slices into smaller slices •  Failed Nodes: Re-protects slices to ensure proper replicas exist •  Flex-Up/Flex-Down: Moves slices to leverage new nodes and/or evacuate nodes •  Skewed Data: Re-distributes the data to even out across nodes •  Hotness Detection: Finds hot slices and balances then across nodes Patent 8,543,538 - Systems and methods for redistributing data in a relational database Patent 8,554,726 - Systems and methods for reslicing data in a relational database
  • 35. Replication and Disaster Recovery 35 Asynchronous multi-point MySQL 5.6 Replication ClustrixDB Parallel Backup up to 10x faster Replicate to any cloud, any datacenter, anywhere Patent 9,348,883 - Systems and methods for replication replay in a relational database
  • 37. ClustrixDB 37 Capacity Massive read write scalability Very high concurrency Linear throughput scale Elasticity Flex UP in minutes Flex DOWN easily Right-size resources on-demand Resiliency Automatic, 100% fault tolerance No single point of failure Battle-tested performance Cloud Cloud, VM, or bare-metal Virtual Images available Point/click Scale-out
  • 45. Sysbench OLTP 100% Reads (bare metal) •  100% Reads –  Max throughput test •  1 TPS = 10 SQL –  10 SELECT –  a.k.a 10 operations/sec •  Linearly scales TPS by adding servers: –  Oak6 = 6 servers –  Oak18 = 18 servers –  Oak30 = 30 servers 45 >1 Million SQL/sec @ 20 ms
  • 46. Yahoo! Cloud Service Benchmark (YCSB) (AWS) •  95% Reads + 5% Writes –  1 Transaction/sec = 1 SQL •  100% Reads •  Over 1 Million TPS –  With 3 ms query response –  Using 50 ClustrixDB servers 46 > 1,000,000 TPS @ 3 ms ClustrixDB scaled to 50 nodes (c3.2xl, 400 vcpu) in 1 day
  • 47. 47 CLUSTRIX RDBMS UNDER THE HOOD §  DISTRIBUTION STRATEGY §  REBALANCER TASKS §  QUERY OPTIMIZER §  EVALUATION MODEL §  CONCURRENCY CONTROL
  • 48. ClustrixDB key components enabling Scale-Out •  Shared-nothing architecture –  Eliminates potential bottlenecks. •  Independent Index Distribution –  Hash each distribution key to a 64-bit number space divided into ranges with a specific slice owning each range •  Rebalancer –  Ensures optimal data distribution across all nodes. –  Rebalancer assigns slices to available nodes for data capacity and access balance •  Query Optimizer –  Distributed query planner, compiler, and distributed shared-nothing execution engine –  Executes queries with max parallelism and many simultaneous queries concurrently. •  Evaluation Model –  Parallelizes queries, which are distributed to the node(s) with the relevant data. •  Consistency and Concurrency Control –  Using Multi-Version Concurrency Control (MVCC), 2 Phase Locking (2PL) on writes, and Paxos Consensus Protocol 48
  • 49. Rebalancer Process •  User tables are vertically partitioned in representations. •  Representations are horizontally partitioned into slices. •  Rebalancer ensures: –  The representation has an appropriate number of slices. –  Slices are well distributed around the cluster on storage devices –  Slices are not placed on server(s) that are being flexed-down. –  Reads from each representation are balanced across the nodes 49
  • 50. ClustrixDB Rebalancer Tasks •  Flex-UP –  Re-distribute replicas to new nodes •  Flex-DOWN –  Move replicas from the flex-down nodes to other nodes in the cluster •  Under-Protection – when a slice has fewer replicas than desired –  Create a new copy of the slice on a different node. •  Slice Too Big –  Split the slice into several new slices and re-distribute them 50
  • 51. ClustrixDB Query Optimizer •  The ClustrixDB Query Optimizer is modeled on the Cascades optimization framework. –  Other RDBMS leverage Cascades are Tandem's Nonstop SQL and Microsoft's SQL Server. –  Cost-driven - Extensible via a rule based mechanism –  Top-down approach •  Query Optimizer must answer the following, per SQL query: –  In what order should the tables be joined? –  Which indexes should be used? –  Should the sort/aggregate be non-blocking? 51
  • 52. ClustrixDB Evaluation Model •  Parallel query evaluation •  Massively Parallel Processing (MPP) for analytic queries •  The Fair Scheduler ensures OLTP prioritized ahead of OLAP •  Queries are broken into fragments (functions). •  Joins require more data movement by their nature. –  ClustrixDB is able to achieve minimal data movement –  Each representation (table or index) has its own distribution map, allowing direct look-ups for which node/slice to go to next, removing broadcasts. –  There is no a central node orchestrating data motion. Data moves directly to the next node it needs to go to. This reduces hops to the minimum possible given the data distribution. 52 COMPILATION FRAGMENTS FRAGMENT 1 FRAGMENT 2 VM FRAGMENT 1 Node := lookup id = 15 <forward to node> VM FRAGMENT 2 SELECT id, amount <return> SELECT id, amount FROM donation WHERE id=15
  • 53. Concurrency Control •  Readers never interfere with writers (or vice-versa). Writers use explicit locking for updates •  MVCC maintains a version of each row as writers modify rows •  Readers have lock-free snapshot isolation while writers use 2PL to manage conflict 53 Time reader reader writer writer writer row conflict one writer blocked no conflict no blocking Lock Conflict Matrix Reader Writer Reader None None Writer None Row

Editor's Notes

  1. Before we begin- 1. Much of today’s presentation comes from the presentation I did at Percona Live earlier this year 2. In general I'd like to keep it generic, but will focus on AWS, b/c this is an AWS meetup :-D 3. But for reference- our database ClustrixDB does run on any cloud or datacenter so if you'd like to discuss any other cloud, I'd be happy to answer your ?s
  2. Let’s start by positioning ‘RDBMS’ in the current Database Landscape There are lots of DB’s out there Whole spectrum of DBs out there, & it can be confusing We’re talking about OLTP, the stuff on the left MySQL is a general-purpose RDBMS It can be used for OLTP, & for OLAP… but like any general-purpose RDBMS it’s not ideal for either. This has created an explosion of specific databases, and we can see how they fit across OLTP –to- OLAP, & how they scale (up or out) Specifically- what we’re talking about today- is OLTP/transactional workloads
  3. When we talk about Scaling a general-purpose RDBMS like MySQL, there can be a lot of trade-offs. So let’s emphasize three dimensions which are critical to an Enterprise deployment… And for reference, when I say “MySQL”, I’m going to start with a sweeping generalization and club all the MySQL variants together: MySQL itself, Percona, MariaDB Google Cloud SQL Azure ClearDB RDS MySQL, & RDS Aurora to some extents In general, if your code-base leverages MySQL code, then we’re putting them in the same high-level ‘grouping’ for right now And we’ll differentiate them further later
  4. Now that we’ve Introduced 3 Dimensions for Enterprise Scaling- Capacity, Elasticity, & Resiliency It’s also very good to keep in mind some of the core Features of an RDBMS These are critical for the Application But these are often what are ‘relaxed’ in search of Scale. But for an application needing an RDBMS, especially OLTP workloads, These are NOT an option, and need to be addressed in any scaling strategy. CAP – Consistency ; Availability ; Partition Tolerance (CLX is CP) BASE – Basically Available ; Soft State ; Eventual Consistency
  5. Latency, Response time- eg Reports for Larry
  6. Pinterest – does NOT WANT TO DEAL W/ READ LATENCY Each pod is MASTER/MASTER
  7. ACID properties still a challenge with cross-shard transactions, and additional complexity is now added with the management layer
  8. Marketo, Salesforce, etc
  9. Now that we’ve reviewed the main RDBMS scaling strategies, from the standpoint of ‘Capacity’- ie, how much more hardware can you add? Let’s revisit each scaling strategy from the standpoint of how Elastic each are. How FAST can you scale each strategy?
  10. Rather than going thru the deck again, let’s do it as an overview:
  11. Now let’s review each scaling strategy from the standpoint of how Resilient each are. How fault-tolerant is strategy? Staples, Best Buy
  12. Here’s a high-level overview…
  13. But the PROOF is in the pudding- let’s see some examples of how ClustrixDB can scale. Here’s a whole bunch of pretty lines- what’s important here, is how each line scales
  14. For example, at 20ms CLX is 4X Aurora Lets say you have have an application that needs 20ms
  15. Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  16. Building a scalable distributed database requires two things Distributing the data intelligently Moving the queries to the data
  17. We've automated away a lot of the complexity in a distributed DB, so users and applications just see a single DB that looks like MySQL
  18. Clustrix support MySQL replication both as master and slave – so you can replicate both ways. Within a cluster we saw earlier that all data has multiple copies For Disaster Recovery (when a whole region loses power) Clustrix has 2 options Fast Parallel Backup – This is in addition to slower MySqlDump backup Fast Parallel Replication – This is asynchronous across two Clustrix Clusters
  19. "Imagine if you had to scale MySQL to 50 nodes - how many weeks it would take to get it all working? With Clustrix we did in one day."