SlideShare a Scribd company logo
1 of 52
© 2014 CLUSTRIX© 2015 CLUSTRIX
Database Scaling Strategies,
in the Cloud & on the Rack
Robbie Mihalyi
@Clustrix
SQL SCALE-OUT
ClustrixDB Overview2
Resiliency
Capacity
Elasticity Cloud
Cloud
o Commoditized hardware resources
 Rapid deployment and pay by the hour
o Access
 Publish your applications quickly
 Use existing services from provider
o Capacity
 Scale resources as you need them
ClustrixDB Overview3
Utility Computing (bare metal)
Platform as a Service (PaaS)
SaaS
o Virtualized (Shared) Resources
 You do not always get the performance
envelope you ask for
o Dedicated (Hardware) Resources
 Available but expensive
 Less flexible
E-Commerce Applications
Example of a Great Match for Cloud
o Need for capacity varies by seasonality and specific events
 Some events can generate 10x normal traffic & increased conversion rates
o Sensitive to performance characteristics
 Throughput and latency
o Up-time is most crucial at the busiest time
 Every minute of downtime can mean thousands of $$$$ in lost revenue
ClustrixDB Overview4
SQL SCALE-OUT
ClustrixDB Overview5
Resiliency
Capacity
Elasticity
SQL SCALE-OUT
ClustrixDB Overview6
Resiliency
Capacity
Elasticity
SCALE
 Data, Users, Session
THROUGHPUT
 Concurrency, Transactions
LATENCY
 Response Time
Application Scaling (App Layer Only)
Easy Installation and Setup
o Load-Balancer
 HAProxy or equivalent
 Distributes incoming requests
o Scale out by adding servers
 All servers are the same – no master
o Redundant backend network
 Low-latency cluster intercommunication
ClustrixDB Overview7
Load Balancer
Commodity servers
APP
APP
APP
Application Scaling (Database Layer)
Database Scaling Is Very Hard
o Data Consistency
o Read vs. Write Scale
o ACID Properties (if you care about it)
o Throughput and Latency
o Application Impact
ClustrixDB Overview8
Non-Relational (NoSQL) Database Architectures
ClustrixDB Overview9
o No imposed structure
o Relaxed or no ACID properties
 BASE – alternative to ACID
o Fast and Scalable
o Suited for specific applications
 IOT, click-stream, object store, document
 Good for Insert workload
 Not good for read / query apps
o RDBMS will provide fast non-structured data
store
ClustrixDB Overview10
RDBMS SCALING
Scaling-Up
o Keep increasing the size of the (single) database server
o Pros
 Simple, no application changes needed
o Cons
 Expensive. At some point, you’re paying 5x for 2x the performance
 ‘Exotic’ hardware (128 cores and above) become price prohibitive
 Eventually you ‘hit the wall’, and you literally cannot scale-up anymore
ClustrixDB Overview11
Scaling Reads: Master/Slave
o Add a ‘Slave’ read-server(s) to your ‘Master’ database server
o Pros
 Reasonably simple to implement.
 Read/write fan-out can be done at the proxy level
o Cons
 Only adds Read performance
 Data consistency issues can occur, especially if the application isn’t coded to
ensure reads from the slave are consistent with reads from the master
ClustrixDB Overview12
Scaling Writes: Master/Master
ClustrixDB Overview13
o Add additional ‘Master’(s) to your ‘Master’ database server
o Pros
 Adds Write scaling without needing to shard
o Cons
 Adds write scaling at the cost of read-slaves
 Adding read-slaves would add even more latency
 Application changes are required to ensure data consistency / conflict resolution
Scaling Reads & Writes: Sharding
ClustrixDB Overview14
SHARDO1 SHARDO2 SHARDO3 SHARDO4
o Partitioning tables across separate database servers
o Pros
 Adds both write and read scaling
o Cons
 Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID
 ACID compliance & transactionality must be managed at the application level
 Consistent backups across all the shards are very hard to manage
 Read and Writes can be skewed / unbalanced
 Application changes can be significant
A - K L - O P - S T - Z
Scaling Reads & Writes: MySQL Cluster
o Provides shared-nothing clustering and auto-sharding for MySQL. (designed for Telco
deployments: minimal cross-node transactions, HA emphasis)
o Pros
 Distributed, multi-master model
 Provides high availability and high throughput
o Cons
 Only supports read-committed isolation
 Long-running transactions can block a node restart
 SBR replication not supported
 Range scans are expensive and lower performance than MySQL
 Unclear how it scales with many nodes
ClustrixDB Overview15
Application Workload Partitioning
ClustrixDB Overview16
o Partition entire application + RDBMS stack across
several “pods”
o Pros
 Adds both write and read scaling
 Flexible: can keep scaling with addition of pods
o Cons
 No data consistency across pods (only suited for cases
where it is not needed)
 High overhead in DBMS maintenance and upgrade
 Queries / Reports across all pods can be very complex
 Complex environment to setup and support
APP
APP
APP
APP
APP
APP
SQL SCALE-OUT
ClustrixDB Overview17
Resiliency
Capacity
Elasticity
SQL SCALE-OUT
ClustrixDB Overview18
Resiliency
Capacity
Elasticity
Ease of ADDING and
REMOVING resources
Flex Up or Down
 Capacity On-Demand
Adapt Resources to Price-
Performance Requirements
Elasticity – flexing up and down
ClustrixDB Overview19
o Application (only)
o NoSQL databases
o Scale-up
o Master – Slave
o Master – Master
o Sharding
o MySQL Cluster
o Application Partitioning
Scaling Options Flex UP Flex DOWN
o Easy o Easy
o Easy o Unclear if it is possible
o Expensive o Not Applicable
o Reasonably simple o Turn off read slaves
o Involved o Involved
o Expensive and complex o Not feasible
o Involved o Involved
o Expensive and complex o Expensive and complex
SQL SCALE-OUT
ClustrixDB Overview20
Resiliency
Resilience to Failures
 Hardware or Software
Fault Tolerance and
High Availability
Capacity
Elasticity
Resiliency – high-availably and fault tolerance
ClustrixDB Overview21
o Application (only)
o NoSQL databases
o Scale-up
o Master – Slave
o Master – Master
o Sharding
o MySQL Cluster
o Application Partitioning
Scaling Options
o No single point failure – failed node bypassed
Resilience to failures
o Support exists
o One large machine  Single point failure
o Fail-over to Slave
o Resilient to one of the Masters failing
o Multiple points of failures
o No single point failure
o Multiple points of failures
RDBMS Capacity, Elasticity and Resiliency
ClustrixDB Overview22
Scale-up
Master – Slave
Master – Master
MySQL Cluster
Sharding
RDBMS Scaling
Many cores – very expensive
Reads Only
Read / Write
Read / Write
Unbalanced Read/Writes
Capacity
Single Point Failure
Fail-over
Yes
Yes
Multiple points of failure
ResiliencyElasticity
No
No
No
No
No
None
Yes – for read scale
High – update conflict
None (or minor)
Very High
Application Impact
ClustrixDB Overview23
CLUSTRIXDB
 FULL ACID COMPLIANT RDBMS
 MYSQL COMPATIBLE
 ARCHITECTED FROM THE GROUND-UP TO ADDRESS:
CAPACITY, ELASTICITY AND RESILIENCY.
ClustrixDB – Shared Nothing Symmetric Architecture
ClustrixDB Overview24
Each Node Contains
o Database Engine:
 all nodes can perform all database operations (no
leader, aggregator, leaf, data-only, special nodes)
o Query Compiler:
 distribute compiled partial query fragments to the
node containing the ranking replica
o Data: Table Slices:
 All table slices auto-redistributed by the
Rebalancer (default: replicas=2)
o Data Map:
 all nodes know where all replicas are
ClustrixDB
Compiler Map
Engine Data
Compiler Map
Engine Data
Compiler Map
Engine Data
BillionsofRows
Database
Tables
S1 S2
S2
S3
S3
S4
S4
S5
S5
Intelligent Data Distribution
o Tables auto-split into slices
o Every slice has a replica on another server
 Auto-distributed and auto-protected
ClustrixDB Overview25
S1
ClustrixDB
S1
S2
S3
S3
S4
S4
S5
Database Capacity And Elasticity
o Easy and simple Flex Up (and Flex Down)
 Flex multiple nodes at the same time
o Data is automatically rebalanced
across the cluster
o All servers handle writes + reads
o Application always sees a single
Database instance
ClustrixDB Overview26
S1
ClustrixDB
S2
S5
S1
S2
S3
S3
S4
S4
S5
Built-in Fault Tolerance
o No Single Point-of-Failure
 No Data Loss
 No Downtime
o Server node goes down…
 Data is automatically rebalanced across
the remaining nodes
ClustrixDB Overview27
S1
ClustrixDB
S2
S5
Query
Distributed Query Processing
o Queries are fielded by any peer node
 Routed to node holding the data
o Complex queries are split into fragments processed in parallel
 Automatically distributed for optimized performance
ClustrixDB Overview28
ClustrixDB
Load
Balancer
TRXTRXTRX
Replication and Disaster Recovery
ClustrixDB Overview29
Asynchronous multi-point Replication
ClustrixDB
Parallel Backup
up to 10x faster
Replicate to any cloud, any datacenter, anywhere
ClustrixDB Overview30
CLUSTRIXDB
UNDER THE HOOD
o DISTRIBUTION STRATEGY
o REBALANCER TASKS
o QUERY OPTIMIZER
o EVALUATION MODEL
o CONCURRENCY CONTROL
ClustrixDB key components enabling Scale-Out
o Shared-nothing architecture
 Eliminates potential bottlenecks.
o Independent Index Distribution
 Hash each distribution key to a 64-bit number space divided into ranges with a specific slice
owning each range
o Rebalancer
 Ensures optimal data distribution across all nodes.
 Rebalancer assigns slices to available nodes for data capacity and access balance
o Query Optimizer
 Distributed query planner, compiler, and distributed shared-nothing execution engine
 Executes queries with max parallelism and many simultaneous queries concurrently.
o Evaluation Model
 Parallelizes queries, which are distributed to the node(s) with the relevant data.
o Consistency and Concurrency Control
 Using Multi-Version Concurrency Control (MVCC) and 2 Phase Locking (2PL)
ClustrixDB Overview31
Rebalancer Process
ClustrixDB Overview32
o User tables are vertically partitioned in representations.
o Representations are horizontally partitioned into slices.
o Rebalancer ensures:
 The representation has an appropriate number of slices.
 Slices are well distributed around the cluster on storage devices
 Slices are not placed on server(s) that are being flexed-down.
 Reads from each representation are balanced across the nodes
ClustrixDB Rebalancer Tasks
o Flex-UP
 Re-distribute replicas to new nodes
o Flex-DOWN
 Move replicas from the flex-down nodes to other nodes in the cluster
o Under-Protection – when a slice has fewer replicas than desired
 Create a new copy of the slice on a different node.
o Slice Too Big
 Split the slice into several new slices and re-distribute them
ClustrixDB Overview33
ClustrixDB Query Optimizer
o The ClustrixDB Query Optimizer is modeled on the Cascades optimization framework.
 Other RDBMS leverage Cascades are Tandem's Nonstop SQL and Microsoft's SQL Server.
 Cost-driven - Extensible via a rule based mechanism
 Top-down approach
o Query Optimizer must answer the following, per SQL query:
 In what order should the tables be joined?
 Which indexes should be used?
 Should the sort/aggregate be non-blocking?
ClustrixDB Overview34
ClustrixDB Evaluation Model
o Parallel query evaluation
o Massively Parallel Processing (MPP) for analytic queries
o The Fair Scheduler ensures OLTP prioritized ahead of OLAP
o Queries are broken into fragments (functions).
o Joins require more data movement by their nature.
 ClustrixDB is able to achieve minimal data movement
 Each representation (table or index) has its own distribution map,
allowing direct look-ups for which node/slice to go to next, removing
broadcasts.
 There is no a central node orchestrating data motion. Data moves
directly to the next node it needs to go to. This reduces hops to the
minimum possible given the data distribution.
ClustrixDB Overview35
COMPILATION
FRAGMENTS
FRAGMENT
1
FRAGMENT
2
VM
FRAGMENT 1
Node := lookup id = 15
<forward to node>
VM
FRAGMENT 2
SELECT id, amount
<return>
SELECT id, amount
FROM donation
WHERE id=15
Concurrency Control
ClustrixDB Overview36
Time
reader
reader
writer
writer
writer
row conflict one
writer blocked
no conflict
no blocking
o Readers never interfere with writers (or vice-versa). Writers use explicit locking for updates
o MVCC maintains a version of each row as writers modify rows
o Readers have lock-free snapshot isolation while writers use 2PL to manage conflict
Lock Conflict Matrix
Reader Writer
Reader None None
Writer None Row
ClustrixDB Overview37
CLUSTRIXDB
DEPLOYMENT EXAMPLES
Example: Huge Write Workload (AWS Deployment)
ClustrixDB Overview38
The Application
Inserts 254 million / day
Updates 1.35 million / day
Reads 252.3 million / day
Deletes 7,800 / day
The Database
Queries 5-9k per sec
CPU Load 45-65%
Nodes - Cores 10 nodes - 80 cores
Example: Huge Update Workload (Bare-Metal Deployment)
ClustrixDB Overview39
The Application
Inserts 31.4 million / day
Updates 3.7 billion / day
Reads 1 billion / day
Deletes 4,300 / day
The Database
Queries 35-55k per sec
CPU Load 25-35%
Nodes - Cores 6 nodes - 120 cores
ClustrixDB Overview40
CLUSTRIXDB
IN DEVELOPMENT
Next Release
o Additional Performance Improvements
 Further improvements to read and write scaling
o Deployment and Provisioning Optimization
 Cloud templates and deployment scripts
 Instance testing and validation
o New Admin architecture and much improved Web UI
 Services based architecture with (RESTful) API
 Simplified single-click FLEX Management
 Significant Graphing and Reporting improvements
 Multi-Cluster topology view and management
ClustrixDB Overview41
New Web UI – Enhanced Dashboard
ClustrixDB Overview42
482 tps
New Web UI – Historical Workload Comparison
ClustrixDB Overview43
New Web UI – FLEX Administration
ClustrixDB Overview44
ClustrixDB Overview45
FINAL THOUGHTS
ClustrixDB Overview46
Capacity
Massive
read write scalability
Very high
concurrency
Linear throughput
scale
Elasticity
Flex UP in
minutes
Flex DOWN
easily
Right-size resources
on-demand
Resiliency
Automatic, 100%
fault tolerance
No single
point of failure
Battle-tested
performance
Flexible
Deployment
Cloud, VM, or
bare-metal
Virtual Images
available
Point/click
Scale-out
ClustrixDB
Thank You.
facebook.com/clustrix
www.clustrix.com
@clustrix
linkedin.com/clustrix
ClustrixDB Overview47
Competitive Cluster Solutions
o Most MySQL clustering solutions leverage Master/Master via
replication:
 MySQL Cluster
 Galera (open-source library)
 Percona XtraDB Cluster (leverages Galera replication library)
 Tungsten
o ClustrixDB does NOT use replication to keep all the servers in
sync
 Replication cannot scale writes as highly as our own technology
 Replication has inherent potential consistency and latency issues
 Transactional workloads such as OLTP (e.g. E-Commerce) are
exactly the workloads that replication struggles the most with
ClustrixDB Overview48
MySQL Cluster
o Provides shared-nothing clustering and auto-sharding for MySQL (designed for
Telco deployments: minimal cross-node transactions, HA emphasis)
o Pros:
 Distributed, multi-master with no SPOF
 Designed to provide high availability and high throughput with low latency, while
allowing for near linear scalability
 Synchronous replication, 2-Phase Commit
o Cons:
 Global checkpoint is 2sec. “There are no guaranteed durable COMMITs to disk”
 Only supports read_committed isolation
 “MySQL cluster does not handle large transactions well”
 Long-running transactions can block a node restart
 Overflow of data in replication stream drops node from cluster, consistency loss
 ‘True’ HA requires multiple replication lines; “1 is not sufficient” for HA
 DELETEs release memory for same-table; full release requires cluster rolling restart
 Range scans are expensive and low(er) performance than MySQL
 No distributed table locks
ClustrixDB Overview49
Galera Cluster
o Is a multi-master topology using their own replication protocol (designed
primarily for High-Availability, and secondarily for scale)
o Pros:
 Writes to any master are replicated to the other master(s) in sync, ensuring all
masters have the same data.
 It is open source, and 24/7 Support can be purchased for $7,950/yr/server. Percona
also provides support, for a higher price.
o Cons:
 Write-scale is limited. Galera support recommends that writes go to one master,
rather than be distributed across the nodes. That helps with isolation issues, but
increases consistency and latency issues across the nodes.
 Snapshot isolation does NOT use first-committer-wins (and so fails Aphyr Jepsen
CAP tests). ClustrixDB does use first-committer wins for snapshot consistency
 Writesets are processed as a single memory-resident buffer and as a result,
extremely large transactions (e.g. LOAD DATA) may adversely affect node
performance.
 Locking is lax with DDL. Eg, if your DML transaction uses a table, and a parallel DDL
statement is started, Galera won’t wait for a metadata lock, causing potential
consistency issues
ClustrixDB Overview50
Percona XtraDB Cluster
o Is an active/active high availability and high scalability open source solution for
MySQL® clustering. It integrates Percona Server and Percona XtraBackup with the
Galera replication library
o Pros:
 Synchronous replication
 Multi-master replication support
 Parallel replication
 Automatic node provisioning
o Cons:
 Not designed for write scaling
 SELECT FOR UPDATE can easily create deadlocks
 Not true synchronous replication, but ‘virtually synchronous’: The data is committed on the
originating node and ack is sent to the application, but the other nodes are committed
asynchronously. This can lead to consistency issues for applications reading from the other
nodes
 “If multiple nodes are used, the ability to read your own writes is not guaranteed. In that case,
a certified transaction, which is already committed on the originating node can still sit in the
receive queue of the node the application is reading from, waiting to be applied.”

ClustrixDB Overview51
Tungsten Replicator
o Is an open source replication engine. Compatible with MySQL, Oracle, and
Amazon RDS; NoSQL stores such as MongoDB, and datawarehouse
stores such as Vertica, InfiniDB, and Hadoop
o Pros:
 Allows data to be exchanged between different databases and different database
versions
 During replication, information can be filtered and modified, and deployment can
be between on-premise or cloud-based databases
 For performance, Tungsten Replicator includes support for parallel replication,
and advanced topologies such as fan-in, star and multi-master, and can be used
efficiently in cross-site deployments
o Cons:
 Very complicated to setup, maintain
 No automated management, automated failover, transparent connections, nor
built-in conflict resolution
 Only allows asynchronous replication
 Cannot suppress slave-side triggers. Need to alter each trigger to add an IF
statement that prevents the trigger from running on the slave.
ClustrixDB Overview52

More Related Content

What's hot

Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...DataStax
 
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)Ontico
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Amazon Web Services
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016DataStax
 
Scylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File Format
Scylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File FormatScylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File Format
Scylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File FormatScyllaDB
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
How to power microservices with MariaDB
How to power microservices with MariaDBHow to power microservices with MariaDB
How to power microservices with MariaDBMariaDB plc
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDBSandun Perera
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streamsconfluent
 
Bi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stackBi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stackIvan Donev
 
Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019Ivan Donev
 
Discovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clustersDiscovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clustersIvan Donev
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016Łukasz Grala
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Dave Anselmi
 

What's hot (20)

Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
 
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
Scylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File Format
Scylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File FormatScylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File Format
Scylla Summit 2018: Scylla Feature Talks - SSTables 3.0 File Format
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
How to power microservices with MariaDB
How to power microservices with MariaDBHow to power microservices with MariaDB
How to power microservices with MariaDB
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streams
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 
Bi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stackBi and AI updates in the Microsoft Data Platform stack
Bi and AI updates in the Microsoft Data Platform stack
 
Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019Discovery Day 2019 Sofia - What is new in SQL Server 2019
Discovery Day 2019 Sofia - What is new in SQL Server 2019
 
Discovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clustersDiscovery Day 2019 Sofia - Big data clusters
Discovery Day 2019 Sofia - Big data clusters
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 
Voldemort
VoldemortVoldemort
Voldemort
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
 

Viewers also liked

AWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAmazon Web Services
 
High availability solution database mirroring
High availability solution database mirroringHigh availability solution database mirroring
High availability solution database mirroringMustafa EL-Masry
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.Clustrix
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database OverviewClustrix
 
Scaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento CapacityScaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento CapacityClustrix
 
Db performance optimization with indexing
Db performance optimization with indexingDb performance optimization with indexing
Db performance optimization with indexingRajeev Kumar
 
Operating Consul as an Early Adopter
Operating Consul as an Early AdopterOperating Consul as an Early Adopter
Operating Consul as an Early AdopterNelson Elhage
 
Microsoft SQL High Availability and Scaling
Microsoft SQL High Availability and ScalingMicrosoft SQL High Availability and Scaling
Microsoft SQL High Availability and ScalingJustin Whyte
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix
 
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site GrowthWhy Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site GrowthClustrix
 
Moving an E-commerce Site to AWS. A Case Study
Moving an  E-commerce Site to AWS. A Case StudyMoving an  E-commerce Site to AWS. A Case Study
Moving an E-commerce Site to AWS. A Case StudyClustrix
 
Migrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraMigrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraDemi Ben-Ari
 
Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema GajjarReema Gajjar
 
Cloud Databases in Research and Practice
Cloud Databases in Research and PracticeCloud Databases in Research and Practice
Cloud Databases in Research and PracticeFelix Gessert
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015Ivan Glushkov
 
Database index
Database indexDatabase index
Database indexRiteshkiit
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine kiran palaka
 
Data indexing presentation
Data indexing presentationData indexing presentation
Data indexing presentationgmbmanikandan
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalabilityjbellis
 

Viewers also liked (20)

AWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS CloudAWS Summit 2011: High Availability Database Architectures in AWS Cloud
AWS Summit 2011: High Availability Database Architectures in AWS Cloud
 
High availability solution database mirroring
High availability solution database mirroringHigh availability solution database mirroring
High availability solution database mirroring
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
 
Clustrix Database Overview
Clustrix Database OverviewClustrix Database Overview
Clustrix Database Overview
 
Scaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento CapacityScaling Techniques to Increase Magento Capacity
Scaling Techniques to Increase Magento Capacity
 
Db performance optimization with indexing
Db performance optimization with indexingDb performance optimization with indexing
Db performance optimization with indexing
 
Operating Consul as an Early Adopter
Operating Consul as an Early AdopterOperating Consul as an Early Adopter
Operating Consul as an Early Adopter
 
Microsoft SQL High Availability and Scaling
Microsoft SQL High Availability and ScalingMicrosoft SQL High Availability and Scaling
Microsoft SQL High Availability and Scaling
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site GrowthWhy Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
 
Moving an E-commerce Site to AWS. A Case Study
Moving an  E-commerce Site to AWS. A Case StudyMoving an  E-commerce Site to AWS. A Case Study
Moving an E-commerce Site to AWS. A Case Study
 
Migrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraMigrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to Cassandra
 
Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema Gajjar
 
Clusterix at VDS 2016
Clusterix at VDS 2016Clusterix at VDS 2016
Clusterix at VDS 2016
 
Cloud Databases in Research and Practice
Cloud Databases in Research and PracticeCloud Databases in Research and Practice
Cloud Databases in Research and Practice
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015
 
Database index
Database indexDatabase index
Database index
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
Data indexing presentation
Data indexing presentationData indexing presentation
Data indexing presentation
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 

Similar to Database Architecture & Scaling Strategies, in the Cloud & on the Rack

Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPythian
 
NoSQL Database
NoSQL DatabaseNoSQL Database
NoSQL DatabaseSteve Min
 
DAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC OnlineDAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC OnlineAmazon Web Services
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlRiccardo Cappello
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explainedUwe Friedrichsen
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Dataexponential-inc
 
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...Nikitas Xenakis
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraJeff Bollinger
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud RightScale
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentationSergey Enin
 
Scaling Your Database In The Cloud
Scaling Your Database In The CloudScaling Your Database In The Cloud
Scaling Your Database In The CloudCory Isaacson
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
 
DBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructureDBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructureEmiliano Fusaglia
 
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ? Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ? Swiss Data Forum Swiss Data Forum
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationDATAVERSITY
 

Similar to Database Architecture & Scaling Strategies, in the Cloud & on the Rack (20)

Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWS
 
NoSQL Database
NoSQL DatabaseNoSQL Database
NoSQL Database
 
DAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC OnlineDAT322_The Nanoservices Architecture That Powers BBC Online
DAT322_The Nanoservices Architecture That Powers BBC Online
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosql
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Data
 
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
 
David_Bermingham
David_BerminghamDavid_Bermingham
David_Bermingham
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with Cassandra
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
Scaling Your Database In The Cloud
Scaling Your Database In The CloudScaling Your Database In The Cloud
Scaling Your Database In The Cloud
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.
 
DBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructureDBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructure
 
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ? Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
 

Recently uploaded

Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 

Recently uploaded (20)

Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 

Database Architecture & Scaling Strategies, in the Cloud & on the Rack

  • 1. © 2014 CLUSTRIX© 2015 CLUSTRIX Database Scaling Strategies, in the Cloud & on the Rack Robbie Mihalyi @Clustrix
  • 3. Cloud o Commoditized hardware resources  Rapid deployment and pay by the hour o Access  Publish your applications quickly  Use existing services from provider o Capacity  Scale resources as you need them ClustrixDB Overview3 Utility Computing (bare metal) Platform as a Service (PaaS) SaaS o Virtualized (Shared) Resources  You do not always get the performance envelope you ask for o Dedicated (Hardware) Resources  Available but expensive  Less flexible
  • 4. E-Commerce Applications Example of a Great Match for Cloud o Need for capacity varies by seasonality and specific events  Some events can generate 10x normal traffic & increased conversion rates o Sensitive to performance characteristics  Throughput and latency o Up-time is most crucial at the busiest time  Every minute of downtime can mean thousands of $$$$ in lost revenue ClustrixDB Overview4
  • 6. SQL SCALE-OUT ClustrixDB Overview6 Resiliency Capacity Elasticity SCALE  Data, Users, Session THROUGHPUT  Concurrency, Transactions LATENCY  Response Time
  • 7. Application Scaling (App Layer Only) Easy Installation and Setup o Load-Balancer  HAProxy or equivalent  Distributes incoming requests o Scale out by adding servers  All servers are the same – no master o Redundant backend network  Low-latency cluster intercommunication ClustrixDB Overview7 Load Balancer Commodity servers APP APP APP
  • 8. Application Scaling (Database Layer) Database Scaling Is Very Hard o Data Consistency o Read vs. Write Scale o ACID Properties (if you care about it) o Throughput and Latency o Application Impact ClustrixDB Overview8
  • 9. Non-Relational (NoSQL) Database Architectures ClustrixDB Overview9 o No imposed structure o Relaxed or no ACID properties  BASE – alternative to ACID o Fast and Scalable o Suited for specific applications  IOT, click-stream, object store, document  Good for Insert workload  Not good for read / query apps o RDBMS will provide fast non-structured data store
  • 11. Scaling-Up o Keep increasing the size of the (single) database server o Pros  Simple, no application changes needed o Cons  Expensive. At some point, you’re paying 5x for 2x the performance  ‘Exotic’ hardware (128 cores and above) become price prohibitive  Eventually you ‘hit the wall’, and you literally cannot scale-up anymore ClustrixDB Overview11
  • 12. Scaling Reads: Master/Slave o Add a ‘Slave’ read-server(s) to your ‘Master’ database server o Pros  Reasonably simple to implement.  Read/write fan-out can be done at the proxy level o Cons  Only adds Read performance  Data consistency issues can occur, especially if the application isn’t coded to ensure reads from the slave are consistent with reads from the master ClustrixDB Overview12
  • 13. Scaling Writes: Master/Master ClustrixDB Overview13 o Add additional ‘Master’(s) to your ‘Master’ database server o Pros  Adds Write scaling without needing to shard o Cons  Adds write scaling at the cost of read-slaves  Adding read-slaves would add even more latency  Application changes are required to ensure data consistency / conflict resolution
  • 14. Scaling Reads & Writes: Sharding ClustrixDB Overview14 SHARDO1 SHARDO2 SHARDO3 SHARDO4 o Partitioning tables across separate database servers o Pros  Adds both write and read scaling o Cons  Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID  ACID compliance & transactionality must be managed at the application level  Consistent backups across all the shards are very hard to manage  Read and Writes can be skewed / unbalanced  Application changes can be significant A - K L - O P - S T - Z
  • 15. Scaling Reads & Writes: MySQL Cluster o Provides shared-nothing clustering and auto-sharding for MySQL. (designed for Telco deployments: minimal cross-node transactions, HA emphasis) o Pros  Distributed, multi-master model  Provides high availability and high throughput o Cons  Only supports read-committed isolation  Long-running transactions can block a node restart  SBR replication not supported  Range scans are expensive and lower performance than MySQL  Unclear how it scales with many nodes ClustrixDB Overview15
  • 16. Application Workload Partitioning ClustrixDB Overview16 o Partition entire application + RDBMS stack across several “pods” o Pros  Adds both write and read scaling  Flexible: can keep scaling with addition of pods o Cons  No data consistency across pods (only suited for cases where it is not needed)  High overhead in DBMS maintenance and upgrade  Queries / Reports across all pods can be very complex  Complex environment to setup and support APP APP APP APP APP APP
  • 18. SQL SCALE-OUT ClustrixDB Overview18 Resiliency Capacity Elasticity Ease of ADDING and REMOVING resources Flex Up or Down  Capacity On-Demand Adapt Resources to Price- Performance Requirements
  • 19. Elasticity – flexing up and down ClustrixDB Overview19 o Application (only) o NoSQL databases o Scale-up o Master – Slave o Master – Master o Sharding o MySQL Cluster o Application Partitioning Scaling Options Flex UP Flex DOWN o Easy o Easy o Easy o Unclear if it is possible o Expensive o Not Applicable o Reasonably simple o Turn off read slaves o Involved o Involved o Expensive and complex o Not feasible o Involved o Involved o Expensive and complex o Expensive and complex
  • 20. SQL SCALE-OUT ClustrixDB Overview20 Resiliency Resilience to Failures  Hardware or Software Fault Tolerance and High Availability Capacity Elasticity
  • 21. Resiliency – high-availably and fault tolerance ClustrixDB Overview21 o Application (only) o NoSQL databases o Scale-up o Master – Slave o Master – Master o Sharding o MySQL Cluster o Application Partitioning Scaling Options o No single point failure – failed node bypassed Resilience to failures o Support exists o One large machine  Single point failure o Fail-over to Slave o Resilient to one of the Masters failing o Multiple points of failures o No single point failure o Multiple points of failures
  • 22. RDBMS Capacity, Elasticity and Resiliency ClustrixDB Overview22 Scale-up Master – Slave Master – Master MySQL Cluster Sharding RDBMS Scaling Many cores – very expensive Reads Only Read / Write Read / Write Unbalanced Read/Writes Capacity Single Point Failure Fail-over Yes Yes Multiple points of failure ResiliencyElasticity No No No No No None Yes – for read scale High – update conflict None (or minor) Very High Application Impact
  • 23. ClustrixDB Overview23 CLUSTRIXDB  FULL ACID COMPLIANT RDBMS  MYSQL COMPATIBLE  ARCHITECTED FROM THE GROUND-UP TO ADDRESS: CAPACITY, ELASTICITY AND RESILIENCY.
  • 24. ClustrixDB – Shared Nothing Symmetric Architecture ClustrixDB Overview24 Each Node Contains o Database Engine:  all nodes can perform all database operations (no leader, aggregator, leaf, data-only, special nodes) o Query Compiler:  distribute compiled partial query fragments to the node containing the ranking replica o Data: Table Slices:  All table slices auto-redistributed by the Rebalancer (default: replicas=2) o Data Map:  all nodes know where all replicas are ClustrixDB Compiler Map Engine Data Compiler Map Engine Data Compiler Map Engine Data
  • 25. BillionsofRows Database Tables S1 S2 S2 S3 S3 S4 S4 S5 S5 Intelligent Data Distribution o Tables auto-split into slices o Every slice has a replica on another server  Auto-distributed and auto-protected ClustrixDB Overview25 S1 ClustrixDB
  • 26. S1 S2 S3 S3 S4 S4 S5 Database Capacity And Elasticity o Easy and simple Flex Up (and Flex Down)  Flex multiple nodes at the same time o Data is automatically rebalanced across the cluster o All servers handle writes + reads o Application always sees a single Database instance ClustrixDB Overview26 S1 ClustrixDB S2 S5
  • 27. S1 S2 S3 S3 S4 S4 S5 Built-in Fault Tolerance o No Single Point-of-Failure  No Data Loss  No Downtime o Server node goes down…  Data is automatically rebalanced across the remaining nodes ClustrixDB Overview27 S1 ClustrixDB S2 S5
  • 28. Query Distributed Query Processing o Queries are fielded by any peer node  Routed to node holding the data o Complex queries are split into fragments processed in parallel  Automatically distributed for optimized performance ClustrixDB Overview28 ClustrixDB Load Balancer TRXTRXTRX
  • 29. Replication and Disaster Recovery ClustrixDB Overview29 Asynchronous multi-point Replication ClustrixDB Parallel Backup up to 10x faster Replicate to any cloud, any datacenter, anywhere
  • 30. ClustrixDB Overview30 CLUSTRIXDB UNDER THE HOOD o DISTRIBUTION STRATEGY o REBALANCER TASKS o QUERY OPTIMIZER o EVALUATION MODEL o CONCURRENCY CONTROL
  • 31. ClustrixDB key components enabling Scale-Out o Shared-nothing architecture  Eliminates potential bottlenecks. o Independent Index Distribution  Hash each distribution key to a 64-bit number space divided into ranges with a specific slice owning each range o Rebalancer  Ensures optimal data distribution across all nodes.  Rebalancer assigns slices to available nodes for data capacity and access balance o Query Optimizer  Distributed query planner, compiler, and distributed shared-nothing execution engine  Executes queries with max parallelism and many simultaneous queries concurrently. o Evaluation Model  Parallelizes queries, which are distributed to the node(s) with the relevant data. o Consistency and Concurrency Control  Using Multi-Version Concurrency Control (MVCC) and 2 Phase Locking (2PL) ClustrixDB Overview31
  • 32. Rebalancer Process ClustrixDB Overview32 o User tables are vertically partitioned in representations. o Representations are horizontally partitioned into slices. o Rebalancer ensures:  The representation has an appropriate number of slices.  Slices are well distributed around the cluster on storage devices  Slices are not placed on server(s) that are being flexed-down.  Reads from each representation are balanced across the nodes
  • 33. ClustrixDB Rebalancer Tasks o Flex-UP  Re-distribute replicas to new nodes o Flex-DOWN  Move replicas from the flex-down nodes to other nodes in the cluster o Under-Protection – when a slice has fewer replicas than desired  Create a new copy of the slice on a different node. o Slice Too Big  Split the slice into several new slices and re-distribute them ClustrixDB Overview33
  • 34. ClustrixDB Query Optimizer o The ClustrixDB Query Optimizer is modeled on the Cascades optimization framework.  Other RDBMS leverage Cascades are Tandem's Nonstop SQL and Microsoft's SQL Server.  Cost-driven - Extensible via a rule based mechanism  Top-down approach o Query Optimizer must answer the following, per SQL query:  In what order should the tables be joined?  Which indexes should be used?  Should the sort/aggregate be non-blocking? ClustrixDB Overview34
  • 35. ClustrixDB Evaluation Model o Parallel query evaluation o Massively Parallel Processing (MPP) for analytic queries o The Fair Scheduler ensures OLTP prioritized ahead of OLAP o Queries are broken into fragments (functions). o Joins require more data movement by their nature.  ClustrixDB is able to achieve minimal data movement  Each representation (table or index) has its own distribution map, allowing direct look-ups for which node/slice to go to next, removing broadcasts.  There is no a central node orchestrating data motion. Data moves directly to the next node it needs to go to. This reduces hops to the minimum possible given the data distribution. ClustrixDB Overview35 COMPILATION FRAGMENTS FRAGMENT 1 FRAGMENT 2 VM FRAGMENT 1 Node := lookup id = 15 <forward to node> VM FRAGMENT 2 SELECT id, amount <return> SELECT id, amount FROM donation WHERE id=15
  • 36. Concurrency Control ClustrixDB Overview36 Time reader reader writer writer writer row conflict one writer blocked no conflict no blocking o Readers never interfere with writers (or vice-versa). Writers use explicit locking for updates o MVCC maintains a version of each row as writers modify rows o Readers have lock-free snapshot isolation while writers use 2PL to manage conflict Lock Conflict Matrix Reader Writer Reader None None Writer None Row
  • 38. Example: Huge Write Workload (AWS Deployment) ClustrixDB Overview38 The Application Inserts 254 million / day Updates 1.35 million / day Reads 252.3 million / day Deletes 7,800 / day The Database Queries 5-9k per sec CPU Load 45-65% Nodes - Cores 10 nodes - 80 cores
  • 39. Example: Huge Update Workload (Bare-Metal Deployment) ClustrixDB Overview39 The Application Inserts 31.4 million / day Updates 3.7 billion / day Reads 1 billion / day Deletes 4,300 / day The Database Queries 35-55k per sec CPU Load 25-35% Nodes - Cores 6 nodes - 120 cores
  • 41. Next Release o Additional Performance Improvements  Further improvements to read and write scaling o Deployment and Provisioning Optimization  Cloud templates and deployment scripts  Instance testing and validation o New Admin architecture and much improved Web UI  Services based architecture with (RESTful) API  Simplified single-click FLEX Management  Significant Graphing and Reporting improvements  Multi-Cluster topology view and management ClustrixDB Overview41
  • 42. New Web UI – Enhanced Dashboard ClustrixDB Overview42 482 tps
  • 43. New Web UI – Historical Workload Comparison ClustrixDB Overview43
  • 44. New Web UI – FLEX Administration ClustrixDB Overview44
  • 46. ClustrixDB Overview46 Capacity Massive read write scalability Very high concurrency Linear throughput scale Elasticity Flex UP in minutes Flex DOWN easily Right-size resources on-demand Resiliency Automatic, 100% fault tolerance No single point of failure Battle-tested performance Flexible Deployment Cloud, VM, or bare-metal Virtual Images available Point/click Scale-out ClustrixDB
  • 48. Competitive Cluster Solutions o Most MySQL clustering solutions leverage Master/Master via replication:  MySQL Cluster  Galera (open-source library)  Percona XtraDB Cluster (leverages Galera replication library)  Tungsten o ClustrixDB does NOT use replication to keep all the servers in sync  Replication cannot scale writes as highly as our own technology  Replication has inherent potential consistency and latency issues  Transactional workloads such as OLTP (e.g. E-Commerce) are exactly the workloads that replication struggles the most with ClustrixDB Overview48
  • 49. MySQL Cluster o Provides shared-nothing clustering and auto-sharding for MySQL (designed for Telco deployments: minimal cross-node transactions, HA emphasis) o Pros:  Distributed, multi-master with no SPOF  Designed to provide high availability and high throughput with low latency, while allowing for near linear scalability  Synchronous replication, 2-Phase Commit o Cons:  Global checkpoint is 2sec. “There are no guaranteed durable COMMITs to disk”  Only supports read_committed isolation  “MySQL cluster does not handle large transactions well”  Long-running transactions can block a node restart  Overflow of data in replication stream drops node from cluster, consistency loss  ‘True’ HA requires multiple replication lines; “1 is not sufficient” for HA  DELETEs release memory for same-table; full release requires cluster rolling restart  Range scans are expensive and low(er) performance than MySQL  No distributed table locks ClustrixDB Overview49
  • 50. Galera Cluster o Is a multi-master topology using their own replication protocol (designed primarily for High-Availability, and secondarily for scale) o Pros:  Writes to any master are replicated to the other master(s) in sync, ensuring all masters have the same data.  It is open source, and 24/7 Support can be purchased for $7,950/yr/server. Percona also provides support, for a higher price. o Cons:  Write-scale is limited. Galera support recommends that writes go to one master, rather than be distributed across the nodes. That helps with isolation issues, but increases consistency and latency issues across the nodes.  Snapshot isolation does NOT use first-committer-wins (and so fails Aphyr Jepsen CAP tests). ClustrixDB does use first-committer wins for snapshot consistency  Writesets are processed as a single memory-resident buffer and as a result, extremely large transactions (e.g. LOAD DATA) may adversely affect node performance.  Locking is lax with DDL. Eg, if your DML transaction uses a table, and a parallel DDL statement is started, Galera won’t wait for a metadata lock, causing potential consistency issues ClustrixDB Overview50
  • 51. Percona XtraDB Cluster o Is an active/active high availability and high scalability open source solution for MySQL® clustering. It integrates Percona Server and Percona XtraBackup with the Galera replication library o Pros:  Synchronous replication  Multi-master replication support  Parallel replication  Automatic node provisioning o Cons:  Not designed for write scaling  SELECT FOR UPDATE can easily create deadlocks  Not true synchronous replication, but ‘virtually synchronous’: The data is committed on the originating node and ack is sent to the application, but the other nodes are committed asynchronously. This can lead to consistency issues for applications reading from the other nodes  “If multiple nodes are used, the ability to read your own writes is not guaranteed. In that case, a certified transaction, which is already committed on the originating node can still sit in the receive queue of the node the application is reading from, waiting to be applied.”  ClustrixDB Overview51
  • 52. Tungsten Replicator o Is an open source replication engine. Compatible with MySQL, Oracle, and Amazon RDS; NoSQL stores such as MongoDB, and datawarehouse stores such as Vertica, InfiniDB, and Hadoop o Pros:  Allows data to be exchanged between different databases and different database versions  During replication, information can be filtered and modified, and deployment can be between on-premise or cloud-based databases  For performance, Tungsten Replicator includes support for parallel replication, and advanced topologies such as fan-in, star and multi-master, and can be used efficiently in cross-site deployments o Cons:  Very complicated to setup, maintain  No automated management, automated failover, transparent connections, nor built-in conflict resolution  Only allows asynchronous replication  Cannot suppress slave-side triggers. Need to alter each trigger to add an IF statement that prevents the trigger from running on the slave. ClustrixDB Overview52

Editor's Notes

  1. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  2. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  3. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  4. CAP – Consistency ; Availability ; Partition Tolerance BASE – Basically Available ; Soft State ; Eventual Consistency
  5. https://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-replication-issues.html https://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-limitations-transactions.html https://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-limitations-unsupported.html
  6. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  7. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  8. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  9. Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  10. Building a scalable distributed database requires two things Distributing the data intelligently Moving the queries to the data
  11. Clustrix support MySQL replication both as master and slave – so you can replicate both ways. Within a cluster we saw earlier that all data has multiple copies For Disaster Recovery (when a whole region loses power) Clustrix has 2 options Fast Parallel Backup – This is in addition to slower MySqlDump backup Fast Parallel Replication – This is asynchronous across two Clustrix Clusters
  12. Purpose: Summarize the value of ClustrixDB to a technical audience Cloud Designed for seamless installation and scale-out on any cloud-based infrastructure Capacity Flex up and down, in minutes - If you need more capacity or performance, just connect and go. Massive, linear scalability - Readily handles massive volumes of customers, carts, orders, products, and business performance reporting Extreme concurrency - Manages millions of concurrent actions without impacting site response time Availability Automatic, 100% fault tolerance – High availability architecture to meet always-on demands of business-critical operations No single point of failure – No impact from hardware outages; zero downtime Battle-tested performance – Proven rock solid performance at some the world’s fastest-growing companies. E.g., maintained superior service levels during 600% Cyber Monday sales spike (nomorerack) Productivity Plug-in MySQL compatibility - Deploy in days, with few or no code changes. ClustrixDB is compatible with any application that uses MySQL, including the popular Magento platform for e-commerce and internally developed solutions. Eliminates re-architecting the database - Does away with complicated scaling strategies like sharding and replication, which are expensive, labor-intensive, and ultimately unsustainable Self-managing operation – Virtually eliminates DBA operations tasks because the management is built into the database itself
  13. https://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-replication-issues.html https://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-limitations-transactions.html https://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-limitations-unsupported.html
  14. https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/
  15. https://www.percona.com/blog/2014/09/11/openstack-users-shed-light-on-percona-xtradb-cluster-deadlock-issues/
  16. https://code.google.com/p/tungsten-replicator/