Why Abstract Away
the Underlying Database Infrastructure
MariaDB MaxScale: Database Proxy
Markus Mäkelä
Overview
• What is database cluster abstraction?
• Why is it important?
• How does MariaDB MaxScale do it?
The Idea of a Perfect Database
● Behaves like a single database
○ Simple to use
○ Easy to manage
● Performs like a cluster
○ Robust and failure tolerant
○ Near-linear scalability
What is Abstraction for Database Clusters?
The Database
Why is it Important?
● Isolates Complexity
○ One logical database → Build simpler applications
● Highly Available Database
○ Fault tolerant → Robust services
○ Dynamic clusters → Easier maintenance
● Load balancing
○ Read/Write splitting → Better performance
Database Abstraction Layer
Why is it Important?
Complexity isolation
● Simpler application development/configuration
○ No need to know where to send queries
● No user-visible infrastructure
○ Don’t need to detect servers that are in maintenance
○ No need to know the cluster topology
The Database
Why is it Important?
Highly Available Database
● Prevents downtime
○ Node failure is not cluster failure
● Easier Maintenance
○ Functionality not tied to physical nodes
○ Reduced capacity, not functionality
○ Easy node replacement
Database Abstraction Layer
Why is it Important?
Load Balancing
● Runtime Load Balancing
○ Maximized node utilization
● Horizontal Scalability
○ Cheaper
○ Easier to change
○ On-demand capacity
2 N1
Database Abstraction Layer
How the Abstraction is Implemented
MariaDB MaxScale
MaxScale Overview
● Modular Database Proxy
○ Only use what is needed
○ Extendable
● Content-aware
○ Understands routed traffic
● Cluster-aware
○ Active cluster monitoring
○ Understands different cluster types
Configuration:
Defining Services instead of Servers
● “Database as a Service”
● Decouple clients from databases
● Describe what you want instead of what
you have
○ This is a service that provides
automated, highly available
read-write splitting
Database Abstraction Layer
Monitors
Abstracting the Cluster Concept
● Classify servers
○ Up or Down?
○ Master or Slave?
○ In sync or not?
● Information used by routers
○ Masters used for writes
○ Slaves used for reads
● Detects events
○ Server went down
○ Slave is disconnected
Overview: Monitors
● Detects topology
○ Builds the replication tree
● Assigns correct labels
○ Root node for writes
○ Other nodes for reads
● Detects replication lag
○ Write timestamp on master
○ Read from slave
MariaDB Monitor:
Master-Slave Monitor
Master
SlaveSlave
This is a master
This is a slave
● Output of SHOW ALL SLAVES STATUS
○ Slave_IO_Running: Yes
○ Slave_SQL_Running: Yes
○ Master_Server_Id: 1234
● Number of configured slaves
● @@read_only
MariaDB Monitor:
Monitored Variables
Master
SlaveSlave
This is used to build
the replication tree
● Galera Clusters
○ Synchronous Cluster
○ Globally conflict free
○ Conflicting transaction → Error on commit
● Abstracted in MaxScale
○ One “master” node
■ Prevents conflicts
○ Rest labeled as “slaves”
■ Good for scaleout
Galera Cluster Monitor
Master
MasterMaster
Use this for all writes...
…and these two for reads
● @@wsrep_local_state
○ 4(Joined) → OK
○ Anything else →Not OK
● @@wsrep_local_index
○ Zero-indexed
○ Cluster-wide “rank”
● Optional:
○ Manual node ranking (priority)
○ Split-brain sanity checks
○ MariaBackup/XtraBackup SST
detection
Galera Cluster Monitor:
Node Election
Master
MasterMaster
Routing & Query Classification
How the Load Balancing is Done
SELECT
WHERE
id
=
1;
● Provides both abstract and detailed information
○ Read or write
■ Does the query modify the database?
○ Query components
■ Is the table `t1` used in this query?
■ What are the values for the functions in the query?
○ Query characteristics
■ Does the query have a WHERE clause?
○ State changes
■ Was the default character set changed?
■ Is there an open transaction?
Query Classifier:
The Brains of MaxScale
Read-only query
SELECT
WHERE
id
=
1;
Query Classifier:
Details
● Based on a modified lightweight version of SQLite
○ Extended for MariaDB 10.3 syntax
○ Removed data storage and memory allocation
● Smart classification
○ First pass
■ Lightweight parsing
■ Resolves operation and query type
○ Second pass
■ Only for full syntactic classification
■ Column ↔Function relationships
Read-only query
● Read/write splitting
○ Write to master, read from slaves
○ Performance improvement for read-heavy loads
○ Prevents conflicts (Galera)
● Session state tracking & propagation
○ Consistent session state
● Failure tolerant
○ Hides slave failures
● Multiple backend connections
○ Must-have for read/write splitting
○ Speeds up node failover
ReadWriteSplit:
The Routing Muscle
Based on server score
● Multiple algorithms
○ Active operation count → Default
■ MIN(operations)
○ Connection count
■ MIN(connections)
○ Replication delay
■ MIN(delay)
● Manually adjustable
○ Weight each server differently
■ MIN(score * weight)
ReadWriteSplit:
Load Balancing
● Consistent state for all connections
○ State modifications propagated
○ Truly abstracted reads
● State modification history
○ Node replacement
ReadWriteSplit:
Session State SET SQL_MODE=’ANSI’;
START TRANSACTION;
SELECT name FROM accounts WHERE id = 1;
INSERT INTO logins VALUES (‘john doe’);
COMMIT;
ReadWriteSplit:
Transactions
Transactional behavior must be kept intact
● Executed only on one node
● Statements cannot be retried on other servers
● Cannot be load balanced
Read-write transaction
START TRANSACTION READ ONLY;
SELECT name FROM accounts WHERE id = 1;
COMMIT;
ReadWriteSplit:
Transactions
Same as read-write except:
● Can be load balanced
● Safe even with writes
○ Server returns an error
Read-only transaction
SELECT name FROM accounts WHERE id = 1;
INSERT INTO logins VALUES (‘john doe’);
SELECT LAST_INSERT_ID();
SET @@character_set_client=cp850;
ReadWriteSplit: Query classification
Read
Write
Dependent Query
Session State
Different queries require different behavior
● Writes to master
● Reads to slaves
● Dependent queries to previous server
● Session state modifications to all
SELECT name FROM accounts WHERE id = ?;
INSERT INTO logins VALUES (‘?’);
ReadWriteSplit: Query classification
Prepared statements
Observable behavior:
● None
Behind the scenes:
● Text protocol
○ Resolve query type
○ Map text identifier to query
type
● Binary protocol
○ Resolve query type
○ Route preparation
○ Map returned identifier to
query type
Handling Failures
How MaxScale Hides Node Failures
Monitors detect failures:
● Node no longer responsive
○ Response takes too long
○ Connection broken → Cannot reestablish
● Invalid state
○ Broken replication
○ Replication is lagging
○ Out-of-sync Galera node
Monitors:
Node Failure
Read retry
● Hides “trivial” failures
○ SELECT statement
○ autocommit=1
○ No open transaction
● Guaranteed reply
○ Try slaves first
○ Use master as last resort
ReadWriteSplit:
Hiding Node Failures
● Triggered on master failure
○ Master server down
○ Lost connection to master
● Read-only queries and transactions allowed
○ For read-heavy traffic
● Configurable behavior
○ Close connection on master failure
○ Close connection on first write
○ Send error on all writes
ReadWriteSplit:
Read-only Mode
● Triggered on slave failure
○ Discard current slave
○ Pick a replacement
● Supplements read retry
○ Lower total connection count
● Configurable behavior
○ Close connection on master failure
○ Close connection on first write
○ Send error on all writes
ReadWriteSplit:
Slave Replacement
Filters
Extending MaxScale Functionality
● Between client and router module
○ Pre-processing
○ Analytics
○ Target hinting
● Chainable
○ Output pipes to input
● Easy to write
○ First community contribution
■ tpmfilter
Filter Overview
Cache:
TTL-based resultset caching
● Up to 3x read performance
● Configurable caching and storage
○ Specific users or applications
○ Matching SQL statements
○ Specific tables or databases
●
● Non-transactional
○ Work on a single node
○ Fail when load balanced
● Depend on previous queries
○ Read inserted value
Critical Reads
INSERT INTO accounts VALUES (‘john doe’);
SELECT name FROM accounts WHERE name = ’john doe’;
● Not compatible with load balancing
○ Can return a result without the inserted value
● Not the “correct way” to do it
○ Legacy application → hard to modify
○ Framework →impossible to modify
● Detects data modification
○ Writes “pin” the session to master
● Tags the query with a hint
○ Route to master
● Configurable
○ Number of queries
○ Time interval
CCRFilter:
Consistent Critical Reads
INSERT INTO accounts VALUES (‘john doe’);
SELECT name FROM accounts WHERE name = ’john doe’;
Route this to the master!
● Match-replace functionality
○ PCRE2 regular expressions
● Fix broken SQL
○ “Patching” after release
● Allows neat tricks
○ Append a LIMIT clause
○ Add optimizer hints
○ Change storage engine
Regexfilter:
sed for SQL
Solution:
Use the right tool. Work smart, not hard.
Wrapping Up
Problem:
Database clusters are essential for
performance and HA but are also hard to
use properly.
Wrapping Up
MaxScale:
A Toolbox for the Database.
● Abstracts database clusters into services
● Truly understands traffic and environment
● Makes database clusters easy to use efficiently
Thank you

M|18 Why Abstract Away the Underlying Database Infrastructure

  • 1.
    Why Abstract Away theUnderlying Database Infrastructure MariaDB MaxScale: Database Proxy Markus Mäkelä
  • 2.
    Overview • What isdatabase cluster abstraction? • Why is it important? • How does MariaDB MaxScale do it?
  • 3.
    The Idea ofa Perfect Database ● Behaves like a single database ○ Simple to use ○ Easy to manage ● Performs like a cluster ○ Robust and failure tolerant ○ Near-linear scalability What is Abstraction for Database Clusters? The Database
  • 4.
    Why is itImportant? ● Isolates Complexity ○ One logical database → Build simpler applications ● Highly Available Database ○ Fault tolerant → Robust services ○ Dynamic clusters → Easier maintenance ● Load balancing ○ Read/Write splitting → Better performance Database Abstraction Layer
  • 5.
    Why is itImportant? Complexity isolation ● Simpler application development/configuration ○ No need to know where to send queries ● No user-visible infrastructure ○ Don’t need to detect servers that are in maintenance ○ No need to know the cluster topology The Database
  • 6.
    Why is itImportant? Highly Available Database ● Prevents downtime ○ Node failure is not cluster failure ● Easier Maintenance ○ Functionality not tied to physical nodes ○ Reduced capacity, not functionality ○ Easy node replacement Database Abstraction Layer
  • 7.
    Why is itImportant? Load Balancing ● Runtime Load Balancing ○ Maximized node utilization ● Horizontal Scalability ○ Cheaper ○ Easier to change ○ On-demand capacity 2 N1 Database Abstraction Layer
  • 8.
    How the Abstractionis Implemented MariaDB MaxScale
  • 9.
    MaxScale Overview ● ModularDatabase Proxy ○ Only use what is needed ○ Extendable ● Content-aware ○ Understands routed traffic ● Cluster-aware ○ Active cluster monitoring ○ Understands different cluster types
  • 10.
    Configuration: Defining Services insteadof Servers ● “Database as a Service” ● Decouple clients from databases ● Describe what you want instead of what you have ○ This is a service that provides automated, highly available read-write splitting Database Abstraction Layer
  • 11.
  • 12.
    ● Classify servers ○Up or Down? ○ Master or Slave? ○ In sync or not? ● Information used by routers ○ Masters used for writes ○ Slaves used for reads ● Detects events ○ Server went down ○ Slave is disconnected Overview: Monitors
  • 13.
    ● Detects topology ○Builds the replication tree ● Assigns correct labels ○ Root node for writes ○ Other nodes for reads ● Detects replication lag ○ Write timestamp on master ○ Read from slave MariaDB Monitor: Master-Slave Monitor Master SlaveSlave This is a master This is a slave
  • 14.
    ● Output ofSHOW ALL SLAVES STATUS ○ Slave_IO_Running: Yes ○ Slave_SQL_Running: Yes ○ Master_Server_Id: 1234 ● Number of configured slaves ● @@read_only MariaDB Monitor: Monitored Variables Master SlaveSlave This is used to build the replication tree
  • 15.
    ● Galera Clusters ○Synchronous Cluster ○ Globally conflict free ○ Conflicting transaction → Error on commit ● Abstracted in MaxScale ○ One “master” node ■ Prevents conflicts ○ Rest labeled as “slaves” ■ Good for scaleout Galera Cluster Monitor Master MasterMaster Use this for all writes... …and these two for reads
  • 16.
    ● @@wsrep_local_state ○ 4(Joined)→ OK ○ Anything else →Not OK ● @@wsrep_local_index ○ Zero-indexed ○ Cluster-wide “rank” ● Optional: ○ Manual node ranking (priority) ○ Split-brain sanity checks ○ MariaBackup/XtraBackup SST detection Galera Cluster Monitor: Node Election Master MasterMaster
  • 17.
    Routing & QueryClassification How the Load Balancing is Done
  • 18.
    SELECT WHERE id = 1; ● Provides bothabstract and detailed information ○ Read or write ■ Does the query modify the database? ○ Query components ■ Is the table `t1` used in this query? ■ What are the values for the functions in the query? ○ Query characteristics ■ Does the query have a WHERE clause? ○ State changes ■ Was the default character set changed? ■ Is there an open transaction? Query Classifier: The Brains of MaxScale Read-only query
  • 19.
    SELECT WHERE id = 1; Query Classifier: Details ● Basedon a modified lightweight version of SQLite ○ Extended for MariaDB 10.3 syntax ○ Removed data storage and memory allocation ● Smart classification ○ First pass ■ Lightweight parsing ■ Resolves operation and query type ○ Second pass ■ Only for full syntactic classification ■ Column ↔Function relationships Read-only query
  • 20.
    ● Read/write splitting ○Write to master, read from slaves ○ Performance improvement for read-heavy loads ○ Prevents conflicts (Galera) ● Session state tracking & propagation ○ Consistent session state ● Failure tolerant ○ Hides slave failures ● Multiple backend connections ○ Must-have for read/write splitting ○ Speeds up node failover ReadWriteSplit: The Routing Muscle
  • 21.
    Based on serverscore ● Multiple algorithms ○ Active operation count → Default ■ MIN(operations) ○ Connection count ■ MIN(connections) ○ Replication delay ■ MIN(delay) ● Manually adjustable ○ Weight each server differently ■ MIN(score * weight) ReadWriteSplit: Load Balancing
  • 22.
    ● Consistent statefor all connections ○ State modifications propagated ○ Truly abstracted reads ● State modification history ○ Node replacement ReadWriteSplit: Session State SET SQL_MODE=’ANSI’;
  • 23.
    START TRANSACTION; SELECT nameFROM accounts WHERE id = 1; INSERT INTO logins VALUES (‘john doe’); COMMIT; ReadWriteSplit: Transactions Transactional behavior must be kept intact ● Executed only on one node ● Statements cannot be retried on other servers ● Cannot be load balanced Read-write transaction
  • 24.
    START TRANSACTION READONLY; SELECT name FROM accounts WHERE id = 1; COMMIT; ReadWriteSplit: Transactions Same as read-write except: ● Can be load balanced ● Safe even with writes ○ Server returns an error Read-only transaction
  • 25.
    SELECT name FROMaccounts WHERE id = 1; INSERT INTO logins VALUES (‘john doe’); SELECT LAST_INSERT_ID(); SET @@character_set_client=cp850; ReadWriteSplit: Query classification Read Write Dependent Query Session State Different queries require different behavior ● Writes to master ● Reads to slaves ● Dependent queries to previous server ● Session state modifications to all
  • 26.
    SELECT name FROMaccounts WHERE id = ?; INSERT INTO logins VALUES (‘?’); ReadWriteSplit: Query classification Prepared statements Observable behavior: ● None Behind the scenes: ● Text protocol ○ Resolve query type ○ Map text identifier to query type ● Binary protocol ○ Resolve query type ○ Route preparation ○ Map returned identifier to query type
  • 27.
    Handling Failures How MaxScaleHides Node Failures
  • 28.
    Monitors detect failures: ●Node no longer responsive ○ Response takes too long ○ Connection broken → Cannot reestablish ● Invalid state ○ Broken replication ○ Replication is lagging ○ Out-of-sync Galera node Monitors: Node Failure
  • 29.
    Read retry ● Hides“trivial” failures ○ SELECT statement ○ autocommit=1 ○ No open transaction ● Guaranteed reply ○ Try slaves first ○ Use master as last resort ReadWriteSplit: Hiding Node Failures
  • 30.
    ● Triggered onmaster failure ○ Master server down ○ Lost connection to master ● Read-only queries and transactions allowed ○ For read-heavy traffic ● Configurable behavior ○ Close connection on master failure ○ Close connection on first write ○ Send error on all writes ReadWriteSplit: Read-only Mode
  • 31.
    ● Triggered onslave failure ○ Discard current slave ○ Pick a replacement ● Supplements read retry ○ Lower total connection count ● Configurable behavior ○ Close connection on master failure ○ Close connection on first write ○ Send error on all writes ReadWriteSplit: Slave Replacement
  • 32.
  • 33.
    ● Between clientand router module ○ Pre-processing ○ Analytics ○ Target hinting ● Chainable ○ Output pipes to input ● Easy to write ○ First community contribution ■ tpmfilter Filter Overview
  • 34.
    Cache: TTL-based resultset caching ●Up to 3x read performance ● Configurable caching and storage ○ Specific users or applications ○ Matching SQL statements ○ Specific tables or databases ●
  • 35.
    ● Non-transactional ○ Workon a single node ○ Fail when load balanced ● Depend on previous queries ○ Read inserted value Critical Reads INSERT INTO accounts VALUES (‘john doe’); SELECT name FROM accounts WHERE name = ’john doe’; ● Not compatible with load balancing ○ Can return a result without the inserted value ● Not the “correct way” to do it ○ Legacy application → hard to modify ○ Framework →impossible to modify
  • 36.
    ● Detects datamodification ○ Writes “pin” the session to master ● Tags the query with a hint ○ Route to master ● Configurable ○ Number of queries ○ Time interval CCRFilter: Consistent Critical Reads INSERT INTO accounts VALUES (‘john doe’); SELECT name FROM accounts WHERE name = ’john doe’; Route this to the master!
  • 37.
    ● Match-replace functionality ○PCRE2 regular expressions ● Fix broken SQL ○ “Patching” after release ● Allows neat tricks ○ Append a LIMIT clause ○ Add optimizer hints ○ Change storage engine Regexfilter: sed for SQL
  • 38.
    Solution: Use the righttool. Work smart, not hard. Wrapping Up Problem: Database clusters are essential for performance and HA but are also hard to use properly.
  • 39.
    Wrapping Up MaxScale: A Toolboxfor the Database. ● Abstracts database clusters into services ● Truly understands traffic and environment ● Makes database clusters easy to use efficiently
  • 40.