MariaDB Paris Workshop 2023 - Performance Optimization

MariaDB Optimisation de
performances
Sebastien Giraud
Senior Solution Engineer
MariaDB plc

Agenda
● What is MariaDB
● Understanding MariaDB
● A brief overview of MariaDB's
architecture
● Where to find performance
● Identifying slowdowns
● Other options

Understand what is MariaDB
● Database
● Open source
● Multi engine/plugin
● Highly tunable (800 variables)

● MariaDB is a modular solution
○ Storage engines
○ Plugins
● Linux like architecture
○ Highly tunable and expandable
○ 800 configuration variables
● Green solution (C source code)
○ Package size is 100 MB
● Inter engine replication
● Complete ecosystem
○ MaxScale
○ MariaBackup
○ MariaDB Shell
○ Connecteurs
○ Monitoring tools

● Do NOT use default community edition
configuration in production
environments
● Understand application needs
○ R/W ratio
○ Max connection number
○ Cache hit ratio

● Why MariaDB is magical
○ Inter engine replication
○ Transparent federator proxy

Local performances : global memory
● User rights
○ Stored in memory
○ Used for each queries
● swappiness = 1
● Performance schema
● MariaDB Shell
○ Easy diagnostic
● Every storage engine consume
memory

Local performances : global memory
● table _open_cache
● table_definition_cache
● query_cache_size
● thread_cache
● MyIsam
○ key_buffer_size
● InnoDB
○ innodb_buffer_pool_size
○ innodb_additional_mem_pool_size
○ Innodb_log_buffer_size

Local performances : per session memory
● Per client allocation
● Per join and sort allocation
● Freeing memory at session end or
connection end
● Always thinking about OOM
● SWAP (again and again …)

Local performance : hard drive
● Persistence requiert disk access
● Disk redundancy
● Adjust async replication parameters
according your application needs
○ sync_binlog
○ sync_relay_log

And what if the performance
was elsewhere?

Performance elsewhere ?
● Adapting architecture to performance
requirements
● Understanding the use case
● Multiplying engines as needed
● Proxy federator to simplify everything

MaxScale distributes traffic
14
Primary Replicas
172.20.0.2 172.20.0.3 172.20.0.4
172.20.0.6:4006
Failover automatique and Read Write Split Service
Applications & Tools
Application connects to MaxScale

MaxScale back on track ;)
15
Primary Replicas
172.20.0.2 172.20.0.3 172.20.0.4
172.20.0.6:4006
Read Write Split Service
Primary failure
Replica promoted to primary

MaxScale, uh no we haven't seen anything !
16
Primary Replicas
172.20.0.2 172.20.0.3 172.20.0.4
172.20.0.6:4006
Read Write Split Service
Replica
Server re-incorporate as replica

In detail, the different
needs?

Analytical workloads ?
● Analytical workloads
○ Mainly read requirements
○ Write overhead
○ Very good solution for read requests
on large datasets
● B-tree model limits ?

B-tree indexes
The good
B-tree indexes
The bad
• Well known technology
• Works with most types of data
• Scales reasonably well
• Really good for OLTP
transactional data
• Really bad for unbalanced data
• Index modifications can be really
slow
• Index modifications are largely single
threaded
• Slows down with the amount of data
• Really not scalable with large
amount of data

In short, analytics ?
● Something that can compress a LOT of data
● Something that can be written to with fast, predictable performance
● Something that does not necessarily support transactions
○ It doesn't hurt, but performance is much more important
● A system capable of handling analytical queries
○ Ad hoc requests
○ Aggregated queries
○ Large datasets
● A system capable of adapting to data growth
● A system capable of ensuring a high level of availability
● Works with analysis tools such as Tableau, R, etc

Nouveauté ColumnStore
https://mariadb.com/docs/server/whats-new/mariadb-enterprise-columnstore-6/
● Agrégation de résultat de requête sur disque
○ Jeu de données de résultat supérieur à la mémoire disponible (>1TB)
● Augmentation de la précision DÉCIMALE de 18 à 38
● Compression LZ4 et Snappy
● Mise à jour des données transactionnelles à partir des données du ColumnStore
en plus de la jointure Cross Engine
UPDATE innodb_tab i
JOIN columnstore_tab c
ON i.col1 = c.col1
SET i.col2 = c.col2;
[Restricted]

Le cas du transactionnel
● Quand les performances ne sont plus au rendez vous ?
○ Passer en revue les serveurs
○ Passer en revue les configurations
○ Optimiser l’utilisation de la mémoire et du CPU
○ Vérifier la complexité des requêtes
■ ANALYZE FORMAT=JSON is a mix of the EXPLAIN FORMAT=JSON and ANALYZE statement features. The
ANALYZE FORMAT=JSON $statement will execute $statement, and then print the output of EXPLAIN
FORMAT=JSON, amended with data from the query execution.
■ EXPLAIN FORMAT=JSON is a variant of EXPLAIN command that produces output in JSON form. The output
always has one row which has only one column titled "JSON". The contents are a JSON representation of
the query plan, formatted for readability:
EXPLAIN FORMAT=JSON SELECT * FROM t1 WHERE col1=1G
● ET APRES ?

Optimisation des cas transactionnel
https://mariadb.com/resources/blog/facebook-myrocks-at-mariadb/#sthash.ZlEr7kNq
.dpuf
● MyRocks ? Oui le moteur de Facebook !
● LSM algorithm : indexation rapide des gros volumes

Et pourquoi pas diviser les schémas ?
● Le Sharding avec Spider
CREATE TABLE s(
id INT NOT NULL AUTO_INCREMENT,
code VARCHAR(10),
PRIMARY KEY(id)
)
ENGINE=SPIDER
COMMENT 'host "127.0.0.1", user "msandbox",
password "msandbox", port "8607"';

Et pourquoi pas diviser les schémas ?
● Le Sharding avec MaxScale
[accounts_east]
type=server
address=192.168.56.102
port=3306
[accounts_west]
type=server
address=192.168.122.85
port=3306
[Sharded-Service]
type=service
router=schemarouter
servers=accounts_west,accounts_east
user=sharduser
password=YqztlYGDvZ8tVMe3GUm9XCwQi

Apercu de MaxScale
Advanced
● Performance and scalability
○ Read/write split
○ Load balancing adaptatif
○ Causal reads
○ Caching des résultats de requête
avec Redis
● HA
○ Failover Automatique
○ Transaction replay
○ Réplication Parallèle avec Xpand
● Multiple Moniteurs
○ Xpad, ColumnStore and Replicated
environments.
● Verrouillage coopératif
○ MaxScale HA
○ Multiple MaxScale Moniteurs dans un
Cluster
● Sécurité
○ Pare-feu pour bases de données
○ Masquage dynamique des données
○ Limitation des requêtes
○ Limitation des résultats des requêtes
○ Statistiques de performance
○ Enregistrement central des requêtes
Basics

Powerful Query Editor
● Formatted Results
● Visualization
● Data Preview
● Syntax Highlighting

En résumé
Performances d'un seul noeud
● Thread pool
● Memory
● Thread
Performance d’un requête
● Les outils Explain, Analyze, MariaDB shell
En cluster async / semi sync
● Maxscale
● Réplication parallèle
● Distribution de la charge r/w split
Analytique
● Columnstore
Transactionnel
● MyRocks
● Spider
● MaxScale

MariaDB Paris Workshop 2023 - Performance Optimization

More Related Content

Similar to MariaDB Paris Workshop 2023 - Performance Optimization

More from MariaDB plc

Recently uploaded

MariaDB Paris Workshop 2023 - Performance Optimization