Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Configuring workload-based storage and topologies


Published on

MariaDB has multiple workload-optimized storage engines, including InnoDB for mixed workloads, MyRocks for write-intensive workloads, Spider for scalable workloads and ColumnStore for analytical workloads. In this session, Kenny Geiselhart discusses how to choose the right storage engine for individual tables, and how replication and asymmetric topologies can be used to further optimize MariaDB and the hardware it runs on for specific workloads.

Published in: Software
  • Be the first to comment

Configuring workload-based storage and topologies

  1. 1. Configuring workload-based storage and topologies Kenneth Geiselhart Enterprise Architect MariaDB Corporation
  2. 2. MariaDB Workload-Based Storage Storage Engines “Quote” “We are running more than 25 billion queries an hour on MariaDB…the query patterns change every hour.” Tim Yim, Director of Operations, ServiceNow
  3. 3. Microservices Database #2 Database #3 Database #4 Cart table InnoDB (mixed) Cart table InnoDB (mixed) Cart table InnoDB (mixed) Database #1 Orders table MyRocks (write-intensive) Order service Products table InnoDB (mixed) Product service Cart table Spider (scalable) Cart service
  5. 5. Internals Thread Cache Query Cache SQL Parser Parse Tree Threads (SQL) Pre Processor Parse Tree Query Execution Engine (handler) Results Results Storage Engine API Query Flow Spider InnoDB MyISAM Memory MyRocks Aria Optimizer Execution Plan Clients Storage Engines
  6. 6. “Supported” purpose-built storage engines Multipurpose Storage Storage Engines Innodb - for mixed read/write workload MyRocks - for write intensive workloads Aria - improved on MyISAM, it's transactional & crash safe. Spider - for distributed workloads, scalable ColumnStore - analytics
  7. 7. InnoDB Multipurpose Storage Storage Engines
  8. 8. InnoDB ACID Compliant Storage Engines Best known for Online Transaction Processing Commit/Rollback Reliable Crash-recovery Row-level locking Foreign key referential-integrity constraints Multi-version concurrency control Encryption ATOMICITY CONSISTENCY ISOLATION DURABILITY
  9. 9. InnoDB When to use InnoDB Storage Engines All things transactional good uses Support for transactions (giving you support for the ACID property). - Banking, e-commerce Row-level locking. Having a more fine grained locking-mechanism gives you higher concurrency compared to, for instance, MyISAM. Foreign key constraints. Allowing you to let the database ensure the integrity of the state of the database, and the relationships between tables.
  10. 10. InnoDB Multipurpose Storage Storage Engines innodb_buffer_pool_size = e.g. 256M-900G ((Total RAM - (OS + Apps) - (mysqld memory + query buffers) - caches for things like binary log - other possible buffers and caches - memory tables) / 105% ) | ROUND_DOWN() ROUND(192G - ~10G - ~4G - ~1G - ~2G - ~3G) = 168G
  11. 11. InnoDB Multipurpose Storage Storage Engines innodb_buffer_pool_size = e.g. 256M-900G innodb_log_file_size = e.g. 2G-16G innodb_stats_on_metadata = 0 innodb_flush_log_at_trx_commit = e.g. 0?(iflatc)
  12. 12. MyRocks Writes and Compression Storage Engines
  13. 13. MyRocks MyRocks History Storage Engines Google Bigtable LevelDB (Google) MyRocks (Facebook) RocksDB (Facebook) MyRocks (MariaDB) Apache Hbase
  14. 14. MyRocks Log-Structured Merge TREE Storage Engines Y X ‘X Mod of Y Mod of X ‘Y Mods written to disk Log segment data structure in memory
  15. 15. MyRocks Why MyRocks? Storage Engines Best Space Efficiency - uses half the space of InnoDB compressed Better Write Efficiency - uses about a quarter of InnoDB uncompressed Ok read Efficiency - can be slower the InnoDB Effective with SSD or Disks - works well on either If the database is larger than ram and you are using InnoDB. Good for Web-based Application workloads.
  16. 16. MyRocks Writes and Compression Storage Engines rocksdb_block_cache_size = e.g. 256M-900G Rocksdb_default_cf_options = "compaction_pri=kMinOverlappingRatio; level_compaction_dynamic_level_bytes=true;" rocksdb_max_background_jobs = (# cpu cores / 4)
  17. 17. MyRocks Writes and Compression Storage Engines latin1_bin, utf8_bin, binary INDEX index_name(col1, ...) COMMENT 'family_name' Rocksdb_max_open_files = -1
  18. 18. Aria System and Temp Storage Engines
  19. 19. Aria System and Temp Storage Engines It is Crash safe, Uses table level locks Temp tables Good for Full Text Search It’s the default storage engine for system tables in 10.4
  20. 20. Aria System and Temp Storage Engines aria_pagecache_buffer_size = e.g. 256M-64G aria_block_size = e.g. 2048-8096 aria_group_commit = "hard" aria_group_commit_interval = 0
  21. 21. Spider Extreme Scale Storage Engines
  22. 22. Spider Distributed & scalable Storage Engines Transparent sharding - meaning the application doesn’t need to know where the shards are, it just queries the spider node. Scalability and concurrency - Table partitioning (e.g., range, key, hash, list), Pushdown (e.g., condition, index, join and aggregate) High availability and consistency - Two-phase commit
  23. 23. Spider Spider Node Only Storage Engines CREATE TABLE test.spider ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(50) ) ENGINE=Spider COMMENT='usr "web", port "3306", password "myspider", table "sharding"' PARTITION BY RANGE(id) ( PARTITION p1 values less than (500,000) COMMENT 'host "DataNode 1"', PARTITION p2 values less than (1,000,000) COMMENT 'host "DataNode 2"', ...
  24. 24. Rows: 1-500,000 Rows: 501,000-1,000,000 Spider Node Table A Spider Shared Data Node 1 Table A (Partition 1) InnoDB Shared Data Node 2 Table A (Partition 2) InnoDB No Data stored on Spider Node Storage Engines Spider Distributed & scalable
  25. 25. Database #1 Table A Spider Rows: 1-500,000 Rows: 501,000-1,000,000 Database #2 Table A (Partition 1) InnoDB Database #3 Table A (Partition 2) InnoDB Database #4 Table A (Partition 3) InnoDB Rows: 1,000,001-1,500,000 Database #5 Table A Spider
  26. 26. ColumnStore Analytics Storage Engines
  27. 27. ColumnStore Analytics Storage Engines *Power of SQL and Freedom of Open Source to Big Data Analytics *Single SQL Interface for OLTP and analytics *Parallel query processing for distributed environments High performance columnar storage engine that support wide variety of analytical use cases with SQL in a highly scalable distributed environments
  28. 28. ColumnStore Analytics Storage Engines Large amounts of data being loaded Tremendous performance improvement for analytical queries using columnar format Massive Parallel processing with performance module Hybrid Transactional Analytical Processing
  29. 29. ColumnStore Analytics Storage Engines DBRoots - are the Datafile containers Internal - local root level storage on server EXternal - to EXTx SAN storage Data Redundancy - requires GlusterFS on all PM nodes NumBlocksPct = 50% memory used for block caching TotalUmMemory = 25% memory used for UM operations like final consolidation and sorting
  30. 30. More storage engines Storage Engines Memory - It is best-used for read-only caches of data from other tables, or for temporary work areas. Data is kept in memory, so it is vulnerable. MyISAM - It's a light, non-transactional engine with great performance, is easy to copy between systems and has a small data footprint. It is all but dead. Use Aria. Blackhole - The BLACKHOLE storage engine accepts data but does not store it and always returns an empty result.
  31. 31. Microservices Database #2 Database #3 Database #4 Cart table InnoDB (mixed) Cart table InnoDB (mixed) Cart table InnoDB (mixed) Database #1 Orders table MyRocks (write-intensive) Order service Products table InnoDB (mixed) Product service Cart table Spider (scalable) Cart service
  32. 32. Documentation For further reading Storage Engines -engine/ view/
  33. 33. THANK YOU!