This document summarizes new features and enhancements in MariaDB MaxScale 2.5 and MariaDB ColumnStore 1.5. Some key points include:
- MaxScale 2.5 includes a new graphical user interface, improved binlog router, capability to stream binlogs to Kafka as JSON, and distributed caching between MaxScale servers.
- ColumnStore 1.5 features a new API, PowerBI direct query connector, improved replication from InnoDB, and multinode support in SkySQL.
- Configuration and installation of ColumnStore has been simplified, including using a new ColumnStore.xml utility and S3 storage manager for redundant file storage in object storage.
3. What is MaxScale?
● Automatic failover
● Transaction replay
● Schema sharding
● Causal reads
● Query blocking
○ Database firewall
● Data masking
● Denial-of-service protection
○ Result limiting
Database proxy
●Offload replication from the primary
database
●Supports a very large number of read
replicas
Replication server
4. MariaDB MaxScale: New Features in 2.5
● New Graphic User Interface
(Alternative to MaxCtrl)
● New and Improved Binlog Router
● Stream Binlogs (as JSON) to Kafka
Broker
●Cooperative Monitoring and Locking
(New HA Solution)
●Distributed Cache between MaxScale
Servers (i.e. Redis, Memcached)
●Columnstore Orchestration
6. New Binlog Router
Allows MaxScale to serve replication traffic to any number of slaves
while removing that load from the true master
● Completely re-written
● More efficient and performant
● Heartbeat and burst interval configured automatically
10. Columnstore Orchestration VIA API
MaxScale
Columnstore
Primary
Columnstore
Replica 1
Columnstore
Replica 2
Legend
Normal Traffic
S3
API
dbroot1 dbroot2 dbroot3
11. Columnstore Orchestration VIA API
MaxScale
Columnstore
Primary
Columnstore
Replica 1
Columnstore
Replica 2
Legend
Normal Traffic
S3
API
dbroot1 dbroot2 dbroot3
12. Columnstore Orchestration VIA API
MaxScale
Columnstore
Primary
Columnstore
Replica 1
Columnstore
Replica 2
Legend
Normal Traffic
S3
API
dbroot1 dbroot2 dbroot3
13. Columnstore Orchestration VIA API
MaxScale
Columnstore
Primary
Columnstore
Replica 1
Columnstore
Replica 2
Legend
Normal Traffic
S3
API
dbroot2 dbroot3dbroot1
14. Columnstore Orchestration VIA API
MaxScale
Columnstore
Primary
Columnstore
Replica 1
Columnstore
Replica 2
Legend
Normal Traffic
S3
API
dbroot2 dbroot3dbroot1
15. Columnstore Orchestration VIA API
MaxScale
Columnstore
Primary
Columnstore
Primary
Columnstore
Replica 2
Legend
Normal Traffic
S3
API
dbroot2 dbroot3dbroot1
17. THE PROBLEM OF BIG DATA
The analytic workload is not too complex
(compared to scientific work)...
… BUT AS DATA GROWS,
PERFORMANCE DEGRADES
MERELY BECAUSE OF ITS SIZE
Computers are not designed to
scale well at the hardware level…
… SO SCALING NEEDS TO BE DONE
AT THE SOFTWARE/ ALGORITHM
LEVEL
18. WHAT IS MARIADB COLUMNSTORE?
● Distributed and columnar storage for analytics
○ Stores data by column rather than row (file per column)
○ Stores data on separate, dedicated storage nodes
○ Scale out by adding more storage nodes
○ Can use GlusterFS or S3 for high availability
● Compression: up to 90%
19. ● High performance storage engine
● Fully compatible with standard SQL
● Easier big data enterprise analytics
○ Single interface for OLTP and
analytics
○ Easy to integrate with third-party
tools
● Fast import of data (cpimport),
● Direct injection of data (bypassing SQL
layer)
○ No need for indexes and views
Features
● Powerful in analytics (Star/Snowflake
schemas)
● Scalability (just adding a new node(PM)
to the cluster and no data distribution is
required)
● Concurrency,
● File level backup in real-time
● Good compression -> low disk space
usage
● Faster, more efficient queries
○ Distributed parallel processing
WHY USE MARIADB COLUMNSTORE?
20. ROW-ORIENTED VERSUS COLUMN-ORIENTED
ROW-ORIENTED
● Rows stored
sequentially in a file
● Scans through every
record row by row
COLUMN-ORIENTED
● Each column is
stored in a separate file
● Scans only the
relevant column
ID FNAME LNAME STATE ZIP PHONE AGE GENDER
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
ID FNAME LNAME STATE ZIP PHONE AGE GENDER
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
> SELECT Fname FROM TABLE1 WHERE State = ‘NY’;
21. Storage Architecture Reduced I/O
High Performance Query Processing
● Only touch column files that are
in Filter, Projection, Group By,
and Join Conditions
● Eliminate Disk Block touches
to Partitions Outside Filter
and Join Conditions
Extent 1:
ShipDate: 2016-01-12 - 2016-03-05
Extent 2:
ShipDate: 2016-03-05 - 2016-09-23
Extent 3:
ShipDate: 2016-09-24 - 2017-01-06
> SELECT item, SUM(qty) FROM orders
WHERE ship_date BETWEEN '2016-01-01'
AND '2016-01-31' GROUP BY item
Horizontal
Partition:
8 Million Rows
Extent 2
Horizontal
Partition:
8 Million Rows
Extent 3
Horizontal
Partition:
8 Million Rows
Extent 1
id order_id Line item qty price supplier ship_date mode
1 1 1 Laptop 5 1000 Dell 2016-01-12 G
2 1 2 Monitor 5 200 LG 2016-01-13 G
3 2 1 Mouse 1 20 Logitech 2016-02-05 M
4 3 1 Laptop 3 1600 Apple 2016-01-31 P
... ... ... ... ... ... ... ... ...
8M 2016-03-05
8M+1 2016-03-05
... ... ... ... ... ... ... ... ...
16M 2016-09-23
16M+1 2016-09-24
... ... ... ... ... ... ... ... ...
24M 2017-01-06
ELIMINATED PARTITION
ELIMINATED PARTITION
22. Sample Two Node ColumnStore Cluster
App BI
MariaDB w/
ColumnStore
MaxScale
Legend
Normal Traffic
InnoDB BinLog
Replication
MariaDB w/
ColumnStore
Node 1 Node 2
ColumnStore Traffic
23. New Features in 1.5
● New API (Replaces OAM)
● PowerBI Direct Query Connector
● Improved Collation and Character
Sets
● Improved Write Cache (100x faster)
Phase 1 (Summer 2020)
● Faster Replication From Innodb
● Sequences
● Restart of Broken LDI (Checkpoints)
Phase 3 (Winter 2020)
● Multinode in SkySQL
● Wider Extents
● Extended Oracle Mode
● Dec(38)
Phase 2 (Fall 2020)
● Extent Elimination for DOUBLE
● Improved redistributedata
● Dynamic Columns
Phase 4 (Spring 2021)
27. Configure Columnstore
We provide a couple Columnstore.xml utilities.
mcsGetConfig -a
Will show the current Columnstore.xml settings.
To change a variable, run:
mcsSetConfig <group> <parameter> <value>
Example: mcsSetConfig CrossEngineSupport User “myuser”
Note: A restart of Columnstore is required.