MariaDB ColumnStore is a high performance columnar storage engine that supports analytical workloads through SQL. It uses a distributed, massively parallel architecture to provide faster and more efficient queries on large datasets. Key features include its use of a columnar data structure for compression and performance, distributed processing and parallel query execution, and integration with the standard MariaDB interface to allow SQL-based analytics.
4. MariaDB ColumnStore
• GPLv2 Open Source
• Columnar, Massively Parallel
MariaDB Storage Engine
• Scalable, high-performance
analytics platform
• Built in redundancy and
high availability
• Runs on premise, on AWS cloud
• Full SQL syntax and capabilities
regardless of platform
Big Data Sources Analytics Insight
MariaDB ColumnStore
. . .
Node 1 Node 2 Node 3 Node N
Local / AWS® / GlusterFS ®
ELT
Tools
BI
Tools
5. MariaDB ColumnStore Architecture
Columnar Distributed Data Storage
User Connections
User Module nUser Module 1
Performance
Module n
Performance
Module 2
Performance
Module 1
MariaDB
Front End
Query Engine
User Module
Processes SQL Requests
Performance Module
Distributed Processing Engine
6. MariaDB ColumnStore
High performance columnar storage engine that support wide variety of
analytical use cases with SQL in a highly scalable distributed environments
Parallel query
processing for
distributed
environments
Faster, More
Efficient Queries
Single SQL
Interface for OLTP
and analytics
Easier Enterprise
Analytics
Power of SQL and
Freedom of Open
Source to Big Data
Analytics
Better Price
Performance
7. OLTP/NoSQL
Workloads
Suited for reporting or analysis of millions-billions of rows from data sets containing millions-trillions of rows.
OLAP/Analytic/
Reporting Workloads
Workload – Query Vision/Scope
1 100 10,000
10-100GB
10,000,000,000
1-10TB
1,000,000 100,000,000
100-1,000GB
8. MariaDB ColumnStore
MariaDB Functions
• MariaDB Client
• MariaDB Connectivity (JDBC, ODBC)
• MariaDB Security
• Initial SQL Statement Parsing
• Initial SQL Optimization < Custom Handler Class >
• Execute final sort and final limit
• Display final results
ExeMgr Functions
• SQL Optimization
• Distribute work for scan, filter, join, functions,
expressions, group by, aggregation, etc. to all available
Performance Modules to be run in parallel
• Collect the results returned by the Performance Modules
• Return the final results to MySQL for display
MariaDB
ColumnStore
ExeMgr
Columnar Distributed Data Storage
User Connections
User Module nUser Module 1
Performance
Module n
Performance
Module 2
Performance
Module 1
MPP
User Module
Processes SQL Requests
MySQL Front End
Performance Module
Executes the Queries
Distributed Processing Engine
11. Process Functionality Value
MariaDB
• Hosts MariaDB
• Connection management
• SQL parsing & optimization
Familiar DBMS interface
Leverages existing partner integrations
Delivers rich SQL syntax support
Extent Map
• Abstracts physical
and logical storage
• Metadata store
Enables partition elimination
ExeMgr
• Work distribution
• Final results management
and aggregation
Multi-threaded to take advantage
of multi-core HW platforms
User Module at a Glance
12. Process Functionality Value
PrimProc
• Scale-out cache management
• Distributed scan, filter, join
and aggregation operations
• Resource management
Independent scalability and
tunable performance
Multi-threaded to take advantage
of multi-core HW platforms
Data
• High Speed Bulk Load
• Transactional DML and DDL
• Online schema extensions
Non-blocking read enabled
Multi-threaded to take advantage
of multi-core HW platforms
Performance Module at a Glance
13. Columnar General Best Practices
Not suited for OLTP
Micro-batch load allows for near real-time behavior
Infrequently used columns do not impact other queries
Columnar suitable for sparse columns (nulls compress nicely)
14. Data Modeling Best Practices
Star-schema optimizations are generally a good idea
Conservative data typing is very important
Especially around fixed-length vs. dictionary boundary (8 bytes)
IP Address vs. IP Number
Break down compound fields into individual fields:
Trivializes searching for sub-fields
Can avoid dictionary overhead
Cost to re-assemble is generally small
15. Compression with Data Storage Layer
Blocks (8KB)
Extent1
(8MB~64MB
8 million rows)
Logical
Layer
Segment File1
(maps to an Extent)
Physical
Layer
Compression
Chunks
16. Data Load and Extents (local load)
8 million rows
1st Data Load
CSV File
Data Range
1 ~ 200
Rows 16 million
2nd Data Load
New CSV File
Data Range
150 ~ 210
Rows 16 million +8
Data Load
Data Load
Extent 1
Min 1, Max 200
Extent 2
Min 1, Max 200
8 million rows
8 million rows
Extent 3
Min 150, Max 210
Extent 4
Min 150, Max 210
8 million rows
Extent 5
Min 150, Max 210
8 million rows
17. Key meta-structure that powers MariaDB ColumnStore’s
performance
A catalog of all extents
• Minimum and maximum values for a column’s data within an extent
• Corresponding blocks for each extent
Master copy of the Extent Map on primary PM node
Upon system startup, copied to all other UM and PM
nodes for disaster recovery and failover purposes
Extent Map resident in memory for quick access at all nodes
As extents modified, updates broadcasted to all participating nodes
Stores about 64 bytes for each 8-64 Mbytes on disk
Extent Map
18. Extent Map
When performing queries:
• Eliminate the extents by taking into consideration only
the extents for the column in join and filter conditions
• Use the minimum and maximum value for the extents for
join columns to filter the columns and eliminate extent
Multiple columns can be used
together for partition elimination
Transitive properties apply, i.e. a filter
on a dimension column (date, for example)
can allow for partition elimination on fact table
19. • 8-byte fixed length token (pointer).
• A variable length value stored at the
location identified by the pointer.
Data Types
1-byte Field
with 8192 values per
8k block
2-byte Field
with 4096 values
per 8k block
4-byte Field
with 2048 values
per 8k block
8-byte Field
with 1024 values per
8k block
Dictionary structure
made up of 2
files/extents with:
At the physical layer, all columns are stored as:
20. • Varchar(8) or larger
• Char(9) or larger
Data Types
1-byte Field
Examples
TinyInt, Char(1)
2-byte Field
Examples
SmallInt, Char(2)
4-byte Field
Examples
Int, Char(3),
Char(4), date, float
8-byte Field
Examples
BigInt, Char(5-
8),datetime, real/double
Dictionary Examples
At the physical layer, all columns are stored as:
21. Sizing
Minimum Spec
UM
4 core,
32 G RAM PM
4 core,
16 G RAM
Typical Server spec
PM
8 core 64G RAM
UM
8 core, 264G RAM
Data Storage
External Data Volumes
• Maximum 2 data volume per IO
channel per PM node server
• up to 2TB on the disk per data
volume ≈ Max 4 TB per PM node
Local disk
Up to 2TB on the disk per
PM node server
DETAILED SIZING GUIDE
based on data size
and workload
22. Sizing - Example
• MariaDB ColumnStore 60TB uncompressed data =
6TB compressed data at 10x compression
• 2UM - 8 core 512G(based on work load)
• 6 TB compressed = 3 data volume (at 2TB per volume)
- with 1 data volume per PM node - 3PMs
• Data growth - 2TB per month, Data retention - 2 years
- Plan for 2TB X24 = 48 TB additional
- 48 TB = 4.8TB compressed ≈ 3 data volume(at 2TB per volume)
with 1 data volume per PM node - 3 additional PMs
• Total 6 PMs, 2 UMs
24. SQL Features
Source : InfiniDB SQL Syntax Guide
Cross Engine
Joins
UDF
DML
Aggregation
DDL
Disk Based
Joins
Windowing
Functions
SELECT
QUERY
25. MAX RANK
MIN DENSE_RANK
COUNT PERCENT_RANK
SUM NTH_VALUE
AVG FIRST_VALUE
VARIANCE LAST_VALUE
VAR_POP CUME_DIST
VAR_SAMP LAG
STD LEAD
STDDEV NTILE
STDDEV_POP PERCENTILE_CONT
STDDEV_SAMP PERCENTILE_DISC
ROW_NUMBER MEDIAN
• Aggregate over a series of related rows
• Simplified function for complex statistical
analytics over sliding window per row
- Cumulative, moving or centered aggregates
- Simple Statistical functions like rank, max, min,
average, median
- More complex functions such as distribution,
percentile, lag, lead
- Without running complex sub-queries
Windowing Functions
Source : InfiniDB SQL Syntax Guide
26. Top N Visitors for each Month
Window Function Example
Total for Each
Visitor by Month
Top 1 :
Time_rank = 1
Top 2 :
Time_rank <= 2
Top N :
Time_rank <= N
27. Complex Window Function Example
Website Visitor Order Table
Outlier Limits
Quartile_1 = 1750
Quartile_3 = 2837.5
Median = 2650
Max_Val = 5000 , Min_Val = 300
Inter Quartile Range = Q3 – Q1 = 1087.5
Higher Control Limit = MIN(M – IQR*1.5, Max_Val) = 4281.25
Lower Control Limit = MAX(M + IQR*1.5, Min_Val) = 1018.75
Vistor_Id Order_Month Orders_Amount
1 January-2014 5,000
2 January-2014 1,000
3 January-2014 3,040
4 January-2014 2,000
5 January-2014 2,770
6 January-2014 2,750
7 January-2014 2,550
8 January-2014 300
1 February-2014 1,410
2 February-2014 293
10 February-2014 304
12 February-2014 314
*Discard Outlier visitors by spending for each month
28. Tuning Commands
mysql> select count(*) from part;
+-----------+
| count(*) |
+-----------+
| 200000000 |
+-----------+
1 row in set (0.48 sec)
mysql> select calgetstats();
+-------------------------------------------------------------------------------
| calgestats()
--------------------------------------------------------------------------------
| Query Stats: MaxMemPct-0; NumTempFiles-0; TempFileSpace-0MB; PhyI/O-0; CacheI/
O-98039;
+-------------------------------------------------------------------------------
-------------------------------------------------------------------------------+
BlocksTouched-97658; CasPartBlks-0; MsgBytesIn-2MB; MsgBytesOut-0MB| 1242146662
640516 |
-------------------------------------------------------------------------------+
Calgetstats: Information On The Last Query Executed Within A Given Session
29. select 'BRAZIL', d_year, lo_tax, p_size, s_region, count(*)
from dateinfo, part, supplier, lineorder
where s_suppkey = lo_suppkey
and d_datekey = lo_orderdate
and p_partkey = lo_partkey
and lo_orderdate between 19980101 and 19981231
and s_nation = 'BRAZIL'
and p_size <> 23
group by 1,2,3,4,5
order by 1,2,3,4,5;
mysql> select calgettrace();
Tuning Commands
Calgetstats: Detailed distributed query execution plan
30. Tuning Commands
Query Statistics
Users can view the query statistics by selecting the rows from the
query stats table in the infinidb_querystats schema.
Example 1 Example 2
List execution time, rows returned
for all the select queries within
the past 12 hours
select queryid, query, endtime-starttime,
rows from querystats where starttime >=
now() - interval 12 hour and querytype =
'SELECT';
List the average, min and max
running time of all the INSERT SELECT
queries within the past 12 hours
select min(endtime-starttime), max(endtime-starttime),
avg(endtimestarttime) from querystats where
querytype='INSERT SELECT' and starttime >=
now() - interval 12 hour;
31. calpont> getActiveSQLStatements
getactivesqlstatements Wed Oct 7 08:38:32 2015
Get List of Active SQL Statements
=================================
Start Time Time (hh:mm:ss) Session ID SQL Statement
---------------- ---------------- -------------------- --------------------------------------------------
Oct 7 08:38:30 00:00:03 73 select c_name, sum(lo_revenue) from customer,
lineorder where lo_custkey = c_custkey and c_custkey = 6 group by c_name
getActiveSQLStatements: List Active SQL Statements within the System
mysql> show processlist;
+----+------+-----------+-------+---------+------+-------+--------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+-------+---------+------+-------+--------------+
| 73 | root | localhost | ssb10 | Query | 0 | NULL | show processlist
+----+------+-----------+-------+---------+------+-------+--------------+
1 row in set (0.01 se
Tuning Commands
33. Bulk Data Load
cpimport, LOAD DATA INFILE
Bulk Data Export
mysql client, odbc, jdbc
Integration with MariaDB
ColumnStore cpimport and sql
interface
34. Bulk Data Load: cpimport
• Fastest way to load data into MariaDB ColumnStore
• Load data from CSV file
cpimport dbName tblName [loadFile]
• Load data from Standard Input
mysql -e 'select * from source_table;' -N db2 | cpimport destination_db
destination_tbl -s 't‘
• Load data from Binary Source file
cpimport -I1 mydb mytable sourcefile.bin
• Multiple tables in can be loaded in parallel by launching multiple jobs
• Read queries continue without being blocked
• Successful cpimport is auto-committed
• In case of errors, entire load is rolled back
35. Bulk Data Load: cpimport mode 1
Single file Central Input :
Data source at UM
cpimport -m1 mytest mytable
mytable.tbl
cpimport
Name Node
UM Node
Source
Data Node
PM Node
Data Node
PM Node
Data Node
PM Node
36. Bulk Data Load: cpimport mode 2
Distributed Input:
Data Source at PMs
Partitioned load
file on each PM
cpimport -m2 testdb mytable
/home/mydata/mytable.tbl
cpimport
Name Node
UM Node
Source
Data Node
PM Node
Data Node
PM Node
Data Node
PM Node
Source Source
37. Distributed Input:
Data Source at PMs
Partitioned load
file on each PM
cpimport -m2 testdb mytable
/home/mydata/mytable.tbl
Bulk load command
at one or more PM
cpimport –m3 testdb mytable
/home/mydata/mytable.tbl
Bulk Data Load: cpimport mode 3
Name Node
UM Node
Source
Data Node
PM Node
Data Node
PM Node
Data Node
PM Node
Source Source
cpimport cpimport cpimport
38. Traditional way of
importing data into
any MariaDB storage
engine table
Bulk Data Load:
LOAD DATA INFILE
Up to 2 times slower
than cpimport for
large size imports
mysql> load data infile '/tmp/
outfile1.txt' into table destinationTable;
Query OK, 9765625 rows affected
(2 min 20.01 sec)
Records: 9765625 Deleted:
0 Skipped: 0 Warnings: 0
Either success or
error operation can
be rolled back
39. • Connect with ODBC, JDBC or
mysql client to the UM
• Extract SQL query results in
output file on the UM
Bulk Data Export
Distributed Export Central Export
• Fastest way to do export
• Use LOCAL PM query feature
• Connect ODBC, JDBC or mysql
client to each PM
• Extract SQL query results in
output file on each PM
44. Data Warehousing
Selective column
based queries
Large number
of dimensions
High Performance
Analytics On Large
Volume Of Data
Reporting and analysis
on millions or billions
of rows
From datasets
containing millions
to trillions of rows
Terabytes to Petabytes
of datasets
Analytics Require
Complex Joins,
Windowing Functions
Technical Use Cases
45. Industry Category Use Case
Gaming Behavior Analytics Projecting and predicting user behavior based on past and current data
Advertising Customer Analytics Customer behavior data for market segmentation and predictive analytics.
Advertising Loyalty Analytics Customer analytics focusing on a person’s commitment to a product, company, or brand.
Web, E-
commerce
Click Stream Analytics
Web activity analysis, software testing, market research with analytics on data about the clicks areas of web pages while
web browsing [Deal News]
Marketing Promotional Testing Using marketing and campaign management data to identify the best criteria to be used for a particular marketing offer.
Social Network Network Analytics Relationship analytics among network nodes
Financial Fraud Analytics
Monitoring user financial transactions and identifying patterns of behaviour to predict and detect abnormal or fraudulent
activity to prevent damage to user and institution.
Healthcare Patient Analytics Analyzing patient medical records to identify patterns to be used for improved medical treatment.
Healthcare Clinical Analytics Analyzing clinical data and its impact on patients to identify patterns to be used for improved medical treatment.
Telco
Network and Application
Performance Analytics
Streaming data from network devices and applications enriched with business operations data to uncover actionable
insights for network planning, operations and marketing analytics
Aviation Flight analytics
Proactively project parts replacement, maintenance and air-plane retirement based on real-time and historically collected
flight parameter data [Boeing]
Customer Use Cases