Big-Data-Analysen mit MariaDB ColumnStore

MariaDB
ColumnStore
Bruno Šimić

Overview Architecture
Columnar
Distributed
Storage

MariaDB ColumnStore
Powerful, open source distributed MPP processing data warehouse for
big data analytics
Parallel query
processing for distributed
environments
Faster, More
Efficient Queries
Single SQL Interface for
OLTP and analytics
Easy to Manage and
Scale
Easier Enterprise
Analytics
Power of SQL and
Freedom of Open
Source to Big Data
Analytics
Better Price
Performance

OLTP/NoSQL
Workloads
• OLAP/Analytic/
Reporting Workloads
Suited for reporting or analysis of millions-billions of rows from data sets containing millions-trillions of rows.
Workload – Query Vision/Scope
1 100 10,000
10-100GB
10,000,000,000
1-10TB
1,000,000 100,000,000
100-1,000GB

Why MariaDB ColumnStore?
Benefits of using it

2. Diagnostic Analytics
Why did it Happen?
1. Descriptive Analytics
What is Happening?
Traditional OLAP
3. Predictive Analytics
What is likely to happen?
4. Prescriptive Analytics
What should I do about it?
Big Data Analytics
Data Collection
MariaDB
ColumnStore
Data Processing
BI Tools, Data Science
Applications
Connectors,
SPARK Integration etc
MariaDB
MaxScale
Analytics
Insight

MariaDB ColumnStore
• GPLv2 Open Source
• Columnar, Massively Parallel
MariaDB Storage Engine
• Scalable, high-performance
analytics platform
• Built in redundancy and
high availability
• Runs on premise, on cloud, on commodity HW
• Full SQL syntax and capabilities
regardless of platform
• More data, less hardware
• Same MariaDB SQL platform
• No need to update application code
Big Data Sources Analytics Insight
MariaDB ColumnStore
. . .
Node 1 Node 2 Node 3 Node N
Local / Ceph / GlusterFS
®
ELT
Tools
BI
Tools

SQL Features
Cross Engine
Joins
UDF
DML
Aggregation
DDL
Disk Based
Joins
Windowing
Functions
SELECT
QUERY

MariaDB ColumnStore Architecture
• Massively parallel
architecture
– Linear scalability as
new nodes are added
• Horizontal scaling
– Add new data nodes
as your data grows
– Continue read queries
when adding new nodes
– Utilize MaxScale to load
balance and provide
single front end access
point.
Shared-Nothing Distributed Data Storage
Compressed (Local | SAN | EBS | Gluster FS | Ceph )
User
Module
(UM)
Performance
Module
(PM)
Data Storage
Load
Balancer
(MaxScale)

User Modules Query Processing – UM
• SQL Operations are translated into thousands of
Primitives
• Parallel/Distributed SQL
• Extensible with Parallel/Distributed UDFs
• Query is parsed by mysqld on UM node
• Parsed query handed over to ExeMgr on UM node
• ExecMgr breaks down the query in primitive operations
• mysqld - The MariaDB
server
• ExeMgr - MariaDB’s
interface to ColumnStore
• cpimport -
high-performance data
import
MariaDB SQL
Front End
User Module (UM)

Performance
Module Query Processing – PM
• Primitives are processed on PM
• One thread working on a range of rows
• Typically 1/2 million rows, stored in a few hundred
blocks of data
• Execute all column operations required (restriction and
projection)
• Execute any group by/aggregation against local data
• Each primitive executes in a fraction of a second
• Primitives are run in parallel and fully distributed
• PrimProc - Primitives
Processor
• WriteEngineServ -
Database file writing
processor
• DMLProc - DML writes
processor
• DDLProc - DDL processors
Distributed
Query Engine
Performance
Module (PM)

Storage Architecture
• Columnar storage
– Each column stored as separate file
– No index management for query
performance tuning
– Online Schema changes: Add new column
without impacting running queries
• Automatic horizontal partitioning
– Logical partition every 8 Million rows
– In memory metadata of partition min and max
– No partition management for query
performance tuning
• Compression
– Accelerate decompression rate
– Reduce I/O for compressed blocks
Column 1
Extent 1 (8 million rows, 8MB～64MB)
Extent 2 (8 million rows)
Extent M (8 million rows)
Column 2 Column 3 ... Column N
Data automatically arranged by
• Column – Acts as Vertical Partitioning
• Extents – Acts as horizontal partition
Vertical
Partition
Horizontal
Partition
...
Vertical
Partition
Vertical
Partition
Vertical
Partition
Horizontal
Partition
Horizontal
Partition

Horizontal
Partition:
8 Million Rows
Extent 2
Horizontal
Partition:
8 Million Rows
Extent 3
Horizontal
Partition:
8 Million Rows
Extent 1
Storage Architecture reduces I/O SELECT Fname FROM Table 1 WHERE State = ‘NY’
• Only touch column files
that are in projection, filter
and join conditions
• Eliminate disk block touches
to partitions outside filter
and join conditions
Extent 1:
Min State: CA, Max State: NY
Extent 2:
Min State: OR, Max State: WY
Extent 3:
Min State: IA, Max State: TN
High Performance Query Processing
ID
1
2
3
4
...
8M
8M+1
...
16M
16M+1
...
24M
Fname
Bugs
Yosemite
Daffy
Hazel
...
...
Jane
...
Elmer
Lname
Bunny
Sam
Duck
Fudd
...
...
...
State
NY
CA
NY
ME
...
MN
WY
TX
OR
...
VA
TN
IA
NY
...
PA
Zip
11217
95389
10013
04578
...
...
...
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
...
...
...
Age
34
52
35
43
...
...
...
Sex
M
M
M
F
...
...
...
Vertical
Partition
Vertical
Partition
Vertical
Partition
Vertical
Partition
Vertical
Partition
…
ELIMINATED PARTITION

Column-oriented storage
Differences to row-oriented storage

Row oriented:
rows stored
sequentially in a file.
Column oriented:
Each column is stored
in a separate file. Each
column for a given
row is at the same
offset.
Row-oriented vs. Column-oriented format
Key Fname Lname State Zip Phone Age Sex
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
Key
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F

Row oriented:
new rows appended to
the end.
Column oriented:
new value added to
each file.
Single-Row Operations - Insert
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
6 Marvin Martian CA 91602 (818) 761-9964 26 M
Key
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F
Columnar insert not efficient for singleton insertions (OLTP). Batch loads touches
row vs. column. Batch load on column-oriented is faster (compression, no indexes).

Row oriented:
new rows deleted
Column oriented:
value deleted from
each file
Single-Row Operations - Delete
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
Key
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F
Recommended Partition Drop to allow dropping columns in bulk.

Row oriented:
Update 100% of rows
means change 100%
of blocks on disk.
Column oriented:
Just update the blocks
needed to be updated
Single-Row Operations - Update
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
Key
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F

Row oriented:
requires rebuilding of
the whole table
Column oriented:
Create new file for the
new column
Changing the table structure
Key Fname Lname State Zip Phone Age Sex Active
1 Bugs Bunny NY 11217 (718) 938-3235 34 M Y
2 Yosemite Sam CA 95389 (209) 375-6572 52 M N
3 Daffy Duck NY 10013 (212) 227-1810 35 M N
4 Elmer Fudd ME 04578 (207) 882-7323 43 M Y
5 Witch Hazel MA 01970 (978) 744-0991 57 F N
Key
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F
Active
Y
N
N
Y
N
Column-oriented is very flexible for adding columns, no need for a full rebuild
required with it.

Analytics
Daily Running Average Revenue for each item
– In-database distributed analytics with complex
join, aggregation, window functions
– Extensible UDF for custom analytics
– Cross Engine Join with other storage engines
Window functions
– PARTITION BY / ORDER BY
– Aggregate functions: MAX, MIN, COUNT,
SUM, AVG STD, STDDEV_SAMP,
STDDEV_POP, VAR_SAMP, VAR_POP
– Ranking: ROW_NUMBER, RANK,
DENSE_RANK, PERCENT_RANK
CUME_DIST, NTILE, PERCENTILE,
PERCENTILE_CONT
PERCENTILE_DISC, MEDIAN
SELECT item_id, server_date, daily_revenue,
AVG(revenue) OVER
(PARTITION BY item_id ORDER BY server_date
RANGE INTERVAL '1' DAY PRECEDING ) running_avg
FROM web_item_sales
Item ID Server_date Revenue
1 02-01-2014 20,000.00
1 02-02-2014 5,001.00
2 02-01-2014 15,000.00
2 02-04-2014 34,029.00
2 02-05-2014 7,138.00
3 02-01-2014 17,250.00
3 02-03-2014 25,010.00
3 02-04-2014 21,034.00
3 02-05-2014 4,120.00
Running Average
20,000.00
12,500.50
15,000.00
34,209.00
20,583.50
17,250.00
250,100.00
12,577.00
20,583.50

Best Practices
Importing and management

Data
Exportand
Data
Im
port
Bulk Data Load
cpimport, LOAD DATA INFILE
Bulk Data Export
mysql client, odbc, jdbc
Integration with MariaDB
ColumnStore cpimport and sql
interface

Bulk Data Load: cpimport
• Fastest way to load data into MariaDB ColumnStore
• Load data from CSV file
cpimport dbName tblName [loadFile]
• Load data from Standard Input
mysql -e 'select * from source_table;' -N db2 | cpimport destination_db
destination_tbl -s 't‘
• Load data from Binary Source file
cpimport -I1 mydb mytable sourcefile.bin
• Multiple tables in can be loaded in parallel by launching multiple jobs
• Read queries continue without being blocked
• Successful cpimport is auto-committed
• In case of errors, entire load is rolled back

Traditional way of
importing data into
any MariaDB storage
engine table
Bulk Data Load:
LOAD DATA INFILE
Up to 2 times slower
than cpimport for
large size imports
mysql> load data infile '/tmp/
outfile1.txt' into table destinationTable;
Query OK, 9765625 rows affected
(2 min 20.01 sec)
Records: 9765625 Deleted:
0 Skipped: 0 Warnings: 0
Either success or
error operation can
be rolled back

Columnar General Best Practices
Not suited for OLTP
Micro-batch load allows for near real-time behavior
Infrequently used columns do not impact other queries
Columnar suitable for sparse columns (nulls compress nicely)

Data Modeling Best Practices
Star-schema optimizations
Fact tables as InnoDB, dimension tables as ColumnStore
Conservative data typing
Especially around fixed-length vs. dictionary boundary (8 bytes)
IP Address vs. IP Number
Break down compound fields into individual fields
Trivializes searching for sub-fields
Can avoid dictionary overhead
Cost to re-assemble is generally small

HA at UM node
• When one UM node goes down, another
UM node takes over
HA at PM node
• SAN/AWS EBS - When a PM node
goes down, the data volumes
attached to the failed PM node gets
attached to another PM
• Local Disks - If a PM node goes
down, the data on its disks are not
available, though queries continue on
the remaining data set
High Availability
HA at Data Storage
• AWS EBS (Elastic Block Store)
• GlusterFS - Multiple copy of data block
across storage. If a disk on a PM node
fails, another PM node will have access
to the copy of the data
• Ceph - Ceph replicates data and makes
it fault-tolerant, using commodity
hardware and requiring no specific
hardware support. The system is both
self-healing and self-managing, aiming
to minimize administration time and
other costs

Visualization
IHME Visualizations library: http://www.healthdata.org/results/data-visualizations

Where to find MariaDB ColumnStore?
SOFTWARE DOWNLOAD https://mariadb.com/downloads/columnstore
SOURCE https://github.com/mariadb-corporation/mariadb-columnstore-engine
DOCUMENTATION https://mariadb.com/kb/en/mariadb/mariadb-columnstore/
BLOGS https://mariadb.com/blog-tags/columnstore
</>

New in MariaDB ColumnStore 1.1
Streaming
● Native API for ColumnStore files
○ Streaming Change Data Capture via
MaxScale & Kafka
○ Wrappers
● Streaming Insert via MaxScale
HA
● Built-in Data Redundancy
○ Integrated Glusterfs for Data HA Analytics
● User Defined Aggregate and Window Functions
Data Type
● Text/BLOB Data Type
Ease of Use/Management
● Non-root support Phase 1
● Backup/Restore tool
MariaDB Server Support
● MariaDB 10.2 support
● Re-implement window functions based on server code
Security
● Audit Plugin integration
Performance
● Improved string handling / memory utilization
● General performance improvements
Certification
● Tableau, Talend

Thank you
Bruno Šimić
bruno@mariadb.com

Big-Data-Analysen mit MariaDB ColumnStore

More Related Content

Similar to Big-Data-Analysen mit MariaDB ColumnStore

More from MariaDB plc

Recently uploaded

Big-Data-Analysen mit MariaDB ColumnStore