M|18 Real-time Analytics with the New Streaming Data Adapters

Real-time Analytics with the
New Streaming Data Adapters
Dipti Joshi
Director Product Management
Markus Mäkelä
Senior Software Engineer

Streamline and simplify
the process of data ingestion

Motivation
Organizations need to make data available for
analysis as soon as it arrives
Machine learning results need to be stored where
other business/data analysts work with them
Time to insight and time to action are now
competitive differentiators for businesses

Bulk data adapters
Applications can use bulk data
adapters SDK to collect and
write data - on-demand data
loading
No need to copy CSV to UM
or PM - simpler
Bypass SQL interface,
parser and optimizer -
faster writes
C++
Python
Java
MariaDB Server
ColumnStore UM
Application
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Bulk Data Adapter
1. For each row
a. For each column
bulkInsert->setColumn
b. bulkInsert->writeRow
2. bulkInsert->commit
* Buffer 100,000 rows by default

Streaming data adapters
– MaxScale CDC
Stream all writes from
MariaDB TX to MariaDB AX
automatically and
continuously - ensure
analytical data is up to
date and not stale, no
need for batch jobs,
manual processes or
human intervention
MariaDB Server
InnoDB
MariaDB Server
ColumnStore UM
MariaDB MaxScale
MariaDB Server
ColumnStore UM
Streaming Data Adapter
(MaxScale CDC Client)
Binlog-Avro CDC
Router

Inside MaxScale CDC Adapter
● Connects to MaxScale via MaxScale CDC Connector
● Connects to ColumnStore via ColumnStore API
● Set of CDC records → CS API mini-batch
● CDC Record
○ Timestamp
○ GTID
○ Type (write, delete, update)
○ Changed data

Inside MaxScale CDC Adapter
CREATE TABLE test.t1 (id INT);
INSERT INTO test.t1 VALUES (1);
UPDATE test.t1 SET id = 2 WHERE id = 1;
DELETE FROM test.t1 WHERE id = 2;
{"domain": 0, "server_id": 3000, "sequence": 19, "event_number": 1, "timestamp": 1519225339, "event_type": "insert", "id": 1}
{"domain": 0, "server_id": 3000, "sequence": 20, "event_number": 1, "timestamp": 1519225349, "event_type": "update_before", "id": 1}
{"domain": 0, "server_id": 3000, "sequence": 20, "event_number": 2, "timestamp": 1519225349, "event_type": "update_after", "id": 2}
{"domain": 0, "server_id": 3000, "sequence": 21, "event_number": 1, "timestamp": 1519225356, "event_type": "delete", "id": 2}
MariaDB [test]> select * from t1 order by sequence;
+--------+--------------+---------------+------+----------+-----------+------------+
| domain | event_number | event_type | id | sequence | server_id | timestamp |
+--------+--------------+---------------+------+----------+-----------+------------+
| 0 | 1 | insert | 1 | 19 | 3000 | 1519225339 |
| 0 | 1 | update_before | 1 | 20 | 3000 | 1519225349 |
| 0 | 2 | update_after | 2 | 20 | 3000 | 1519225349 |
| 0 | 1 | delete | 2 | 21 | 3000 | 1519225356 |
+--------+--------------+---------------+------+----------+-----------+------------+

Streaming data adapters
– Apache Kafka
Stream all messages
published to Apache Kafka
topics to MariaDB AX
automatically and
continuously - enable data
from many sources to be
streamed and collected for
analysis without complex
code
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
Streaming Data Adapter
(Kafka Client)
Apache Kafka
Topic Topic Topic

Inside Apache Kafka Adapter
● Connects to Kafka
● Reads Avro formatted data
○ Confluent KafkaAvroSerializer: https://docs.confluent.io/current/streams/developer-guide/write-streams.html
● Each topic is a stream
● Streams map to tables
○ Stream to multiple tables
○ Multiple streams to single table

The big picture – putting it all together

AnalyticsOperations Ingestion
Apache Kafka
Streaming Data Adapters
Data Services
Bulk Data Adapters
Spark / Python / ML
Bulk Data Adapters
Transaction (OLTP)
MariaDB Server
InnoDB
MariaDB MaxScale
Web/Mobile Services
MariaDB MaxScale
Analytics (OLAP)
MariaDB
ColumnStore

Resources
Reach me
Download
Documentation https://mariadb.com/kb/en/library/mariadb-columnstore/
Blogs https://mariadb.com/blog-tags/columnstore
https://mariadb.com/blog-tags/big-data
dipti.joshi@mariadb.com
MariaDB AX https://mariadb.com/mariadb-ax-download
MariaDB ColumnStore 1.1 https://mariadb.com/downloads/mariadb-ax
MariaDB MaxScale https://mariadb.com/downloads/mariadb-ax/maxscale
Bulk Data Adapters and Streaming Data Adapters
https://mariadb.com/downloads/mariadb-ax/data-adapters
MariaDB ColumnStore Backup/Restore Tool
https://mariadb.com/downloads/mariadb-ax/tools-ax

M|18 Real-time Analytics with the New Streaming Data Adapters

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to M|18 Real-time Analytics with the New Streaming Data Adapters

Similar to M|18 Real-time Analytics with the New Streaming Data Adapters (20)

More from MariaDB plc

More from MariaDB plc (20)

Recently uploaded

Recently uploaded (20)

M|18 Real-time Analytics with the New Streaming Data Adapters