Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

M|18 Analyzing Data with the MariaDB AX Platform

443 views

Published on

M|18 Analyzing Data with the MariaDB AX Platform

Published in: Data & Analytics
  • Be the first to comment

M|18 Analyzing Data with the MariaDB AX Platform

  1. 1. What’s New in the MariaDB AX Platform Dipti Joshi Director Product Management
  2. 2. MariaDB AX Analytics made easy – simple, fast, scalable… and open source
  3. 3. MariaDB AX MariaDB Server MariaDB MaxScale MariaDB ColumnStore Parallel queries Distributed storage No indexes Automatic partitioning Read optimized High compression Low disk IO ColumnStore PM ColumnStore PM ColumnStore PM MariaDB Server ColumnStore UM MariaDB Server ColumnStore UM MariaDB MaxScale MariaDB Server ColumnStore UM ColumnStore PM MariaDB MaxScale Distributed Shared Nothing Storage
  4. 4. MariaDB AX What was there MariaDB ColumnStore 1.0 Manual import Manual backup/restore Window functions Aggregate functions User-defined functions Cross-engine joins ColumnStore PMMariaDB Server ColumnStore UM InnoDB Applications / Spark MariaDB MaxScale
  5. 5. Goals for next MariaDB AX 1. Expand high availability/disaster recovery options 2. Make it easier to perform custom, complex analytics 3. Streamline and simplify the process of ingesting data
  6. 6. MariaDB AX What’s new MariaDB ColumnStore 1.1 Streaming data adapters Bulk data adapters User defined Window functions Distributed aggregates Spark support Read : JDBC Publish: data adapters High availability Local storage (GlusterFS) Parallel backup/restore ColumnStore PMMariaDB Server ColumnStore UM InnoDB Applications / Spark MariaDB MaxScale
  7. 7. What’s new in MariaDB AX BI CERTIFICATION INGESTION ANALYTICS Applications, Apache Kafka, MariaDB MaxScale User-defined aggregate and window functions HA / DR GlusterFS support, Parallel backup/restore DATA TYPES Text, BLOB columns SECURITY Auditing Tableau
  8. 8. Extend high availability and disaster recovery options
  9. 9. GlusterFS Volume Replication High availability for Local Storage GlusterFS can replicate files within a volume - HA without the need for an SAN ColumnStore storage nodes can read other files within a volume - simple, automatic failover GlusterFS Volume Replication ColumnStore PM 1 (dbroot1) ColumnStore PM 2 (dbroot2) MariaDB Server ColumnStore UM MariaDB Server ColumnStore UM ColumnStore PM 3 (dbroot3) /dbroot 1 /dbroot 2 /dbroot 2 /dbroot 3 /dbroot 3 /dbroot 1
  10. 10. Parallel Backup/Restore Parallel backup/restore using rsync - faster backup and restore Support incremental backup and restore - faster backup and restore Consolidate data from multiple storage nodes in a single backup location - simplified, automatic backups and restores /home/user/columnstoreBackupData/pm1dbroot1 /home/user/columnstoreBackupData/pm2dbroot2 /home/user/columnstoreBackupData/pm3dbroot3 ColumnStore PM 1 ColumnStore PM 2 MariaDB Server ColumnStore UM MariaDB Server ColumnStore UM ColumnStore PM 3 Backup and restore tool rsync /data1/* rsync /data2/* rsync /data3/*
  11. 11. Make it easier to perform custom, complex analytics
  12. 12. User-defined distributed aggregate and window functions User-defined distributed aggregate functions - custom analytical functions and better performance User-defined window functions Example: calculate a weighted sum (revenue) $1-10 (0.5) $11-100 (1.0) $100+ (1.5) MariaDB Server ColumnStore UM MariaDB Server ColumnStore UM ColumnStore PM ColumnStore PM ColumnStore PM $10 $5 $100 $100 $200 $300 Column WSUM $4 $2 $8 $4 $20 $20 Column WSUM $12 $6 $60 $60 $300 $450 Column WSUM WSUM = $405 WSUM = $26 WSUM = $516 WSUM = $947
  13. 13. Streamline and simplify the process of data ingestion
  14. 14. Motivation Organizations need to make data available for analysis as soon as it arrives Machine learning results need to be stored where other business/data analysts work with them Time to insight and time to action are now competitive differentiators for businesses
  15. 15. Bulk data adapters Applications can use bulk data adapters SDK to collect and write data - on-demand data loading No need to copy CSV to UM or PM - simpler Bypass SQL interface, parser and optimizer - faster writes C++ Python Java MariaDB Server ColumnStore UM Application ColumnStore PM ColumnStore PMColumnStore PM Write API Write API Write API MariaDB Server ColumnStore UM Bulk Data Adapter 1. For each row a. For each column bulkInsert->setColumn a. bulkInsert->writeRow 1. bulkInsert->commit * Buffer 100,000 rows by default Deep dive session: Ingesting Data with the New Bulk Data Adapters Today at 5 pm
  16. 16. Streaming data adapters – MaxScale CDC Stream writes from MariaDB TX to MariaDB AX automatically and continuously - ensure analytical data is up to date and not stale, no need for batch jobs, manual processes or human intervention MariaDB Server InnoDB MariaDB Server ColumnStore UM MariaDB MaxScale ColumnStore PM ColumnStore PMColumnStore PM Write API Write API Write API MariaDB Server ColumnStore UM Streaming Data Adapter (CDC Client) Binlog-Avro CDC Router Deep dive session: Real-time Analytics With The New Streaming Data Adapters Tomorrow at 8:40 am
  17. 17. Streaming data adapters – Apache Kafka Stream all messages published to Apache Kafka topics to MariaDB AX automatically and continuously - enable data from many sources to be streamed and collected for analysis without complex code MariaDB Server ColumnStore UM ColumnStore PM ColumnStore PMColumnStore PM Write API Write API Write API MariaDB Server ColumnStore UM Streaming Data Adapter (Kafka Client) Apache Kafka Topic Topic Topic Deep dive session: Real-time Analytics With The New Streaming Data Adapters Tomorrow at 8:40 am
  18. 18. The big picture – putting it all together
  19. 19. AnalyticsOperations Ingestion Apache Kafka Streaming Data Adapters Data Services Bulk Data Adapters Spark / Python / ML Bulk Data Adapters Transaction (OLTP) MariaDB Server InnoDB MariaDB MaxScale Web/Mobile Services MariaDB MaxScale Analytics (OLAP) MariaDB ColumnStore
  20. 20. Resources Reach me Download Documentation https://mariadb.com/kb/en/library/mariadb-columnstore/ Blogs https://mariadb.com/blog-tags/columnstore https://mariadb.com/blog-tags/big-data dipti.joshi@mariadb.com MariaDB ColumnStore 1.1 https://mariadb.com/downloads/mariadb-ax MariaDB MaxScale https://mariadb.com/downloads/mariadb-ax/maxscale Bulk Data Adapters and Streaming Data Adapters https://mariadb.com/downloads/mariadb-ax/data-adapters MariaDB ColumnStore Backup/Restore Tool https://mariadb.com/downloads/mariadb-ax/tools-ax
  21. 21. Complex, custom analytics User-defined aggregate functions User-defined window functions Text and binary columns Spark integration JDBC (SQL) Direct (data adapter) Improved HA/DR GlusterFS support Parallel backup/restore Streamlined data ingestion Streaming data adapters Bulk data adapters What’s new in MariaDB AX Summary
  22. 22. Thank you!

×