SlideShare a Scribd company logo
1 of 37
Why use a columnar database
for analytical workloads
Shane Johnson
Senior Director of Product Marketing
Introducing MariaDB AX
Why use a columnar database
for analytical workloads
MariaDB Server
InnoDB
MariaDB TX MariaDB AX
Optimized for OLTP
Row-based storage
Optimized for OLAP
Columnar storage
MariaDB AX and columnar database use cases
– agenda –
MariaDB AX
Introduction
Architecture
Use cases
Financial services
Healthcare
Telecommunications
Digital advertising
A database platform for
modern analytics and data
warehousing.
MariaDB AX and columnar database use cases
– introduction –
Distributed data
Columnar storage
Parallel processing
Data adapters
Connectors (Spark & Kafka)
Open source
Standard SQL
MariaDB AX architecture
Why use a columnar database
for analytical workloads
MariaDB Server
InnoDB
customer month year spend
1 01 2018 100.00
2 02 2018 50.00
3 03 2018 75.00
10000000 03 2018 75.00
MariaDB Server
InnoDB
customer month year spend
1 01 2018 100.00
2 02 2018 50.00
3 03 2018 75.00
10000000 03 2018 75.00
SELECT AVG(spend)
FROM tbl_purchases
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
customer month year spend
1 01 2018 100.00
2 02 2018 50.00
3 03 2018 75.00
10000000 03 2018 75.00
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
customer month year spend
1 01 2018 100.00
2 02 2018 50.00
3 03 2018 75.00
10000000 03 2018 75.00
SELECT AVG(spend)
FROM tbl_purchases
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
customer month year spend
1 01 2018 100.00
2 02 2018 50.00
3 03 2018 75.00
10000000 03 2018 75.00
1. high compression (65-95%)
2. supports spare columns (NULL)
3. supports many columns
4. no need for indexes
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
customer month year spend
1 01 2018 100.00
2 02 2018 50.00
3 03 2018 75.00
10000000 03 2018 75.00
1. columns stored as segments (file)
2. segments have extents (logical)
3. extents have 8 million rows
Performance Module (PM)
User Module (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
S1 (UM + PM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
S1 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
S2 (PM)
Rows 1 to 300,000
S4 (PM)S2 (PM)
S1 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
S3 (PM)
ColumnStore
Storage
ColumnStore
Storage
Rows 1 to 100,000 Rows 100,001 to 200,000 Rows 200,001 to 300,000
S4 (PM)S2 (PM)
S1 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
S3 (PM)
ColumnStore
Storage
ColumnStore
Storage
T1: 00,001 to 30,000
T2: 30,001 to 60,000
T4: 100,001 to 130,000
T5: 130,001 to 160,000
T7: 200,001 to 230,000
T8: 230,001 to 260,000
S6 (PM)S4 (PM)
S2 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
S5 (PM)
ColumnStore
Storage
S1 (UM)
MariaDB Server
InnoDB
ColumnStore
S3 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
Query
throughput
Query
latency
S2 (PM)
S1 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
Storage
SQL
S4 (PM)
ColumnStore
Storage
S6 (PM)
ColumnStore
Storage
S1 (UM)
S2 (PM)
ColumnStore
Storage
Import (CLI)
S4 (PM)
ColumnStore
Storage
S6 (PM)
ColumnStore
Storage
File
S2 (PM)
ColumnStore
Storage
S4 (PM)
ColumnStore
Storage
S6 (PM)
ColumnStore
Storage
Import (CLI)
File (2/3)
Import (CLI)
File (1/3)
Import (CLI)
File (3/3)
Application/Service/Script
S2 (PM)
ColumnStore
Storage
Bulk data
adapter
S4 (PM)
ColumnStore
Storage
S6 (PM)
ColumnStore
Storage
Spark job/task
S2 (PM)
ColumnStore
Storage
Spark
Connector
S4 (PM)
ColumnStore
Storage
S6 (PM)
ColumnStore
Storage
Bulk adapters
S1 (UM)
MariaDB Server
InnoDB
ColumnStore
ColumnStore
C
Java
Python
Stream adapters
Kafka
MaxScale CDCSpark
Storage
SQL
Import (CSV)S2 (PM)
MariaDB AX
customer use cases
Why use a columnar database
for analytical workloads
MariaDB AX and columnar database use cases
– financial services –
Drivers
Become customer-centric
Facilitate regulatory compliance
Create competitive advantages
Goals
Improve customer satisfaction
Mitigate financial risks
Predict market changes
Use cases
Fraud detection: identify patterns + detect anomalies in financial transactions
Compliance archiving: store financial trade history for long-term retention
Investment forecasting: analyze financial markets + securities to predict ROI
MariaDB AX and columnar database use cases
– OTC Markets Group –
Data
10TB of rolling data (5 years)
10,000 U.S. and global securities
100,000 trades
24 million quotes
Use cases
Subscribers analyze quote and trading
data
Regulatory agencies build compliance
reports on demand
MariaDB AX and columnar database use cases
– healthcare –
Drivers
Digital transformation
Electronic health records (EHRs)
Value-base care (VBC)
Goals
Improved population health
Better patient experiences
Reduced cost of care
Use cases
Population health mgt: analyze claims/surveys to recommend interventins
Evidence-based medicine: improve diagnostic accuracy by analyzing EHRs
Precision medicine: identify targeted treatments by analyzing genomes
MariaDB AX and columnar database use cases
– Institute for Health Metrics and Evaluation –
Data
30TB of data
100 billion data points
Multi-billion row tables
Use cases
Enable the public to analyze global
health population data via online data
visualization tools
MariaDB AX and columnar database use cases
– telecommunications –
Drivers
Improve sales, marketing and
operational efficiency
Goals
High customer retention
Better network optimizations
New services and revenue
Use cases
Churn prevention: analyze customer plans/usage to create retention programs
Cross-selling: identify opportunities by analyzing call detail records (CDRs)
Network optimization: analyze traffic/cell tower data to optimize capacity
MariaDB AX and columnar database use cases
– Pinger, Inc. –
About
30 million texts
3 million phone calls
1.5 billions logs a day
24 months’ worth of data
Use cases
To support customer behavioural
analysis based on historical data and
usage
MariaDB AX and columnar database use cases
– digital advertising –
Drivers
Granularity of data
Demographic and behavioral
Social and location
Goals
Deliver the right ad to the right person at
the right time, in the right location and
through the right medium
Use cases
Audience segmentation: improve ad relevance via fine-grained visitor profiles
Ad placement: choose where to show ads based on click and conversion data
Real-time bidding: analyze big request/response history to optimize prices
MariaDB AX and columnar database use cases
– digital advertising vendor –
About
300 million impressions a month
70 million rows a day
60TB of uncompressed data
Use cases
Enable customers to create a custom
report on up to 30 columns on
demand
Why use a columnar database
for analytical workloads
Questions?
Thank you

More Related Content

Similar to Data Con LA 2018 - Why use a columnar database for analytical workloads by Shane Johnson

Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...
Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...
Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...Data Con LA
 
MariaDB today and our vision for the future
MariaDB today and our vision for the futureMariaDB today and our vision for the future
MariaDB today and our vision for the futureMariaDB plc
 
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...Insight Technology, Inc.
 
Exploring modern analytics use cases
Exploring modern analytics use casesExploring modern analytics use cases
Exploring modern analytics use casesMariaDB plc
 
Improving Transactional Applications with Analytics
Improving Transactional Applications with AnalyticsImproving Transactional Applications with Analytics
Improving Transactional Applications with AnalyticsDATAVERSITY
 
Welcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the futureWelcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the futureMariaDB plc
 
[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...
[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...
[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...Insight Technology, Inc.
 
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...Insight Technology, Inc.
 
What's new in MariaDB AX webinar
What's new in MariaDB AX webinarWhat's new in MariaDB AX webinar
What's new in MariaDB AX webinarMariaDB plc
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreMariaDB plc
 
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
MLOps journey at Swisscom: AI Use Cases, Architecture and Future VisionMLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
MLOps journey at Swisscom: AI Use Cases, Architecture and Future VisionBATbern
 
Delivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analyticsDelivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analyticsMariaDB plc
 
04 2017 emea_roadshowmilan_mariadb columnstore
04 2017 emea_roadshowmilan_mariadb columnstore04 2017 emea_roadshowmilan_mariadb columnstore
04 2017 emea_roadshowmilan_mariadb columnstoremlraviol
 
Introduction of MariaDB AX / TX
Introduction of MariaDB AX / TXIntroduction of MariaDB AX / TX
Introduction of MariaDB AX / TXGOTO Satoru
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Delivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analyticsDelivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analyticsMariaDB plc
 
When Open Source Meets the Enterprise
When Open Source Meets the EnterpriseWhen Open Source Meets the Enterprise
When Open Source Meets the EnterpriseMariaDB plc
 
Transactional and Analytics together: MariaDB and ColumnStore
Transactional and Analytics together: MariaDB and ColumnStoreTransactional and Analytics together: MariaDB and ColumnStore
Transactional and Analytics together: MariaDB and ColumnStoremlraviol
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
Fast, Powerful and Scalable Analytics
Fast, Powerful and Scalable AnalyticsFast, Powerful and Scalable Analytics
Fast, Powerful and Scalable AnalyticsMariaDB plc
 

Similar to Data Con LA 2018 - Why use a columnar database for analytical workloads by Shane Johnson (20)

Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...
Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...
Data Con LA 2019 - Hybrid Transactional Analytical Processing (HTAP) with Mar...
 
MariaDB today and our vision for the future
MariaDB today and our vision for the futureMariaDB today and our vision for the future
MariaDB today and our vision for the future
 
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
 
Exploring modern analytics use cases
Exploring modern analytics use casesExploring modern analytics use cases
Exploring modern analytics use cases
 
Improving Transactional Applications with Analytics
Improving Transactional Applications with AnalyticsImproving Transactional Applications with Analytics
Improving Transactional Applications with Analytics
 
Welcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the futureWelcome: MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the future
 
[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...
[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...
[db tech showcase OSS 2017] A25: Replacing Oracle Database at DBS Bank by Mar...
 
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
 
What's new in MariaDB AX webinar
What's new in MariaDB AX webinarWhat's new in MariaDB AX webinar
What's new in MariaDB AX webinar
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
 
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
MLOps journey at Swisscom: AI Use Cases, Architecture and Future VisionMLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
 
Delivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analyticsDelivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analytics
 
04 2017 emea_roadshowmilan_mariadb columnstore
04 2017 emea_roadshowmilan_mariadb columnstore04 2017 emea_roadshowmilan_mariadb columnstore
04 2017 emea_roadshowmilan_mariadb columnstore
 
Introduction of MariaDB AX / TX
Introduction of MariaDB AX / TXIntroduction of MariaDB AX / TX
Introduction of MariaDB AX / TX
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Delivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analyticsDelivering fast, powerful and scalable analytics
Delivering fast, powerful and scalable analytics
 
When Open Source Meets the Enterprise
When Open Source Meets the EnterpriseWhen Open Source Meets the Enterprise
When Open Source Meets the Enterprise
 
Transactional and Analytics together: MariaDB and ColumnStore
Transactional and Analytics together: MariaDB and ColumnStoreTransactional and Analytics together: MariaDB and ColumnStore
Transactional and Analytics together: MariaDB and ColumnStore
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Fast, Powerful and Scalable Analytics
Fast, Powerful and Scalable AnalyticsFast, Powerful and Scalable Analytics
Fast, Powerful and Scalable Analytics
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Data Con LA 2018 - Why use a columnar database for analytical workloads by Shane Johnson

  • 1. Why use a columnar database for analytical workloads Shane Johnson Senior Director of Product Marketing
  • 2. Introducing MariaDB AX Why use a columnar database for analytical workloads
  • 3. MariaDB Server InnoDB MariaDB TX MariaDB AX Optimized for OLTP Row-based storage Optimized for OLAP Columnar storage
  • 4. MariaDB AX and columnar database use cases – agenda – MariaDB AX Introduction Architecture Use cases Financial services Healthcare Telecommunications Digital advertising
  • 5. A database platform for modern analytics and data warehousing. MariaDB AX and columnar database use cases – introduction – Distributed data Columnar storage Parallel processing Data adapters Connectors (Spark & Kafka) Open source Standard SQL
  • 6. MariaDB AX architecture Why use a columnar database for analytical workloads
  • 7. MariaDB Server InnoDB customer month year spend 1 01 2018 100.00 2 02 2018 50.00 3 03 2018 75.00 10000000 03 2018 75.00
  • 8. MariaDB Server InnoDB customer month year spend 1 01 2018 100.00 2 02 2018 50.00 3 03 2018 75.00 10000000 03 2018 75.00 SELECT AVG(spend) FROM tbl_purchases
  • 9. MariaDB Server InnoDB ColumnStore ColumnStore Storage customer month year spend 1 01 2018 100.00 2 02 2018 50.00 3 03 2018 75.00 10000000 03 2018 75.00
  • 10. MariaDB Server InnoDB ColumnStore ColumnStore Storage customer month year spend 1 01 2018 100.00 2 02 2018 50.00 3 03 2018 75.00 10000000 03 2018 75.00 SELECT AVG(spend) FROM tbl_purchases
  • 11. MariaDB Server InnoDB ColumnStore ColumnStore Storage customer month year spend 1 01 2018 100.00 2 02 2018 50.00 3 03 2018 75.00 10000000 03 2018 75.00 1. high compression (65-95%) 2. supports spare columns (NULL) 3. supports many columns 4. no need for indexes
  • 12. MariaDB Server InnoDB ColumnStore ColumnStore Storage customer month year spend 1 01 2018 100.00 2 02 2018 50.00 3 03 2018 75.00 10000000 03 2018 75.00 1. columns stored as segments (file) 2. segments have extents (logical) 3. extents have 8 million rows
  • 13.
  • 14. Performance Module (PM) User Module (UM) MariaDB Server InnoDB ColumnStore ColumnStore Storage
  • 15. S1 (UM + PM) MariaDB Server InnoDB ColumnStore ColumnStore Storage
  • 17. S4 (PM)S2 (PM) S1 (UM) MariaDB Server InnoDB ColumnStore ColumnStore Storage S3 (PM) ColumnStore Storage ColumnStore Storage Rows 1 to 100,000 Rows 100,001 to 200,000 Rows 200,001 to 300,000
  • 18. S4 (PM)S2 (PM) S1 (UM) MariaDB Server InnoDB ColumnStore ColumnStore Storage S3 (PM) ColumnStore Storage ColumnStore Storage T1: 00,001 to 30,000 T2: 30,001 to 60,000 T4: 100,001 to 130,000 T5: 130,001 to 160,000 T7: 200,001 to 230,000 T8: 230,001 to 260,000
  • 19. S6 (PM)S4 (PM) S2 (UM) MariaDB Server InnoDB ColumnStore ColumnStore Storage S5 (PM) ColumnStore Storage S1 (UM) MariaDB Server InnoDB ColumnStore S3 (UM) MariaDB Server InnoDB ColumnStore ColumnStore Storage Query throughput Query latency
  • 20. S2 (PM) S1 (UM) MariaDB Server InnoDB ColumnStore ColumnStore Storage SQL S4 (PM) ColumnStore Storage S6 (PM) ColumnStore Storage
  • 21. S1 (UM) S2 (PM) ColumnStore Storage Import (CLI) S4 (PM) ColumnStore Storage S6 (PM) ColumnStore Storage File
  • 22. S2 (PM) ColumnStore Storage S4 (PM) ColumnStore Storage S6 (PM) ColumnStore Storage Import (CLI) File (2/3) Import (CLI) File (1/3) Import (CLI) File (3/3)
  • 23. Application/Service/Script S2 (PM) ColumnStore Storage Bulk data adapter S4 (PM) ColumnStore Storage S6 (PM) ColumnStore Storage
  • 24. Spark job/task S2 (PM) ColumnStore Storage Spark Connector S4 (PM) ColumnStore Storage S6 (PM) ColumnStore Storage
  • 25. Bulk adapters S1 (UM) MariaDB Server InnoDB ColumnStore ColumnStore C Java Python Stream adapters Kafka MaxScale CDCSpark Storage SQL Import (CSV)S2 (PM)
  • 26. MariaDB AX customer use cases Why use a columnar database for analytical workloads
  • 27. MariaDB AX and columnar database use cases – financial services – Drivers Become customer-centric Facilitate regulatory compliance Create competitive advantages Goals Improve customer satisfaction Mitigate financial risks Predict market changes Use cases Fraud detection: identify patterns + detect anomalies in financial transactions Compliance archiving: store financial trade history for long-term retention Investment forecasting: analyze financial markets + securities to predict ROI
  • 28. MariaDB AX and columnar database use cases – OTC Markets Group – Data 10TB of rolling data (5 years) 10,000 U.S. and global securities 100,000 trades 24 million quotes Use cases Subscribers analyze quote and trading data Regulatory agencies build compliance reports on demand
  • 29. MariaDB AX and columnar database use cases – healthcare – Drivers Digital transformation Electronic health records (EHRs) Value-base care (VBC) Goals Improved population health Better patient experiences Reduced cost of care Use cases Population health mgt: analyze claims/surveys to recommend interventins Evidence-based medicine: improve diagnostic accuracy by analyzing EHRs Precision medicine: identify targeted treatments by analyzing genomes
  • 30. MariaDB AX and columnar database use cases – Institute for Health Metrics and Evaluation – Data 30TB of data 100 billion data points Multi-billion row tables Use cases Enable the public to analyze global health population data via online data visualization tools
  • 31.
  • 32. MariaDB AX and columnar database use cases – telecommunications – Drivers Improve sales, marketing and operational efficiency Goals High customer retention Better network optimizations New services and revenue Use cases Churn prevention: analyze customer plans/usage to create retention programs Cross-selling: identify opportunities by analyzing call detail records (CDRs) Network optimization: analyze traffic/cell tower data to optimize capacity
  • 33. MariaDB AX and columnar database use cases – Pinger, Inc. – About 30 million texts 3 million phone calls 1.5 billions logs a day 24 months’ worth of data Use cases To support customer behavioural analysis based on historical data and usage
  • 34. MariaDB AX and columnar database use cases – digital advertising – Drivers Granularity of data Demographic and behavioral Social and location Goals Deliver the right ad to the right person at the right time, in the right location and through the right medium Use cases Audience segmentation: improve ad relevance via fine-grained visitor profiles Ad placement: choose where to show ads based on click and conversion data Real-time bidding: analyze big request/response history to optimize prices
  • 35. MariaDB AX and columnar database use cases – digital advertising vendor – About 300 million impressions a month 70 million rows a day 60TB of uncompressed data Use cases Enable customers to create a custom report on up to 30 columns on demand
  • 36. Why use a columnar database for analytical workloads Questions?