SlideShare a Scribd company logo
Building High-Performance MySQL Query Systems and Analytic Applications Robin Schumacher
Agenda ,[object Object],[object Object],[object Object],[object Object]
What are we talking about? ,[object Object],[object Object],[object Object]
Reporting and Business Intelligence DB’s ,[object Object],[object Object],[object Object]
Data Warehouses/Marts/Analytic DB’s OLTP Files/XML Log Files Operational Source Data Staging  or ODS ETL Final  ETL Reporting, BI, Notification Layer Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Warehouse Archive Purge/Archive Data Warehouse and Metadata Management
Reporting Databases OLTP Database Read Shard One Reporting Database Application Servers End Users ETL Data Archiving Link Replication
Application Sharding / Partitioning ,[object Object],[object Object],[object Object]
Read Sharding / Partitioning
What are the core rules to follow in order to avoid anxiety over building fast read-intensive, reporting, and analytic databases?
#1 Only Read the Data You Need ,[object Object],[object Object],[object Object],[object Object]
#2 Exploit Modern Hardware ,[object Object],[object Object],[object Object],[object Object]
#3 Divide and Conquer ,[object Object],[object Object],[object Object],[object Object]
#4 Scale both I/O and User Connections ,[object Object],[object Object],[object Object]
#5 Provide Transparent Expansion and Failover ,[object Object],[object Object],[object Object]
#6 Load New Data with Minimal Impact ,[object Object],[object Object],[object Object],[object Object]
#7 Quickly Troubleshoot Poor Read Performance ,[object Object],[object Object],[object Object]
Good suggestions, but how can I practically do all these things…?
What is Calpont’s InfiniDB? InfiniDB is an open source, column-oriented database architected to handle data warehouses, data marts, analytic/BI systems, and other read-intensive applications. It delivers true scale up (more CPU’s/cores, RAM) and massive parallel processing (MPP) scale out capabilities for MySQL users. Linear performance gains are achieved when adding either more capabilities to one box or using commodity machines in a scale out configuration.  Scale up Scale Out
#1 Only Read the Data You Need ,[object Object],[object Object],[object Object],[object Object],[object Object],Recommendation : Start using a column-oriented database Caveat : if you are reading all (select *) or most of the columns in a table, then a column database may not be right for your application.
Column vs. Row Orientation  A column-oriented architecture looks the same on the surface, but stores data differently than legacy/row-based databases…
#2 Exploit Modern Hardware ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use databases/storage engines that scale up (i.e. use available CPU’s/cores)
InfiniDB Community – Scale Up InfiniDB Community edition is a FOSS, multi-threaded database server that is capable of using a machine’s CPUs/cores to process queries 87% 22.14 164.12 Q3.2 83% 55.04 316.79 Q3.1 87% 15.94 121.33 Q2.3 87% 19.70 151.20 Q2.2 79% 44.65 210.21 Q2.1 Overall Percent Reduction with additional cores InfiniDB 8 cores (elapsed time in seconds) InfiniDB 1 Core (elapsed time in seconds) SSB Query  (@100 scale)
#3 Divide and Conquer ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use Scale-Out in addition to Scale-up
InfiniDB Enterprise – Scale Up and Out User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog
#3 Divide and Conquer ,[object Object],[object Object],[object Object],87% 77.74 148.49 297.46 597.97 Q3.2 84% 134.21 316.50 425.25 848.79 Q3.1 87% 51.36 96.03 192.03 386.66 Q2.3 87% 56.41 106.37 214.87 430.25 Q2.2 87% 68.21 129.90 261.35 531.34 Q2.1 Overall Percent Reduction from 1 – 8PM’s 8PM (elapsed time in seconds) 4PM (elapsed time in seconds) 2PM (elapsed time in seconds) 1PM (elapsed time in seconds) SSB Query @1000
#4 Scale both I/O and User Connections Recommendation : Use modular architecture  User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog Add more Performance Modules to scale I/O Add more User Modules to scale concurrency
#5 Provide Transparent Expansion and Failover ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use either replication or MPP
#5 Provide Transparent Expansion and Failover Cust_id 1-999 Cust_id 1000-1999 Cust_id 2000-2999 Sharding Architecture MySQL Replication Web/App Servers Browsers
#5 Provide Transparent Expansion and Failover User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog If one Performance Module fails, traffic resumes with the remaining nodes User queries can be redirected to other User Modules if one fails
#6 Load New Data with Minimal Impact ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use two-step ETL feed with non-blocking load utilities and/or MVCC database engine
#6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data Staging  or ODS ETL High-speed Load Utility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management
#7 Quickly Troubleshoot Poor Read Performance ,[object Object],[object Object],[object Object],[object Object],Recommendation : Proactively use load testing; reactively use SQL analysis and tracing
InfiniDB Extent Map – No Indexing Needed If a column WHERE filter of “COL1 BETWEEN 220 AND 250 AND COL2 < 10000” is specified, InfiniDB will eliminate extents 1, 2 and 4 from the first column filter, then, looking at just the matching extents for COL2 (i.e. just extent 3), it will determine that no extents match and return zero rows without doing any I/O at all. … Extent Map Also enables logical range partitioning of data… Ext 2 Min 101 Max 200 Ext 3 Min 201 Max 300 Ext 4 Min 301 Max 400 Col1 Ext 1 Min 1 Max 100 Ext 2 Min 10100 Max 20000 Ext 3 Min 20100 Max 30000 Ext 4 Min 30100 Max 40000 Col2 Ext 1 Min 100 Max 10000
Summary Provides both diagnostic and tracing tools; no major design tuning efforts Use load testing and SQL analysis tools Method for troubleshooting poor read performance Has high-speed loader with no blocking and MVCC Use two-step ETL and bulk load process Load data with minimal impact Does transparent failover for I/O and manual for connectivity Use replication and load balancers Provide transparent expansion and failover Modular architecture for scaling both concurrency and I/O Application partition Scale concurrency and I/O Supports MPP scale out Spread load via replication or MPP Divide and Conquer Is multi-threaded and uses multiple CPUs / Cores Use DB’s/storage engines that are multi-threaded Exploit modern hardware Is column-oriented Use column database Only read the data you need InfiniDB General Technique Recommendation
Calpont Solutions Calpont Analytic Database Server Editions Calpont Analytic Database Solutions InfiniDB  Community Server Column-Oriented Multi-threaded Terabyte Capable Single Server InfiniDB Enterprise Server Scale out / Parallel Processing Automatic Failover InfiniDB Enterprise Solution Monitoring 24x7 Support Auto Patch Management Alerts & SNMP Notifications Hot Fix Builds Consultative Help
InfiniDB Community & Enterprise Server Comparison Yes No Multi-Node, MPP scale out capable w/ failover Formal Production Support Forums Only Support Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes InfiniDB Community Yes INSERT/UPDATE/DELETE (DML) support Yes Transaction support (ACID compliant) Yes MySQL front end Yes Logical data compression Yes High-Speed bulk loader w/ no blocking queries while loading Yes Multi-threaded engine (queries/writes will use all CPU’s/cores on box) Yes Crash-recovery Yes Terabyte database capable Yes High concurrency supported Yes Alter Table with online add column capability  Yes MVCC support – snapshot read (readers don’t block writers) Yes Automatic vertical (column) and logical horizontal partitioning of data Yes No indexing necessary Yes Column-oriented InfiniDB Enterprise Core Database Server Features
For More Information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],www.infinidb.org
Building High-Performance MySQL Query Systems and Analytic Applications Thanks…!

More Related Content

What's hot

Breaking data
Breaking dataBreaking data
Breaking data
Terry Bunio
 
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Continuent
 
Sql server tips from the field
Sql server tips from the fieldSql server tips from the field
Sql server tips from the field
JoAnna Cheshire
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed Database
Abhilasha Lahigude
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
Deepak Shankar
 
Making Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterMaking Postgres Central in Your Data Center
Making Postgres Central in Your Data Center
EDB
 
Replication in Distributed Real Time Database
Replication in Distributed Real Time DatabaseReplication in Distributed Real Time Database
Replication in Distributed Real Time Database
Ghanshyam Yadav
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Haribabu Nandyal Padmanaban
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
Steve Feldman
 
try
trytry
Cloud computing
Cloud computingCloud computing
Cloud computing
Aaron Tushabe
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File Systemtutchiio
 
TimesTen Overview
TimesTen OverviewTimesTen Overview
TimesTen Overview
Rex Wang
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
Ruben Badaró
 
Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systems
Tinniam V Ganesh (TV)
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
DATAVERSITY
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA Tips
EDB
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Global Business Events
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
MongoDB
 

What's hot (19)

Breaking data
Breaking dataBreaking data
Breaking data
 
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
 
Sql server tips from the field
Sql server tips from the fieldSql server tips from the field
Sql server tips from the field
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed Database
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
Making Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterMaking Postgres Central in Your Data Center
Making Postgres Central in Your Data Center
 
Replication in Distributed Real Time Database
Replication in Distributed Real Time DatabaseReplication in Distributed Real Time Database
Replication in Distributed Real Time Database
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
try
trytry
try
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
TimesTen Overview
TimesTen OverviewTimesTen Overview
TimesTen Overview
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systems
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA Tips
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 

Viewers also liked

Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query Tuning
Alexander Rubin
 
Zurich2007 MySQL Query Optimization
Zurich2007 MySQL Query OptimizationZurich2007 MySQL Query Optimization
Zurich2007 MySQL Query Optimization
Hiệp Lê Tuấn
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
Dave Stokes
 
56 Query Optimization
56 Query Optimization56 Query Optimization
56 Query OptimizationMYXPLAIN
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimizationBaohua Cai
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
Pythian
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
MYXPLAIN
 
Tunning sql query
Tunning sql queryTunning sql query
Tunning sql query
vuhaininh88
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
Sveta Smirnova
 
Advanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema TuningAdvanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema Tuning
MYXPLAIN
 
MySQL Query Optimization
MySQL Query OptimizationMySQL Query Optimization
MySQL Query Optimization
Morgan Tocker
 
Webinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuningWebinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuning
晓 周
 
MySQL Query Optimization.
MySQL Query Optimization.MySQL Query Optimization.
MySQL Query Optimization.
Remote MySQL DBA
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Jaime Crespo
 
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Jaime Crespo
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)Karthik .P.R
 
Sql query patterns, optimized
Sql query patterns, optimizedSql query patterns, optimized
Sql query patterns, optimized
Karwin Software Solutions LLC
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index TuningManikanda kumar
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
mysqlops
 

Viewers also liked (20)

Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query Tuning
 
Zurich2007 MySQL Query Optimization
Zurich2007 MySQL Query OptimizationZurich2007 MySQL Query Optimization
Zurich2007 MySQL Query Optimization
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
 
56 Query Optimization
56 Query Optimization56 Query Optimization
56 Query Optimization
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimization
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
Tunning sql query
Tunning sql queryTunning sql query
Tunning sql query
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
Advanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema TuningAdvanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema Tuning
 
MySQL Query Optimization
MySQL Query OptimizationMySQL Query Optimization
MySQL Query Optimization
 
My sql optimization
My sql optimizationMy sql optimization
My sql optimization
 
Webinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuningWebinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuning
 
MySQL Query Optimization.
MySQL Query Optimization.MySQL Query Optimization.
MySQL Query Optimization.
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
 
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)
 
Sql query patterns, optimized
Sql query patterns, optimizedSql query patterns, optimized
Sql query patterns, optimized
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
 

Similar to Building High Performance MySql Query Systems And Analytic Applications

EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
Parallel processing in data warehousing and big data
Parallel processing in data warehousing and big dataParallel processing in data warehousing and big data
Parallel processing in data warehousing and big data
Abhishek Sharma
 
Applications of parellel computing
Applications of parellel computingApplications of parellel computing
Applications of parellel computingpbhopi
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replicationShahzad
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
ijcsit
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
AIRCC Publishing Corporation
 
Veritas Failover3
Veritas Failover3Veritas Failover3
Veritas Failover3grogers1124
 
Azure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfAzure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdf
pbonillo1
 
Vargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbtVargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbtGenoveva Vargas-Solar
 
data mining
data miningdata mining
data mining
renukarenuka9
 
data mining
data miningdata mining
data mining
renukarenuka9
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
 
Scalability Considerations
Scalability ConsiderationsScalability Considerations
Scalability Considerations
Navid Malek
 
Open world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedOpen world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedchet justice
 
Sql interview question part 5
Sql interview question part 5Sql interview question part 5
Sql interview question part 5kaashiv1
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
Database project edi
Database project ediDatabase project edi
Database project ediRey Jefferson
 

Similar to Building High Performance MySql Query Systems And Analytic Applications (20)

EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
Parallel processing in data warehousing and big data
Parallel processing in data warehousing and big dataParallel processing in data warehousing and big data
Parallel processing in data warehousing and big data
 
Applications of parellel computing
Applications of parellel computingApplications of parellel computing
Applications of parellel computing
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replication
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
 
Veritas Failover3
Veritas Failover3Veritas Failover3
Veritas Failover3
 
Azure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfAzure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdf
 
Vargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbtVargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbt
 
data mining
data miningdata mining
data mining
 
data mining
data miningdata mining
data mining
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Scalability Considerations
Scalability ConsiderationsScalability Considerations
Scalability Considerations
 
Open world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedOpen world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learned
 
Ebook5
Ebook5Ebook5
Ebook5
 
Sql interview question part 5
Sql interview question part 5Sql interview question part 5
Sql interview question part 5
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Database project
Database projectDatabase project
Database project
 
Database project edi
Database project ediDatabase project edi
Database project edi
 

Recently uploaded

Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 

Recently uploaded (20)

Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 

Building High Performance MySql Query Systems And Analytic Applications

  • 1. Building High-Performance MySQL Query Systems and Analytic Applications Robin Schumacher
  • 2.
  • 3.
  • 4.
  • 5. Data Warehouses/Marts/Analytic DB’s OLTP Files/XML Log Files Operational Source Data Staging or ODS ETL Final ETL Reporting, BI, Notification Layer Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Warehouse Archive Purge/Archive Data Warehouse and Metadata Management
  • 6. Reporting Databases OLTP Database Read Shard One Reporting Database Application Servers End Users ETL Data Archiving Link Replication
  • 7.
  • 8. Read Sharding / Partitioning
  • 9. What are the core rules to follow in order to avoid anxiety over building fast read-intensive, reporting, and analytic databases?
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Good suggestions, but how can I practically do all these things…?
  • 18. What is Calpont’s InfiniDB? InfiniDB is an open source, column-oriented database architected to handle data warehouses, data marts, analytic/BI systems, and other read-intensive applications. It delivers true scale up (more CPU’s/cores, RAM) and massive parallel processing (MPP) scale out capabilities for MySQL users. Linear performance gains are achieved when adding either more capabilities to one box or using commodity machines in a scale out configuration. Scale up Scale Out
  • 19.
  • 20. Column vs. Row Orientation A column-oriented architecture looks the same on the surface, but stores data differently than legacy/row-based databases…
  • 21.
  • 22. InfiniDB Community – Scale Up InfiniDB Community edition is a FOSS, multi-threaded database server that is capable of using a machine’s CPUs/cores to process queries 87% 22.14 164.12 Q3.2 83% 55.04 316.79 Q3.1 87% 15.94 121.33 Q2.3 87% 19.70 151.20 Q2.2 79% 44.65 210.21 Q2.1 Overall Percent Reduction with additional cores InfiniDB 8 cores (elapsed time in seconds) InfiniDB 1 Core (elapsed time in seconds) SSB Query (@100 scale)
  • 23.
  • 24. InfiniDB Enterprise – Scale Up and Out User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog
  • 25.
  • 26. #4 Scale both I/O and User Connections Recommendation : Use modular architecture User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog Add more Performance Modules to scale I/O Add more User Modules to scale concurrency
  • 27.
  • 28. #5 Provide Transparent Expansion and Failover Cust_id 1-999 Cust_id 1000-1999 Cust_id 2000-2999 Sharding Architecture MySQL Replication Web/App Servers Browsers
  • 29. #5 Provide Transparent Expansion and Failover User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog If one Performance Module fails, traffic resumes with the remaining nodes User queries can be redirected to other User Modules if one fails
  • 30.
  • 31. #6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data Staging or ODS ETL High-speed Load Utility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management
  • 32.
  • 33. InfiniDB Extent Map – No Indexing Needed If a column WHERE filter of “COL1 BETWEEN 220 AND 250 AND COL2 < 10000” is specified, InfiniDB will eliminate extents 1, 2 and 4 from the first column filter, then, looking at just the matching extents for COL2 (i.e. just extent 3), it will determine that no extents match and return zero rows without doing any I/O at all. … Extent Map Also enables logical range partitioning of data… Ext 2 Min 101 Max 200 Ext 3 Min 201 Max 300 Ext 4 Min 301 Max 400 Col1 Ext 1 Min 1 Max 100 Ext 2 Min 10100 Max 20000 Ext 3 Min 20100 Max 30000 Ext 4 Min 30100 Max 40000 Col2 Ext 1 Min 100 Max 10000
  • 34. Summary Provides both diagnostic and tracing tools; no major design tuning efforts Use load testing and SQL analysis tools Method for troubleshooting poor read performance Has high-speed loader with no blocking and MVCC Use two-step ETL and bulk load process Load data with minimal impact Does transparent failover for I/O and manual for connectivity Use replication and load balancers Provide transparent expansion and failover Modular architecture for scaling both concurrency and I/O Application partition Scale concurrency and I/O Supports MPP scale out Spread load via replication or MPP Divide and Conquer Is multi-threaded and uses multiple CPUs / Cores Use DB’s/storage engines that are multi-threaded Exploit modern hardware Is column-oriented Use column database Only read the data you need InfiniDB General Technique Recommendation
  • 35. Calpont Solutions Calpont Analytic Database Server Editions Calpont Analytic Database Solutions InfiniDB Community Server Column-Oriented Multi-threaded Terabyte Capable Single Server InfiniDB Enterprise Server Scale out / Parallel Processing Automatic Failover InfiniDB Enterprise Solution Monitoring 24x7 Support Auto Patch Management Alerts & SNMP Notifications Hot Fix Builds Consultative Help
  • 36. InfiniDB Community & Enterprise Server Comparison Yes No Multi-Node, MPP scale out capable w/ failover Formal Production Support Forums Only Support Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes InfiniDB Community Yes INSERT/UPDATE/DELETE (DML) support Yes Transaction support (ACID compliant) Yes MySQL front end Yes Logical data compression Yes High-Speed bulk loader w/ no blocking queries while loading Yes Multi-threaded engine (queries/writes will use all CPU’s/cores on box) Yes Crash-recovery Yes Terabyte database capable Yes High concurrency supported Yes Alter Table with online add column capability Yes MVCC support – snapshot read (readers don’t block writers) Yes Automatic vertical (column) and logical horizontal partitioning of data Yes No indexing necessary Yes Column-oriented InfiniDB Enterprise Core Database Server Features
  • 37.
  • 38. Building High-Performance MySQL Query Systems and Analytic Applications Thanks…!