SlideShare a Scribd company logo
Shard-Query
AN MPP DATABASE FOR THE CLOUD USING THE
LAMP STACK
Introduction
Presenter
• Justin Swanhart
• Principal Support Engineer at Percona
• Previously a trainer and consultant at Percona too
Developer
• Swanhart-tools
• Shard-Query – MPP sharding middleware for MySQL
• Flexviews – Materialized views (fast refresh) for MySQL
• bcmath UDF – arbitrary precision math for MySQL
Intended Audience
• MySQL users with data too large to query efficiently using a single
machine
• Big Data
• Analytics / OLAP
• User generated content analysis
• People interested in distributed database processing
Terms
MPP – Massively Parallel Processing
• An MPP system is a system that can process a SQL statement in
parallel on a single machine or even many machines
• A collection of machines is often called a Grid
• MPP is also sometimes called Grid Computing
MPP (cont)
• Not many open source databases (none?) support MPP
• Community editions of closed source offerings are limited
• Some closed source databases include Vertica, Greenplum, Redshift
The Cloud
• Managed collection of virtual servers
• Easy to add servers on demand
• Ideal for a federated, distributed database grid
• Easy to “scale up” by moving to a VM with more cores
• Easy to “scale out” by adding machines
• Amazon is one of the most popular cloud environments
LAMP stack
• Linux
• Amazon Linux
• RHEL
• Ubuntu LTS, etc.
• Apache Web Server
• Most popular web server on the planet
• MySQL
• The world’s most popular open source database
• PHP
• High level language makes development easier
Database Middleware
• A piece of software that sits between an end-user application and
the database
• Operates on the queries submitted by the application, then
returns the results to the application
• Usually a proxy of some sort
• MySQL proxy is the open source user configurable proxy for MySQL
• Supports Lua scripts which intercept queries
• Shard-Query can use MySQL Proxy out of the box
Message Queue / Job Server
• Accepts jobs or messages and places them in a queue
• A worker reads jobs/messages from the queue and acts on them
• Offers support for asynchronous jobs
• Gearman
• My job server of choice for PHP
• Has two different PHP interfaces (pear and pecl)
• SQ comes bundled with a modified version* of the pear interface
• Excellent integration with MySQL as well (UDF)
* Removes warnings triggered by modern PHP strict mode
Sharding
• It is a short for Shared Nothing
• Means splitting up your data onto more than one machine
• Tables that are split up are called sharded tables
• Lookup tables are not sharded. In other words, they must be
duplicated on all nodes
• Shard-Query supports directory based or hash based sharding
Shard mapper
• Shard-Query supports DIRECTORY and HASH mapping out of the
box
• DIRECTORY based sharding allows you to add or remove shards
from the system, but lookups may go over the network, reducing
performance* compared to HASH mapping
• HASH based sharding uses a hash algorithm to balance rows over
the sharded database. However, since a HASH algorithm is used,
the number of database shards can not change after initial data
loading.
* But only for queries like “select count(*) from table where customer_id = 50”
What is “big data”
Most machine generated data
• Line order information for a large organization like Wal-Mart™
• Any data so large that you can’t effectively operate on it on one
machine
• For example, an important query that needs to run daily executes in
greater than 24 hours. It is impossible to meet the daily goal unless
you can find a way to make the query execute faster.
• These kind of problems can happen on relatively small amounts of
data (tens of gigabytes)
Analytics(OLAP) versus OLTP
• OLTP is focused on short lived small transactions that read or
write small amounts of data
• OLAP is focused on bulk loading and reading large amounts of
data in a single query.
• Aggregation queries are OLAP queries
• Shard-Query is designed for analytics (OLAP) not OLTP
• must parse all commands sent to it (and make multiple round trips)
• Minium query time of around 20ms
PROBLEM: Single
Threaded Queries
THE BIGGEST BOTTLENECK IN ANALYTICAL QUERIES
IS THE SPEED OF A SINGLE CORE
Single thread queries in the database
• MySQL, PostgreSQL, Firebird and all other major open source
databases have single threaded queries
• This means that a single query can only ever utilize the resources
of a single core
• As the data size grows, analytical queries get slower and slower
• In memory, as the data grows the speed decreases because the data is
accessed in a single query
• As the number of rows to be examined increases, performance
decreases
Why single threaded
• MySQL is optimized for getting small amounts of data
quickly(OLTP)
• It was created at a time when having more than one CPU was not
common
• Adding parallelism now is a very complex task, particularly since
MySQL supports multiple storage engines
• So adding parallel query is not a high priority (not even on the
roadmap)
• Designed to run LOTS of small queries simultaneously, not one
big query
Single Threading – bad for IO
• If the data set is significantly larger than memory, single threaded
queries often cause the buffer pool to "churn“
• For example, small lookup tables can easily be pushed out of the buffer
pool, resulting in frequent IO to look up values
• While SSD may helps somewhat, one database thread can not read
from an SSD at maximum device capacity
• While the disk may be capable of 1000+ MB/sec, a single thread is
generally limited to <100MB/sec (usually 30-40)
• This is because a single thread shares doing IO AND running the query
on one CPU (MySQL does not use read threads for queries)
The OLAP Example
• A large company maintains a star schema of their sales history for
analytics purposes
• This company likes to present a sum total of orders for all time on
the dashboard
• In the beginning the query is very fast
• It gets slower, though, as months of data are added and as the business
grows, data increases too
• Eventually the query takes more than 24 hours to run, which means it
can no longer be updated daily
• “Drill down” gets slower as data increases
What can be done?
• Caching?
• Materialized views?
• Partitioning?
• Sharding?
Making OLAP more like OLTP!
• Shard-Query breaks on big query up into smaller queries that can
access the database in parallel
• Partitioning and sharding are used to keep data size for any single
query to a minimum
• If your table has 16 partitions, you can get up to 16 way parallelism
• If you also have 2 nodes, you get 32 way parallelism, and so on
• You can use multiple database schema on a single server instead (a
form of sharding) if you don’t partition your data
Shard-Query
ADDING PARALLELISM TO QUERIES
Sharding Reviewed
• A sharded database contains multiple nodes or databases called
shards
• One physical machine might host many shards
• Each shard has identical schema
Sharding Reviewed (cont)
• The multiple shards function together as one RDBMS system.
• You can think of the shards as a big UNION ALL of the data, with
only a portion of the data on any one machine
• A mechanism must control which server on which to place
particular pieces of data.
• In Shard-Query a particular column controls data placement – this
is called the shard key
Sharding – Data distribution
• There are usually one or two large tables that are sharded
• These are usually called FACT tables
• An example might be blogs, blog_posts and blog_comments. All three
share a “shard key” of blog_id
• Most common case is one big table with smaller lookup tables
Sharding Reviewed (cont)
• The shard key is very important!
• Since a specific column acts as the “shard key”, all sharded tables must
contain the shard key.
• For example: blog_id might be the shard key.
• The rows for a specific blog_id are then located on the same shard in
any table that has the blog_id column
Optimization - Shard Elimination
• When Shard-Query sees an expression on the shard key it looks up*
the shard that contains the appropriate data and only sends queries to
the necessary shards.
• Equality lookup is most efficient, but IN, BETWEEN and other operators are allowed
as well
• Lookups may not use subqueries (ie, blog_id IN (1,2,3) is okay, not blog_id in (select
…))
• This is called “shard elimination”
• Shard elimination is analogous to partition elimination.
• where blog_id = 10, for example
Can Shard-Query help on 1 machine?
• Yes! - Use MySQL partitioning on a single machine
• Shard-Query can access the partitions of a table in parallel!
• This means that if you have many partitions, then Shard-Query can
utilize many cores to answer the query
Use partitions for
parallelism
How does that work?
• Shard-Query executes an EXPLAIN PLAN on the query
• This EXPLAIN PLAN shows the partitions that MySQL will access
when running the query
• Shard-Query uses the 5.6 PARTITION hint to generate one query
per partition
• These queries can execute in parallel
Sharding can help too
• How?
• Shard-Query adds parallelism to queries by spreading them over nodes
in parallel
• Spread the data over four nodes and queries are 4x faster
MySQL database shards
Shard-Query
Sharding + Partitioning is best
• Why?
• Partition the tables to add parallelism to each node
• Use sharding to have multiple nodes working together
• 4 nodes with 3 partitions each = 12 way parallelism
Shard-Query
MySQL database shards
Partitions
Shard-Query
ARCHITECTURE
Configuration Repository
• Shard-Query stores all configuration information in a MySQL
database called the configuration repository
• This should be a highly available replication pair (or XtraDB
cluster) for HA
• Web interface can change the settings
• Manual settings changes can be done via SQL
• schemata_config table in Shard-Query repository
• Makes using Shard-Query easier, especially when using more than
one node
PHP OO Apache
Web
Interface
MySQL
Proxy
Gearman Message Queue
Worker Worker Worker Worker
MySQL database shards
Shard-Query Architecture
Interfaces
Communication
Workers
Storage
Config
Repository
Configuration
Management
PHP OO Apache
Web
Interface
MySQL
Proxy
Gearman Message Queue
Worker Worker Worker Worker
MySQL database shards
Shard-Query Architecture
Gearman job server
• Provides the parallel mechanism
for Shard-Query
• Multiple Gearman are
supported for HA
• Enables Shard-Query to use a
map/reduce like architecture
• Sends jobs to workers when they
arrive at the queue
• If all workers are busy the job
waits
Gearman at a glance
Shard-Query OO
Store-resultset
Loader worker
SQ run SQL worker
PHP OO Apache
Web
Interface
MySQL
Proxy
Gearman Message Queue
Worker Worker Worker Worker
MySQL database shards
Shard-Query Architecture
Three kinds of workers
• loader_worker – Listens for
loader jobs and executes them.
Used by parallel loader.
• shard_query_worker – Listens
for SQL jobs, runs the job via
Shard-Query and returns the
results as JSON. Used by web
and proxy interfaces.
• store_resultset_worker – Main
worker used by Shard-Query. It
runs SQL and stores the result
in a table.
PHP OO Apache
Web
Interface
MySQL
Proxy
Gearman Message Queue
Worker Worker Worker Worker
MySQL database shards
Shard-Query Architecture
PHP Object Oriented Interface
• Very simple to use
• Constructor parameters not
even usually needed
• Just one function to run a SQL
query and get results back
• Complete example comes with
Shard-Query as:
bin/run_query
PHP OO Example (from bin/run_query):
$shard_query = new ShardQuery();
$stime = microtime(true);
$stmt = $shard_query->query($sql);
$endtime = microtime(true);
if(!empty($shard_query->errors)) {
if(!empty($shard_query->errors)) {
echo "ERRORS RETURNED BY OPERATION:n";
print_r($shard_query->errors);
}
}
if(is_resource($stmt) || is_object($stmt)) {
$count=0;
while($row = $shard_query->DAL->my_fetch_assoc($stmt)) {
print_r($row);
++$count;
}
echo "$count rows returnedn";
$shard_query->DAL->my_free_result($stmt);
} else {
if(!empty($shard_query->info)) print_r($shard_query->info);
echo "no query resultsn";
}
echo "Exec time: " . ($endtime - $stime) . "n";
Simple data access layer
comes with Shard-Query
Errors are returned as a member
of the object
Run the query
PHP OO Apache
Web
Interface
MySQL
Proxy
Gearman Message Queue
Worker Worker Worker Worker
MySQL database shards
Shard-Query Architecture
Apache web interface
• GUI
• Easy to set up
• Run queries and get results
• Serves as an example of using
Shard-Query in a web app with
asynchronous queries
• Submits queries via Gearman
• Simple HTTP authentication
PHP OO Apache
Web
Interface
MySQL
Proxy
Gearman Message Queue
Worker Worker Worker Worker
MySQL database shards
Shard-Query Architecture
MySQL Proxy Interface
• LUA script for MySQL Proxy
• Supports most SHOW
commands
• Intercepts queries, and sends
them to Shard-Query using the
MySQL Gearman UDF
• Serves as another example of
using Gearman to execute
queries.
• Behaves slightly differently than
MySQL for some commands
Query submitted
SQL is parsed
Query rewrite
for parallelism
yields multiple
queries
Gearman Jobs
(map/combine)
Final Aggregation
(reduce)
Return result
Shard-Query Data Flow
Map/reduce like workflow
Query submitted
SQL is parsed
Query rewrite
for parallelism
yields multiple
queries
Gearman Jobs
(map/combine)
Final Aggregation
(reduce)
Return result
Shard-Query Data Flow
SQL Parser
• Find it at http://github.com/greenlion/php-sql-parser
• Supports
• SELECT/INSERT/UPDATE/DELETE
• REPLACE
• RENAME
• SHOW/SET
• DROP/CREATE INDEX/CREATE TABLE
• EXPLAIN/DESCRIBE
Used by SugarCRM too, as
well as other open source
projects.
Query submitted
SQL is parsed
Query rewrite
for parallelism
yields multiple
queries
Gearman Jobs
(map/combine)
Final Aggregation
(reduce)
Return result
Shard-Query Data Flow
Query Rewrite for parallelism
• Shard-Query has to manipulate the SQL statement so that it can
be executed over more than on partition or machine
• COUNT() turns into SUM of COUNTs from each query
• AVG turns into SUM and COUNT
• SEMI-JOIN is turned into a materialized join
• STDDEV/VARIANCE are rewritten as well use the sum of squares
method
• Push down LIMIT when possible
Query Rewrite for parallelism (cont)
• Because lookup tables are duplicated on all shards, the query
executes in a shared-nothing way
• All joins, filtering and aggregation are pushed down
• Mean very little data must flow between nodes in most cases
• High performance
• Meets or beats Amazon Redshift in testing at 200GB of data
Query submitted
SQL is parsed
Query rewrite
for parallelism
yields multiple
queries
Gearman Jobs
(map/combine)
Final Aggregation
(reduce)
Return result
Shard-Query Data Flow
Map/Combine
• The store_resultset gearman worker runs SQL and stores the result
in a table
• To keep the number of rows in the table (and the time it takes to
aggregate results in the end) small, an INSERT … ON DUPLICATE
KEY UPDATE (ODKU) statement is used when inserting the rows
• There is a UNIQUE KEY over the GROUP BY attributes to facilitate
the upsert
Query submitted
SQL is parsed
Query rewrite
for parallelism
yields multiple
queries
Gearman Jobs
(map/combine)
Final Aggregation
(reduce)
Return result
Shard-Query Data Flow
Final aggregation
• Shard-Query has to return a proper result, combining the results
in the result table together to return the correct answer
• Again, for example COUNT must be rewritten as SUM to combine
all the counts (from each shard) in the result table
• Aggregated result is returned to the client
Shard-Query Flow as SQL
[justin@localhost bin]$ ./run_query --verbose
select count(*) from lineorder;
Shard-Query optimizer messages:
SQL TO SEND TO SHARDS:
Array
(
[0] => SELECT COUNT(*) AS expr_2913896658
FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1
[1] => SELECT COUNT(*) AS expr_2913896658
FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1
[2] => SELECT COUNT(*) AS expr_2913896658
FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1
[3] => SELECT COUNT(*) AS expr_2913896658
FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1
)
SQL TO SEND TO COORDINATOR NODE:
SELECT SUM(expr_2913896658) AS ` count `
FROM `aggregation_tmp_58392079`
Array
(
[count ] => 0
)
1 rows returned
Exec time: 0.03083610534668
Initial query
Query rewrite / map
Final aggregation / reduce
Final result
Map/Combine example
select LO_OrderDateKey, count(*) from lineorder group by LO_OrderDateKey;
Shard-Query optimizer messages:
* The following projections may be selected for a UNIQUE CHECK on the storage node operation:
expr$0
* storage node result set merge optimization enabled:
ON DUPLICATE KEY UPDATE
expr_2445085448=expr_2445085448 + VALUES(expr_2445085448)
SQL TO SEND TO SHARDS:
Array
(
[0] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448
FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 GROUP BY expr$0
[1] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448
FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 GROUP BY expr$0
[2] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448
FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 GROUP BY expr$0
[3] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448
FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 GROUP BY expr$0
)
SQL TO SEND TO COORDINATOR NODE:
SELECT expr$0 AS `LO_OrderDateKey`,SUM(expr_2445085448) AS ` count `
FROM `aggregation_tmp_12033903` GROUP BY expr$0
combine
reduce
Use cases
Machine generated data
• Sensor readings
• Metrics
• Logs
• Any large table with short lookup tables
Star schema are ideal
Call detail records
• Shard-Query is used in the billing system of a large cellular provider
• CDRs generate a lot of data
• Shard-Query includes a fast PERCENTILE function
Green energy meter processing
• High volume of data means sharding is necessary
• With Shard-Query, reporting is possible over all the shards,
making queries possible that would not work with Fabric or other
sharding solutions
• Used in India for reporting on a green power grid
Log analysis
• Performance logs from a web application for example
• Aggregate many different statistics and shard if log volumes are
high enough
• Search text logs with regular expressions
Performance
Star Schema Benchmark – SF 20
• 119 million rows of data (12GB)
• Infobright Community Database
• Only 1st query from each “flight” selected
• Unsharded compared to four shards (box has 4 cpu - Amazon
m1.xlarge)
COLD
• MySQL – 35.39s
• Shard-Query – 11.62s
HOT
• MySQL – 10.99s
• Shard-Query – 2.95s
Query 1
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder join dim_date on lo_orderdatekey = d_datekey
where d_year = 1993
and lo_discount between 1 and 3
and lo_quantity < 25;
COLD
• MySQL – 34.24s
• Shard-Query – 12.74s
HOT
• MySQL – 12.74s
• Shard-Query – 3.26s
Query 2
select sum(lo_revenue), d_year, p_brand
from lineorder
join dim_date on lo_orderdatekey = d_datekey
join part on lo_partkey = p_partkey
join supplier on lo_suppkey = s_suppkey
where p_category = 'MFGR#12'
and s_region = 'AMERICA'
group by d_year, p_brand
order by d_year, p_brand;
COLD
• MySQL – 27.29s
• Shard-Query – 7.97s
HOT
• MySQL – 18.89
• Shard-Query – 5.06s
Query 3
select c_nation, s_nation, d_year, sum(lo_revenue) as revenue
from customer join lineorder
on lo_custkey = c_customerkey
join supplier on lo_suppkey = s_suppkey
join dim_date on lo_orderdatekey = d_datekey
where c_region = 'ASIA'
and s_region = 'ASIA'
and d_year >= 1992 and d_year <= 1997
group by c_nation, s_nation, d_year
order by d_year asc, revenue desc;
COLD
• MySQL – 23.02s
• Shard-Query – 8.48s
HOT
• MySQL – 14.77
• Shard-Query – 4.29s
Query 4
select d_year, c_nation, sum(lo_revenue - lo_supplycost) as profit
from lineorder join dim_date on lo_orderdatekey = d_datekey
join customer on lo_custkey = c_customerkey
join supplier on lo_suppkey = s_suppkey
join part on lo_partkey = p_partkey
where c_region = 'AMERICA'
and s_region = 'AMERICA'
and (p_mfgr = 'MFGR#1'
or p_mfgr = 'MFGR#2')
group by d_year, c_nation
order by d_year, c_nation;

More Related Content

What's hot

Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
markgrover
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera, Inc.
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
trihug
 
Cloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for HadoopCloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for Hadoop
Cloudera, Inc.
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 
Cloudera impala
Cloudera impalaCloudera impala
Cloudera impala
Swiss Big Data User Group
 
Citus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed DatabaseCitus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed Database
Ozgun Erdogan
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera, Inc.
 
Performance evaluation of cloudera impala (with Comparison to Hive)
Performance evaluation of cloudera impala (with Comparison to Hive)Performance evaluation of cloudera impala (with Comparison to Hive)
Performance evaluation of cloudera impala (with Comparison to Hive)Yukinori Suda
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
markgrover
 
Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop
Guy Harrison
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
EDB
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
Todd Lipcon
 
Hadoop databases for oracle DBAs
Hadoop databases for oracle DBAsHadoop databases for oracle DBAs
Hadoop databases for oracle DBAs
Maxym Kharchenko
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Dataconomy Media
 
Cloudera Impala technical deep dive
Cloudera Impala technical deep diveCloudera Impala technical deep dive
Cloudera Impala technical deep dive
huguk
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
James Chen
 
Moving Data Between Exadata and Hadoop
Moving Data Between Exadata and HadoopMoving Data Between Exadata and Hadoop
Moving Data Between Exadata and HadoopEnkitec
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
guestdfd1ec
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
larsgeorge
 

What's hot (20)

Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
 
Cloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for HadoopCloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for Hadoop
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Cloudera impala
Cloudera impalaCloudera impala
Cloudera impala
 
Citus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed DatabaseCitus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed Database
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Performance evaluation of cloudera impala (with Comparison to Hive)
Performance evaluation of cloudera impala (with Comparison to Hive)Performance evaluation of cloudera impala (with Comparison to Hive)
Performance evaluation of cloudera impala (with Comparison to Hive)
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
 
Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop Hadoop and rdbms with sqoop
Hadoop and rdbms with sqoop
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
Hadoop databases for oracle DBAs
Hadoop databases for oracle DBAsHadoop databases for oracle DBAs
Hadoop databases for oracle DBAs
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
 
Cloudera Impala technical deep dive
Cloudera Impala technical deep diveCloudera Impala technical deep dive
Cloudera Impala technical deep dive
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
 
Moving Data Between Exadata and Hadoop
Moving Data Between Exadata and HadoopMoving Data Between Exadata and Hadoop
Moving Data Between Exadata and Hadoop
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 

Similar to Shard-Query, an MPP database for the cloud using the LAMP stack

HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
No sql databases
No sql databasesNo sql databases
No sql databases
swathika rajan
 
Comparative study of modern databases
Comparative study of modern databasesComparative study of modern databases
Comparative study of modern databases
Anirban Konar
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
Bozhidar Bozhanov
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Data warehouse 26 exploiting parallel technologies
Data warehouse  26 exploiting parallel technologiesData warehouse  26 exploiting parallel technologies
Data warehouse 26 exploiting parallel technologies
Vaibhav Khanna
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
Apache drill
Apache drillApache drill
Apache drill
MapR Technologies
 
Breaking data
Breaking dataBreaking data
Breaking data
Terry Bunio
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
Keeyong Han
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
Zohar Elkayam
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
Amazon Web Services LATAM
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro
 
Massively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationMassively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentation
kriptonium
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
raghdooosh
 

Similar to Shard-Query, an MPP database for the cloud using the LAMP stack (20)

Revision
RevisionRevision
Revision
 
NoSql
NoSqlNoSql
NoSql
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Comparative study of modern databases
Comparative study of modern databasesComparative study of modern databases
Comparative study of modern databases
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Data warehouse 26 exploiting parallel technologies
Data warehouse  26 exploiting parallel technologiesData warehouse  26 exploiting parallel technologies
Data warehouse 26 exploiting parallel technologies
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Apache drill
Apache drillApache drill
Apache drill
 
Breaking data
Breaking dataBreaking data
Breaking data
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Massively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentationMassively sharded my sql at tumblr presentation
Massively sharded my sql at tumblr presentation
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 

Recently uploaded

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 

Recently uploaded (20)

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 

Shard-Query, an MPP database for the cloud using the LAMP stack

  • 1. Shard-Query AN MPP DATABASE FOR THE CLOUD USING THE LAMP STACK
  • 2. Introduction Presenter • Justin Swanhart • Principal Support Engineer at Percona • Previously a trainer and consultant at Percona too Developer • Swanhart-tools • Shard-Query – MPP sharding middleware for MySQL • Flexviews – Materialized views (fast refresh) for MySQL • bcmath UDF – arbitrary precision math for MySQL
  • 3. Intended Audience • MySQL users with data too large to query efficiently using a single machine • Big Data • Analytics / OLAP • User generated content analysis • People interested in distributed database processing
  • 5. MPP – Massively Parallel Processing • An MPP system is a system that can process a SQL statement in parallel on a single machine or even many machines • A collection of machines is often called a Grid • MPP is also sometimes called Grid Computing
  • 6. MPP (cont) • Not many open source databases (none?) support MPP • Community editions of closed source offerings are limited • Some closed source databases include Vertica, Greenplum, Redshift
  • 7. The Cloud • Managed collection of virtual servers • Easy to add servers on demand • Ideal for a federated, distributed database grid • Easy to “scale up” by moving to a VM with more cores • Easy to “scale out” by adding machines • Amazon is one of the most popular cloud environments
  • 8. LAMP stack • Linux • Amazon Linux • RHEL • Ubuntu LTS, etc. • Apache Web Server • Most popular web server on the planet • MySQL • The world’s most popular open source database • PHP • High level language makes development easier
  • 9. Database Middleware • A piece of software that sits between an end-user application and the database • Operates on the queries submitted by the application, then returns the results to the application • Usually a proxy of some sort • MySQL proxy is the open source user configurable proxy for MySQL • Supports Lua scripts which intercept queries • Shard-Query can use MySQL Proxy out of the box
  • 10. Message Queue / Job Server • Accepts jobs or messages and places them in a queue • A worker reads jobs/messages from the queue and acts on them • Offers support for asynchronous jobs • Gearman • My job server of choice for PHP • Has two different PHP interfaces (pear and pecl) • SQ comes bundled with a modified version* of the pear interface • Excellent integration with MySQL as well (UDF) * Removes warnings triggered by modern PHP strict mode
  • 11. Sharding • It is a short for Shared Nothing • Means splitting up your data onto more than one machine • Tables that are split up are called sharded tables • Lookup tables are not sharded. In other words, they must be duplicated on all nodes • Shard-Query supports directory based or hash based sharding
  • 12. Shard mapper • Shard-Query supports DIRECTORY and HASH mapping out of the box • DIRECTORY based sharding allows you to add or remove shards from the system, but lookups may go over the network, reducing performance* compared to HASH mapping • HASH based sharding uses a hash algorithm to balance rows over the sharded database. However, since a HASH algorithm is used, the number of database shards can not change after initial data loading. * But only for queries like “select count(*) from table where customer_id = 50”
  • 13. What is “big data” Most machine generated data • Line order information for a large organization like Wal-Mart™ • Any data so large that you can’t effectively operate on it on one machine • For example, an important query that needs to run daily executes in greater than 24 hours. It is impossible to meet the daily goal unless you can find a way to make the query execute faster. • These kind of problems can happen on relatively small amounts of data (tens of gigabytes)
  • 14. Analytics(OLAP) versus OLTP • OLTP is focused on short lived small transactions that read or write small amounts of data • OLAP is focused on bulk loading and reading large amounts of data in a single query. • Aggregation queries are OLAP queries • Shard-Query is designed for analytics (OLAP) not OLTP • must parse all commands sent to it (and make multiple round trips) • Minium query time of around 20ms
  • 15. PROBLEM: Single Threaded Queries THE BIGGEST BOTTLENECK IN ANALYTICAL QUERIES IS THE SPEED OF A SINGLE CORE
  • 16. Single thread queries in the database • MySQL, PostgreSQL, Firebird and all other major open source databases have single threaded queries • This means that a single query can only ever utilize the resources of a single core • As the data size grows, analytical queries get slower and slower • In memory, as the data grows the speed decreases because the data is accessed in a single query • As the number of rows to be examined increases, performance decreases
  • 17. Why single threaded • MySQL is optimized for getting small amounts of data quickly(OLTP) • It was created at a time when having more than one CPU was not common • Adding parallelism now is a very complex task, particularly since MySQL supports multiple storage engines • So adding parallel query is not a high priority (not even on the roadmap) • Designed to run LOTS of small queries simultaneously, not one big query
  • 18. Single Threading – bad for IO • If the data set is significantly larger than memory, single threaded queries often cause the buffer pool to "churn“ • For example, small lookup tables can easily be pushed out of the buffer pool, resulting in frequent IO to look up values • While SSD may helps somewhat, one database thread can not read from an SSD at maximum device capacity • While the disk may be capable of 1000+ MB/sec, a single thread is generally limited to <100MB/sec (usually 30-40) • This is because a single thread shares doing IO AND running the query on one CPU (MySQL does not use read threads for queries)
  • 19. The OLAP Example • A large company maintains a star schema of their sales history for analytics purposes • This company likes to present a sum total of orders for all time on the dashboard • In the beginning the query is very fast • It gets slower, though, as months of data are added and as the business grows, data increases too • Eventually the query takes more than 24 hours to run, which means it can no longer be updated daily • “Drill down” gets slower as data increases
  • 20. What can be done? • Caching? • Materialized views? • Partitioning? • Sharding?
  • 21. Making OLAP more like OLTP! • Shard-Query breaks on big query up into smaller queries that can access the database in parallel • Partitioning and sharding are used to keep data size for any single query to a minimum • If your table has 16 partitions, you can get up to 16 way parallelism • If you also have 2 nodes, you get 32 way parallelism, and so on • You can use multiple database schema on a single server instead (a form of sharding) if you don’t partition your data
  • 23. Sharding Reviewed • A sharded database contains multiple nodes or databases called shards • One physical machine might host many shards • Each shard has identical schema
  • 24. Sharding Reviewed (cont) • The multiple shards function together as one RDBMS system. • You can think of the shards as a big UNION ALL of the data, with only a portion of the data on any one machine • A mechanism must control which server on which to place particular pieces of data. • In Shard-Query a particular column controls data placement – this is called the shard key
  • 25. Sharding – Data distribution • There are usually one or two large tables that are sharded • These are usually called FACT tables • An example might be blogs, blog_posts and blog_comments. All three share a “shard key” of blog_id • Most common case is one big table with smaller lookup tables
  • 26. Sharding Reviewed (cont) • The shard key is very important! • Since a specific column acts as the “shard key”, all sharded tables must contain the shard key. • For example: blog_id might be the shard key. • The rows for a specific blog_id are then located on the same shard in any table that has the blog_id column
  • 27. Optimization - Shard Elimination • When Shard-Query sees an expression on the shard key it looks up* the shard that contains the appropriate data and only sends queries to the necessary shards. • Equality lookup is most efficient, but IN, BETWEEN and other operators are allowed as well • Lookups may not use subqueries (ie, blog_id IN (1,2,3) is okay, not blog_id in (select …)) • This is called “shard elimination” • Shard elimination is analogous to partition elimination. • where blog_id = 10, for example
  • 28. Can Shard-Query help on 1 machine? • Yes! - Use MySQL partitioning on a single machine • Shard-Query can access the partitions of a table in parallel! • This means that if you have many partitions, then Shard-Query can utilize many cores to answer the query Use partitions for parallelism
  • 29. How does that work? • Shard-Query executes an EXPLAIN PLAN on the query • This EXPLAIN PLAN shows the partitions that MySQL will access when running the query • Shard-Query uses the 5.6 PARTITION hint to generate one query per partition • These queries can execute in parallel
  • 30. Sharding can help too • How? • Shard-Query adds parallelism to queries by spreading them over nodes in parallel • Spread the data over four nodes and queries are 4x faster MySQL database shards Shard-Query
  • 31. Sharding + Partitioning is best • Why? • Partition the tables to add parallelism to each node • Use sharding to have multiple nodes working together • 4 nodes with 3 partitions each = 12 way parallelism Shard-Query MySQL database shards Partitions
  • 33. Configuration Repository • Shard-Query stores all configuration information in a MySQL database called the configuration repository • This should be a highly available replication pair (or XtraDB cluster) for HA • Web interface can change the settings • Manual settings changes can be done via SQL • schemata_config table in Shard-Query repository • Makes using Shard-Query easier, especially when using more than one node
  • 34. PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture Interfaces Communication Workers Storage Config Repository Configuration Management
  • 35. PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture Gearman job server • Provides the parallel mechanism for Shard-Query • Multiple Gearman are supported for HA • Enables Shard-Query to use a map/reduce like architecture • Sends jobs to workers when they arrive at the queue • If all workers are busy the job waits
  • 36. Gearman at a glance Shard-Query OO Store-resultset Loader worker SQ run SQL worker
  • 37. PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture Three kinds of workers • loader_worker – Listens for loader jobs and executes them. Used by parallel loader. • shard_query_worker – Listens for SQL jobs, runs the job via Shard-Query and returns the results as JSON. Used by web and proxy interfaces. • store_resultset_worker – Main worker used by Shard-Query. It runs SQL and stores the result in a table.
  • 38. PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture PHP Object Oriented Interface • Very simple to use • Constructor parameters not even usually needed • Just one function to run a SQL query and get results back • Complete example comes with Shard-Query as: bin/run_query
  • 39. PHP OO Example (from bin/run_query): $shard_query = new ShardQuery(); $stime = microtime(true); $stmt = $shard_query->query($sql); $endtime = microtime(true); if(!empty($shard_query->errors)) { if(!empty($shard_query->errors)) { echo "ERRORS RETURNED BY OPERATION:n"; print_r($shard_query->errors); } } if(is_resource($stmt) || is_object($stmt)) { $count=0; while($row = $shard_query->DAL->my_fetch_assoc($stmt)) { print_r($row); ++$count; } echo "$count rows returnedn"; $shard_query->DAL->my_free_result($stmt); } else { if(!empty($shard_query->info)) print_r($shard_query->info); echo "no query resultsn"; } echo "Exec time: " . ($endtime - $stime) . "n"; Simple data access layer comes with Shard-Query Errors are returned as a member of the object Run the query
  • 40. PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture Apache web interface • GUI • Easy to set up • Run queries and get results • Serves as an example of using Shard-Query in a web app with asynchronous queries • Submits queries via Gearman • Simple HTTP authentication
  • 41. PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture MySQL Proxy Interface • LUA script for MySQL Proxy • Supports most SHOW commands • Intercepts queries, and sends them to Shard-Query using the MySQL Gearman UDF • Serves as another example of using Gearman to execute queries. • Behaves slightly differently than MySQL for some commands
  • 42. Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow Map/reduce like workflow
  • 43. Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow
  • 44. SQL Parser • Find it at http://github.com/greenlion/php-sql-parser • Supports • SELECT/INSERT/UPDATE/DELETE • REPLACE • RENAME • SHOW/SET • DROP/CREATE INDEX/CREATE TABLE • EXPLAIN/DESCRIBE Used by SugarCRM too, as well as other open source projects.
  • 45. Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow
  • 46. Query Rewrite for parallelism • Shard-Query has to manipulate the SQL statement so that it can be executed over more than on partition or machine • COUNT() turns into SUM of COUNTs from each query • AVG turns into SUM and COUNT • SEMI-JOIN is turned into a materialized join • STDDEV/VARIANCE are rewritten as well use the sum of squares method • Push down LIMIT when possible
  • 47. Query Rewrite for parallelism (cont) • Because lookup tables are duplicated on all shards, the query executes in a shared-nothing way • All joins, filtering and aggregation are pushed down • Mean very little data must flow between nodes in most cases • High performance • Meets or beats Amazon Redshift in testing at 200GB of data
  • 48. Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow
  • 49. Map/Combine • The store_resultset gearman worker runs SQL and stores the result in a table • To keep the number of rows in the table (and the time it takes to aggregate results in the end) small, an INSERT … ON DUPLICATE KEY UPDATE (ODKU) statement is used when inserting the rows • There is a UNIQUE KEY over the GROUP BY attributes to facilitate the upsert
  • 50. Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow
  • 51. Final aggregation • Shard-Query has to return a proper result, combining the results in the result table together to return the correct answer • Again, for example COUNT must be rewritten as SUM to combine all the counts (from each shard) in the result table • Aggregated result is returned to the client
  • 52. Shard-Query Flow as SQL [justin@localhost bin]$ ./run_query --verbose select count(*) from lineorder; Shard-Query optimizer messages: SQL TO SEND TO SHARDS: Array ( [0] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 [1] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 [2] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 [3] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 ) SQL TO SEND TO COORDINATOR NODE: SELECT SUM(expr_2913896658) AS ` count ` FROM `aggregation_tmp_58392079` Array ( [count ] => 0 ) 1 rows returned Exec time: 0.03083610534668 Initial query Query rewrite / map Final aggregation / reduce Final result
  • 53. Map/Combine example select LO_OrderDateKey, count(*) from lineorder group by LO_OrderDateKey; Shard-Query optimizer messages: * The following projections may be selected for a UNIQUE CHECK on the storage node operation: expr$0 * storage node result set merge optimization enabled: ON DUPLICATE KEY UPDATE expr_2445085448=expr_2445085448 + VALUES(expr_2445085448) SQL TO SEND TO SHARDS: Array ( [0] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 GROUP BY expr$0 [1] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 GROUP BY expr$0 [2] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 GROUP BY expr$0 [3] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 GROUP BY expr$0 ) SQL TO SEND TO COORDINATOR NODE: SELECT expr$0 AS `LO_OrderDateKey`,SUM(expr_2445085448) AS ` count ` FROM `aggregation_tmp_12033903` GROUP BY expr$0 combine reduce
  • 55. Machine generated data • Sensor readings • Metrics • Logs • Any large table with short lookup tables Star schema are ideal
  • 56. Call detail records • Shard-Query is used in the billing system of a large cellular provider • CDRs generate a lot of data • Shard-Query includes a fast PERCENTILE function
  • 57. Green energy meter processing • High volume of data means sharding is necessary • With Shard-Query, reporting is possible over all the shards, making queries possible that would not work with Fabric or other sharding solutions • Used in India for reporting on a green power grid
  • 58. Log analysis • Performance logs from a web application for example • Aggregate many different statistics and shard if log volumes are high enough • Search text logs with regular expressions
  • 60. Star Schema Benchmark – SF 20 • 119 million rows of data (12GB) • Infobright Community Database • Only 1st query from each “flight” selected • Unsharded compared to four shards (box has 4 cpu - Amazon m1.xlarge)
  • 61. COLD • MySQL – 35.39s • Shard-Query – 11.62s HOT • MySQL – 10.99s • Shard-Query – 2.95s Query 1 select sum(lo_extendedprice*lo_discount) as revenue from lineorder join dim_date on lo_orderdatekey = d_datekey where d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25;
  • 62. COLD • MySQL – 34.24s • Shard-Query – 12.74s HOT • MySQL – 12.74s • Shard-Query – 3.26s Query 2 select sum(lo_revenue), d_year, p_brand from lineorder join dim_date on lo_orderdatekey = d_datekey join part on lo_partkey = p_partkey join supplier on lo_suppkey = s_suppkey where p_category = 'MFGR#12' and s_region = 'AMERICA' group by d_year, p_brand order by d_year, p_brand;
  • 63. COLD • MySQL – 27.29s • Shard-Query – 7.97s HOT • MySQL – 18.89 • Shard-Query – 5.06s Query 3 select c_nation, s_nation, d_year, sum(lo_revenue) as revenue from customer join lineorder on lo_custkey = c_customerkey join supplier on lo_suppkey = s_suppkey join dim_date on lo_orderdatekey = d_datekey where c_region = 'ASIA' and s_region = 'ASIA' and d_year >= 1992 and d_year <= 1997 group by c_nation, s_nation, d_year order by d_year asc, revenue desc;
  • 64. COLD • MySQL – 23.02s • Shard-Query – 8.48s HOT • MySQL – 14.77 • Shard-Query – 4.29s Query 4 select d_year, c_nation, sum(lo_revenue - lo_supplycost) as profit from lineorder join dim_date on lo_orderdatekey = d_datekey join customer on lo_custkey = c_customerkey join supplier on lo_suppkey = s_suppkey join part on lo_partkey = p_partkey where c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') group by d_year, c_nation order by d_year, c_nation;