SlideShare a Scribd company logo
1 of 76
Download to read offline
Analytics in the Real World,
Case Studies and Use Cases
Amy Krishnamohan
Director of Product Marketing
GOTO Satoru
Customer Solutions Engineer
Finance
● Identify trade patterns
● Detect fraud and anomalies
● Predict trading outcomes
Manufacturing
● Simulations to improve design/yield
● Detect production anomalies
● Predict machine failures (sensor data)
Telecom
● Behavioral analysis of customer calls
● Network analysis (perf and reliability)
Healthcare
● Find genetic profiles/matches
● Analyze health vs spending
● Predict viral outbreaks
CIM Inc.
MariaDB AX Use Case
1. Find genetic
mates for cattle
2. Predict meat
production
3. Gene/DNA
analysis
Had to convert to CSV files and schedule
import jobs (cron)
Always receiving new genetic data
Migrated to data adapter (Python)
● streamline import process
● remove steps / possible error
● remove delays
● import data on demand
● immediate customer access
Life Science industry
Industry
biotechnology
(genetics)
Data
genotypes
Use Case
genetic profiling
Details
1. Identify trends
and patterns
2. Determine
population
cohorts
3. Predict health
outcomes
4. Anticipate
funding / capacity
5. Recommend
intervention
Can’t do complex queries on current hardware
with Oracle and snowflake schemas
Limited to optimizing for simple, known queries
(2-3 columns)
Replaced with ColumnStore
● a single table
● 2.5 million rows, 248 columns > complex,
ad-hoc queries
● query 20+ columns in seconds
Healthcare industry
Industry
healthcare
(Medicaid)
Data
surveys
Use Case
decision support
system
Details
1. Import log
2. Analyze customer
behavior
a. Website
click
b. Keyword
search
3. Optimize ad
performance
4. Manage dynamic
pricing based on
the KPI
Needs real-time analytics to optimize
advertisement
Replaced with ColumnStore
● fast data ingestion
● optimizes Ad performance
● A/B testing
● target ad by geography and demographic
provide automated monitoring,
● adjusts traffic based on real-time
performance manages dynamic pricing.
Advertisement industry
Industry
Digital
Advertisement
Data
Log
Use Case
Ad Analytics
Details
1. Collect asset
tracking data
2. Analyze and
monitor
a. Contract
b. Performance
3. Proactive service
Needs to ingest text type data and integration
with BI tool
Replaced with ColumnStore
● faster data ingestion
● Time series analysis with Window
function
● real-time asset monitoring with Tableau
● predictive asset maintenance
High tech industry
Industry
High tech
Data
Asset tracking
time series data
Use Case
Asset
Management
Details
1. Receive sensor data
from different parts
2. Real-time
monitoring
3. Analyze historical
data to uncover
machine failure
pattern
4. Predict machine
failure
5. Schedule proactive
maintenance
Need real time data ingestion
Needs integration with Spark to run Machine
Learning algorithm
Replaced with ColumnStore
● faster data ingestion
● leverage Spark ML
● real-time monitoring
● reduce production downtime
Manufacturing industry
Industry
Manufacturing
/Automobile
Data
Sensor data
Use Case
Predictive
Maintenance
Details
1. Collect asset
tracking data
2. Analyze and
monitor
a. Contract
b. Performance
3. Proactive service
Needs big data analytics solution to analyze
over 25 million quote records and 100,000
trading records per day
Replaced with ColumnStore
● archive large set of data to comply with
regulations
● provide self-service analytics to
sales/marketing team
● time series analysis with Window
function
Finance industry
Industry
Finance
Data
Trading records
Use Case
Trading analysis
Details
Time Series Data Analysis
with ColumnStore
Forex historical data
Free currency historical data from HistData.com
•GBPUSD M1 (1 minute) historical data in 2016
http://www.histdata.com/download-free-forex-historical-data/?/ascii/1-min
ute-bar-quotes/gbpusd/2016
•download HISTDATA_COM_ASCII_GBPUSD_M1_2016.zip
11
Free GBPUSD historical data (2016)
•1st column: timestamp
•need to convert the format in order to fit with DATETIME data type
MariaDB ColumnStore Data Types
• INT types - range is 2 less from max unsigned or min unsigned
• CHAR†
- max 255 bytes
• VARCHAR†
- max 8000 bytes
• DECIMAL - max 18 digits
• DOUBLE/FLOAT
• DATETIME - no sub-seconds yyyy-mm-dd hh:mm:ss
• DATE
• BLOB/TEXT
Convert timestamp w/ Ruby script
id = 0
while line = gets
timestamp, open, high, low, close = line.split(";")
year, month, day, hour, minute, second =
timestamp.unpack("a4a2a2xa2a2a2")
id+= 1
print "#{id},#{year}-#{month}-#{day} #{hour}:#{minute},”
puts [open, high, low, close].join(“,”)
end
Converted CSV
1,2016-01-03 17:00,1.473350,1.473350,1.473290,1.473290
2,2016-01-03 17:01,1.473280,1.473360,1.473260,1.473350
3,2016-01-03 17:02,1.473350,1.473350,1.473290,1.473290
4,2016-01-03 17:03,1.473300,1.473330,1.473290,1.473320
5,2016-01-03 17:04,1.473320,1.473340,1.473320,1.473320
6,2016-01-03 17:05,1.473340,1.473370,1.473300,1.473320
7,2016-01-03 17:06,1.473320,1.473320,1.473310,1.473310
8,2016-01-03 17:07,1.473310,1.473310,1.473300,1.473310
9,2016-01-03 17:08,1.473310,1.474010,1.473300,1.474010
• DATETIME - no sub-seconds yyyy-mm-dd hh:mm:ss
Populate sample data
CREATE DATABASE/TABLE
MariaDB [(none)]> create database forex;
MariaDB [(none)]> use forex;
MariaDB [forex]> CREATE TABLE gbpusd(
id int,
time datetime,
open double,
high double,
low double,
close double)
engine=columnstore default character set=utf8;
DESC TABLE
MariaDB [forex]> desc gbpusd;
+-----------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+----------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| time | datetime | YES | | NULL | |
| open | double | YES | | NULL | |
| high | double | YES | | NULL | |
| low | double | YES | | NULL | |
| close | double | YES | | NULL | |
+-----------+----------+------+-----+---------+-------+
import CSV into ColumnStore using cpimport
# cpimport -s ',' forex gbpusd gbpusd2016.csv
Locale is : C
Column delimiter : ,
Using table OID 3163 as the default JOB ID
Input file(s) will be read from : /home/vagrant/histdata
Job description file :
/usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T103843_S950145_Job_3163.xml
Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3163.log
2017-06-24 10:38:43 (29756) INFO : successfully loaded job file
/usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T103843_S950145_Job_3163.xml
2017-06-24 10:38:43 (29756) INFO : Job file loaded, run time for this step : 0.0321331 seconds
2017-06-24 10:38:43 (29756) INFO : PreProcessing check starts
2017-06-24 10:38:43 (29756) INFO : input data file /home/vagrant/histdata/gbpusd2016.csv
2017-06-24 10:38:43 (29756) INFO : PreProcessing check completed
2017-06-24 10:38:43 (29756) INFO : preProcess completed, run time for this step : 0.0329528 seconds
2017-06-24 10:38:43 (29756) INFO : No of Read Threads Spawned = 1
2017-06-24 10:38:43 (29756) INFO : No of Parse Threads Spawned = 3
2017-06-24 10:38:45 (29756) INFO : For table forex.gbpusd: 372,480 rows processed and 372480 rows inserted.
2017-06-24 10:38:46 (29756) INFO : Bulk load completed, total run time : 2.11976 seconds
DB table
if cpimport failed...
# cpimport forex gbpusd gbpusd2016.csv
Locale is : C
Using table OID 3163 as the default JOB ID
Input file(s) will be read from : /home/vagrant/histdata
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T104034_S269473_Job_3163.xml
Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3163.log
2017-06-24 10:40:34 (30209) INFO : successfully loaded job file
/usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T104034_S269473_Job_3163.xml
2017-06-24 10:40:34 (30209) INFO : Job file loaded, run time for this step : 0.0253589 seconds
2017-06-24 10:40:34 (30209) INFO : PreProcessing check starts
2017-06-24 10:40:34 (30209) INFO : input data file /home/vagrant/histdata/gbpusd2016.csv
2017-06-24 10:40:34 (30209) INFO : PreProcessing check completed
2017-06-24 10:40:34 (30209) INFO : preProcess completed, run time for this step : 0.065531 seconds
2017-06-24 10:40:34 (30209) INFO : No of Read Threads Spawned = 1
2017-06-24 10:40:34 (30209) INFO : No of Parse Threads Spawned = 3
2017-06-24 10:40:34 (30209) INFO : Number of rows with errors = 11. Row numbers with error reasons are listed in file
/home/vagrant/histdata/gbpusd2016.csv.Job_3163_30209.err
2017-06-24 10:40:34 (30209) INFO : Number of rows with errors = 11. Exact error rows are listed in file
/home/vagrant/histdata/gbpusd2016.csv.Job_3163_30209.bad
2017-06-24 10:40:34 (30209) ERR : Actual error row count(11) exceeds the max error rows(10) allowed for table forex.gbpusd [1451]
2017-06-24 10:40:34 (30209) CRIT : Bulkload Read (thread 0) Failed for Table forex.gbpusd. Terminating this job. [1451]
2017-06-24 10:40:34 (30209) INFO : Bulkload Parse (thread 2) Stopped parsing Tables. BulkLoad::parse() responding to job termination
2017-06-24 10:40:34 (30209) INFO : Bulkload Parse (thread 1) Stopped parsing Tables. BulkLoad::parse() responding to job termination
2017-06-24 10:40:34 (30209) INFO : Bulkload Parse (thread 0) Stopped parsing Tables. BulkLoad::parse() responding to job termination
2017-06-24 10:40:34 (30209) INFO : Table forex.gbpusd (OID-3163) was not successfully loaded. Rolling back.
2017-06-24 10:40:34 (30209) INFO : Bulk load completed, total run time : 0.638649 seconds
verify your Job_xxxx_xxxxx.err
gbpusd2016.csv.Job_3163_30209.err :
Line number 1; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 2; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 3; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 4; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 5; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 6; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 7; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 8; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 9; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 10; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Line number 11; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
Bulk import performance
performance LOAD DATA LOCAL INFILE
# mcsmysql --local-infile=1 forex
Welcome to the MariaDB monitor. Commands end with ; or g.
Your MariaDB connection id is 38
Server version: 10.1.23-MariaDB Columnstore 1.0.9-1
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
MariaDB [forex]> LOAD DATA LOCAL INFILE 'gbpusd2016.csv' INTO TABLE gbpusd FIELDS
TERMINATED BY ',';
Query OK, 372480 rows affected (1.52 sec)
Records: 372480 Deleted: 0 Skipped: 0 Warnings: 0
Performance ColumnStore : cpimport
# cpimport -s ',' forex gbpusd gbpusd2016.csv
2017-06-24 10:38:45 (29756) INFO : For table forex.gbpusd: 372480
rows processed and 372480 rows inserted.
2017-06-24 10:38:46 (29756) INFO : Bulk load completed, total run
time : 2.11976 seconds
-s: field separator
cpimport
• 2 sec. for 372,480 rows
LOAD DATA LOCAL INFILE
• 372480 rows affected
(1.52 sec)
CSV import: cpimport vs. LOAD DATA LOCAL INFILE
Performance ColumnStore : INSERT INTO
INSERT INTO gbpusd_idb(id, time, open, high, low, close) VALUES('1',
'2016-01-03 17:00', '1.473350', '1.473350', '1.473290', '1.473290');
INSERT INTO gbpusd_idb(id, time, open, high, low, close) VALUES('2',
'2016-01-03 17:01', '1.473280', '1.473360', '1.473260', '1.473350');
INSERT INTO gbpusd_idb(id, time, open, high, low, close) VALUES('3',
'2016-01-03 17:02', '1.473350', '1.473350', '1.473290', '1.473290');
...
MariaDB [forex]> source gbpusd2016.sql
...
MariaDB [forex]> Bye
real 18m16.178s
user 0m28.330s
sys 0m23.551s
Simply query w/ SQLPad
SQLPad - https://github.com/rickbergfalk/sqlpad
SQLPad installation / launch
# yum -y install npm (EPEL repository required)
# npm install sqlpad -g
$ sqlpad -ip 0.0.0.0 --port 3000
Launching server WITHOUT SSL
Welcome to SQLPad!. Visit http://localhost:3000 to get started
UK votes to leave
EU after dramatic
night divides nation
• https://www.theguardian.com/politics/2016/jun/24/br
itain-votes-for-brexit-eu-referendum-david-cameron
The value of the pound
swung wildly on currency
markets as initial
confidence among
investors expecting a
remain vote was dented
by some of the early
referendum results,
triggering falls of close
to 10% and its biggest
one-day fall ever.
simple query for time period before/after vote
simple query for time period before/after vote
Query Results Visualization
MariaDB ColumnStore
Window Functions
Supported Window Functions
Function Description
AVG() The average of all input values.
COUNT() Number of input rows.
CUME_DIST() Calculates the cumulative distribution, or relative rank, of the current row to other rows in
the same partition. Number of peer or preceding rows / number of rows in partition.
DENSE_RANK() Ranks items in a group leaving no gaps in ranking sequence when there are ties.
FIRST_VALUE() The value evaluated at the row that is the first row of the window frame (counting from 1);
null if no such row.
Supported Window Functions (cont’d)
Function Description
LAG() The value evaluated at the row that is offset rows before the current row within the
partition; if there is no such row, instead return default. Both offset and default are
evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null.
LAG provides access to more than one row of a table at the same time without a self-join.
Given a series of rows returned from a query and a position of the cursor, LAG provides
access to a row at a given physical offset prior to that position.
LAST_VALUE() The value evaluated at the row that is the last row of the window frame (counting from 1);
null if no such row.
LEAD() Provides access to a row at a given physical offset beyond that position. Returns value
evaluated at the row that is offset rows after the current row within the partition; if there is
no such row, instead return default. Both offset and default are evaluated with respect to
the current row. If omitted, offset defaults to 1 and default to null.
MAX() Maximum value of expression across all input values.
Supported Window Functions (cont’d)
Function Description
MEDIAN() An inverse distribution function that assumes a continuous distribution model. It takes a
numeric or datetime value and returns the middle value or an interpolated value that
would be the middle value once the values are sorted. Nulls are ignored in the
calculation.
MIN() Minimum value of expression across all input values.
NTH_VALUE() The value evaluated at the row that is the nth row of the window frame (counting from
1); null if no such row.
NTILE() Divides an ordered data set into a number of buckets indicated by expr and assigns the
appropriate bucket number to each row. The buckets are numbered 1 through expr. The
expr value must resolve to a positive constant for each partition. Integer ranging from 1
to the argument value, dividing the partition as equally as possible.
PERCENT_RANK() relative rank of the current row: (rank - 1) / (total rows - 1).
Supported Window Functions (cont’d)
Function Description
PERCENTILE_CONT() An inverse distribution function that assumes a continuous distribution model. It
takes a percentile value and a sort specification, and returns an interpolated value
that would fall into that percentile value with respect to the sort specification. Nulls
are ignored in the calculation.
PERCENTILE_DISC() An inverse distribution function that assumes a discrete distribution model. It takes a
percentile value and a sort specification and returns an element from the set. Nulls
are ignored in the calculation.
RANK() rank of the current row with gaps; same as row_number of its first peer.
ROW_NUMBER() number of the current row within its partition, counting from 1
STDDEV()
STDDEV_POP()
Computes the population standard deviation and returns the square root of the
population variance.
Supported Window Functions (cont’d)
Function Description
STDDEV_SAMP() Computes the cumulative sample standard deviation and returns the square root of
the sample variance.
SUM() Sum of expression across all input values.
VARIANCE()
VAR_POP()
Population variance of the input values (square of the population standard
deviation).
VAR_SAMP() Sample variance of the input values (square of the sample standard deviation).
MariaDB ColumnStore
Aggregate Functions
MAX GBPUSD 23th - 25th June 2016
MIN GBPUSD 23th - 25th June 2016
Drop off rate GBPUSD 23th - 25th June 2016
-13% of drop off
in a few hours
Correlation GBPUSD - USDJPY
Correlation GBPUSD - USDJPY @ Brexit
scatter plot
(normalized)
GBPUSD*100-130
USDJPY-110
Correlation GBPUSD - USDJPY @ Brexit
SELECT
( AVG( gbpusd.close * usdjpy.close ) - AVG( gbpusd.close ) * AVG( usdjpy.close ) ) /
( STDDEV(gbpusd.close) * STDDEV(usdjpy.close) )
AS correlation_coefficient_population
FROM usdjpy
INNER JOIN gbpusd ON gbpusd.time = usdjpy.time
WHERE
gbpusd.time BETWEEN TIMESTAMP ( '2016-06-22' )
AND TIMESTAMP ( '2016-06-26' );
Pearson correlation coefficient
Correlation GBPUSD - USDJPY @ Brexit
Scatter Plot
(normalized)
correlation coeff.
94.4 % :
highly correlated GBPUSD*100-130
Correlation GBPUSD - USDJPY 2016 (Jan. - Dec.)
Scatter Plot
(normalized)
correlation coeff.
36%: low correlation GBPUSD*100-130
USDJPY-110
Performance - ColumnStore vs. InnoDB
ColumnStore storage engine:
+------------------------------------+
| correlation_coefficient_population |
+------------------------------------+
| 0.9648375371071727 |
+------------------------------------+
1 row in set (0.43 sec)
> 1000 times faster than InnoDB
SELECT
(AVG(gbpusd.close*usdjpy.close) - AVG(gbpusd.close)*AVG(usdjpy.close)) /
(STDDEV(gbpusd.close) * STDDEV(usdjpy.close))
AS correlation_coefficient_population
FROM gbpusd
JOIN usdjpy ON gbpusd.time = usdjpy.time
WHERE gbpusd.time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');
InnoDB storage engine:
+------------------------------------+
| correlation_coefficient_population |
+------------------------------------+
| 0.964837537107134 |
+------------------------------------+
1 row in set (8 min 11.21 sec)
Moving Average w/ Window Functions
Moving Average GBPUSD
SELECT
time, close,
AVG(close) OVER (
ORDER BY time ASC
ROWS BETWEEN
6 PRECEDING AND
6 FOLLOWING ) AS MA13,
COUNT(close) OVER (
ORDER BY time ASC
ROWS BETWEEN
6 PRECEDING AND
6 FOLLOWING ) AS row_count
FROM gbpusd
WHERE time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');
Moving Average GBPUSD
AVG(close) OVER (
ORDER BY time ASC
ROWS BETWEEN
6 PRECEDING AND
6 FOLLOWING )
AS MA13
time close MA13 row
count
6/23/2016 00:00 1.4797 1.4797 7 preceding 6
6/23/2016 00:01 1.4798 1.4797 8 preceding 5
6/23/2016 00:02 1.4798 1.4796 9 preceding 4
6/23/2016 00:03 1.4797 1.4796 10 preceding 3
6/23/2016 00:04 1.4796 1.4796 11 preceding 2
6/23/2016 00:05 1.4796 1.4796 12 preceding 1
6/23/2016 00:06 1.4796 1.4796 13 current row
6/23/2016 00:07 1.4796 1.4796 13 following 1
6/23/2016 00:08 1.4796 1.4796 13 following 2
6/23/2016 00:09 1.4796 1.4796 13 following 3
6/23/2016 00:10 1.4796 1.4797 13 following 4
6/23/2016 00:11 1.4796 1.4797 13 following 5
6/23/2016 00:12 1.4797 1.4797 13 following 6
raw data GBPUSD M1
53
Moving Average 13 GBPUSD
summary
• Free Forex time series history data analyzed with :
– Simple analytic queries(aggregate functions) w/
SQLPad
– Moving average using Window Function
Thank you!
Appendix
MariaDB Partners w/ Global Visual Analytics Leader Tableau
https://mariadb.com/about-us/newsroom/press-releases/fastest-growing-open-source-database
-mariadb-partners-global
Fastest Growing Open Source Database MariaDB Partners With Global Visual
Analytics Leader Tableau
Combination of ubiquitous database and visual analytics technologies accelerates delivery of
business insights
MENLO PARK, Calif. and HELSINKI – December 12, 2017 – MariaDB® Corporation, the
company behind the fastest growing open source database, today announced Tableau Software,
the global leader in visual analytics, has certified MariaDB integration with Tableau’s business
intelligence (BI) and visual analytics platform. Bringing together the highly popular data
management products and the renowned visualization technologies means businesses globally
can confidently use these preferred solutions for reliable, fast, data-driven business decisions.
Analyzing Queries in ColumnStore
Analyzing Queries : select calGetStats();
https://mariadb.com/kb/en/library/analyzing-queries-in-columnstore/
MariaDB [forex]> select calGetStats();
Query Stats: MaxMemPct-1; NumTempFiles-0; TempFileSpace-0B; ApproxPhyI/O-0;
CacheI/O-6298; BlocksTouched-6298; PartitionBlocksEliminated-0; MsgBytesIn-4MB;
MsgBytesOut-11MB; Mode-Distributed
Analyzing Queries : select calGetStats();
• MaxMemPct - Peak memory utilization on the User Module, likely in support of a large (User Module)
based hash join operation.
• NumTempFiles - Report on any temporary files created in support of query operations larger than
available memory, typically for unusual join operations where the smaller table join cardinality exceeds
some configurable threshold.
• TempFileSpace - Report on space used by temporary files created in support of query operations larger
than available memory, typically for unusual join operations where the smaller table join cardinality
exceeds some configurable threshold.
• PhyI/O - Number of 8k blocks read from disk, SSD, or other persistent storage.
• CacheI/O - Approximate number of 8k blocks processed in memory, adjusted down by the number of
discrete PhyI/O calls required.
• BlocksTouched - Approximate number of 8k blocks processed in memory.
• PartitionBlocksEliminated - The number of block touches eliminated via the Extent Map elimination
behavior.
MsgBytesIn, MsgByteOut - Message size in MB sent between nodes in support of the query.
Analyzing Queries : calSetTrace(1); calGetTrace();
MariaDB [test]> calSetTrace(1);
MariaDB [test]> select c_name, sum(o_totalprice) from customer, orders where o_custkey =
c_custkey and c_custkey = 5 group by c_name;
+--------------------+-------------------+
| c_name | sum(o_totalprice) |
+--------------------+-------------------+
| Customer#000000005 | 684965.28 |
+--------------------+-------------------+
1 row in set, 1 warning (0.34 sec)
MariaDB [test]> select calGetTrace();
Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
BPS PM customer 3024 (c_custkey,c_name) 0 43 36 0.006 1
BPS PM orders 3038 (o_custkey,o_totalprice) 0 766 0 0.032 3
HJS PM orders-customer 3038 - - - - ----- -
TAS UM - - - - - - 0.021 1
Analyzing Queries : calSetTrace(1); calGetTrace();
Desc – Operation being executed. Possible values:
● BPS - Batch Primitive Step : scanning or projecting the column blocks.
● CES - Cross Engine Step: Performing Cross engine join
● DSS - Dictionary Structure Step : a dictionary scan for a particular variable length string value.
● HJS - Hash Join Step : Performing a hash join between 2 tables
● HVS - Having Step: Performing the having clause on the result set
● SQS - Sub Query Step: Performaning a sub query
● TAS - Tuple Aggregation step : the process of receiving intermediate aggregation results at the UM from the PM
nodes.
● TNS - Tuple Annexation Step : Query result finishing, e.g. filling in constant columns, limit, order by and final
distinct cases.
● TUS = Tuple Union step : Performing a SQL union of 2 sub queries.
● TCS = Tuple Constant Step: Process Constant Value Columns
● WFS = Window Function Step: Performing a window function.
Analyzing Queries : calSetTrace(1); calGetTrace();
• Mode – Where the operation was performed: UM or PM
• Table – Table for which columns may be scanned/projected.
• TableOID – ObjectID for the table being scanned.
• ReferencedOIDs – ObjectIDs for the columns required by the query.
• PIO – Physical I/O (reads from storage) executed for the query.
• LIO – Logical I/O executed for the query, also known as Blocks Touched.
• PBE – Partition Blocks Eliminated identifies blocks eliminated by Extent Map min/max.
• Elapsed – Elapsed time for a give step.
• Rows – Intermediate rows returned
ColumnStore Architecture
MariaDB ColumnStore Architecture
Columnar Distributed Data Storage
Local Storage | SAN | EBS | Gluster FS
BI Tool SQL Client Custom
Big Data App
Application
MariaDB SQL
Front End
Distributed
Query Engine
Data
Storage
User Module (UM)
Performance
Module (PM)
Row-oriented vs. Column-oriented format
•Row oriented
–Rows stored
sequentially in a file
–Scans through every
record row by row
•Column oriented:
–Each column is stored
in a separate file
–Scans the only
relevant column
ID Fname Lname State Zip Phone Age Sex
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
ID
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F
SELECT Fname FROM Table 1 WHERE State = 'NY'
67
High Performance Query Processing
Horizontal
Partition:
8 Million Rows
Extent 2
Horizontal
Partition:
8 Million Rows
Extent 3
Horizontal
Partition:
8 Million Rows
Extent 1
Storage Architecture reduces I/O
• Only touch column files
that are in projection, filter
and join conditions
• Eliminate disk block touches
to partitions outside filter
and join conditions
Extent 1:
Min State: CA, Max State: NY
Extent 2:
Min State: OR, Max State: WY
Extent 3:
Min State: IA, Max State: TN
SELECT Fname FROM Table 1 WHERE State = ‘NY’
ID
1
2
3
4
...
8M
8M+1
...
16M
16M+1
...
24M
Fname
Bugs
Yosemite
Daffy
Hazel
...
...
Jane
...
Elmer
Lname
Bunny
Sam
Duck
Fudd
...
...
...
State
NY
CA
NY
ME
...
MN
WY
TX
OR
...
VA
TN
IA
NY
...
PA
Zip
11217
95389
10013
04578
...
...
...
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
...
...
...
Age
34
52
35
43
...
...
...
Sex
M
M
M
F
...
...
...
Vertical
Partition
Vertical
Partition
Vertical
Partition
Vertical
Partition
Vertical
Partition
…
ELIMINATED PARTITION
sizing ColumnStore
Sizing
Minimum Spec
UM
4 core,
32 G RAM PM
4 core,
16 G RAM
Typical Server spec
PM
8 core 64G RAM
UM
8 core, 264G RAM
Data Storage
External Data Volumes
• Maximum 2 data volume per IO
channel per PM node server
• up to 2TB on the disk per data
volume ≈ Max 4 TB per PM node
Local disk
Up to 2TB on the disk per
PM node server
DETAILED SIZING GUIDE
based on data size
and workload
MariaDB ColumnStore Sizing - Example
• 60TB uncompressed data =
6TB compressed data at 10x compression
• 2UM - 8 core 512GB(based on workload)
• 6 TB compressed = 3 data volume (at 2TB per volume)
-with 1 data volume per PM node - 3PMs
• Data growth - 2TB per month, Data retention - 2 years
-Plan for 2TB X24 = 48 TB additional
-48 TB = 4.8TB compressed ≈ 3 data volume(at 2TB
per volume)
with 1 data volume per PM node - 3 additional PMs
• Total 6 PMs, 2 UMs
using ColumnStore
via SSL/TLS connection
SSL variables w/o SSL
MariaDB [(none)]> SHOW VARIABLES LIKE '%ssl%';
+---------------------+---------------------------------+
| Variable_name | Value |
+---------------------+---------------------------------+
| have_openssl | YES |
| have_ssl | DISABLED |
| ssl_ca | |
| ssl_capath | |
| ssl_cert | |
| ssl_cipher | |
| ssl_crl | |
| ssl_crlpath | |
| ssl_key | |
| version_ssl_library | OpenSSL 1.0.1e-fips 11 Feb 2013 |
+---------------------+---------------------------------+
/usr/local/mariadb/columnstore/mysql/my.cnf
[client]
ssl-ca = /etc/pki/tls/mariadb/certs/ca-cert.pem
ssl-cert = /etc/pki/tls/mariadb/certs/client-cert.pem
ssl-key = /etc/pki/tls/mariadb/private/client-key.pem
[mysqld]
ssl-ca = /etc/pki/tls/mariadb/certs/ca-cert.pem
ssl-cert = /etc/pki/tls/mariadb/certs/server-cert.pem
ssl-key = /etc/pki/tls/mariadb/private/server-key.pem
SSL variables : SSL enabled
MariaDB [(none)]> SHOW VARIABLES LIKE '%ssl%';
+---------------------+---------------------------------------------+
| Variable_name | Value |
+---------------------+---------------------------------------------+
| have_openssl | YES |
| have_ssl | YES |
| ssl_ca | /etc/pki/tls/mariadb/certs/ca-cert.pem |
| ssl_capath | |
| ssl_cert | /etc/pki/tls/mariadb/certs/server-cert.pem |
| ssl_cipher | |
| ssl_crl | |
| ssl_crlpath | |
| ssl_key | /etc/pki/tls/mariadb/private/server-key.pem |
| version_ssl_library | OpenSSL 1.0.1e-fips 11 Feb 2013 |
+---------------------+---------------------------------------------+
status : SSL enabled
MariaDB [(none)]> status
--------------
/usr/local/mariadb/columnstore/mysql/bin/mysql Ver 15.1 Distrib 10.1.23-MariaDB, for Linux
(x86_64) using readline 5.1
Connection id: 5
Current database:
Current user: root@localhost
SSL: Cipher in use is DHE-RSA-AES256-GCM-SHA384
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server: MariaDB
Server version: 10.1.23-MariaDB Columnstore 1.0.9-1
Protocol version: 10
Connection: Localhost via UNIX socket
Server characterset: latin1
Db characterset: latin1
Client characterset: utf8
Conn. characterset: utf8
UNIX socket: /usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock

More Related Content

What's hot

Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotDistributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotCitus Data
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponCloudera, Inc.
 
The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryThe Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryHanna Kelman
 
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendscodecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendsDataStax Academy
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Real time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco SlotReal time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco SlotCitus Data
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevAltinity Ltd
 
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Seunghyun Lee
 
Workshop 20140522 BigQuery Implementation
Workshop 20140522   BigQuery ImplementationWorkshop 20140522   BigQuery Implementation
Workshop 20140522 BigQuery ImplementationSimon Su
 
Modeling the IoT with TitanDB and Cassandra
Modeling the IoT with TitanDB and CassandraModeling the IoT with TitanDB and Cassandra
Modeling the IoT with TitanDB and Cassandratwilmes
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Martin Zapletal
 
Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Mayank Shrivastava
 
Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021
Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021
Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021InfluxData
 
ClickHouse 2018. How to stop waiting for your queries to complete and start ...
ClickHouse 2018.  How to stop waiting for your queries to complete and start ...ClickHouse 2018.  How to stop waiting for your queries to complete and start ...
ClickHouse 2018. How to stop waiting for your queries to complete and start ...Altinity Ltd
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim TkachenkoWebinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim TkachenkoAltinity Ltd
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxData
 
Big size meteorological data processing and mobile displaying system using ...
Big size meteorological data processing and mobile displaying system using ...Big size meteorological data processing and mobile displaying system using ...
Big size meteorological data processing and mobile displaying system using ...BJ Jang
 

What's hot (20)

Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotDistributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
 
The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus StoryThe Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus Story
 
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendscodecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Real time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco SlotReal time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco Slot
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
 
Workshop 20140522 BigQuery Implementation
Workshop 20140522   BigQuery ImplementationWorkshop 20140522   BigQuery Implementation
Workshop 20140522 BigQuery Implementation
 
Modeling the IoT with TitanDB and Cassandra
Modeling the IoT with TitanDB and CassandraModeling the IoT with TitanDB and Cassandra
Modeling the IoT with TitanDB and Cassandra
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
 
Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020
 
Omid: A transactional Framework for HBase
Omid: A transactional Framework for HBaseOmid: A transactional Framework for HBase
Omid: A transactional Framework for HBase
 
Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021
Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021
Anais Dotis-Georgiou [InfluxData] | Learn Flux by Example | InfluxDays NA 2021
 
ClickHouse 2018. How to stop waiting for your queries to complete and start ...
ClickHouse 2018.  How to stop waiting for your queries to complete and start ...ClickHouse 2018.  How to stop waiting for your queries to complete and start ...
ClickHouse 2018. How to stop waiting for your queries to complete and start ...
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim TkachenkoWebinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
 
Triggers in MongoDB
Triggers in MongoDBTriggers in MongoDB
Triggers in MongoDB
 
Big size meteorological data processing and mobile displaying system using ...
Big size meteorological data processing and mobile displaying system using ...Big size meteorological data processing and mobile displaying system using ...
Big size meteorological data processing and mobile displaying system using ...
 

Similar to M|18 Analytics in the Real World, Case Studies and Use Cases

MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB
 
SFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdf
SFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdfSFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdf
SFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdfSouth Tyrol Free Software Conference
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)Toshiyuki Shimono
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinGuido Schmutz
 
GSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting WorkflowsGSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting WorkflowsNeo4j
 
Best Practices with ODI : Flexibility
Best Practices with ODI : FlexibilityBest Practices with ODI : Flexibility
Best Practices with ODI : FlexibilityGurcan Orhan
 
online Blood Bank management system
online Blood Bank management system online Blood Bank management system
online Blood Bank management system amarsajid
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
OSMC 2013 | openTSDB - metrics for a distributed world
OSMC 2013 | openTSDB - metrics for a distributed worldOSMC 2013 | openTSDB - metrics for a distributed world
OSMC 2013 | openTSDB - metrics for a distributed worldNETWAYS
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18Imply
 
GPS based Bus management system
GPS based Bus management systemGPS based Bus management system
GPS based Bus management systemNeeraj Kansal
 
Multi state churn analysis with a subscription product
Multi state churn analysis with a subscription productMulti state churn analysis with a subscription product
Multi state churn analysis with a subscription productVienna Data Science Group
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...yashbheda
 
AI-Powered Streaming Analytics for Real-Time Customer Experience
AI-Powered Streaming Analytics for Real-Time Customer ExperienceAI-Powered Streaming Analytics for Real-Time Customer Experience
AI-Powered Streaming Analytics for Real-Time Customer ExperienceDatabricks
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC OsloDavid Pilato
 
Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Tal Bar-Zvi
 
Daewoo reservation and transport system database
Daewoo reservation and transport system databaseDaewoo reservation and transport system database
Daewoo reservation and transport system databaseEfrah Shakir
 

Similar to M|18 Analytics in the Real World, Case Studies and Use Cases (20)

MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
 
SFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdf
SFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdfSFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdf
SFScon22 - Anton Dignoes - Managing Temporal Data in PostgreSQL.pdf
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
 
GSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting WorkflowsGSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting Workflows
 
Best Practices with ODI : Flexibility
Best Practices with ODI : FlexibilityBest Practices with ODI : Flexibility
Best Practices with ODI : Flexibility
 
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructureApplication of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
 
online Blood Bank management system
online Blood Bank management system online Blood Bank management system
online Blood Bank management system
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
OSMC 2013 | openTSDB - metrics for a distributed world
OSMC 2013 | openTSDB - metrics for a distributed worldOSMC 2013 | openTSDB - metrics for a distributed world
OSMC 2013 | openTSDB - metrics for a distributed world
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18
 
Tic tac toe
Tic tac toeTic tac toe
Tic tac toe
 
Tic tac toe game code
Tic tac toe game codeTic tac toe game code
Tic tac toe game code
 
GPS based Bus management system
GPS based Bus management systemGPS based Bus management system
GPS based Bus management system
 
Multi state churn analysis with a subscription product
Multi state churn analysis with a subscription productMulti state churn analysis with a subscription product
Multi state churn analysis with a subscription product
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
 
AI-Powered Streaming Analytics for Real-Time Customer Experience
AI-Powered Streaming Analytics for Real-Time Customer ExperienceAI-Powered Streaming Analytics for Real-Time Customer Experience
AI-Powered Streaming Analytics for Real-Time Customer Experience
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
 
Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019
 
Daewoo reservation and transport system database
Daewoo reservation and transport system databaseDaewoo reservation and transport system database
Daewoo reservation and transport system database
 

More from MariaDB plc

MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.xMariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.xMariaDB plc
 
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - NewpharmaMariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - NewpharmaMariaDB plc
 
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - CloudMariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - CloudMariaDB plc
 
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB EnterpriseMariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB EnterpriseMariaDB plc
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB plc
 
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale MariaDB plc
 
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentationMariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentationMariaDB plc
 
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentationMariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentationMariaDB plc
 
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB plc
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB plc
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023MariaDB plc
 
Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBMariaDB plc
 
Die Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerDie Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerMariaDB plc
 
Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®MariaDB plc
 
Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysisMariaDB plc
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoringMariaDB plc
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorMariaDB plc
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB plc
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBMariaDB plc
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQLMariaDB plc
 

More from MariaDB plc (20)

MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.xMariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
 
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - NewpharmaMariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - Newpharma
 
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - CloudMariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - Cloud
 
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB EnterpriseMariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB Enterprise
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
 
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale
 
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentationMariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentation
 
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentationMariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentation
 
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023
 
Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDB
 
Die Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerDie Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise Server
 
Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®
 
Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysis
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoring
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connector
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introduction
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQL
 

Recently uploaded

Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...HyderabadDolls
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...HyderabadDolls
 

Recently uploaded (20)

Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 

M|18 Analytics in the Real World, Case Studies and Use Cases

  • 1. Analytics in the Real World, Case Studies and Use Cases Amy Krishnamohan Director of Product Marketing GOTO Satoru Customer Solutions Engineer
  • 2. Finance ● Identify trade patterns ● Detect fraud and anomalies ● Predict trading outcomes Manufacturing ● Simulations to improve design/yield ● Detect production anomalies ● Predict machine failures (sensor data) Telecom ● Behavioral analysis of customer calls ● Network analysis (perf and reliability) Healthcare ● Find genetic profiles/matches ● Analyze health vs spending ● Predict viral outbreaks CIM Inc. MariaDB AX Use Case
  • 3. 1. Find genetic mates for cattle 2. Predict meat production 3. Gene/DNA analysis Had to convert to CSV files and schedule import jobs (cron) Always receiving new genetic data Migrated to data adapter (Python) ● streamline import process ● remove steps / possible error ● remove delays ● import data on demand ● immediate customer access Life Science industry Industry biotechnology (genetics) Data genotypes Use Case genetic profiling Details
  • 4. 1. Identify trends and patterns 2. Determine population cohorts 3. Predict health outcomes 4. Anticipate funding / capacity 5. Recommend intervention Can’t do complex queries on current hardware with Oracle and snowflake schemas Limited to optimizing for simple, known queries (2-3 columns) Replaced with ColumnStore ● a single table ● 2.5 million rows, 248 columns > complex, ad-hoc queries ● query 20+ columns in seconds Healthcare industry Industry healthcare (Medicaid) Data surveys Use Case decision support system Details
  • 5. 1. Import log 2. Analyze customer behavior a. Website click b. Keyword search 3. Optimize ad performance 4. Manage dynamic pricing based on the KPI Needs real-time analytics to optimize advertisement Replaced with ColumnStore ● fast data ingestion ● optimizes Ad performance ● A/B testing ● target ad by geography and demographic provide automated monitoring, ● adjusts traffic based on real-time performance manages dynamic pricing. Advertisement industry Industry Digital Advertisement Data Log Use Case Ad Analytics Details
  • 6. 1. Collect asset tracking data 2. Analyze and monitor a. Contract b. Performance 3. Proactive service Needs to ingest text type data and integration with BI tool Replaced with ColumnStore ● faster data ingestion ● Time series analysis with Window function ● real-time asset monitoring with Tableau ● predictive asset maintenance High tech industry Industry High tech Data Asset tracking time series data Use Case Asset Management Details
  • 7. 1. Receive sensor data from different parts 2. Real-time monitoring 3. Analyze historical data to uncover machine failure pattern 4. Predict machine failure 5. Schedule proactive maintenance Need real time data ingestion Needs integration with Spark to run Machine Learning algorithm Replaced with ColumnStore ● faster data ingestion ● leverage Spark ML ● real-time monitoring ● reduce production downtime Manufacturing industry Industry Manufacturing /Automobile Data Sensor data Use Case Predictive Maintenance Details
  • 8. 1. Collect asset tracking data 2. Analyze and monitor a. Contract b. Performance 3. Proactive service Needs big data analytics solution to analyze over 25 million quote records and 100,000 trading records per day Replaced with ColumnStore ● archive large set of data to comply with regulations ● provide self-service analytics to sales/marketing team ● time series analysis with Window function Finance industry Industry Finance Data Trading records Use Case Trading analysis Details
  • 9. Time Series Data Analysis with ColumnStore
  • 11. Free currency historical data from HistData.com •GBPUSD M1 (1 minute) historical data in 2016 http://www.histdata.com/download-free-forex-historical-data/?/ascii/1-min ute-bar-quotes/gbpusd/2016 •download HISTDATA_COM_ASCII_GBPUSD_M1_2016.zip 11
  • 12. Free GBPUSD historical data (2016) •1st column: timestamp •need to convert the format in order to fit with DATETIME data type
  • 13. MariaDB ColumnStore Data Types • INT types - range is 2 less from max unsigned or min unsigned • CHAR† - max 255 bytes • VARCHAR† - max 8000 bytes • DECIMAL - max 18 digits • DOUBLE/FLOAT • DATETIME - no sub-seconds yyyy-mm-dd hh:mm:ss • DATE • BLOB/TEXT
  • 14. Convert timestamp w/ Ruby script id = 0 while line = gets timestamp, open, high, low, close = line.split(";") year, month, day, hour, minute, second = timestamp.unpack("a4a2a2xa2a2a2") id+= 1 print "#{id},#{year}-#{month}-#{day} #{hour}:#{minute},” puts [open, high, low, close].join(“,”) end
  • 15. Converted CSV 1,2016-01-03 17:00,1.473350,1.473350,1.473290,1.473290 2,2016-01-03 17:01,1.473280,1.473360,1.473260,1.473350 3,2016-01-03 17:02,1.473350,1.473350,1.473290,1.473290 4,2016-01-03 17:03,1.473300,1.473330,1.473290,1.473320 5,2016-01-03 17:04,1.473320,1.473340,1.473320,1.473320 6,2016-01-03 17:05,1.473340,1.473370,1.473300,1.473320 7,2016-01-03 17:06,1.473320,1.473320,1.473310,1.473310 8,2016-01-03 17:07,1.473310,1.473310,1.473300,1.473310 9,2016-01-03 17:08,1.473310,1.474010,1.473300,1.474010 • DATETIME - no sub-seconds yyyy-mm-dd hh:mm:ss
  • 17. CREATE DATABASE/TABLE MariaDB [(none)]> create database forex; MariaDB [(none)]> use forex; MariaDB [forex]> CREATE TABLE gbpusd( id int, time datetime, open double, high double, low double, close double) engine=columnstore default character set=utf8;
  • 18. DESC TABLE MariaDB [forex]> desc gbpusd; +-----------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+----------+------+-----+---------+-------+ | id | int(11) | YES | | NULL | | | time | datetime | YES | | NULL | | | open | double | YES | | NULL | | | high | double | YES | | NULL | | | low | double | YES | | NULL | | | close | double | YES | | NULL | | +-----------+----------+------+-----+---------+-------+
  • 19. import CSV into ColumnStore using cpimport # cpimport -s ',' forex gbpusd gbpusd2016.csv Locale is : C Column delimiter : , Using table OID 3163 as the default JOB ID Input file(s) will be read from : /home/vagrant/histdata Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T103843_S950145_Job_3163.xml Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3163.log 2017-06-24 10:38:43 (29756) INFO : successfully loaded job file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T103843_S950145_Job_3163.xml 2017-06-24 10:38:43 (29756) INFO : Job file loaded, run time for this step : 0.0321331 seconds 2017-06-24 10:38:43 (29756) INFO : PreProcessing check starts 2017-06-24 10:38:43 (29756) INFO : input data file /home/vagrant/histdata/gbpusd2016.csv 2017-06-24 10:38:43 (29756) INFO : PreProcessing check completed 2017-06-24 10:38:43 (29756) INFO : preProcess completed, run time for this step : 0.0329528 seconds 2017-06-24 10:38:43 (29756) INFO : No of Read Threads Spawned = 1 2017-06-24 10:38:43 (29756) INFO : No of Parse Threads Spawned = 3 2017-06-24 10:38:45 (29756) INFO : For table forex.gbpusd: 372,480 rows processed and 372480 rows inserted. 2017-06-24 10:38:46 (29756) INFO : Bulk load completed, total run time : 2.11976 seconds DB table
  • 20. if cpimport failed... # cpimport forex gbpusd gbpusd2016.csv Locale is : C Using table OID 3163 as the default JOB ID Input file(s) will be read from : /home/vagrant/histdata Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T104034_S269473_Job_3163.xml Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3163.log 2017-06-24 10:40:34 (30209) INFO : successfully loaded job file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T104034_S269473_Job_3163.xml 2017-06-24 10:40:34 (30209) INFO : Job file loaded, run time for this step : 0.0253589 seconds 2017-06-24 10:40:34 (30209) INFO : PreProcessing check starts 2017-06-24 10:40:34 (30209) INFO : input data file /home/vagrant/histdata/gbpusd2016.csv 2017-06-24 10:40:34 (30209) INFO : PreProcessing check completed 2017-06-24 10:40:34 (30209) INFO : preProcess completed, run time for this step : 0.065531 seconds 2017-06-24 10:40:34 (30209) INFO : No of Read Threads Spawned = 1 2017-06-24 10:40:34 (30209) INFO : No of Parse Threads Spawned = 3 2017-06-24 10:40:34 (30209) INFO : Number of rows with errors = 11. Row numbers with error reasons are listed in file /home/vagrant/histdata/gbpusd2016.csv.Job_3163_30209.err 2017-06-24 10:40:34 (30209) INFO : Number of rows with errors = 11. Exact error rows are listed in file /home/vagrant/histdata/gbpusd2016.csv.Job_3163_30209.bad 2017-06-24 10:40:34 (30209) ERR : Actual error row count(11) exceeds the max error rows(10) allowed for table forex.gbpusd [1451] 2017-06-24 10:40:34 (30209) CRIT : Bulkload Read (thread 0) Failed for Table forex.gbpusd. Terminating this job. [1451] 2017-06-24 10:40:34 (30209) INFO : Bulkload Parse (thread 2) Stopped parsing Tables. BulkLoad::parse() responding to job termination 2017-06-24 10:40:34 (30209) INFO : Bulkload Parse (thread 1) Stopped parsing Tables. BulkLoad::parse() responding to job termination 2017-06-24 10:40:34 (30209) INFO : Bulkload Parse (thread 0) Stopped parsing Tables. BulkLoad::parse() responding to job termination 2017-06-24 10:40:34 (30209) INFO : Table forex.gbpusd (OID-3163) was not successfully loaded. Rolling back. 2017-06-24 10:40:34 (30209) INFO : Bulk load completed, total run time : 0.638649 seconds
  • 21. verify your Job_xxxx_xxxxx.err gbpusd2016.csv.Job_3163_30209.err : Line number 1; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 2; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 3; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 4; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 5; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 6; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 7; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 8; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 9; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 10; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1 Line number 11; Error: Data contains wrong number of columns; num fields expected-6; num fields found-1
  • 23. performance LOAD DATA LOCAL INFILE # mcsmysql --local-infile=1 forex Welcome to the MariaDB monitor. Commands end with ; or g. Your MariaDB connection id is 38 Server version: 10.1.23-MariaDB Columnstore 1.0.9-1 Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. MariaDB [forex]> LOAD DATA LOCAL INFILE 'gbpusd2016.csv' INTO TABLE gbpusd FIELDS TERMINATED BY ','; Query OK, 372480 rows affected (1.52 sec) Records: 372480 Deleted: 0 Skipped: 0 Warnings: 0
  • 24. Performance ColumnStore : cpimport # cpimport -s ',' forex gbpusd gbpusd2016.csv 2017-06-24 10:38:45 (29756) INFO : For table forex.gbpusd: 372480 rows processed and 372480 rows inserted. 2017-06-24 10:38:46 (29756) INFO : Bulk load completed, total run time : 2.11976 seconds -s: field separator
  • 25. cpimport • 2 sec. for 372,480 rows LOAD DATA LOCAL INFILE • 372480 rows affected (1.52 sec) CSV import: cpimport vs. LOAD DATA LOCAL INFILE
  • 26. Performance ColumnStore : INSERT INTO INSERT INTO gbpusd_idb(id, time, open, high, low, close) VALUES('1', '2016-01-03 17:00', '1.473350', '1.473350', '1.473290', '1.473290'); INSERT INTO gbpusd_idb(id, time, open, high, low, close) VALUES('2', '2016-01-03 17:01', '1.473280', '1.473360', '1.473260', '1.473350'); INSERT INTO gbpusd_idb(id, time, open, high, low, close) VALUES('3', '2016-01-03 17:02', '1.473350', '1.473350', '1.473290', '1.473290'); ... MariaDB [forex]> source gbpusd2016.sql ... MariaDB [forex]> Bye real 18m16.178s user 0m28.330s sys 0m23.551s
  • 27. Simply query w/ SQLPad
  • 29. SQLPad installation / launch # yum -y install npm (EPEL repository required) # npm install sqlpad -g $ sqlpad -ip 0.0.0.0 --port 3000 Launching server WITHOUT SSL Welcome to SQLPad!. Visit http://localhost:3000 to get started
  • 30. UK votes to leave EU after dramatic night divides nation • https://www.theguardian.com/politics/2016/jun/24/br itain-votes-for-brexit-eu-referendum-david-cameron The value of the pound swung wildly on currency markets as initial confidence among investors expecting a remain vote was dented by some of the early referendum results, triggering falls of close to 10% and its biggest one-day fall ever.
  • 31. simple query for time period before/after vote
  • 32. simple query for time period before/after vote
  • 35. Supported Window Functions Function Description AVG() The average of all input values. COUNT() Number of input rows. CUME_DIST() Calculates the cumulative distribution, or relative rank, of the current row to other rows in the same partition. Number of peer or preceding rows / number of rows in partition. DENSE_RANK() Ranks items in a group leaving no gaps in ranking sequence when there are ties. FIRST_VALUE() The value evaluated at the row that is the first row of the window frame (counting from 1); null if no such row.
  • 36. Supported Window Functions (cont’d) Function Description LAG() The value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null. LAG provides access to more than one row of a table at the same time without a self-join. Given a series of rows returned from a query and a position of the cursor, LAG provides access to a row at a given physical offset prior to that position. LAST_VALUE() The value evaluated at the row that is the last row of the window frame (counting from 1); null if no such row. LEAD() Provides access to a row at a given physical offset beyond that position. Returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null. MAX() Maximum value of expression across all input values.
  • 37. Supported Window Functions (cont’d) Function Description MEDIAN() An inverse distribution function that assumes a continuous distribution model. It takes a numeric or datetime value and returns the middle value or an interpolated value that would be the middle value once the values are sorted. Nulls are ignored in the calculation. MIN() Minimum value of expression across all input values. NTH_VALUE() The value evaluated at the row that is the nth row of the window frame (counting from 1); null if no such row. NTILE() Divides an ordered data set into a number of buckets indicated by expr and assigns the appropriate bucket number to each row. The buckets are numbered 1 through expr. The expr value must resolve to a positive constant for each partition. Integer ranging from 1 to the argument value, dividing the partition as equally as possible. PERCENT_RANK() relative rank of the current row: (rank - 1) / (total rows - 1).
  • 38. Supported Window Functions (cont’d) Function Description PERCENTILE_CONT() An inverse distribution function that assumes a continuous distribution model. It takes a percentile value and a sort specification, and returns an interpolated value that would fall into that percentile value with respect to the sort specification. Nulls are ignored in the calculation. PERCENTILE_DISC() An inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation. RANK() rank of the current row with gaps; same as row_number of its first peer. ROW_NUMBER() number of the current row within its partition, counting from 1 STDDEV() STDDEV_POP() Computes the population standard deviation and returns the square root of the population variance.
  • 39. Supported Window Functions (cont’d) Function Description STDDEV_SAMP() Computes the cumulative sample standard deviation and returns the square root of the sample variance. SUM() Sum of expression across all input values. VARIANCE() VAR_POP() Population variance of the input values (square of the population standard deviation). VAR_SAMP() Sample variance of the input values (square of the sample standard deviation).
  • 41. MAX GBPUSD 23th - 25th June 2016
  • 42. MIN GBPUSD 23th - 25th June 2016
  • 43. Drop off rate GBPUSD 23th - 25th June 2016 -13% of drop off in a few hours
  • 45. Correlation GBPUSD - USDJPY @ Brexit scatter plot (normalized) GBPUSD*100-130 USDJPY-110
  • 46. Correlation GBPUSD - USDJPY @ Brexit SELECT ( AVG( gbpusd.close * usdjpy.close ) - AVG( gbpusd.close ) * AVG( usdjpy.close ) ) / ( STDDEV(gbpusd.close) * STDDEV(usdjpy.close) ) AS correlation_coefficient_population FROM usdjpy INNER JOIN gbpusd ON gbpusd.time = usdjpy.time WHERE gbpusd.time BETWEEN TIMESTAMP ( '2016-06-22' ) AND TIMESTAMP ( '2016-06-26' ); Pearson correlation coefficient
  • 47. Correlation GBPUSD - USDJPY @ Brexit Scatter Plot (normalized) correlation coeff. 94.4 % : highly correlated GBPUSD*100-130
  • 48. Correlation GBPUSD - USDJPY 2016 (Jan. - Dec.) Scatter Plot (normalized) correlation coeff. 36%: low correlation GBPUSD*100-130 USDJPY-110
  • 49. Performance - ColumnStore vs. InnoDB ColumnStore storage engine: +------------------------------------+ | correlation_coefficient_population | +------------------------------------+ | 0.9648375371071727 | +------------------------------------+ 1 row in set (0.43 sec) > 1000 times faster than InnoDB SELECT (AVG(gbpusd.close*usdjpy.close) - AVG(gbpusd.close)*AVG(usdjpy.close)) / (STDDEV(gbpusd.close) * STDDEV(usdjpy.close)) AS correlation_coefficient_population FROM gbpusd JOIN usdjpy ON gbpusd.time = usdjpy.time WHERE gbpusd.time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25'); InnoDB storage engine: +------------------------------------+ | correlation_coefficient_population | +------------------------------------+ | 0.964837537107134 | +------------------------------------+ 1 row in set (8 min 11.21 sec)
  • 50. Moving Average w/ Window Functions
  • 51. Moving Average GBPUSD SELECT time, close, AVG(close) OVER ( ORDER BY time ASC ROWS BETWEEN 6 PRECEDING AND 6 FOLLOWING ) AS MA13, COUNT(close) OVER ( ORDER BY time ASC ROWS BETWEEN 6 PRECEDING AND 6 FOLLOWING ) AS row_count FROM gbpusd WHERE time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');
  • 52. Moving Average GBPUSD AVG(close) OVER ( ORDER BY time ASC ROWS BETWEEN 6 PRECEDING AND 6 FOLLOWING ) AS MA13 time close MA13 row count 6/23/2016 00:00 1.4797 1.4797 7 preceding 6 6/23/2016 00:01 1.4798 1.4797 8 preceding 5 6/23/2016 00:02 1.4798 1.4796 9 preceding 4 6/23/2016 00:03 1.4797 1.4796 10 preceding 3 6/23/2016 00:04 1.4796 1.4796 11 preceding 2 6/23/2016 00:05 1.4796 1.4796 12 preceding 1 6/23/2016 00:06 1.4796 1.4796 13 current row 6/23/2016 00:07 1.4796 1.4796 13 following 1 6/23/2016 00:08 1.4796 1.4796 13 following 2 6/23/2016 00:09 1.4796 1.4796 13 following 3 6/23/2016 00:10 1.4796 1.4797 13 following 4 6/23/2016 00:11 1.4796 1.4797 13 following 5 6/23/2016 00:12 1.4797 1.4797 13 following 6
  • 55. summary • Free Forex time series history data analyzed with : – Simple analytic queries(aggregate functions) w/ SQLPad – Moving average using Window Function
  • 58. MariaDB Partners w/ Global Visual Analytics Leader Tableau https://mariadb.com/about-us/newsroom/press-releases/fastest-growing-open-source-database -mariadb-partners-global Fastest Growing Open Source Database MariaDB Partners With Global Visual Analytics Leader Tableau Combination of ubiquitous database and visual analytics technologies accelerates delivery of business insights MENLO PARK, Calif. and HELSINKI – December 12, 2017 – MariaDB® Corporation, the company behind the fastest growing open source database, today announced Tableau Software, the global leader in visual analytics, has certified MariaDB integration with Tableau’s business intelligence (BI) and visual analytics platform. Bringing together the highly popular data management products and the renowned visualization technologies means businesses globally can confidently use these preferred solutions for reliable, fast, data-driven business decisions.
  • 59. Analyzing Queries in ColumnStore
  • 60. Analyzing Queries : select calGetStats(); https://mariadb.com/kb/en/library/analyzing-queries-in-columnstore/ MariaDB [forex]> select calGetStats(); Query Stats: MaxMemPct-1; NumTempFiles-0; TempFileSpace-0B; ApproxPhyI/O-0; CacheI/O-6298; BlocksTouched-6298; PartitionBlocksEliminated-0; MsgBytesIn-4MB; MsgBytesOut-11MB; Mode-Distributed
  • 61. Analyzing Queries : select calGetStats(); • MaxMemPct - Peak memory utilization on the User Module, likely in support of a large (User Module) based hash join operation. • NumTempFiles - Report on any temporary files created in support of query operations larger than available memory, typically for unusual join operations where the smaller table join cardinality exceeds some configurable threshold. • TempFileSpace - Report on space used by temporary files created in support of query operations larger than available memory, typically for unusual join operations where the smaller table join cardinality exceeds some configurable threshold. • PhyI/O - Number of 8k blocks read from disk, SSD, or other persistent storage. • CacheI/O - Approximate number of 8k blocks processed in memory, adjusted down by the number of discrete PhyI/O calls required. • BlocksTouched - Approximate number of 8k blocks processed in memory. • PartitionBlocksEliminated - The number of block touches eliminated via the Extent Map elimination behavior. MsgBytesIn, MsgByteOut - Message size in MB sent between nodes in support of the query.
  • 62. Analyzing Queries : calSetTrace(1); calGetTrace(); MariaDB [test]> calSetTrace(1); MariaDB [test]> select c_name, sum(o_totalprice) from customer, orders where o_custkey = c_custkey and c_custkey = 5 group by c_name; +--------------------+-------------------+ | c_name | sum(o_totalprice) | +--------------------+-------------------+ | Customer#000000005 | 684965.28 | +--------------------+-------------------+ 1 row in set, 1 warning (0.34 sec) MariaDB [test]> select calGetTrace(); Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows BPS PM customer 3024 (c_custkey,c_name) 0 43 36 0.006 1 BPS PM orders 3038 (o_custkey,o_totalprice) 0 766 0 0.032 3 HJS PM orders-customer 3038 - - - - ----- - TAS UM - - - - - - 0.021 1
  • 63. Analyzing Queries : calSetTrace(1); calGetTrace(); Desc – Operation being executed. Possible values: ● BPS - Batch Primitive Step : scanning or projecting the column blocks. ● CES - Cross Engine Step: Performing Cross engine join ● DSS - Dictionary Structure Step : a dictionary scan for a particular variable length string value. ● HJS - Hash Join Step : Performing a hash join between 2 tables ● HVS - Having Step: Performing the having clause on the result set ● SQS - Sub Query Step: Performaning a sub query ● TAS - Tuple Aggregation step : the process of receiving intermediate aggregation results at the UM from the PM nodes. ● TNS - Tuple Annexation Step : Query result finishing, e.g. filling in constant columns, limit, order by and final distinct cases. ● TUS = Tuple Union step : Performing a SQL union of 2 sub queries. ● TCS = Tuple Constant Step: Process Constant Value Columns ● WFS = Window Function Step: Performing a window function.
  • 64. Analyzing Queries : calSetTrace(1); calGetTrace(); • Mode – Where the operation was performed: UM or PM • Table – Table for which columns may be scanned/projected. • TableOID – ObjectID for the table being scanned. • ReferencedOIDs – ObjectIDs for the columns required by the query. • PIO – Physical I/O (reads from storage) executed for the query. • LIO – Logical I/O executed for the query, also known as Blocks Touched. • PBE – Partition Blocks Eliminated identifies blocks eliminated by Extent Map min/max. • Elapsed – Elapsed time for a give step. • Rows – Intermediate rows returned
  • 66. MariaDB ColumnStore Architecture Columnar Distributed Data Storage Local Storage | SAN | EBS | Gluster FS BI Tool SQL Client Custom Big Data App Application MariaDB SQL Front End Distributed Query Engine Data Storage User Module (UM) Performance Module (PM)
  • 67. Row-oriented vs. Column-oriented format •Row oriented –Rows stored sequentially in a file –Scans through every record row by row •Column oriented: –Each column is stored in a separate file –Scans the only relevant column ID Fname Lname State Zip Phone Age Sex 1 Bugs Bunny NY 11217 (718) 938-3235 34 M 2 Yosemite Sam CA 95389 (209) 375-6572 52 M 3 Daffy Duck NY 10013 (212) 227-1810 35 M 4 Elmer Fudd ME 04578 (207) 882-7323 43 M 5 Witch Hazel MA 01970 (978) 744-0991 57 F ID 1 2 3 4 5 Fname Bugs Yosemite Daffy Elmer Witch Lname Bunny Sam Duck Fudd Hazel State NY CA NY ME MA Zip 11217 95389 10013 04578 01970 Phone (718) 938-3235 (209) 375-6572 (212) 227-1810 (207) 882-7323 (978) 744-0991 Age 34 52 35 43 57 Sex M M M M F SELECT Fname FROM Table 1 WHERE State = 'NY' 67
  • 68. High Performance Query Processing Horizontal Partition: 8 Million Rows Extent 2 Horizontal Partition: 8 Million Rows Extent 3 Horizontal Partition: 8 Million Rows Extent 1 Storage Architecture reduces I/O • Only touch column files that are in projection, filter and join conditions • Eliminate disk block touches to partitions outside filter and join conditions Extent 1: Min State: CA, Max State: NY Extent 2: Min State: OR, Max State: WY Extent 3: Min State: IA, Max State: TN SELECT Fname FROM Table 1 WHERE State = ‘NY’ ID 1 2 3 4 ... 8M 8M+1 ... 16M 16M+1 ... 24M Fname Bugs Yosemite Daffy Hazel ... ... Jane ... Elmer Lname Bunny Sam Duck Fudd ... ... ... State NY CA NY ME ... MN WY TX OR ... VA TN IA NY ... PA Zip 11217 95389 10013 04578 ... ... ... Phone (718) 938-3235 (209) 375-6572 (212) 227-1810 (207) 882-7323 ... ... ... Age 34 52 35 43 ... ... ... Sex M M M F ... ... ... Vertical Partition Vertical Partition Vertical Partition Vertical Partition Vertical Partition … ELIMINATED PARTITION
  • 70. Sizing Minimum Spec UM 4 core, 32 G RAM PM 4 core, 16 G RAM Typical Server spec PM 8 core 64G RAM UM 8 core, 264G RAM Data Storage External Data Volumes • Maximum 2 data volume per IO channel per PM node server • up to 2TB on the disk per data volume ≈ Max 4 TB per PM node Local disk Up to 2TB on the disk per PM node server DETAILED SIZING GUIDE based on data size and workload
  • 71. MariaDB ColumnStore Sizing - Example • 60TB uncompressed data = 6TB compressed data at 10x compression • 2UM - 8 core 512GB(based on workload) • 6 TB compressed = 3 data volume (at 2TB per volume) -with 1 data volume per PM node - 3PMs • Data growth - 2TB per month, Data retention - 2 years -Plan for 2TB X24 = 48 TB additional -48 TB = 4.8TB compressed ≈ 3 data volume(at 2TB per volume) with 1 data volume per PM node - 3 additional PMs • Total 6 PMs, 2 UMs
  • 73. SSL variables w/o SSL MariaDB [(none)]> SHOW VARIABLES LIKE '%ssl%'; +---------------------+---------------------------------+ | Variable_name | Value | +---------------------+---------------------------------+ | have_openssl | YES | | have_ssl | DISABLED | | ssl_ca | | | ssl_capath | | | ssl_cert | | | ssl_cipher | | | ssl_crl | | | ssl_crlpath | | | ssl_key | | | version_ssl_library | OpenSSL 1.0.1e-fips 11 Feb 2013 | +---------------------+---------------------------------+
  • 74. /usr/local/mariadb/columnstore/mysql/my.cnf [client] ssl-ca = /etc/pki/tls/mariadb/certs/ca-cert.pem ssl-cert = /etc/pki/tls/mariadb/certs/client-cert.pem ssl-key = /etc/pki/tls/mariadb/private/client-key.pem [mysqld] ssl-ca = /etc/pki/tls/mariadb/certs/ca-cert.pem ssl-cert = /etc/pki/tls/mariadb/certs/server-cert.pem ssl-key = /etc/pki/tls/mariadb/private/server-key.pem
  • 75. SSL variables : SSL enabled MariaDB [(none)]> SHOW VARIABLES LIKE '%ssl%'; +---------------------+---------------------------------------------+ | Variable_name | Value | +---------------------+---------------------------------------------+ | have_openssl | YES | | have_ssl | YES | | ssl_ca | /etc/pki/tls/mariadb/certs/ca-cert.pem | | ssl_capath | | | ssl_cert | /etc/pki/tls/mariadb/certs/server-cert.pem | | ssl_cipher | | | ssl_crl | | | ssl_crlpath | | | ssl_key | /etc/pki/tls/mariadb/private/server-key.pem | | version_ssl_library | OpenSSL 1.0.1e-fips 11 Feb 2013 | +---------------------+---------------------------------------------+
  • 76. status : SSL enabled MariaDB [(none)]> status -------------- /usr/local/mariadb/columnstore/mysql/bin/mysql Ver 15.1 Distrib 10.1.23-MariaDB, for Linux (x86_64) using readline 5.1 Connection id: 5 Current database: Current user: root@localhost SSL: Cipher in use is DHE-RSA-AES256-GCM-SHA384 Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 10.1.23-MariaDB Columnstore 1.0.9-1 Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: latin1 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock