SlideShare a Scribd company logo
1 of 26
Download to read offline
Ivan Zoratti
Big Data with MySQL
Percona Live Santa Clara 2013
V1304.01
Friday, 3 May 13
Who is Ivan
?
Friday, 3 May 13
SkySQL
•Leading provider of open source
databases, services and
solutions
•Home for the founders and the
original developers of the core
of MySQL
•The creators of MariaDB, the
drop-off, innovative
replacement of MySQL
Friday, 3 May 13
What is Big Data?
http://marketingblogged.marketingmagazine.co.uk/files/Big-Data-3.jpg
Friday, 3 May 13
PAGE
Big Data!
Big data is a collection of data
sets so large and complex that it
becomes difficult to process
using on-hand database
management tools or traditional
data processing applications.
5
http://readwrite.com/files/styles/800_450sc/public/files/fields/shutterstock_bigdata.jpg
Friday, 3 May 13
PAGE
Big Data By Structure
6
Unstructured
•Store everything you have/you find
•In any format and shape
•You do not know how to use it, but it may
come handy
•Storing unstructured data is usually cheaper than
storing it in a more structured datastore
•Does not fit well in a relational database
•Examples:
•Text: Plain text, documents, web content,
messages
•Bitmap: Image, audio, video
•Typical approach:
•Mining, pattern recognition, tagging
•Usually batch analysis
Structured
•Store only what you need
•In a good format, ready to be used
•You should already know how to use it, or at
least what it means
•Storing structured data is quite expensive
•Raw data, indexing, denormalisation,
aggregation
•Arelational database is still the best choice
•Examples:
•Machine-Generated Data (MGD)
•Tags, counters, sales
•Typical approach:
•BI tools, reporting
•Real time analysis change data capture
Friday, 3 May 13
PAGE
Unstructured
•Store everything you have/you find
•In any format and shape
•You do not know how to use it, but it may
come handy
•Storing unstructured data is usually cheaper than
storing it in a more structured datastore
•Does not fit well in a relational database
•Examples:
•Text: Plain text, documents, web content,
messages
•Bitmap: Image, audio, video
•Typical approach:
•Mining, pattern recognition, tagging
•Usually batch analysis
Structured
•Store only what you need
•In a good format, ready to be used
•You should already know how to use it, or at
least what it means
•Storing structured data is quite expensive
•Raw data, indexing, denormalisation,
aggregation
•Arelational database is still the best choice
•Examples:
•Machine-Generated Data (MGD)
•Tags, counters, sales
•Typical approach:
•BI tools, reporting
•Real time analysis change data capture
Big Data By Structure
7
Friday, 3 May 13
PAGE
How “Big” is Big Data?
•Data Factors
•Size
•Speed to collect/
generate
•Variety
•Resources
•Administrators
•Developers
•Infrastructure
•Growth
•Collection
•Processing
•Availability
•To whom?
•For how long?
•In which format?
•Aggregated
•Detailed
8
Friday, 3 May 13
PAGE
How to manage Big Data
•Collection - Storage -Archive
•Load - Transform -Analyze
•Access - Explore - Utilize
9
http://www.futuresmag.com/2012/07/01/big-data-manage-it-dont-drown-in-it
Friday, 3 May 13
Big Data with MySQL
http://news.mydosti.com/newsphotos/tech/BigDataV1Dec22012.jpg
Friday, 3 May 13
PAGE
Technologies to
Use / Consider / Watch
•MyISAM and MyISAM compression
•InnoDB compression
•MySQL 5.6 Partitioning
•MariaDB Optimizer
•MariaDB Virtual & Dynamic
Columns
•Cassandra Storage Engine
•Connect Storage Engine
•Columnar Databases
•InfiniDB
•Infobright
•TokuDB Storage Engine
11
Friday, 3 May 13
PAGE
Columnar Databases
•Automatic compression
•Automatic column storage
•Data distribution
•Map/Reduce approach
•MPP / Parallel loading
•No indexes
•On public clouds, HW or SW
appliances
12
Friday, 3 May 13
PAGE
TokuDB
•Increased Performance
•Increased Compression
•Online administration
•No Index rebuild
13
Friday, 3 May 13
PAGE
MyISAM
•Static, dynamic and compressed
format
•Multiple key cache, CACHE INDEX
and LOAD INDEX
•Compressed tables
•Horizontal partitioning (manual)
•External locking
14
Friday, 3 May 13
PAGE
InnoDB/XtraDB
•Data Load
•Pre-order data
•Split data into chunks
•unique_checks = 0;
•foreign_key_checks = 0;
•sql_log_bin = 0;
•innodb_autoinc_lock_mode = 2;
•Compression and block size
•Persistent optimizer stats
•innodb_stats_persistent
•innodb_stats_auto_recalc
15
SET GLOBAL innodb_file_per_table = 1;
SET GLOBAL innodb_file_format = Barracuda;
CREATE TABLE t1
( c1 INT PRIMARY KEY,
c2 VARCHAR(255) )
ROW_FORMAT = COMPRESSED
KEY_BLOCK_SIZE = 8;
LOAD   DATA LOCAL INFILE '/usr2/t1_01_simple' INTO TABLE t1;
Query OK, 134217728 rows affected (1 hour 34 min 7.49 sec)
Records: 134217728  Deleted: 0  Skipped: 0  Warnings: 0
LOAD   DATA LOCAL INFILE '/usr2/t1_01_simple' INTO TABLE t2;
Query OK, 134217728 rows affected (25 min 20.75 sec)
Records: 134217728  Deleted: 0  Skipped: 0  Warnings: 0
Friday, 3 May 13
PAGE
Partitioning (MySQL 5.6)
•Partitioning Types
•RANGE, LIST, RANGE COLUMN,
HASH, LINEAR HASH, KEY LINEAR
KEY, sub-partitions
•Partition and lock pruning
•Use of INDEX and DATA
DIRECTORY
•PARTITIONADD, DROP,
REORGANIZE, COALESCE,
TRUNCATE, EXCHANGE,
REBUILD, OPTIMIZE, CHECK,
ANALYZE, REPAIR
16
CREATE TABLE t1 ( c1 INT, c2 DATE )
PARTITION BY RANGE( YEAR( c2 ) )
SUBPARTITION BY HASH ( TO_DAYS( c2 ) )
( PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0
DATA DIRECTORY = '/disk0/data'
INDEX DIRECTORY = '/disk0/idx',
SUBPARTITION s1
DATA DIRECTORY = '/disk1/data'
INDEX DIRECTORY = '/disk1/idx' ),...
ALTER TABLE t1
EXCHANGE PARTITION p3 WITH TABLE t2;
-- Range and List partitions
ALTER TABLE t1 REORGANIZE PARTITION
p0,p1,p2,p3 INTO (
PARTITION m0 VALUES LESS THAN (1980),
PARTITION m1 VALUES LESS THAN (2000));
-- Hash and Key partitions
ALTER TABLE t1 COALESCE PARTITION 10;
ALTER TABLE t1 ADD PARTITION PARTITIONS 5;
Friday, 3 May 13
PAGE
MariaDB Optimizer
•Multi-Range Read (MRR)*
•Index Merge / Sort intersection
•Batch KeyAccess*
•Block hash join
•Cost-based choice of range vs.
index_merge
•ORDER BY ... LIMIT <limit>*
•MariaDB 10
•Subqueries
•Semi-join*
•Materialization*
•subquery cache
•LIMIT ... ROWS EXAMINED
<limit>
17
(*) - Available in MySQL 5.6
Friday, 3 May 13
PAGE
Virtual & Dynamic Columns
VIRTUAL COLUMNS
•For InnoDB, MyISAM andAria
•PERSISTENT (stored) or VIRTUAL
(generated)
18
CREATE TABLE t1 (
c1 INT NOT NULL,
c2 VARCHAR(32),
c3 INT AS
( c1 MOD 10 ) VIRTUAL,
c4 VARCHAR(5) AS
( LEFT(B,5) ) PERSISTENT);
DYNAMIC COLUMNS
•Implement a schemaless,
document store
•COLUMN_ CREATE,ADD, GET, LIST,
JSON, EXISTS, CHECK, DELETE
•Nested colums are allowed
•Main datatypes are allowed
•Max 1GB documents
CREATE TABLE assets (
item_name VARCHAR(32) PRIMARY KEY,
dynamic_cols BLOB );
INSERT INTO assets VALUES (
'MariaDB T-shirt',
COLUMN_CREATE( 'color', 'blue',
'size', 'XL' ) );
INSERT INTO assets VALUES (
'Thinkpad Laptop',
COLUMN_CREATE( 'color', 'black',
'price', 500 ) );
Friday, 3 May 13
PAGE
Cassandra Storage Engine
•Column Family == Table
•Rowkey, static and dynamic
columns allowed
•Batch key access support
SET cassandra_default_thrift_host =
'192.168.0.10'
CREATE TABLE cassandra_tbl (
rowkey INT PRIMARY KEY,
col1 VARCHAR(25),
col2 BIGINT,
dyn_cols BLOB DYNAMIC_COLUMN_STORAGE = yes )
ENGINE = cassandra
KEYSPACE = 'cassandra_key_space'
COLUMN_FAMILY = 'column_family_name';
19
Friday, 3 May 13
PAGE
Connect Storage Engine
•Any file format as MySQLTABLE:
•ODBC
•Text, XML, *ML
•Excel,Access etc.
•MariaDB CREATE TABLE options
•Multi-file table
•TableAutocreation
•Condition push down
•Read/Write and Multi Storage Engine Join
•CREATE INDEX
20
CREATE TABLE handout
ENGINE = CONNECT
TABLE_TYPE = XML
FILE_NAME = 'handout.htm'
HEADER = yes OPTION_LIST =
'name = TABLE,
coltype = HTML,
attribute =
(border=1;cellpadding=5)';
Friday, 3 May 13
Starting Your Big Data Project
Friday, 3 May 13
PAGE
Why would you use MySQL?
• Time
• Knowledge
• Infrastructure
• Costs
• Simplified Integration
• Not so “big” data
22
Friday, 3 May 13
PAGE
Apache Hadoop & Friends
23
HDFS
MapReduce
PIG HIVE
HCatalog
HBASE
ZooKeeper
•Mahout
•Ambari, Ganglia,
Nagios
•Sqoop
•Cascading
•Oozie
•Flume
•Protobuf, Avro,
Thrift
•Fuse-DFS
•Chukwa
•Cassandra
Friday, 3 May 13
PAGE
MySQL & Friends
24
MySQL/MariaDB/Storage Engines
SQL Optimizer
Scripts
Stored Procedures DML
DB Schema / DDL
MySQL/MariaDB
SkySQLDS
•Mahout
•SDS, Ganglia,
Nagios
•mysqlimport
•Cascading
•Talend, Pentaho
•Connect
Friday, 3 May 13
PAGE
Join us at the Solutions Day
•Cassandra and Connect Storage Engine
•Map/Reduce approach - Proxy optimisation
•Multiple protocols and more
25
Friday, 3 May 13
Thank You!
ivan@skysql.com
izoratti.blogspot.com
www.slideshare.net/izorattiwww.skysql.com
Friday, 3 May 13

More Related Content

What's hot

Sql(structured query language)
Sql(structured query language)Sql(structured query language)
Sql(structured query language)Ishucs
 
MySQL Architecture and Engine
MySQL Architecture and EngineMySQL Architecture and Engine
MySQL Architecture and EngineAbdul Manaf
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouseAbhi Bhardwaj
 
Oracle Spatial de la A a la Z - Unidad 1
Oracle Spatial de la A a la Z - Unidad 1Oracle Spatial de la A a la Z - Unidad 1
Oracle Spatial de la A a la Z - Unidad 1Jorge Ulises
 
5. stored procedure and functions
5. stored procedure and functions5. stored procedure and functions
5. stored procedure and functionsAmrit Kaur
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB Habilelabs
 
Oracle basic queries
Oracle basic queriesOracle basic queries
Oracle basic queriesPRAKHAR JHA
 
Advanced Sql Injection ENG
Advanced Sql Injection ENGAdvanced Sql Injection ENG
Advanced Sql Injection ENGDmitry Evteev
 
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaPostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaEdureka!
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architectureuncleRhyme
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 

What's hot (20)

sqlmap - Under the Hood
sqlmap - Under the Hoodsqlmap - Under the Hood
sqlmap - Under the Hood
 
Sql(structured query language)
Sql(structured query language)Sql(structured query language)
Sql(structured query language)
 
Shadow paging
Shadow pagingShadow paging
Shadow paging
 
MySQL JOINS
MySQL JOINSMySQL JOINS
MySQL JOINS
 
MySQL Architecture and Engine
MySQL Architecture and EngineMySQL Architecture and Engine
MySQL Architecture and Engine
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Oracle Spatial de la A a la Z - Unidad 1
Oracle Spatial de la A a la Z - Unidad 1Oracle Spatial de la A a la Z - Unidad 1
Oracle Spatial de la A a la Z - Unidad 1
 
Trigger
TriggerTrigger
Trigger
 
Database Security
Database SecurityDatabase Security
Database Security
 
SQL Server Stored procedures
SQL Server Stored proceduresSQL Server Stored procedures
SQL Server Stored procedures
 
5. stored procedure and functions
5. stored procedure and functions5. stored procedure and functions
5. stored procedure and functions
 
Agreggates i
Agreggates iAgreggates i
Agreggates i
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
Oracle basic queries
Oracle basic queriesOracle basic queries
Oracle basic queries
 
Manage users & tables in Oracle Database
Manage users & tables in Oracle DatabaseManage users & tables in Oracle Database
Manage users & tables in Oracle Database
 
Trigger in DBMS
Trigger in DBMSTrigger in DBMS
Trigger in DBMS
 
Advanced Sql Injection ENG
Advanced Sql Injection ENGAdvanced Sql Injection ENG
Advanced Sql Injection ENG
 
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaPostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | Edureka
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 

Similar to Big Data with MySQL

Building better SQL Server Databases
Building better SQL Server DatabasesBuilding better SQL Server Databases
Building better SQL Server DatabasesColdFusionConference
 
Data Warehouse Logical Design using Mysql
Data Warehouse Logical Design using MysqlData Warehouse Logical Design using Mysql
Data Warehouse Logical Design using MysqlHAFIZ Islam
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sqldeathsubte
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?Ivan Zoratti
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsIke Ellis
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 
Geek Sync I Polybase and Time Travel (Temporal Tables)
Geek Sync I Polybase and Time Travel (Temporal Tables)Geek Sync I Polybase and Time Travel (Temporal Tables)
Geek Sync I Polybase and Time Travel (Temporal Tables)IDERA Software
 
Oracle 12c New Features For Better Performance
Oracle 12c New Features For Better PerformanceOracle 12c New Features For Better Performance
Oracle 12c New Features For Better PerformanceZohar Elkayam
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraCaserta
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Web Services
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analyticsIke Ellis
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataAbishek V S
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designCalpont
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureChristos Charmatzis
 

Similar to Big Data with MySQL (20)

Building better SQL Server Databases
Building better SQL Server DatabasesBuilding better SQL Server Databases
Building better SQL Server Databases
 
Data Warehouse Logical Design using Mysql
Data Warehouse Logical Design using MysqlData Warehouse Logical Design using Mysql
Data Warehouse Logical Design using Mysql
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Geek Sync I Polybase and Time Travel (Temporal Tables)
Geek Sync I Polybase and Time Travel (Temporal Tables)Geek Sync I Polybase and Time Travel (Temporal Tables)
Geek Sync I Polybase and Time Travel (Temporal Tables)
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Oracle 12c New Features For Better Performance
Oracle 12c New Features For Better PerformanceOracle 12c New Features For Better Performance
Oracle 12c New Features For Better Performance
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech Talks
 
Data modeling trends for analytics
Data modeling trends for analyticsData modeling trends for analytics
Data modeling trends for analytics
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with Azure
 
unit-ii.pptx
unit-ii.pptxunit-ii.pptx
unit-ii.pptx
 

More from Ivan Zoratti

AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jIvan Zoratti
 
Introducing the Open Edge Module
Introducing the Open Edge ModuleIntroducing the Open Edge Module
Introducing the Open Edge ModuleIvan Zoratti
 
MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017Ivan Zoratti
 
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
NOSQL Meets Relational - The MySQL Ecosystem Gains More FlexibilityNOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
NOSQL Meets Relational - The MySQL Ecosystem Gains More FlexibilityIvan Zoratti
 
MariaDB ColumnStore - LONDON MySQL Meetup
MariaDB ColumnStore - LONDON MySQL MeetupMariaDB ColumnStore - LONDON MySQL Meetup
MariaDB ColumnStore - LONDON MySQL MeetupIvan Zoratti
 
ScaleDB Technical Presentation
ScaleDB Technical PresentationScaleDB Technical Presentation
ScaleDB Technical PresentationIvan Zoratti
 
Time Series From Collection To Analysis
Time Series From Collection To AnalysisTime Series From Collection To Analysis
Time Series From Collection To AnalysisIvan Zoratti
 
ScaleDB Technical Presentation
ScaleDB Technical PresentationScaleDB Technical Presentation
ScaleDB Technical PresentationIvan Zoratti
 
MySQL for Beginners - part 1
MySQL for Beginners - part 1MySQL for Beginners - part 1
MySQL for Beginners - part 1Ivan Zoratti
 
Anatomy of a Proxy Server - MaxScale Internals
Anatomy of a Proxy Server - MaxScale InternalsAnatomy of a Proxy Server - MaxScale Internals
Anatomy of a Proxy Server - MaxScale InternalsIvan Zoratti
 
Orchestrating MySQL
Orchestrating MySQLOrchestrating MySQL
Orchestrating MySQLIvan Zoratti
 
The Evolution of Open Source Databases
The Evolution of Open Source DatabasesThe Evolution of Open Source Databases
The Evolution of Open Source DatabasesIvan Zoratti
 
MaxScale for Effective MySQL Meetup NYC - 14.01.21
MaxScale for Effective MySQL Meetup NYC - 14.01.21MaxScale for Effective MySQL Meetup NYC - 14.01.21
MaxScale for Effective MySQL Meetup NYC - 14.01.21Ivan Zoratti
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonIvan Zoratti
 
SkySQL & MariaDB What's all the buzz?
SkySQL & MariaDB What's all the buzz?SkySQL & MariaDB What's all the buzz?
SkySQL & MariaDB What's all the buzz?Ivan Zoratti
 
MySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereMySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereIvan Zoratti
 
The sky's the limit
The sky's the limitThe sky's the limit
The sky's the limitIvan Zoratti
 

More from Ivan Zoratti (20)

AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
 
Introducing the Open Edge Module
Introducing the Open Edge ModuleIntroducing the Open Edge Module
Introducing the Open Edge Module
 
MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017
 
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
NOSQL Meets Relational - The MySQL Ecosystem Gains More FlexibilityNOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
 
MariaDB ColumnStore - LONDON MySQL Meetup
MariaDB ColumnStore - LONDON MySQL MeetupMariaDB ColumnStore - LONDON MySQL Meetup
MariaDB ColumnStore - LONDON MySQL Meetup
 
ScaleDB Technical Presentation
ScaleDB Technical PresentationScaleDB Technical Presentation
ScaleDB Technical Presentation
 
Time Series From Collection To Analysis
Time Series From Collection To AnalysisTime Series From Collection To Analysis
Time Series From Collection To Analysis
 
ScaleDB Technical Presentation
ScaleDB Technical PresentationScaleDB Technical Presentation
ScaleDB Technical Presentation
 
MySQL for Beginners - part 1
MySQL for Beginners - part 1MySQL for Beginners - part 1
MySQL for Beginners - part 1
 
Anatomy of a Proxy Server - MaxScale Internals
Anatomy of a Proxy Server - MaxScale InternalsAnatomy of a Proxy Server - MaxScale Internals
Anatomy of a Proxy Server - MaxScale Internals
 
Orchestrating MySQL
Orchestrating MySQLOrchestrating MySQL
Orchestrating MySQL
 
GTIDs Explained
GTIDs ExplainedGTIDs Explained
GTIDs Explained
 
The Evolution of Open Source Databases
The Evolution of Open Source DatabasesThe Evolution of Open Source Databases
The Evolution of Open Source Databases
 
MaxScale for Effective MySQL Meetup NYC - 14.01.21
MaxScale for Effective MySQL Meetup NYC - 14.01.21MaxScale for Effective MySQL Meetup NYC - 14.01.21
MaxScale for Effective MySQL Meetup NYC - 14.01.21
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
 
SkySQL & MariaDB What's all the buzz?
SkySQL & MariaDB What's all the buzz?SkySQL & MariaDB What's all the buzz?
SkySQL & MariaDB What's all the buzz?
 
MySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereMySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens Here
 
Sky Is The limit
Sky Is The limitSky Is The limit
Sky Is The limit
 
The sky's the limit
The sky's the limitThe sky's the limit
The sky's the limit
 
HA Reloaded
HA ReloadedHA Reloaded
HA Reloaded
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Big Data with MySQL

  • 1. Ivan Zoratti Big Data with MySQL Percona Live Santa Clara 2013 V1304.01 Friday, 3 May 13
  • 3. SkySQL •Leading provider of open source databases, services and solutions •Home for the founders and the original developers of the core of MySQL •The creators of MariaDB, the drop-off, innovative replacement of MySQL Friday, 3 May 13
  • 4. What is Big Data? http://marketingblogged.marketingmagazine.co.uk/files/Big-Data-3.jpg Friday, 3 May 13
  • 5. PAGE Big Data! Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. 5 http://readwrite.com/files/styles/800_450sc/public/files/fields/shutterstock_bigdata.jpg Friday, 3 May 13
  • 6. PAGE Big Data By Structure 6 Unstructured •Store everything you have/you find •In any format and shape •You do not know how to use it, but it may come handy •Storing unstructured data is usually cheaper than storing it in a more structured datastore •Does not fit well in a relational database •Examples: •Text: Plain text, documents, web content, messages •Bitmap: Image, audio, video •Typical approach: •Mining, pattern recognition, tagging •Usually batch analysis Structured •Store only what you need •In a good format, ready to be used •You should already know how to use it, or at least what it means •Storing structured data is quite expensive •Raw data, indexing, denormalisation, aggregation •Arelational database is still the best choice •Examples: •Machine-Generated Data (MGD) •Tags, counters, sales •Typical approach: •BI tools, reporting •Real time analysis change data capture Friday, 3 May 13
  • 7. PAGE Unstructured •Store everything you have/you find •In any format and shape •You do not know how to use it, but it may come handy •Storing unstructured data is usually cheaper than storing it in a more structured datastore •Does not fit well in a relational database •Examples: •Text: Plain text, documents, web content, messages •Bitmap: Image, audio, video •Typical approach: •Mining, pattern recognition, tagging •Usually batch analysis Structured •Store only what you need •In a good format, ready to be used •You should already know how to use it, or at least what it means •Storing structured data is quite expensive •Raw data, indexing, denormalisation, aggregation •Arelational database is still the best choice •Examples: •Machine-Generated Data (MGD) •Tags, counters, sales •Typical approach: •BI tools, reporting •Real time analysis change data capture Big Data By Structure 7 Friday, 3 May 13
  • 8. PAGE How “Big” is Big Data? •Data Factors •Size •Speed to collect/ generate •Variety •Resources •Administrators •Developers •Infrastructure •Growth •Collection •Processing •Availability •To whom? •For how long? •In which format? •Aggregated •Detailed 8 Friday, 3 May 13
  • 9. PAGE How to manage Big Data •Collection - Storage -Archive •Load - Transform -Analyze •Access - Explore - Utilize 9 http://www.futuresmag.com/2012/07/01/big-data-manage-it-dont-drown-in-it Friday, 3 May 13
  • 10. Big Data with MySQL http://news.mydosti.com/newsphotos/tech/BigDataV1Dec22012.jpg Friday, 3 May 13
  • 11. PAGE Technologies to Use / Consider / Watch •MyISAM and MyISAM compression •InnoDB compression •MySQL 5.6 Partitioning •MariaDB Optimizer •MariaDB Virtual & Dynamic Columns •Cassandra Storage Engine •Connect Storage Engine •Columnar Databases •InfiniDB •Infobright •TokuDB Storage Engine 11 Friday, 3 May 13
  • 12. PAGE Columnar Databases •Automatic compression •Automatic column storage •Data distribution •Map/Reduce approach •MPP / Parallel loading •No indexes •On public clouds, HW or SW appliances 12 Friday, 3 May 13
  • 13. PAGE TokuDB •Increased Performance •Increased Compression •Online administration •No Index rebuild 13 Friday, 3 May 13
  • 14. PAGE MyISAM •Static, dynamic and compressed format •Multiple key cache, CACHE INDEX and LOAD INDEX •Compressed tables •Horizontal partitioning (manual) •External locking 14 Friday, 3 May 13
  • 15. PAGE InnoDB/XtraDB •Data Load •Pre-order data •Split data into chunks •unique_checks = 0; •foreign_key_checks = 0; •sql_log_bin = 0; •innodb_autoinc_lock_mode = 2; •Compression and block size •Persistent optimizer stats •innodb_stats_persistent •innodb_stats_auto_recalc 15 SET GLOBAL innodb_file_per_table = 1; SET GLOBAL innodb_file_format = Barracuda; CREATE TABLE t1 ( c1 INT PRIMARY KEY, c2 VARCHAR(255) ) ROW_FORMAT = COMPRESSED KEY_BLOCK_SIZE = 8; LOAD   DATA LOCAL INFILE '/usr2/t1_01_simple' INTO TABLE t1; Query OK, 134217728 rows affected (1 hour 34 min 7.49 sec) Records: 134217728  Deleted: 0  Skipped: 0  Warnings: 0 LOAD   DATA LOCAL INFILE '/usr2/t1_01_simple' INTO TABLE t2; Query OK, 134217728 rows affected (25 min 20.75 sec) Records: 134217728  Deleted: 0  Skipped: 0  Warnings: 0 Friday, 3 May 13
  • 16. PAGE Partitioning (MySQL 5.6) •Partitioning Types •RANGE, LIST, RANGE COLUMN, HASH, LINEAR HASH, KEY LINEAR KEY, sub-partitions •Partition and lock pruning •Use of INDEX and DATA DIRECTORY •PARTITIONADD, DROP, REORGANIZE, COALESCE, TRUNCATE, EXCHANGE, REBUILD, OPTIMIZE, CHECK, ANALYZE, REPAIR 16 CREATE TABLE t1 ( c1 INT, c2 DATE ) PARTITION BY RANGE( YEAR( c2 ) ) SUBPARTITION BY HASH ( TO_DAYS( c2 ) ) ( PARTITION p0 VALUES LESS THAN (1990) ( SUBPARTITION s0 DATA DIRECTORY = '/disk0/data' INDEX DIRECTORY = '/disk0/idx', SUBPARTITION s1 DATA DIRECTORY = '/disk1/data' INDEX DIRECTORY = '/disk1/idx' ),... ALTER TABLE t1 EXCHANGE PARTITION p3 WITH TABLE t2; -- Range and List partitions ALTER TABLE t1 REORGANIZE PARTITION p0,p1,p2,p3 INTO ( PARTITION m0 VALUES LESS THAN (1980), PARTITION m1 VALUES LESS THAN (2000)); -- Hash and Key partitions ALTER TABLE t1 COALESCE PARTITION 10; ALTER TABLE t1 ADD PARTITION PARTITIONS 5; Friday, 3 May 13
  • 17. PAGE MariaDB Optimizer •Multi-Range Read (MRR)* •Index Merge / Sort intersection •Batch KeyAccess* •Block hash join •Cost-based choice of range vs. index_merge •ORDER BY ... LIMIT <limit>* •MariaDB 10 •Subqueries •Semi-join* •Materialization* •subquery cache •LIMIT ... ROWS EXAMINED <limit> 17 (*) - Available in MySQL 5.6 Friday, 3 May 13
  • 18. PAGE Virtual & Dynamic Columns VIRTUAL COLUMNS •For InnoDB, MyISAM andAria •PERSISTENT (stored) or VIRTUAL (generated) 18 CREATE TABLE t1 ( c1 INT NOT NULL, c2 VARCHAR(32), c3 INT AS ( c1 MOD 10 ) VIRTUAL, c4 VARCHAR(5) AS ( LEFT(B,5) ) PERSISTENT); DYNAMIC COLUMNS •Implement a schemaless, document store •COLUMN_ CREATE,ADD, GET, LIST, JSON, EXISTS, CHECK, DELETE •Nested colums are allowed •Main datatypes are allowed •Max 1GB documents CREATE TABLE assets ( item_name VARCHAR(32) PRIMARY KEY, dynamic_cols BLOB ); INSERT INTO assets VALUES ( 'MariaDB T-shirt', COLUMN_CREATE( 'color', 'blue', 'size', 'XL' ) ); INSERT INTO assets VALUES ( 'Thinkpad Laptop', COLUMN_CREATE( 'color', 'black', 'price', 500 ) ); Friday, 3 May 13
  • 19. PAGE Cassandra Storage Engine •Column Family == Table •Rowkey, static and dynamic columns allowed •Batch key access support SET cassandra_default_thrift_host = '192.168.0.10' CREATE TABLE cassandra_tbl ( rowkey INT PRIMARY KEY, col1 VARCHAR(25), col2 BIGINT, dyn_cols BLOB DYNAMIC_COLUMN_STORAGE = yes ) ENGINE = cassandra KEYSPACE = 'cassandra_key_space' COLUMN_FAMILY = 'column_family_name'; 19 Friday, 3 May 13
  • 20. PAGE Connect Storage Engine •Any file format as MySQLTABLE: •ODBC •Text, XML, *ML •Excel,Access etc. •MariaDB CREATE TABLE options •Multi-file table •TableAutocreation •Condition push down •Read/Write and Multi Storage Engine Join •CREATE INDEX 20 CREATE TABLE handout ENGINE = CONNECT TABLE_TYPE = XML FILE_NAME = 'handout.htm' HEADER = yes OPTION_LIST = 'name = TABLE, coltype = HTML, attribute = (border=1;cellpadding=5)'; Friday, 3 May 13
  • 21. Starting Your Big Data Project Friday, 3 May 13
  • 22. PAGE Why would you use MySQL? • Time • Knowledge • Infrastructure • Costs • Simplified Integration • Not so “big” data 22 Friday, 3 May 13
  • 23. PAGE Apache Hadoop & Friends 23 HDFS MapReduce PIG HIVE HCatalog HBASE ZooKeeper •Mahout •Ambari, Ganglia, Nagios •Sqoop •Cascading •Oozie •Flume •Protobuf, Avro, Thrift •Fuse-DFS •Chukwa •Cassandra Friday, 3 May 13
  • 24. PAGE MySQL & Friends 24 MySQL/MariaDB/Storage Engines SQL Optimizer Scripts Stored Procedures DML DB Schema / DDL MySQL/MariaDB SkySQLDS •Mahout •SDS, Ganglia, Nagios •mysqlimport •Cascading •Talend, Pentaho •Connect Friday, 3 May 13
  • 25. PAGE Join us at the Solutions Day •Cassandra and Connect Storage Engine •Map/Reduce approach - Proxy optimisation •Multiple protocols and more 25 Friday, 3 May 13