SlideShare a Scribd company logo
Bay Area Hadoop Users Group
Turning the Tables with InfiniDB for
Hadoop
December 18, 2013
Agenda
 InfiniDB Background
 InfiniDB Technical Foundations
 Parallelism
 Partitioning Model
 Additional I/O Efficiencies

 (My)SQL for Hadoop
 When to use Columnar/InfiniDB for Hadoop
 InfiniDB Benchmarks

Copyright © 2013 Calpont. All Rights Reserved.
InfiniDB Background
Platforms

Versions

 InfiniDB

 InfiniDB Launched Feb 2010

 InfiniDB for the Cloud

 InfiniDB 4 – latest release
available October 2013

 InfiniDB for Hadoop

 Added InfiniDB for Hadoop

 Source code at
https://github.com/infinidb

 GPL v2
 No restrictions on syntax,
scale, or performance

Copyright © 2013 Calpont. All Rights Reserved.
InfiniDB Background - Customer Base

Copyright © 2013 Calpont. All Rights Reserved.
InfiniDB Background
Platforms
 InfiniDB

Local Disk, GlusterFS, Windows*

 http://www.calpont.com/products/tryinfinidb

 InfiniDB for Hadoop

CDH or HDP

 http://www.calpont.com/products/tryinfinidb

 InfiniDB for the Cloud

Any availability zone

Copyright © 2013 Calpont. All Rights Reserved.
InfiniDB Background – InfiniDB for Hadoop
 InfiniDB is a non-map/reduce engine
 Reads and writes natively to HDFS

Pig/Hive

HBase

Map Reduce

InfiniDB
for
Hadoop

Hadoop Distributed File System

6
InfiniDB Background - InfiniDB for Hadoop
Is InfiniDB a Database?
“InfiniDB turns SQL developers

…not a General Purpose DBMS.

into Big Data developers. We
deployed it quickly and easily

Is InfiniDB NoSQL?

for our online sales analytics.

… only in the sense that we discarded

Something we couldn’t do

traditional DBMS architectures.

with Hadoop, Mongo, or
Teradata”

Is InfiniDB an SQL for Hadoop technology?
… Yes, but not general purpose SQL.

InfiniDB is highly optimized for analytic
workloads/queries.

7
InfiniDB Foundation - Parallelism
• User Module – Processes SQL Requests
• Performance Module – Executes the Queries
Single Server

MPP

or

Local disk / EBS
GlusterFS / HDFS
8
InfiniDB Foundation - Parallelism
•Purpose-built C++ engine
•Parallelism is at the thread level
•Example: 12 PM Servers with 8 cores each
yields 96 parallel processing engines.
•SQL is translated into thousands or tens of
thousands of discrete jobs or “primitives”.
•The UM sends primitives to the processing
engines.
9
InfiniDB Foundation - Parallelism
•User Module – Processes SQL Requests
•Performance Module – Executes the Queries
Single Server

MPP

• Primitives are issued to
thread queue within PM
• Fixed thread count at PM
Local disk / EBS
GlusterFS / HDFS
10
Fully Parallel SQL + Full SQL Syntax

DoW

Reduce 

SQL Operations are translated into thousands of jobs via custom
Distribution of Work:
• Parallel/Distributed Data Access
• Parallel/Distributed Joins (Inner, Outer)
• Parallel/Distributed Sub-queries (From, Where, Select)
• Parallel/Distributed Group By, Distinct, and Aggregation
• Extensible with Parallel/Distributed User Defined Functions
Results are returned to User Module in Reduce Phase
11
InfiniDB Data Partitioning
2-Dimensional Partitioning Model
•Vertical Partitioning by Column
o Not Column-Family (no relation to HBase)
o Only do I/O for columns requested

•Horizontal Partitioning by range of rows
o Meta-data stored within in-memory structure

12
InfiniDB Data Partitioning
•Partition elimination can occur based on:
o Columns not included in SQL.
o Based on filter expressed within query.
o Based on filter expressed on a join table:

Table1 filter can drive Table2 I/O elimination
o Intersection between filters:
Filter1 and Filter2 does I/O on intersection
13
Column Restriction and Projection
|-------- Column # Seventeen -----------|

Extent # 27

Filter 3

Filter 2

Filter 1

|-------------- Column # Six ---------------|

|-------------- Column # Four ---------------|

Projection

Extent # 5

Projection

• Automatic Vertical Partitioning + Horizontal Partitioning
• Just-In-Time Materialization
14
Additional I/O Efficiency
Techniques to Avoid Unnecessary I/O
 Vertical Partitioning: read only the columns required

 Horizontal Partition: focus on the rows required
 Just-in-time materialization

Techniques for Efficient I/O
 Columnar compression reduces I/O from disk
 Global data buffer cache can reduce disk I/O (in-memory)

 Avoidance of Random I/O

15
InfiniDB Design Principles
®

Scalable

Fast

16

Simple
(My)SQL for Hadoop - Engine=InfiniDB
InfiniDB uses standard “Engine=InfiniDB” syntax:

CREATE TABLE `game_warehouse`.`dim_title` (
`id` INT,
`name` VARCHAR(45),
`publisher` VARCHAR(45),
`release_date` DATE,
`language` INT,
`platform_name` VARCHAR(45),
`version` VARCHAR(45)
) ENGINE=InfiniDB;

17
(My)SQL for Hadoop
Leverage existing tools
that connect to
MySQL

Expose Structured
Data to the Business

Familiar User Privilege
Administration

MicroStrategy
JasperSoft
Pentaho

MySQL ease of use + Hadoop Scale + Columnar
Performance
18
Syntax Support

Broad MySQL
SQL syntax

-

+

Analytic/windowing
functions included
with InfiniDB 4

No indexing needed.
Partitioning is automatic.

InfiniDB Supported Syntax
19
When to Use InfiniDB for Hadoop

Query Size (Vision/Scope) defines workloads:
1

100 10,000

1,000,000

100,000,000 10,000,000,000

Query Size/Vision/Scope

OLTP/NoSQL Workloads

ROLAP/Analytic/Reporting Workloads

General purpose DBMS missed the target
( dated database technology generally not optimal )
20
What is your typical query?
1

100 10,000

1,000,000

100,000,000 10,000,000,000

Query Vision/Scope

OLTP/NoSQL Workloads

Analytic Workloads

• There is no “average” query.
• The challenges are at the extremes:
o The challenge of high concurrency levels with small queries.
o The challenge of latency for very large queries.

• Most use cases imply multiple data technologies.
21
Columnar Appropriate Workloads
1

100 10,000

1,000,000

100,000,000 10,000,000,000

Query Vision/Scope

OLTP/NoSQL Workloads

Pure Columnar about
10x worse I/O for
single record lookups
22

ROLAP/Analytic/Reporting Workloads

Pure Columnar about
10x better I/O for large
data access patterns
Columnar Appropriate Workloads
Data Dimensions and InfiniDB for Hadoop
Unstructured Data
Schema on read

Schema on write

Small Queries

Large Queries

Transform (ETL)

Targeted Extract

Pre-defined queries
23

Structured

Ad-hoc queries
InfiniDB Query Performance – Percona
Star Schema Benchmark (SSB)
Q5 Series
5 table Joins

Q1 Series
2 table Joins

Q2 Series
3 table Joins

Q3 Series
4 table Joins

24
1000 Genomes Data Set – 289 Billion Rows
 Fast load Rate
 Millions rows/sec
 Billions rows/hour

 Scalable load rate

1000 Genomes data set on AWS
1000 Genomes Data Set – ~ 24 trillion base
nucleotide values
Scaling: 4 –> 8 –> 16 Performance Modules

 Fast Analytics
 Millions of rows/second

 Scalable Analytics

Seconds

per core

 Automatic parallelism
Performance Modules (PMs) Active

Figure 2 - TATA Binding Protein
Source: http://en.wikipedia.org/wiki/TATA_binding_protein
Impala-InfiniDB Benchmark (Piwik Data Set)

InfiniDB

Figure 1 - Piwik Standard Query Performance

InfiniDB

Figure 2 - Piwik Ad-Hoc Query Performance

Piwik is an Open Source alternative to Google Analytics
Queries 1-6 offered are Piwik production queries
Queries 7-9 are additional ad-hoc queries covering all data
Amazon 5-node cluster
Columnar Appropriate Workloads
Data Dimensions and InfiniDB for Hadoop
Structured
Schema on read

InfiniDB

Schema on write

Small Queries

Large Queries

Transform (ETL)

Targeted Extract

Figure 2 - Piwik Ad-Hoc Query Performance

Ad-hoc queries
28
Download Today
InfiniDB and InfiniDB for Hadoop:
www.calpont.com
InfiniDB for the Cloud:
InfiniDB AMI in any AWS Availability Zone/Region

Services Inquiries:
sales@calpont.com
Twitter:
@InfiniDB

@jtommaney

© 2013 Calpont Corporation. Calpont, the Calpont logo, InfiniDB, and the InfiniDB logo are trademarks of Calpont Corporation. AWS is a trademark of Amazon.com,
Inc., and Apache Hadoop is a trademark of the Apache Software Foundation. Other product names and logos may be trademarks of their respective owners.

29

More Related Content

What's hot

OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Ganesan Narayanasamy
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
inside-BigData.com
 
IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016
Patrick Bouillaud
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
gluent.
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
xKinAnx
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
Ganesan Narayanasamy
 
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetup
t3rmin4t0r
 
IBM Platform Computing Elastic Storage
IBM Platform Computing  Elastic StorageIBM Platform Computing  Elastic Storage
IBM Platform Computing Elastic Storage
Patrick Bouillaud
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
Ganesan Narayanasamy
 
Llap: Locality is Dead
Llap: Locality is DeadLlap: Locality is Dead
Llap: Locality is Dead
t3rmin4t0r
 
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Community
 
POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI
Anand Haridass
 
A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
Yu Liu
 
Gummadi-47-Shadowbase-Technical-Overview.Final
Gummadi-47-Shadowbase-Technical-Overview.FinalGummadi-47-Shadowbase-Technical-Overview.Final
Gummadi-47-Shadowbase-Technical-Overview.Finalajaya gummadi
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
Filipe Miranda
 
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD
 
00 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver200 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver2
Yutaka Kawai
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
Odinot Stanislas
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...
Hitachi Vantara
 

What's hot (20)

OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetup
 
IBM Platform Computing Elastic Storage
IBM Platform Computing  Elastic StorageIBM Platform Computing  Elastic Storage
IBM Platform Computing Elastic Storage
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
Llap: Locality is Dead
Llap: Locality is DeadLlap: Locality is Dead
Llap: Locality is Dead
 
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
 
POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI
 
A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
 
Gummadi-47-Shadowbase-Technical-Overview.Final
Gummadi-47-Shadowbase-Technical-Overview.FinalGummadi-47-Shadowbase-Technical-Overview.Final
Gummadi-47-Shadowbase-Technical-Overview.Final
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center
 
00 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver200 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver2
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...
 

Similar to December 2013 HUG: InfiniDB for Hadoop

MySQL conference 2010 ignite talk on InfiniDB
MySQL conference 2010 ignite talk on InfiniDBMySQL conference 2010 ignite talk on InfiniDB
MySQL conference 2010 ignite talk on InfiniDBCalpont
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 
Informix warehouse accelerator update
Informix warehouse accelerator updateInformix warehouse accelerator update
Informix warehouse accelerator update
IBM Sverige
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
Severalnines Training: MySQL® Cluster - Part IX
Severalnines Training: MySQL® Cluster - Part IXSeveralnines Training: MySQL® Cluster - Part IX
Severalnines Training: MySQL® Cluster - Part IX
Severalnines
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
Cloudera, Inc.
 
MySQL 5.7: Focus on InnoDB
MySQL 5.7: Focus on InnoDBMySQL 5.7: Focus on InnoDB
MySQL 5.7: Focus on InnoDB
Mario Beck
 
Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01Arunkumar Shanmugam
 
MySQL 5.6, news in 5.7 and our HA options
MySQL 5.6, news in 5.7 and our HA optionsMySQL 5.6, news in 5.7 and our HA options
MySQL 5.6, news in 5.7 and our HA options
Ted Wennmark
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsThe Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
Inside Analysis
 
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
IBM DB2 Analytics Accelerator  Trends & Directions by Namik Hrle IBM DB2 Analytics Accelerator  Trends & Directions by Namik Hrle
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
Surekha Parekh
 
IBM Analytics Accelerator Trends & Directions Namk Hrle
IBM Analytics Accelerator  Trends & Directions Namk Hrle IBM Analytics Accelerator  Trends & Directions Namk Hrle
IBM Analytics Accelerator Trends & Directions Namk Hrle
Surekha Parekh
 
SDAccel Design Contest: SDAccel and F1 Instances
SDAccel Design Contest: SDAccel and F1 InstancesSDAccel Design Contest: SDAccel and F1 Instances
SDAccel Design Contest: SDAccel and F1 Instances
NECST Lab @ Politecnico di Milano
 
Software Variability Management
Software Variability ManagementSoftware Variability Management
Software Variability Management
XavierDevroey
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
Tyrone Systems
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community
 
MySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMySQL Performance Metrics that Matter
MySQL Performance Metrics that Matter
Morgan Tocker
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
ScyllaDB
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...
Daniel Martin
 

Similar to December 2013 HUG: InfiniDB for Hadoop (20)

MySQL conference 2010 ignite talk on InfiniDB
MySQL conference 2010 ignite talk on InfiniDBMySQL conference 2010 ignite talk on InfiniDB
MySQL conference 2010 ignite talk on InfiniDB
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Informix warehouse accelerator update
Informix warehouse accelerator updateInformix warehouse accelerator update
Informix warehouse accelerator update
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Severalnines Training: MySQL® Cluster - Part IX
Severalnines Training: MySQL® Cluster - Part IXSeveralnines Training: MySQL® Cluster - Part IX
Severalnines Training: MySQL® Cluster - Part IX
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
 
MySQL 5.7: Focus on InnoDB
MySQL 5.7: Focus on InnoDBMySQL 5.7: Focus on InnoDB
MySQL 5.7: Focus on InnoDB
 
Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01
 
MySQL 5.6, news in 5.7 and our HA options
MySQL 5.6, news in 5.7 and our HA optionsMySQL 5.6, news in 5.7 and our HA options
MySQL 5.6, news in 5.7 and our HA options
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsThe Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
 
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
IBM DB2 Analytics Accelerator  Trends & Directions by Namik Hrle IBM DB2 Analytics Accelerator  Trends & Directions by Namik Hrle
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
 
IBM Analytics Accelerator Trends & Directions Namk Hrle
IBM Analytics Accelerator  Trends & Directions Namk Hrle IBM Analytics Accelerator  Trends & Directions Namk Hrle
IBM Analytics Accelerator Trends & Directions Namk Hrle
 
SDAccel Design Contest: SDAccel and F1 Instances
SDAccel Design Contest: SDAccel and F1 InstancesSDAccel Design Contest: SDAccel and F1 Instances
SDAccel Design Contest: SDAccel and F1 Instances
 
Software Variability Management
Software Variability ManagementSoftware Variability Management
Software Variability Management
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
MySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMySQL Performance Metrics that Matter
MySQL Performance Metrics that Matter
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...
 

More from Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Yahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Yahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Yahoo Developer Network
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Yahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
Yahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
Yahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
Yahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
Yahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
Yahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Yahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
Yahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
Yahoo Developer Network
 

More from Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Recently uploaded

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 

Recently uploaded (20)

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 

December 2013 HUG: InfiniDB for Hadoop

  • 1. Bay Area Hadoop Users Group Turning the Tables with InfiniDB for Hadoop December 18, 2013
  • 2. Agenda  InfiniDB Background  InfiniDB Technical Foundations  Parallelism  Partitioning Model  Additional I/O Efficiencies  (My)SQL for Hadoop  When to use Columnar/InfiniDB for Hadoop  InfiniDB Benchmarks Copyright © 2013 Calpont. All Rights Reserved.
  • 3. InfiniDB Background Platforms Versions  InfiniDB  InfiniDB Launched Feb 2010  InfiniDB for the Cloud  InfiniDB 4 – latest release available October 2013  InfiniDB for Hadoop  Added InfiniDB for Hadoop  Source code at https://github.com/infinidb  GPL v2  No restrictions on syntax, scale, or performance Copyright © 2013 Calpont. All Rights Reserved.
  • 4. InfiniDB Background - Customer Base Copyright © 2013 Calpont. All Rights Reserved.
  • 5. InfiniDB Background Platforms  InfiniDB Local Disk, GlusterFS, Windows*  http://www.calpont.com/products/tryinfinidb  InfiniDB for Hadoop CDH or HDP  http://www.calpont.com/products/tryinfinidb  InfiniDB for the Cloud Any availability zone Copyright © 2013 Calpont. All Rights Reserved.
  • 6. InfiniDB Background – InfiniDB for Hadoop  InfiniDB is a non-map/reduce engine  Reads and writes natively to HDFS Pig/Hive HBase Map Reduce InfiniDB for Hadoop Hadoop Distributed File System 6
  • 7. InfiniDB Background - InfiniDB for Hadoop Is InfiniDB a Database? “InfiniDB turns SQL developers …not a General Purpose DBMS. into Big Data developers. We deployed it quickly and easily Is InfiniDB NoSQL? for our online sales analytics. … only in the sense that we discarded Something we couldn’t do traditional DBMS architectures. with Hadoop, Mongo, or Teradata” Is InfiniDB an SQL for Hadoop technology? … Yes, but not general purpose SQL. InfiniDB is highly optimized for analytic workloads/queries. 7
  • 8. InfiniDB Foundation - Parallelism • User Module – Processes SQL Requests • Performance Module – Executes the Queries Single Server MPP or Local disk / EBS GlusterFS / HDFS 8
  • 9. InfiniDB Foundation - Parallelism •Purpose-built C++ engine •Parallelism is at the thread level •Example: 12 PM Servers with 8 cores each yields 96 parallel processing engines. •SQL is translated into thousands or tens of thousands of discrete jobs or “primitives”. •The UM sends primitives to the processing engines. 9
  • 10. InfiniDB Foundation - Parallelism •User Module – Processes SQL Requests •Performance Module – Executes the Queries Single Server MPP • Primitives are issued to thread queue within PM • Fixed thread count at PM Local disk / EBS GlusterFS / HDFS 10
  • 11. Fully Parallel SQL + Full SQL Syntax DoW Reduce  SQL Operations are translated into thousands of jobs via custom Distribution of Work: • Parallel/Distributed Data Access • Parallel/Distributed Joins (Inner, Outer) • Parallel/Distributed Sub-queries (From, Where, Select) • Parallel/Distributed Group By, Distinct, and Aggregation • Extensible with Parallel/Distributed User Defined Functions Results are returned to User Module in Reduce Phase 11
  • 12. InfiniDB Data Partitioning 2-Dimensional Partitioning Model •Vertical Partitioning by Column o Not Column-Family (no relation to HBase) o Only do I/O for columns requested •Horizontal Partitioning by range of rows o Meta-data stored within in-memory structure 12
  • 13. InfiniDB Data Partitioning •Partition elimination can occur based on: o Columns not included in SQL. o Based on filter expressed within query. o Based on filter expressed on a join table: Table1 filter can drive Table2 I/O elimination o Intersection between filters: Filter1 and Filter2 does I/O on intersection 13
  • 14. Column Restriction and Projection |-------- Column # Seventeen -----------| Extent # 27 Filter 3 Filter 2 Filter 1 |-------------- Column # Six ---------------| |-------------- Column # Four ---------------| Projection Extent # 5 Projection • Automatic Vertical Partitioning + Horizontal Partitioning • Just-In-Time Materialization 14
  • 15. Additional I/O Efficiency Techniques to Avoid Unnecessary I/O  Vertical Partitioning: read only the columns required  Horizontal Partition: focus on the rows required  Just-in-time materialization Techniques for Efficient I/O  Columnar compression reduces I/O from disk  Global data buffer cache can reduce disk I/O (in-memory)  Avoidance of Random I/O 15
  • 17. (My)SQL for Hadoop - Engine=InfiniDB InfiniDB uses standard “Engine=InfiniDB” syntax: CREATE TABLE `game_warehouse`.`dim_title` ( `id` INT, `name` VARCHAR(45), `publisher` VARCHAR(45), `release_date` DATE, `language` INT, `platform_name` VARCHAR(45), `version` VARCHAR(45) ) ENGINE=InfiniDB; 17
  • 18. (My)SQL for Hadoop Leverage existing tools that connect to MySQL Expose Structured Data to the Business Familiar User Privilege Administration MicroStrategy JasperSoft Pentaho MySQL ease of use + Hadoop Scale + Columnar Performance 18
  • 19. Syntax Support Broad MySQL SQL syntax - + Analytic/windowing functions included with InfiniDB 4 No indexing needed. Partitioning is automatic. InfiniDB Supported Syntax 19
  • 20. When to Use InfiniDB for Hadoop Query Size (Vision/Scope) defines workloads: 1 100 10,000 1,000,000 100,000,000 10,000,000,000 Query Size/Vision/Scope OLTP/NoSQL Workloads ROLAP/Analytic/Reporting Workloads General purpose DBMS missed the target ( dated database technology generally not optimal ) 20
  • 21. What is your typical query? 1 100 10,000 1,000,000 100,000,000 10,000,000,000 Query Vision/Scope OLTP/NoSQL Workloads Analytic Workloads • There is no “average” query. • The challenges are at the extremes: o The challenge of high concurrency levels with small queries. o The challenge of latency for very large queries. • Most use cases imply multiple data technologies. 21
  • 22. Columnar Appropriate Workloads 1 100 10,000 1,000,000 100,000,000 10,000,000,000 Query Vision/Scope OLTP/NoSQL Workloads Pure Columnar about 10x worse I/O for single record lookups 22 ROLAP/Analytic/Reporting Workloads Pure Columnar about 10x better I/O for large data access patterns
  • 23. Columnar Appropriate Workloads Data Dimensions and InfiniDB for Hadoop Unstructured Data Schema on read Schema on write Small Queries Large Queries Transform (ETL) Targeted Extract Pre-defined queries 23 Structured Ad-hoc queries
  • 24. InfiniDB Query Performance – Percona Star Schema Benchmark (SSB) Q5 Series 5 table Joins Q1 Series 2 table Joins Q2 Series 3 table Joins Q3 Series 4 table Joins 24
  • 25. 1000 Genomes Data Set – 289 Billion Rows  Fast load Rate  Millions rows/sec  Billions rows/hour  Scalable load rate 1000 Genomes data set on AWS
  • 26. 1000 Genomes Data Set – ~ 24 trillion base nucleotide values Scaling: 4 –> 8 –> 16 Performance Modules  Fast Analytics  Millions of rows/second  Scalable Analytics Seconds per core  Automatic parallelism Performance Modules (PMs) Active Figure 2 - TATA Binding Protein Source: http://en.wikipedia.org/wiki/TATA_binding_protein
  • 27. Impala-InfiniDB Benchmark (Piwik Data Set) InfiniDB Figure 1 - Piwik Standard Query Performance InfiniDB Figure 2 - Piwik Ad-Hoc Query Performance Piwik is an Open Source alternative to Google Analytics Queries 1-6 offered are Piwik production queries Queries 7-9 are additional ad-hoc queries covering all data Amazon 5-node cluster
  • 28. Columnar Appropriate Workloads Data Dimensions and InfiniDB for Hadoop Structured Schema on read InfiniDB Schema on write Small Queries Large Queries Transform (ETL) Targeted Extract Figure 2 - Piwik Ad-Hoc Query Performance Ad-hoc queries 28
  • 29. Download Today InfiniDB and InfiniDB for Hadoop: www.calpont.com InfiniDB for the Cloud: InfiniDB AMI in any AWS Availability Zone/Region Services Inquiries: sales@calpont.com Twitter: @InfiniDB @jtommaney © 2013 Calpont Corporation. Calpont, the Calpont logo, InfiniDB, and the InfiniDB logo are trademarks of Calpont Corporation. AWS is a trademark of Amazon.com, Inc., and Apache Hadoop is a trademark of the Apache Software Foundation. Other product names and logos may be trademarks of their respective owners. 29