SlideShare a Scribd company logo

Where Does Big Data Meet Big Database - QCon 2012

1 of 75
Download to read offline
Welcome
It used to be easy…
they all looked pretty much alike
NoSQL       BigData    MapReduce     Graph      Document


             Shared      Column                   Eventual
BigTable                               CAP
             Nothing     Oriented                Consistency


  ACID        BASE       Mongo       Coudera      Hadoop



Voldemort   Cassandra    Dynamo      Marklogic     Redis



 Velocity    Hbase      Hypertable     Riak         BDB
Now it’s downright
c0nfuZ1nG!
What Happened?
Ad

Recommended

MySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Cluster no PayPal
MySQL Cluster no PayPalMySQL Brasil
 
Introduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & HadoopIntroduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & HadoopSavvycom Savvycom
 
Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLTushar Shende
 
NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?Martin Scholl
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An OverviewArvind Kalyan
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Cloudera, Inc.
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 

More Related Content

What's hot

Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseAge Mooij
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
An introduction to Big Data
An introduction to Big DataAn introduction to Big Data
An introduction to Big DataForwardSprint
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storagehybrid cloud
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An OverviewC. Scyphers
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceMahantesh Angadi
 
Long Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC DatacenterLong Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC Datacenterinside-BigData.com
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data ApplicationsRichard McDougall
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 
Considerations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT projectConsiderations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT projectAkmal Chaudhri
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12mark madsen
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012InfiniteGraph
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @ScaleDr Hajji Hicham
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop IntroductionAdam Muise
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopAdam Muise
 

What's hot (20)

Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBase
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
An introduction to Big Data
An introduction to Big DataAn introduction to Big Data
An introduction to Big Data
 
Flexible Design
Flexible DesignFlexible Design
Flexible Design
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An Overview
 
On nosql
On nosqlOn nosql
On nosql
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
 
Long Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC DatacenterLong Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC Datacenter
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data Applications
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Considerations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT projectConsiderations for using NoSQL technology on your next IT project
Considerations for using NoSQL technology on your next IT project
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @Scale
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop Introduction
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
 

Viewers also liked

A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojureBen Stopford
 
Big Data & the Enterprise
Big Data & the EnterpriseBig Data & the Enterprise
Big Data & the EnterpriseBen Stopford
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?Ben Stopford
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streamsjimriecken
 
Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideBen Stopford
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Michael Noll
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming WorldBen Stopford
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache KafkaBen Stopford
 
Microservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka EcosystemMicroservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka Ecosystemconfluent
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the LogBen Stopford
 

Viewers also liked (12)

A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojure
 
Big Data & the Enterprise
Big Data & the EnterpriseBig Data & the Enterprise
Big Data & the Enterprise
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
 
JAX London Slides
JAX London SlidesJAX London Slides
JAX London Slides
 
Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the Divide
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming World
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache Kafka
 
Microservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka EcosystemMicroservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka Ecosystem
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the Log
 

Similar to Where Does Big Data Meet Big Database - QCon 2012

NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastInside Analysis
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBen Stopford
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...Felix Gessert
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Next Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon ThomasNext Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon ThomasThoughtworks
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBigDataCloud
 
JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonMongoDB
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Schubert Zhang
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraJeff Bollinger
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]Huy Do
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...lisapaglia
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBjhugg
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
 

Similar to Where Does Big Data Meet Big Database - QCon 2012 (20)

NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory Webcast
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
 
No sql
No sqlNo sql
No sql
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Next Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon ThomasNext Generation Data Platforms - Deon Thomas
Next Generation Data Platforms - Deon Thomas
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
NoSQL
NoSQLNoSQL
NoSQL
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max Schireson
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Learning from google megastore (Part-1)
Learning from google megastore (Part-1)Learning from google megastore (Part-1)
Learning from google megastore (Part-1)
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with Cassandra
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
 

More from Ben Stopford

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache KafkaBen Stopford
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven MicroservicesBen Stopford
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessBen Stopford
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationBen Stopford
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBen Stopford
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with StreamsBen Stopford
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Ben Stopford
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBen Stopford
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsBen Stopford
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache KafkaBen Stopford
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Ben Stopford
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Ben Stopford
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyBen Stopford
 
Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopfordBen Stopford
 
Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Ben Stopford
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...Ben Stopford
 
Balancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java DatabaseBalancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java DatabaseBen Stopford
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Ben Stopford
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideBen Stopford
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 

More from Ben Stopford (20)

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and Serverless
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices Generation
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka Streams
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful Streams
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful Streams
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data Dichotomy
 
Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopford
 
Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
 
Balancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java DatabaseBalancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java Database
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental Divide
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 

Recently uploaded

Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfkatalinjordans1
 
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...MarcovanHurne2
 
Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?MENGSAYLOEM1
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build PolandGDSC PJATK
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...htrindia
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...Fwdays
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!KivenRaySarsaba
 
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfSafe Software
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxInfosec
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxNeo4j
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Adrian Sanabria
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceVijayananda Mohire
 
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro KozhevinFwdays
 
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...Product School
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Umar Saif
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsInflectra
 
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Product School
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...Neo4j
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...DianaGray10
 

Recently uploaded (20)

Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdf
 
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
 
Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build Poland
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!
 
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptx
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial Intelligence
 
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
 
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
 
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
 
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
 

Where Does Big Data Meet Big Database - QCon 2012

  • 2. It used to be easy…
  • 3. they all looked pretty much alike
  • 4. NoSQL BigData MapReduce Graph Document Shared Column Eventual BigTable CAP Nothing Oriented Consistency ACID BASE Mongo Coudera Hadoop Voldemort Cassandra Dynamo Marklogic Redis Velocity Hbase Hypertable Riak BDB
  • 7. we changed scale
  • 8. tack ge d we ch an
  • 9. so where does big data meet big database?
  • 10. The world’s largest NoSQL database?
  • 12. So how Big is Big? Everything (500,000) Web Pages (40) 0.01% Words (0.6) 1% Sizes in Petabytes
  • 13. Many more Big Sources mobile weather sensors Social Logs data audio video
  • 14. But it is pretty useful Marketing Fraud detection Tax Evasion Intelligence Advertising Scientific research
  • 15. Gartner 80% of business is conducted on unstructured information
  • 16. Big Data is now a new class of economic asset* *World economic forum 2012
  • 17. Yet 80% Enterprise Databases < 1TB
  • 18. Along came the Big Data Movement
  • 19. MapReduce (2004) •  Large, distributed, ordered map •  Fault-tolerant file system •  Petabyte scaling
  • 20. Disruptive Simple Pragmatic Solved an insoluble problem Unencumbered by tradition (good & bad) Hacker rather than Enterprise culture
  • 21. A Different Focus Tradition The new wave •  Global consistency •  Local consistency •  Schema driven •  Schemaless / Last •  Reliable Network •  Unreliable Network •  Highly Structured •  Semi-structured/ Unstructured
  • 22. Novel? Possibly better put as: A timely and elegant combination of existing ideas, placed together to solve a previously unsolved problem.
  • 23. Backlash (2009) Not novel (dates back to the 80’s) Physical level not the logical level (messy?) Incompatible with tooling Lack of integrity (referential) & ACID MR is brute force ignoring indexing, scew
  • 24. All points are reasonable
  • 25. And they proved it too! “A comparison of Approaches to Large Scale Data Analysis” – Sigmod 2009 •  Vertica vs. DBMSX vs. Hadoop •  Vertica up to 7 x faster than Hadoop over benchmarks Databases faster than Hadoop
  • 26. But possibly missed the point?
  • 27. MapReduce was not supposed to be a Data Warehousing tool?
  • 28. If you need more, layer it on top For example Tensing & Magastore @ Google
  • 29. So MapReduce represents a bottom-up approach to accessing very large data sets that is unencumbered by the past.
  • 30. …and the Database Field knew it had Problems
  • 31. We Lose: Joe Hellerstein (Berkeley) 2001 “Databases are commoditised and cornered to slow-moving, evolving, structure intensive, applications that require schema evolution.“ … “The internet companies are lost and we will remain in the doldrums of the enterprise space.” … “As databases are black boxes which require a lot of coaxing to get maximum performance”
  • 32. Yet they do some very cool stuff Statistically based optimisers, Compression, indexing structures, distributed optimisers, their own declarative language
  • 34. They are an Awesome Tool
  • 35. They Don’t talk our Language
  • 36. They Default to Constraint
  • 37. So NoSurprise with NoSQL then Simpler Contract Shared nothing No joins / ACID No impedance mismatch No slow schema evolution Simple code paths Just works
  • 38. The NoSQL Approach Simple, flexible storage over a diverse range of data structures that will scale almost indefinitely.
  • 40. Two Ways In: Key Based Access Client
  • 41. Two Ways In: Broadcast to Every Node Client
  • 42. So.. A simple bottom up approach to data storage that scales almost indefinitely. •  No relations •  No joins •  No SQL •  No Transactions •  No sluggish schema evolution
  • 44. The ‘Relational Camp’ had been busy too Realisation that the traditional architecture was insufficient for various modern workloads
  • 45. End of an Era Paper - 2007 “Because RDBMSs can be beaten by more than an order of magnitude on the standard OLTP benchmark, then there is no market where they are competitive. As such, they should be considered as legacy technology more than a quarter of a century in age, for which a complete redesign and re-architecting is the appropriate next step.” – Michael Stonebraker
  • 46. No Longer a One-Size-Fits-All
  • 47. Architecting for Different Non- Functionals Shared In-Memory Nothing / Disk Fast Column Network/ Orientation SSD
  • 50. Shared Disk Architecture Single node Cache can handle sits any query above whole dataset All machines see all data
  • 51. Shared Nothing Architecture Cache Queries hit every node over just the shard •  Autonomy over a shard •  Divide and conqueror (non-key hit every node)
  • 52. Vendors polarise over this issue Shared Nothing Shared Everything •  TerraData (Aster Data) •  Oracle RAC/Exadata •  Netezza (IBM) •  IBM purescale •  ParAccel •  Sybase IQ •  Vertica •  Microsoft SQL Server •  Greenplumb (there is some blurring)
  • 53. Column Oriented Storage Columns laid contiguously 2-10x compression typical Indexing becomes less important. Pinpoint I/O slow (tuple construction) Bulk read/write faster Compression >> row-based alternatives
  • 54. Solid State Drives 1ms 1µs HDD Seek Time SSD Drive •  Traditional databases are designed for sequential access over magnetic drives, not random access over SSD. •  Weakens the columnar/row argument
  • 55. Faster Networking RAM 10Gigabit Ethernet RDMA Gigabit Ethernet 1ms 1µs 1ns HDD Seek Time SSD Drive
  • 56. The best technologies of the moment are leveraging many of these factors
  • 57. There is a new and impressive breed •  Products < 5 years old •  Shared nothing with SSD’s over shards •  Large address spaces (256GB+) •  No indexes (column oriented) •  No referential integrity •  Surprisingly quick for big queries when compared with incumbent technologies.
  • 58. TPC-H Benchmarks Several new contenders with good scores: –  Exasol –  ParAccel –  Vectorwise
  • 59. TPC-H Benchmarks •  Exasol has 100GB -> 10TB benchmarks •  Up to 20x faster than nearest rivals (But take benchmarks with a pinch of salt)
  • 60. Relational Approach Solid data from every angle, bounded in terms of scale, but with a boundary that is rapidly expanding.
  • 62. At the extreme MapReduce has it TB 0 1 10 100 1000 10,000
  • 63. But there is massive overlap TB 0 1 10 100 1000 10,000
  • 64. It’s not just data volume/velocity
  • 65. The Dimensions of Data •  Volume (pure physical size) •  Velocity (rate of change) •  Variety (number of different types of data, formats and sources) •  Static & Dynamic Complexity
  • 66. Consider the characteristics of data to be integrated, and how that equates to cost
  • 67. Ability to model data is much more of a gating factor than raw size, particularly when considering new forms of data Dave Campbell (Microsoft – VLDB Keynote)
  • 68. It becomes about your data and you want to do with it Do you need to more than just SQL to process your data? Does your data change rapidly? Are you ok with some degree of eventual consistency? Do isolation and consistency matter Do you need to answer questions absolutely or within a tolerance? Do you want to keep your data in its natural form? Do you prefer to work bottom up or top down? How risk averse are you? Are you willing to pay big vendor prices?
  • 69. Composite Offerings Hadoop has Pig & Hbase Mongo offers Query Language, atomaticity & MR Oracle have BigData appliance with Cloudera IBM have a Map Reduce offering Sybase (now part of SAP) provides MR natively EMC acquired Greenplum which has MR support
  • 71. Relational world has focused on keeping data consistent and well structured so it can be sliced and diced at will
  • 72. Big data technologies focus on executing code next to data, where that data is held in a more natural form.
  • 73. So •  NoSQL has disrupted the database market, questioning the need for constraint and highlighting the power of simple solutions. •  DB startups are providing some surprisingly fast solutions that drop some traditional database tenets and cleverly leverage new hardware advances. •  Your problem (and budget) is likely a better guide than the size of the data •  The market is converging on both sides towards a middle ground and integrated suites of complementary tools.
  • 74. The right tool for the job “Attempting to force one technology or tool to satisfy a particular need for which another tool is more effective and efficient is like attempting to drive a screw into a wall with a hammer when a screwdriver is at hand: the screw may eventually enter the wall but at what cost?” E.F. Codd, 1993