SlideShare a Scribd company logo

Balancing Replication and Partitioning in a Distributed Java Database

This talk, presented at JavaOne 2011, describes the ODC, a distributed, in-memory database built in Java that holds objects in a normalized form in a way that alleviates the traditional degradation in performance associated with joins in shared-nothing architectures. The presentation describes the two patterns that lie at the core of this model. The first is an adaptation of the Star Schema model used to hold data either replicated or partitioned data, depending on whether the data is a fact or a dimension. In the second pattern, the data store tracks arcs on the object graph to ensure that only the minimum amount of data is replicated. Through these mechanisms, almost any join can be performed across the various entities stored in the grid, without the need for key shipping or iterative wire calls.

1 of 99
Download to read offline
Balancing Replication and Partitioning in a Distributed Java Database
In-memory databases offer
                     significant gains in               But three issues
                     performance as all data is         have stunted their
                     freely available. There is no      uptake: Address
Traditional disk-                                       spaces only being
                     need to page to and from disk.
oriented database                                       large enough for a
architecture is                                         subset of a typical
showing its age.    This makes joins a                  user’s data. The ‘one
                    problem. When data                  more bit’ problem
                    must be joined across               and durability.
                    multiple machines
                    performance degradation
 Snowflake          is inevitable.                    Distributed in-
 Schemas allow                                        memory databases
 us to mix             But this model only goes       solve these three
 Partitioning          so far. “Connected             problems but at the
 and Replication       Replication” takes us a        price of loosing the
 so joins never        step further allowing us       single address space.
 hit the wire.         to make the best possible
                       use of replication.
Balancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java Database
The lay of the land: 
  The main architectural
constructs in the database
         industry
Balancing Replication and Partitioning in a Distributed Java Database

Recommended

Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideBen Stopford
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the LogBen Stopford
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBen Stopford
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyBen Stopford
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming WorldBen Stopford
 
Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Ben Stopford
 
The return of big iron?
The return of big iron?The return of big iron?
The return of big iron?Ben Stopford
 

More Related Content

What's hot

Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentationsharonyb
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedFlyData Inc.
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentationadvaitdeo
 
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex QueriesGetting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queriestimonk
 
Stream Application Development with Apache Kafka
Stream Application Development with Apache KafkaStream Application Development with Apache Kafka
Stream Application Development with Apache KafkaMatthias J. Sax
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013Amazon Web Services
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesAmazon Web Services
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentationMat Wall
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbaseDipti Borkar
 

What's hot (20)

Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentation
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentation
 
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex QueriesGetting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queries
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Stream Application Development with Apache Kafka
Stream Application Development with Apache KafkaStream Application Development with Apache Kafka
Stream Application Development with Apache Kafka
 
Couchbase 101
Couchbase 101 Couchbase 101
Couchbase 101
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS Updates
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentation
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
 

Similar to Balancing Replication and Partitioning in a Distributed Java Database

Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopfordBen Stopford
 
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...JAX London
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...Ben Stopford
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."Basho Technologies
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure javaRoman Elizarov
 
Top Technology Trends
Top Technology Trends Top Technology Trends
Top Technology Trends InnoTech
 
Databases for Storage Engineers
Databases for Storage EngineersDatabases for Storage Engineers
Databases for Storage EngineersThomas Kejser
 
Memory-Based Cloud Architectures
Memory-Based Cloud ArchitecturesMemory-Based Cloud Architectures
Memory-Based Cloud Architectures小新 制造
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non RelazionaliNoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non RelazionaliSteve Maraspin
 
Seattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / CassandraSeattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / Cassandraclive boulton
 
Top Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallasTop Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallasInnoTech
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase
 
Re-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw AwayRe-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw AwayDATAVERSITY
 
There is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky VersionThere is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky VersionOpen Spectrum Inc
 
Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Brent Salisbury
 
Acunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu
 
Benchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseBenchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseChristopher Choi
 

Similar to Balancing Replication and Partitioning in a Distributed Java Database (20)

Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopford
 
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure java
 
Top Technology Trends
Top Technology Trends Top Technology Trends
Top Technology Trends
 
Databases for Storage Engineers
Databases for Storage EngineersDatabases for Storage Engineers
Databases for Storage Engineers
 
Memory-Based Cloud Architectures
Memory-Based Cloud ArchitecturesMemory-Based Cloud Architectures
Memory-Based Cloud Architectures
 
NoSQL
NoSQLNoSQL
NoSQL
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non RelazionaliNoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
 
Seattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / CassandraSeattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / Cassandra
 
Top Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallasTop Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallas
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
 
Re-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw AwayRe-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw Away
 
There is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky VersionThere is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky Version
 
Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012
 
Acunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFP
 
Benchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseBenchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBase
 

More from Ben Stopford

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache KafkaBen Stopford
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven MicroservicesBen Stopford
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessBen Stopford
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationBen Stopford
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBen Stopford
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with StreamsBen Stopford
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Ben Stopford
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBen Stopford
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsBen Stopford
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache KafkaBen Stopford
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Ben Stopford
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Ben Stopford
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache KafkaBen Stopford
 
A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojureBen Stopford
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Big Data & the Enterprise
Big Data & the EnterpriseBig Data & the Enterprise
Big Data & the EnterpriseBen Stopford
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Ben Stopford
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Ben Stopford
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideBen Stopford
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 

More from Ben Stopford (20)

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and Serverless
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices Generation
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka Streams
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful Streams
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful Streams
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache Kafka
 
A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojure
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Big Data & the Enterprise
Big Data & the EnterpriseBig Data & the Enterprise
Big Data & the Enterprise
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental Divide
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 

Recently uploaded

New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024ThousandEyes
 
AI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientAI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientKari Kakkonen
 
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024BookNet Canada
 
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...Product School
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Umar Saif
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceVijayananda Mohire
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxInfosec
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxNeo4j
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...UiPathCommunity
 
Enterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewEnterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewAshraf Fouad
 
Act Like an Owner, Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner,  Challenge Like a VC by former CPO, TripadvisorAct Like an Owner,  Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner, Challenge Like a VC by former CPO, TripadvisorProduct School
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceSusan Ibach
 
Artificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeArtificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeJosh Gellers
 
National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...MichaelBenis1
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...DianaGray10
 
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...ShapeBlue
 
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlueCloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlueShapeBlue
 
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Product School
 
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)Jay Zhao
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxNeo4j
 

Recently uploaded (20)

New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024
 
AI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientAI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficient
 
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
 
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
Synergy in Leadership and Product Excellence: A Blueprint for Growth by CPO, ...
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial Intelligence
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptx
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
 
Enterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewEnterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book Review
 
Act Like an Owner, Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner,  Challenge Like a VC by former CPO, TripadvisorAct Like an Owner,  Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner, Challenge Like a VC by former CPO, Tripadvisor
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data science
 
Artificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeArtificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human Justice
 
National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
 
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
 
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlueCloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
 
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
 
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
 

Balancing Replication and Partitioning in a Distributed Java Database

  • 2. In-memory databases offer significant gains in But three issues performance as all data is have stunted their freely available. There is no uptake: Address Traditional disk- spaces only being need to page to and from disk. oriented database large enough for a architecture is subset of a typical showing its age. This makes joins a user’s data. The ‘one problem. When data more bit’ problem must be joined across and durability. multiple machines performance degradation Snowflake is inevitable. Distributed in- Schemas allow memory databases us to mix But this model only goes solve these three Partitioning so far. “Connected problems but at the and Replication Replication” takes us a price of loosing the so joins never step further allowing us single address space. hit the wire. to make the best possible use of replication.
  • 5. The lay of the land: The main architectural constructs in the database industry
  • 10. ms μs ns ps 1MB Disk/Network 1MB Main Memory 0.000,000,000,000 Cross Continental Main Memory L1 Cache Ref Round Trip Ref Cross Network L2 Cache Ref Round Trip * L1 ref is about 2 clock cycles or 0.7ns. This is the time it takes light to travel 20cm
  • 21. Taken from “OLTP Through the Looking Glass, and What We Found There” Harizopoulos et al
  • 23. Shared Nothing Teradata, Vertica, Greenplumb… SN Regular In-Memory In-Memory Database Database Drop Disk Exasol, VoltDB, Oracle, Sybase, MySql Times Ten, HSQL, KDB Distribute Hana ODC Distributed Caching Coherence, Gemfire, Gigaspaces
  • 24. Distributed Architecture Simplify the Contract. Stick to RAM
  • 25. 450 processes 2TB of RAM Oracle Coherence Messaging (Topic Based) as a system of record (persistence)
  • 26. Access Layer Java Java client client API API Query Layer Transactions Data Layer Mtms Cashflows Persistence Layer
  • 27. Indexing Partitioning Replication
  • 28. But your storage is limited by the memory on a node
  • 29. Keys Fs-Fz Keys Xa-Yd Scalable storage, bandwidth and processing
  • 33. Trader Party Version 1 Trade Trader Party Version 2 Trade Trader Party Version 3 Trade Trader Party Version 4 Trade …and you need versioning to do MVCC
  • 34. Trade Trader Party Party Trader Trade Party Trader Trade Party
  • 35. So better to use partitioning, spreading data around the cluster.
  • 36. Trader Party Trade Trade Trader Party
  • 37. Trader Party Trade Trade Trader Party
  • 41. ! This is what using Snowflake Schemas and the Connected Replication pattern is all about!
  • 44. Crosscutting Keys Common Keys
  • 45. Replicated Trader Party Trade Partitioned
  • 52. Valuation Legs Valuations art Transaction Mapping Cashflow Mapping Facts: Party Alias Transaction =>Big, Cashflows common Legs Parties keys Ledger Book Source Book Dimensions Cost Centre Product =>Small, Risk Organisation Unit Business Unit crosscutting HCS Entity Keys Set of Books 0 37,500,000 75,000,000 112,500,000 150,000,000
  • 54. Coherence’s KeyAssociation gives us this Trades MTMs Common Key
  • 55. Replicated Trader Party Trade Partitioned (
  • 56. Query Layer Trader Party Trade Transactions Data Layer Mtms Cashflows Fact Storage (Partitioned)
  • 57. Dimensions (repliacte) Transactions Mtms Facts Cashflows (distribute/ partition) Fact Storage (Partitioned)
  • 58. Valuation Legs Valuations Facts: art Transaction Mapping Cashflow Mapping Party Alias =>Big =>Distribute Transaction Cashflows Legs Parties Ledger Book Source Book Dimensions =>Small Cost Centre Product Risk Organisation Unit => Replicate Business Unit HCS Entity Set of Books 0 37,500,000 75,000,000 112,500,000 150,000,000
  • 59. We use a variant on a Snowflake Schema to partition big stuff, that has the same key and replicate small stuff that has crosscutting keys.
  • 61. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’
  • 63. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’ LBs[]=getLedgerBooksFor(CC1) SBs[]=getSourceBooksFor(LBs[]) So we have all the bottom level dimensions needed to query facts Transactions Mtms Cashflows Partitioned
  • 64. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’ LBs[]=getLedgerBooksFor(CC1) SBs[]=getSourceBooksFor(LBs[]) So we have all the bottom level dimensions needed to query facts Transactions Get all Transactions and Mtms MTMs (cluster side join) for the passed Source Books Cashflows Partitioned
  • 66. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’ Populate raw facts LBs[]=getLedgerBooksFor(CC1) (Transactions) with SBs[]=getSourceBooksFor(LBs[]) dimension data So we have all the bottom level before returning to dimensions needed to query facts client. Transactions Get all Transactions and Mtms MTMs (cluster side join) for the passed Source Books Cashflows Partitioned
  • 68. Replicated Partitioned Java client Dimensions Facts API We never have to do a distributed join!
  • 69. So all the big stuff is held paritioned And we can join without shipping keys around and having intermediate results
  • 70. Trader Party Trade Trade Trader Party
  • 71. Trader Party Version 1 Trade Trader Party Version 2 Trade Trader Party Version 3 Trade Trader Party Version 4 Trade
  • 72. Trade Trader Party Party Trader Trade Party Trader Trade Party
  • 78. Valuation Legs Valuations rt Transaction Mapping Cashflow Mapping Party Alias Facts Transaction Cashflows Legs Parties This is a dimension Ledger Book •  It has a different Source Book Cost Centre key to the Facts. Dimensions Product •  And it’s BIG Risk Organisation Unit Business Unit HCS Entity Set of Books 0 125,000,000
  • 83. Party Alias Parties Ledger Book Source Book Cost Centre Product Risk Organisation Unit Business Unit HCS Entity Set of Books 0 1,250,000 2,500,000 3,750,000 5,000,000
  • 84. Party Alias Parties Ledger Book Source Book Cost Centre Product Risk Organisation Unit Business Unit HCS Entity Set of Books 20 1,250,015 2,500,010 3,750,005 5,000,000
  • 86. So we only replicate ‘Connected’ or ‘Used’ dimensions
  • 87. Processing Layer Dimension Caches (Replicated) Transactions Data Layer As new Facts are added Mtms relevant Dimensions that they reference are moved Cashflows to processing layer caches Fact Storage (Partitioned)
  • 89. Query Layer Save Trade (With connected dimension Caches) Data Layer Cache Trade (All Normalised) Store Partitioned Trigger Source Cache Party Ccy Alias Book
  • 90. Query Layer (With connected dimension Caches) Data Layer Trade (All Normalised) Party Source Ccy Alias Book
  • 91. Query Layer (With connected dimension Caches) Data Layer Trade (All Normalised) Party Source Ccy Alias Book Party Ledger Book
  • 92. ‘Connected Replication’ A simple pattern which recurses through the foreign keys in the domain model, ensuring only ‘Connected’ dimensions are replicated
  • 94. Java client Java schema API Java ‘Stored Procedures’ and ‘Triggers’

Editor's Notes

  1. Big data sets are held distributed and only joined on the grid to collocated objects. Small data sets are held in replicated caches so they can be joined in process (only ‘active’ data is held)
  2. Big data sets are held distributed and only joined on the grid to collocated objects. Small data sets are held in replicated caches so they can be joined in process (only ‘active’ data is held)