SlideShare a Scribd company logo
Overhauling a database
engine in 2 months
Max Neunhöffer
Move Fast and Break Things, 12 March 2015
www.arangodb.com
Max Neunhöffer
I am a mathematician
“Earlier life”: Research in Computer Algebra
(Computational Group Theory)
Always juggled with big data
Now: working in database development, NoSQL, ArangoDB
I like:
research,
hacking,
teaching,
tickling the highest performance out of computer systems.
1
ArangoDB GmbH
triAGENS GmbH offers consulting services since 2004:
software architecture
project management
software development
business analysis
a lot of experience with specialised database systems.
have done NoSQL, before the term was coined at all
2011/2012, an idea emerged:
to build the database one had wished to have all those years!
development of ArangoDB as open source software since 2012
ArangoDB GmbH: spin-off to take care of ArangoDB (2014)
2
is a multi-model database (document store & graph database),
is open source and free (Apache 2 license),
offers convenient queries (via HTTP/REST and AQL),
including joins between different collections,
configurable consistency guarantees using transactions
is memory efficient by shape detection,
uses JavaScript throughout (Google’s V8 built into server),
API extensible by JS code in the Foxx Microservice Framework,
offers many drivers for a wide range of languages,
is easy to use with web front end and good documentation,
and enjoys good community as well as professional support.
3
Architecture
DB Engine
Transactions Sharding
Cluster
Infrastructure
V8
JavaScript
libev
zliblibICU
Unicode
Lib: TCP server, HTTP, JSON,
OS−dep., V8 helpers
API
CRUD
high level API in JavaScript Foxx
etcd
(Go)
Client
Scheduler
RequestsHTTP
Dispatcher
HTTP Responses
4
ArangoDB in numbers
DB engine written in C++
embeds Google’s V8 (∼ 130 000 lines of code)
mostly in memory, using memory mapped files
processes JSON data, schema-less but “shapes”
library: ∼ 128 000 lines (C++)
DB engine: ∼ 210 000 lines (C++, including 12 000 for utilities)
JavaScript layer: ∼ 1 232 000 lines of code
∼ 85 000 standard API implementation
∼ 592 000 Foxx apps (API extensions, web front end)
∼ 298 000 unit tests
∼ 327 000 node.js modules
further unit tests: ∼ 10 000 C++ and ∼ 24 000 Ruby for HTTP
plus documentation
and drivers (in other repositories)
5
The Task
It is March 2014, we have just released V2.0.
V2.1 is scheduled for end of May, V2.2 is scheduled for July
V2.1 is incremental, V2.2 is “Write-Ahead-Log”
work for V2.2 started in March
unfortunately, introducing a WAL is akin to open heart surgery,
→ essentially need to reengineer the database engine
do not have the capacity to assign 10 developers to the job
6
The old setup
New data:
{ name: "watch", price: 99 }
Data files:
Collection: products
Collection: sales
(append only)
(append only)
For transactions:
Locks and commit markers on all collections necessary.
7
The new setup
"Collector" (later)
Collection: sales
Collection: products (append only)
(append only)
Data files:
New data:
Write Ahead Log (WAL) (append only)
{ name: "watch", price: 99 }
For transactions:
Less locks and commit markers only in WAL.
8
Advantages of a WAL
have a single history of events
have a single place to note the commit of a transaction
easy asynchronous replication
efficient sync to disk
uncommitted stuff does not hit the data files at all
better support for transactions → much higher performance
better crash recovery
deterministic, well defined behaviour
9
Challenges
need fundamental change in the storage engine
the collector changes stuff that is potentially being read
need to be careful not to create a bottleneck
need to get locking right
crashes hard to test
if possible, users must not notice the change
(except better performance)
10
Our testing setup
We do continuous tests after every push to github and nightly.
We have separate test suites for single server and cluster.
Different types of test for good coverage:
low-level C++ library unit tests (10000 LOC C++)
JS tests (separately on server and JS shell, 290000 LOC)
AQL query engine (230000 LOC of the above)
HTTP interface (TCP and SSL, 24000 LOC Ruby)
dump/restore and bulk import
benchmarks (3000 LOC C++)
user interface (phantomjs, comparing screen shots)
run tests with valgrind
check coverage
11
Test methodologies
Tests can have different aims:
Unit tests:
ensure that individual components work according to
specifications
Integration tests:
ensure that multiple components work together correctly
Benchmark tests:
ensure performance
End to End tests:
ensure that complex systems as a whole do their job
User interface tests:
ensure that the user interface works and behaves as
documented/specified
12
Test characteristics needed for our task
Characteristics of our task
well-defined, deterministic behaviour
if possible, no observable change in functionality
changes relatively far down in the software stack
but reach wide
temporary breakage expected and accepted
=⇒ For our task, we needed:
unit tests,
integration tests and
benchmarks.
Fortunately, we had all these in place!
13
Approach — preparation phase
1. design work:
WAL, markers, collection, compaction, where lock what
2. implement infrastructure for WAL:
mmapped files, append op, marker format
3. add “write to WAL” to write operations
4. test filling of WAL
14
Approach — breaking phase
4. remove old write operations
5. implement collector thread and adjust compactor thread
6. repair by adjusting read operations for WAL/data file
15
Approach — repairing phase
7. fix transaction management
8. implement startup with non-empty WAL: crash recovery
9. fix/simplify replication
10. fix dump/restore
11. fix cluster
12. tune performance (legends)
13. Hurray, tests work again!
14. release beta version
15. fix, fix, fix and tune
16. publish V2.2
16

More Related Content

What's hot

Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architecture
ArangoDB Database
 
guacamole: an Object Document Mapper for ArangoDB
guacamole: an Object Document Mapper for ArangoDBguacamole: an Object Document Mapper for ArangoDB
guacamole: an Object Document Mapper for ArangoDB
Max Neunhöffer
 
An E-commerce App in action built on top of a Multi-model Database
An E-commerce App in action built on top of a Multi-model DatabaseAn E-commerce App in action built on top of a Multi-model Database
An E-commerce App in action built on top of a Multi-model Database
ArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQLDATAVERSITY
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
ArangoDB Database
 
Couch db
Couch dbCouch db
Couch db
Rashmi Agale
 
Hugfr SPARK & RIAK -20160114_hug_france
Hugfr  SPARK & RIAK -20160114_hug_franceHugfr  SPARK & RIAK -20160114_hug_france
Hugfr SPARK & RIAK -20160114_hug_france
Modern Data Stack France
 
Query Languages for Document Stores
Query Languages for Document StoresQuery Languages for Document Stores
Query Languages for Document StoresInteractiveCologne
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
In-Memory Computing Summit
 
Lokijs
LokijsLokijs
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
J Singh
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UG
Steve Behrendt
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Treasure Data, Inc.
 
Introduction to NoSQL Database
Introduction to NoSQL DatabaseIntroduction to NoSQL Database
Introduction to NoSQL Database
Mohammad Alghanem
 
Couch db
Couch dbCouch db
Couch db
amini gazar
 
Data Pipeline at Tapad
Data Pipeline at TapadData Pipeline at Tapad
Data Pipeline at Tapad
Toby Matejovsky
 
CouchDB
CouchDBCouchDB
CouchDB
Rashmi Agale
 
How to integrate your database with kafka & CDC
How to integrate your database with kafka & CDCHow to integrate your database with kafka & CDC
How to integrate your database with kafka & CDC
Abdallah Mahmoud
 

What's hot (20)

Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architecture
 
guacamole: an Object Document Mapper for ArangoDB
guacamole: an Object Document Mapper for ArangoDBguacamole: an Object Document Mapper for ArangoDB
guacamole: an Object Document Mapper for ArangoDB
 
An E-commerce App in action built on top of a Multi-model Database
An E-commerce App in action built on top of a Multi-model DatabaseAn E-commerce App in action built on top of a Multi-model Database
An E-commerce App in action built on top of a Multi-model Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQL
 
CouchDB
CouchDBCouchDB
CouchDB
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
Couch db
Couch dbCouch db
Couch db
 
Hugfr SPARK & RIAK -20160114_hug_france
Hugfr  SPARK & RIAK -20160114_hug_franceHugfr  SPARK & RIAK -20160114_hug_france
Hugfr SPARK & RIAK -20160114_hug_france
 
Query Languages for Document Stores
Query Languages for Document StoresQuery Languages for Document Stores
Query Languages for Document Stores
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
 
Lokijs
LokijsLokijs
Lokijs
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UG
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Introduction to NoSQL Database
Introduction to NoSQL DatabaseIntroduction to NoSQL Database
Introduction to NoSQL Database
 
Couch db
Couch dbCouch db
Couch db
 
Data Pipeline at Tapad
Data Pipeline at TapadData Pipeline at Tapad
Data Pipeline at Tapad
 
CouchDB
CouchDBCouchDB
CouchDB
 
How to integrate your database with kafka & CDC
How to integrate your database with kafka & CDCHow to integrate your database with kafka & CDC
How to integrate your database with kafka & CDC
 

Viewers also liked

03 engine bottom end
03 engine bottom end03 engine bottom end
03 engine bottom endEmmanuel Kolo
 
Complex queries in a distributed multi-model database
Complex queries in a distributed multi-model databaseComplex queries in a distributed multi-model database
Complex queries in a distributed multi-model database
Max Neunhöffer
 
GraphDatabases and what we can use them for
GraphDatabases and what we can use them forGraphDatabases and what we can use them for
GraphDatabases and what we can use them for
Michael Hackstein
 
Hotcode 2013: Javascript in a database (Part 1)
Hotcode 2013: Javascript in a database (Part 1)Hotcode 2013: Javascript in a database (Part 1)
Hotcode 2013: Javascript in a database (Part 1)
ArangoDB Database
 
Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012 Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012
ArangoDB Database
 
Hotcode 2013: Javascript in a database (Part 2)
Hotcode 2013: Javascript in a database (Part 2)Hotcode 2013: Javascript in a database (Part 2)
Hotcode 2013: Javascript in a database (Part 2)
ArangoDB Database
 
Domain Driven Design & NoSQL
Domain Driven Design & NoSQLDomain Driven Design & NoSQL
Domain Driven Design & NoSQL
ArangoDB Database
 
Hydraulic cylinder piston
Hydraulic cylinder pistonHydraulic cylinder piston
Hydraulic cylinder piston
Rumeel Ahmad
 
ArangoDB – Persistência Poliglota e Banco de Dados Multi-Modelos
ArangoDB – Persistência Poliglota e Banco de Dados Multi-ModelosArangoDB – Persistência Poliglota e Banco de Dados Multi-Modelos
ArangoDB – Persistência Poliglota e Banco de Dados Multi-Modelos
Helder Santana
 
Domain Driven Design & NoSQL
Domain Driven Design & NoSQLDomain Driven Design & NoSQL
Domain Driven Design & NoSQL
ArangoDB Database
 
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
ArangoDB Database
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at... Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Big Data Spain
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
Max Neunhöffer
 
ArangoDB - Using JavaScript in the database
ArangoDB - Using JavaScript in the databaseArangoDB - Using JavaScript in the database
ArangoDB - Using JavaScript in the database
ArangoDB Database
 
Domain driven design @FrOSCon
Domain driven design @FrOSConDomain driven design @FrOSCon
Domain driven design @FrOSCon
ArangoDB Database
 
Rupy2012 ArangoDB Workshop Part1
Rupy2012 ArangoDB Workshop Part1Rupy2012 ArangoDB Workshop Part1
Rupy2012 ArangoDB Workshop Part1ArangoDB Database
 
Wir sind aber nicht Twitter
Wir sind aber nicht TwitterWir sind aber nicht Twitter
Wir sind aber nicht Twitter
ArangoDB Database
 
Domain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLVDomain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLV
ArangoDB Database
 
ArangoDB
ArangoDBArangoDB
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDB
ArangoDB Database
 

Viewers also liked (20)

03 engine bottom end
03 engine bottom end03 engine bottom end
03 engine bottom end
 
Complex queries in a distributed multi-model database
Complex queries in a distributed multi-model databaseComplex queries in a distributed multi-model database
Complex queries in a distributed multi-model database
 
GraphDatabases and what we can use them for
GraphDatabases and what we can use them forGraphDatabases and what we can use them for
GraphDatabases and what we can use them for
 
Hotcode 2013: Javascript in a database (Part 1)
Hotcode 2013: Javascript in a database (Part 1)Hotcode 2013: Javascript in a database (Part 1)
Hotcode 2013: Javascript in a database (Part 1)
 
Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012 Running MRuby in a Database - ArangoDB - RuPy 2012
Running MRuby in a Database - ArangoDB - RuPy 2012
 
Hotcode 2013: Javascript in a database (Part 2)
Hotcode 2013: Javascript in a database (Part 2)Hotcode 2013: Javascript in a database (Part 2)
Hotcode 2013: Javascript in a database (Part 2)
 
Domain Driven Design & NoSQL
Domain Driven Design & NoSQLDomain Driven Design & NoSQL
Domain Driven Design & NoSQL
 
Hydraulic cylinder piston
Hydraulic cylinder pistonHydraulic cylinder piston
Hydraulic cylinder piston
 
ArangoDB – Persistência Poliglota e Banco de Dados Multi-Modelos
ArangoDB – Persistência Poliglota e Banco de Dados Multi-ModelosArangoDB – Persistência Poliglota e Banco de Dados Multi-Modelos
ArangoDB – Persistência Poliglota e Banco de Dados Multi-Modelos
 
Domain Driven Design & NoSQL
Domain Driven Design & NoSQLDomain Driven Design & NoSQL
Domain Driven Design & NoSQL
 
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at... Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
ArangoDB - Using JavaScript in the database
ArangoDB - Using JavaScript in the databaseArangoDB - Using JavaScript in the database
ArangoDB - Using JavaScript in the database
 
Domain driven design @FrOSCon
Domain driven design @FrOSConDomain driven design @FrOSCon
Domain driven design @FrOSCon
 
Rupy2012 ArangoDB Workshop Part1
Rupy2012 ArangoDB Workshop Part1Rupy2012 ArangoDB Workshop Part1
Rupy2012 ArangoDB Workshop Part1
 
Wir sind aber nicht Twitter
Wir sind aber nicht TwitterWir sind aber nicht Twitter
Wir sind aber nicht Twitter
 
Domain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLVDomain Driven Design and NoSQL TLV
Domain Driven Design and NoSQL TLV
 
ArangoDB
ArangoDBArangoDB
ArangoDB
 
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDB
 

Similar to Overhauling a database engine in 2 months

Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
Nicola Ferraro
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!
Ulf Wendel
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
devopsdaysaustin
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
rhirschfeld
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
Hojoong Kim
 
Productionalizing ML : Real Experience
Productionalizing ML : Real ExperienceProductionalizing ML : Real Experience
Productionalizing ML : Real Experience
Ihor Bobak
 
AKS: k8s e azure
AKS: k8s e azureAKS: k8s e azure
AKS: k8s e azure
Alessandro Melchiori
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
The Software House
 
Gluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container InfrastructureGluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container Infrastructure
rhirschfeld
 
Beginning MEAN Stack
Beginning MEAN StackBeginning MEAN Stack
Beginning MEAN StackRob Davarnia
 
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
DevOps.com
 
Fighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkFighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with Embulk
Sadayuki Furuhashi
 
A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite
Codemotion
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
Nordic APIs
 
StrongLoop Overview
StrongLoop OverviewStrongLoop Overview
StrongLoop Overview
Shubhra Kar
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Iulian Pintoiu
 
Enterprise Data Science
Enterprise Data ScienceEnterprise Data Science
Enterprise Data Science
Misha Lisovich
 
Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015
Microsoft
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 

Similar to Overhauling a database engine in 2 months (20)

Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
Productionalizing ML : Real Experience
Productionalizing ML : Real ExperienceProductionalizing ML : Real Experience
Productionalizing ML : Real Experience
 
AKS: k8s e azure
AKS: k8s e azureAKS: k8s e azure
AKS: k8s e azure
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
 
Gluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container InfrastructureGluecon Preso: Hybrid Container Infrastructure
Gluecon Preso: Hybrid Container Infrastructure
 
Beginning MEAN Stack
Beginning MEAN StackBeginning MEAN Stack
Beginning MEAN Stack
 
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
How the Automation of a Benchmark Famework Keeps Pace with the Dev Cycle at I...
 
Fighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkFighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with Embulk
 
A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
StrongLoop Overview
StrongLoop OverviewStrongLoop Overview
StrongLoop Overview
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
 
Enterprise Data Science
Enterprise Data ScienceEnterprise Data Science
Enterprise Data Science
 
Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015Red Hat Forum Benelux 2015
Red Hat Forum Benelux 2015
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 

Overhauling a database engine in 2 months

  • 1. Overhauling a database engine in 2 months Max Neunhöffer Move Fast and Break Things, 12 March 2015 www.arangodb.com
  • 2. Max Neunhöffer I am a mathematician “Earlier life”: Research in Computer Algebra (Computational Group Theory) Always juggled with big data Now: working in database development, NoSQL, ArangoDB I like: research, hacking, teaching, tickling the highest performance out of computer systems. 1
  • 3. ArangoDB GmbH triAGENS GmbH offers consulting services since 2004: software architecture project management software development business analysis a lot of experience with specialised database systems. have done NoSQL, before the term was coined at all 2011/2012, an idea emerged: to build the database one had wished to have all those years! development of ArangoDB as open source software since 2012 ArangoDB GmbH: spin-off to take care of ArangoDB (2014) 2
  • 4. is a multi-model database (document store & graph database), is open source and free (Apache 2 license), offers convenient queries (via HTTP/REST and AQL), including joins between different collections, configurable consistency guarantees using transactions is memory efficient by shape detection, uses JavaScript throughout (Google’s V8 built into server), API extensible by JS code in the Foxx Microservice Framework, offers many drivers for a wide range of languages, is easy to use with web front end and good documentation, and enjoys good community as well as professional support. 3
  • 5. Architecture DB Engine Transactions Sharding Cluster Infrastructure V8 JavaScript libev zliblibICU Unicode Lib: TCP server, HTTP, JSON, OS−dep., V8 helpers API CRUD high level API in JavaScript Foxx etcd (Go) Client Scheduler RequestsHTTP Dispatcher HTTP Responses 4
  • 6. ArangoDB in numbers DB engine written in C++ embeds Google’s V8 (∼ 130 000 lines of code) mostly in memory, using memory mapped files processes JSON data, schema-less but “shapes” library: ∼ 128 000 lines (C++) DB engine: ∼ 210 000 lines (C++, including 12 000 for utilities) JavaScript layer: ∼ 1 232 000 lines of code ∼ 85 000 standard API implementation ∼ 592 000 Foxx apps (API extensions, web front end) ∼ 298 000 unit tests ∼ 327 000 node.js modules further unit tests: ∼ 10 000 C++ and ∼ 24 000 Ruby for HTTP plus documentation and drivers (in other repositories) 5
  • 7. The Task It is March 2014, we have just released V2.0. V2.1 is scheduled for end of May, V2.2 is scheduled for July V2.1 is incremental, V2.2 is “Write-Ahead-Log” work for V2.2 started in March unfortunately, introducing a WAL is akin to open heart surgery, → essentially need to reengineer the database engine do not have the capacity to assign 10 developers to the job 6
  • 8. The old setup New data: { name: "watch", price: 99 } Data files: Collection: products Collection: sales (append only) (append only) For transactions: Locks and commit markers on all collections necessary. 7
  • 9. The new setup "Collector" (later) Collection: sales Collection: products (append only) (append only) Data files: New data: Write Ahead Log (WAL) (append only) { name: "watch", price: 99 } For transactions: Less locks and commit markers only in WAL. 8
  • 10. Advantages of a WAL have a single history of events have a single place to note the commit of a transaction easy asynchronous replication efficient sync to disk uncommitted stuff does not hit the data files at all better support for transactions → much higher performance better crash recovery deterministic, well defined behaviour 9
  • 11. Challenges need fundamental change in the storage engine the collector changes stuff that is potentially being read need to be careful not to create a bottleneck need to get locking right crashes hard to test if possible, users must not notice the change (except better performance) 10
  • 12. Our testing setup We do continuous tests after every push to github and nightly. We have separate test suites for single server and cluster. Different types of test for good coverage: low-level C++ library unit tests (10000 LOC C++) JS tests (separately on server and JS shell, 290000 LOC) AQL query engine (230000 LOC of the above) HTTP interface (TCP and SSL, 24000 LOC Ruby) dump/restore and bulk import benchmarks (3000 LOC C++) user interface (phantomjs, comparing screen shots) run tests with valgrind check coverage 11
  • 13. Test methodologies Tests can have different aims: Unit tests: ensure that individual components work according to specifications Integration tests: ensure that multiple components work together correctly Benchmark tests: ensure performance End to End tests: ensure that complex systems as a whole do their job User interface tests: ensure that the user interface works and behaves as documented/specified 12
  • 14. Test characteristics needed for our task Characteristics of our task well-defined, deterministic behaviour if possible, no observable change in functionality changes relatively far down in the software stack but reach wide temporary breakage expected and accepted =⇒ For our task, we needed: unit tests, integration tests and benchmarks. Fortunately, we had all these in place! 13
  • 15. Approach — preparation phase 1. design work: WAL, markers, collection, compaction, where lock what 2. implement infrastructure for WAL: mmapped files, append op, marker format 3. add “write to WAL” to write operations 4. test filling of WAL 14
  • 16. Approach — breaking phase 4. remove old write operations 5. implement collector thread and adjust compactor thread 6. repair by adjusting read operations for WAL/data file 15
  • 17. Approach — repairing phase 7. fix transaction management 8. implement startup with non-empty WAL: crash recovery 9. fix/simplify replication 10. fix dump/restore 11. fix cluster 12. tune performance (legends) 13. Hurray, tests work again! 14. release beta version 15. fix, fix, fix and tune 16. publish V2.2 16