Native multi-model 

is competitive
Performance comparison between
ArangoDB, MongoDB and Neo4j
Claudius Weinberger (ArangoDB)
June, 2015
Native Multi-Model DB
“A native multi-model database is from my perspective a document
store (JSON documents), a key/value store and a graph database, all
in one database engine and with a unifying query language and API
that covers all three data models and even allows to mix them in a
single query.” — Claudius Weinberger
Data-Set
Our test data are a snapshot from a social
network in Slovakia and provided by the
Stanford University SNAP collection. It
contains 1 632 803 vertices describing
people and 30 622 564 edges describing
their friendship relation. We use the vertex
data for the document tests and the combined
vertex and edge data for the graph tests. We
wanted to study the standard client/server
setup rather than embedding the database
into the application, because this model is
used more widely and therefore more relevant
in practice. We used JavaScript/node.js as
client language and environment
Use-Cases
• single read: single document reads of profiles (100 000 documents)
• single write: single document writes of profiles (100 000 documents)
• aggregation: aggregation over a single collection (1 632 803 documents) 

Here, we compute statistics about the age distribution for everyone in the network,
simply counting which age occurs how often.
• neighbors: finding direct neighbors plus the neighbors of the neighbors (for 500
vertices)
• shortest path: finding 19 shortest paths (in a highly connected social graph) 

This answers the question how close to each other two people are in the social
network.
The throughput measurements on the test machine for ArangoDB define the baseline (100%)
for the comparisons. Lower percentages point to higher throughput - so less is better.
Test Setup
• Hardware:
• virtual machine of type n1-standard-16 in Google Compute Engine with 16 virtual cores (on these, a virtual
core is implemented as a single hardware hyper-thread on a 2.3 GHz Intel Xeon E5 v3) and altogether 60
GB of RAM.
• the client was an n1-standard-8 (8 vCPU, 30 GB RAM) in the same network.
• Software:
• ArangoDB V2.6.0 alpha3 (pre-release) for x86_64

driver: arangojs in version 3.8.0
• MongoDB V3.0.3 for x86_64, using the WiredTiger storage engine

driver: mongodb in v2.0.33, which builds on top of mongodb-core in v1.1.32
• Neo4j Enterprise 2.3 SNAPSHOT running on JDK 1.7.0_79

driver: node-neo4j v2.0.0 RC1
more details in the blog post
feedback welcome
Don’t trust benchmarks. Make your own.
github.com/weinberger/nosql-test

Performance comparison: Multi-Model vs. MongoDB and Neo4j

  • 1.
    Native multi-model 
 iscompetitive Performance comparison between ArangoDB, MongoDB and Neo4j Claudius Weinberger (ArangoDB) June, 2015
  • 2.
    Native Multi-Model DB “Anative multi-model database is from my perspective a document store (JSON documents), a key/value store and a graph database, all in one database engine and with a unifying query language and API that covers all three data models and even allows to mix them in a single query.” — Claudius Weinberger
  • 3.
    Data-Set Our test dataare a snapshot from a social network in Slovakia and provided by the Stanford University SNAP collection. It contains 1 632 803 vertices describing people and 30 622 564 edges describing their friendship relation. We use the vertex data for the document tests and the combined vertex and edge data for the graph tests. We wanted to study the standard client/server setup rather than embedding the database into the application, because this model is used more widely and therefore more relevant in practice. We used JavaScript/node.js as client language and environment
  • 4.
    Use-Cases • single read:single document reads of profiles (100 000 documents) • single write: single document writes of profiles (100 000 documents) • aggregation: aggregation over a single collection (1 632 803 documents) 
 Here, we compute statistics about the age distribution for everyone in the network, simply counting which age occurs how often. • neighbors: finding direct neighbors plus the neighbors of the neighbors (for 500 vertices) • shortest path: finding 19 shortest paths (in a highly connected social graph) 
 This answers the question how close to each other two people are in the social network.
  • 5.
    The throughput measurementson the test machine for ArangoDB define the baseline (100%) for the comparisons. Lower percentages point to higher throughput - so less is better.
  • 6.
    Test Setup • Hardware: •virtual machine of type n1-standard-16 in Google Compute Engine with 16 virtual cores (on these, a virtual core is implemented as a single hardware hyper-thread on a 2.3 GHz Intel Xeon E5 v3) and altogether 60 GB of RAM. • the client was an n1-standard-8 (8 vCPU, 30 GB RAM) in the same network. • Software: • ArangoDB V2.6.0 alpha3 (pre-release) for x86_64
 driver: arangojs in version 3.8.0 • MongoDB V3.0.3 for x86_64, using the WiredTiger storage engine
 driver: mongodb in v2.0.33, which builds on top of mongodb-core in v1.1.32 • Neo4j Enterprise 2.3 SNAPSHOT running on JDK 1.7.0_79
 driver: node-neo4j v2.0.0 RC1 more details in the blog post
  • 7.
    feedback welcome Don’t trustbenchmarks. Make your own. github.com/weinberger/nosql-test