Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Performance comparison: Multi-Model vs. MongoDB and Neo4j


Published on

Native multi-model databases combine different data models like documents or graphs in one tool and even allow to mix them in a single query. How can this concept compete with a pure document store like MongoDB or a graph database like Neo4j? I myself and a lot of folks in the community asked that question.

So here are some benchmark results.

Published in: Data & Analytics
  • Be the first to comment

Performance comparison: Multi-Model vs. MongoDB and Neo4j

  1. 1. Native multi-model 
 is competitive Performance comparison between ArangoDB, MongoDB and Neo4j Claudius Weinberger (ArangoDB) June, 2015
  2. 2. Native Multi-Model DB “A native multi-model database is from my perspective a document store (JSON documents), a key/value store and a graph database, all in one database engine and with a unifying query language and API that covers all three data models and even allows to mix them in a single query.” — Claudius Weinberger
  3. 3. Data-Set Our test data are a snapshot from a social network in Slovakia and provided by the Stanford University SNAP collection. It contains 1 632 803 vertices describing people and 30 622 564 edges describing their friendship relation. We use the vertex data for the document tests and the combined vertex and edge data for the graph tests. We wanted to study the standard client/server setup rather than embedding the database into the application, because this model is used more widely and therefore more relevant in practice. We used JavaScript/node.js as client language and environment
  4. 4. Use-Cases • single read: single document reads of profiles (100 000 documents) • single write: single document writes of profiles (100 000 documents) • aggregation: aggregation over a single collection (1 632 803 documents) 
 Here, we compute statistics about the age distribution for everyone in the network, simply counting which age occurs how often. • neighbors: finding direct neighbors plus the neighbors of the neighbors (for 500 vertices) • shortest path: finding 19 shortest paths (in a highly connected social graph) 
 This answers the question how close to each other two people are in the social network.
  5. 5. The throughput measurements on the test machine for ArangoDB define the baseline (100%) for the comparisons. Lower percentages point to higher throughput - so less is better.
  6. 6. Test Setup • Hardware: • virtual machine of type n1-standard-16 in Google Compute Engine with 16 virtual cores (on these, a virtual core is implemented as a single hardware hyper-thread on a 2.3 GHz Intel Xeon E5 v3) and altogether 60 GB of RAM. • the client was an n1-standard-8 (8 vCPU, 30 GB RAM) in the same network. • Software: • ArangoDB V2.6.0 alpha3 (pre-release) for x86_64
 driver: arangojs in version 3.8.0 • MongoDB V3.0.3 for x86_64, using the WiredTiger storage engine
 driver: mongodb in v2.0.33, which builds on top of mongodb-core in v1.1.32 • Neo4j Enterprise 2.3 SNAPSHOT running on JDK 1.7.0_79
 driver: node-neo4j v2.0.0 RC1 more details in the blog post
  7. 7. feedback welcome Don’t trust benchmarks. Make your own.