SlideShare a Scribd company logo
1 of 22
1
Analysis on the
performance of graph query
languages: Comparative
study of Cypher, Gremlin
and native access in Neo4j
Athiq Ahamed, ITIS, TU-Braunschweig
Supervised by: Dr. Lena Wiese
Georg-August-University Göttingen
Prof. Dr. René Peinl, Florian Holzschuher
Performance of graph query languages:
Comparison of Cypher,
Gremlin and native access in Neo4j
2
Agenda
• RDBMS
• Reason for NoSQL
• Categories of NoSQL databases
• Comparison of popular NoSQL databases
• Motivation
• Neo4j and Query Languages
• Comparison of Neo4j to other databases
• Testing (importance of benchmarking, different suites)
• Results
• Limitations and Future work
3
Introduction 1
RDBMS
• For decades relational databases been a dominant choice
• Structured Query Language (SQL) retrieves data with ease
• Currently, Outsized volumes of dynamic data is been
developed
• Strict schemas and joining several tables for answering
queries
• Not a good choice for current state
• So we require dynamic schemas, high scalability, high
performance and so on
4
Introduction 2
• NoSQL databases are the first choice now, solves most the
problems
• Graph databases are best suited for storing networks of data
(social networking)
Features
– NoSQL database has a proper query language
– NoSQL databases do either trade availability or consistency in favor of
partition-tolerance (CAP).
– Neo4j, Cassandra, MongoDB, BigTable to name a few
• It is an ideal choice for web 2.0
5
NoSQL databases
• Four important categories of NoSQL databases
Key-values Stores Column Family
Stores
Document stores Graph Databases
Simplest and easy
to implement,
having a hash table
with a unique key
to the value as a
pointer
Widely used for
data distribution,
where keys point to
multiple columns
Used for semi
structured data,
storing it in JSON
format similar to
key-value store
Used for storing
graph like data e.g.
social networks
Redis, Oracle BDB,
Voldemort
BigTable model of
Google
MongoDB Neo4j
6
Comparison Between Popular NoSQL
Databases
MongoDB
(Document-oriented)
Rank No. 1
Cassandra
(Wide Column)
Rank No. 2
Neo4j
(Graph)
Rank No. 5
Replication and Failover for
high availability
Trade off is done for
consistency providing high
availability
Neo4j which is very similar
to MongoDB with blocking
replication, cluster setup for
high availability
Consistency is default, auto
sharding to ease scalability,
replication, full index
support
Cassandra with incremental
scalability, high availability,
very eventually consistent
Neo4j with scalable
clustering support,
runtime failover, Live
Backup support
7
Different types of DBs and Languages
Databases Languages
Relational Databases SQL
XML databases XPATH, XQUERY
RDF RQL, SPARQL
Objected oriented OQL
Multidimensional MDX
Graph Cypher, Gremlin
8
Motivation
• To measure the performance of different graph query
languages and native access in Neo4j
• Compare ease of understanding , code readability,
maintainability of the languages
• Test the performance and correctness of these graph
databases
• Apache Shindig, for hosting OpenSocial applications
• Compare performance of different back-ends on Neo4j
9
Neo4j and Query Languages
• Neo4j, is an open-source NoSQL graph database
• Which implements the property graph data model
• Neo4j has a native Java Api with a traversal framework
• Features
– Supports ACID properties
– Runtime failover
– High performance
– Scalability
– Very good documentation
– Very good query language, Cypher
• Cypher, declarative query language similar to SQL
• Gremlin, Groovy based query language
10
Comparison of Neo4j to other DBs
Existing Work
Neo4j and MySQL Neo4j and Other graph database
Neo4j retrieved results faster than relational
databases
Data used for testing performance: 1k, 32k
and 1m nodes reaching from 9k
relationships to 8.4 million relationships
Flexible than MySQL Jena and HypergraphDB were not able to
load the database in a specified time
Query times are 2-5 times lower that MySQL
for their 500 objects data set
DEX and Neo4j were able to load the largest
benchmark sizes
Neo4j performed better at the structural
type queries than SQL
Jena could load the graph with 1M nodes
faster than Neo4j but it couldn’t scale
Neo4j were slower than MySQL with integer
data
Neo4j is faster than DEX for the large
dataset, and the reverse happens for the
small dataset
So, Neo4j is used for queries like friendship,
movie favorites and more complicated
commercial purposes queries
DEX is able to scale better, whereas Neo4j
obtained a good throughput
11
Setup
• Apache shindig 2.5, for hosting OpenSocial applications
• Neo4j has a native Java Api with which we can retrieve and
traverse methods
• Also directly accessible when neo4j is in embedded mode
• A RESTful (REST stands for Representational State Transfer)
web service interface
• Several wrappers for various programming languages like
python and java
• Cypher is used for all the CRUD (create, read, update and
delete)
• Gremlin does both imperative and declarative querying
12
Data Used for testing
• 2011 people
• 26,982 messages
• 24,365 activities
• 2000 address
• 200 groups
• 100 organizations
• They even tested on a bigger dataset 10,003 people
• One had at least 1 friend or a maximum of 667 friends from
25,0000 friendship relationships
• For bigger dataset 10,003 people, there were 137,000
friendships in total, a maximum of 1,448 friends for one
person
13
Suites used for testing
• Neo4j embedded
• Neo4j REST
• Neo4j Cypher embedded
• Neo4j Cypher REST
• Neo4j Gremlin Rest
• MySQL JPA
• These suites retrieves profiles, friends, group recommendations
and other social networking features
14
Results 1
Comparison of query languages and native access
Native object access Cypher Gremlin SQL
Can retrieve and
traverse methods,
with a traversal
framework
Declarative query
language does all the
CRUD operations
Groovy based query
language with a
compact syntax
Structured query
language, simple to
understand
Difficult to learn, Easy to learn, Difficult to learn Easy to learn
Several lines of codes
for simple retrieval
Simple and easy to
understand
Compact syntax,
difficult to understand
Several lines of
code
Comparable Good for complex
retrieval
Good for small
retrieval
Slows down for
complicated
queries
15
Results 2 - Gremlin vs. Cypher
Cypher
START person= node:people(id = {id})
MATCH person-[:FRIEND_OF] -> friend-[:FRIEND_OF]
-> friend_of_friend
WHERE not (friend_of_friend <- [:FRIEND_OF]-person)
RETURN friend_of_friend, COUNT(*)
ORDER BY COUNT(*) DESC
Gremlin
t = new Table();
x = [];"
g.idx('persons')[[id:id_param]].
out('FRIEND_OF').fill(x);"
g.idx('persons')[[id:id_param]].out('FRIEND_OF').
out('FRIEND_OF').dedup().except(x).id.as('ID').
back(1).displayName.as('name').
table(t,['ID','name']){it}{it}.iterate();
t
Friend Suggestion For A
Person
16
Results 3 - Gremlin vs. Cypher
Queries Cypher and Gremlin Performance
Friend queries (simple) Gremlin is bit faster than Cypher
Peoples queries Gremlin is slower than Cypher
Message queries Gremlin is on par with Cypher
FOAF queries (complicated) Cypher better than Gremlin
• Gremlin is slower when there are complicated pattern matching
• Complex queries with many properties, relationships Cypher out
performed Gremlin
• Gremlin is better for simple cases
17
Results 4 - from Original Paper
Figure 1: 2000 people in ms Figure 2: Gremlin vs Cypher in ms
18
Results 5
• Embedded instance way faster than DBMS over the network
• Neo4j query languages outperform JPA for friend queries
• Remote access with REST slower compared to the embedded
Neo4j native object access
• JPA VS RESTful cypher and gremlin very interesting
– For person profile JPA back-end performances equally good as RESTful
cypher
19
Results 6
• Friend queries are more than one order of magnitude slower
for JPA
• Neo4j showed a constant performance when increasing from
2000 to 10,000 persons
• MySQL drops performance by a factor of 5 for people queries
• MySQL drops performance by a factor of 7-9 for peoples
friends queries
• Restful case is slower than JPA in most of the cases
20
Limitation
• The data which they used was realistic to an extent
• Results always showed some fluctuations
• Not good for benchmarking and using the results for further
research because of fluctuations
• They have used different Cypher queries for embedded and
rest benchmarking
• Neo4j’s normal server settings were used
• Haven't tested Neo4j´s advanced version with load balancing
21
Conclusion and Future work
• Analyzed the performance and programming effort for
different back-ends
• Compared JPA back-end using MySQL with Cypher and
Gremlin
• Neo4j with Cypher had better performance overall
• Gremlin performed better with simple queries
• Cypher performed better with complicated queries
• Neo4j is a good replacement for the traditional RDBMS for
web 2.0
• Future work: To implement and test with an interesting
approach of spring data Neo4j
22

More Related Content

What's hot

Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorialMax De Marzi
 
Intro to Neo4j with Ruby
Intro to Neo4j with RubyIntro to Neo4j with Ruby
Intro to Neo4j with RubyMax De Marzi
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Using Neo4j from Java
Using Neo4j from JavaUsing Neo4j from Java
Using Neo4j from JavaNeo4j
 
Graph Databases & OrientDB
Graph Databases & OrientDBGraph Databases & OrientDB
Graph Databases & OrientDBArpit Poladia
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph DatabasesInfiniteGraph
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereEugene Hanikblum
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendationsproksik
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageNeo4j
 
NoSQL, Neo4J for Java Developers , OracleWeek-2012
NoSQL, Neo4J for Java Developers , OracleWeek-2012NoSQL, Neo4J for Java Developers , OracleWeek-2012
NoSQL, Neo4J for Java Developers , OracleWeek-2012Eugene Hanikblum
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - ImportNeo4j
 
Gerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...MongoDB
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databasesthai
 
Introduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 EditionIntroduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 EditionAleksander Stensby
 
Gremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageGremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageMarko Rodriguez
 

What's hot (20)

Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorial
 
Intro to Neo4j with Ruby
Intro to Neo4j with RubyIntro to Neo4j with Ruby
Intro to Neo4j with Ruby
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Using Neo4j from Java
Using Neo4j from JavaUsing Neo4j from Java
Using Neo4j from Java
 
Graph Databases & OrientDB
Graph Databases & OrientDBGraph Databases & OrientDB
Graph Databases & OrientDB
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and Where
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
NoSQL, Neo4J for Java Developers , OracleWeek-2012
NoSQL, Neo4J for Java Developers , OracleWeek-2012NoSQL, Neo4J for Java Developers , OracleWeek-2012
NoSQL, Neo4J for Java Developers , OracleWeek-2012
 
Graph database
Graph database Graph database
Graph database
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
 
Graph Database
Graph DatabaseGraph Database
Graph Database
 
Gerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol Graph Databases
Gerry McNicol Graph Databases
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Performance neo4j-versus (2)
Performance neo4j-versus (2)Performance neo4j-versus (2)
Performance neo4j-versus (2)
 
Introduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 EditionIntroduction to graph databases, Neo4j and Spring Data - English 2015 Edition
Introduction to graph databases, Neo4j and Spring Data - English 2015 Edition
 
Gremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageGremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming Language
 

Viewers also liked

Pattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparabilityPattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparability Athiq Ahamed
 
Neo4j Spatial - GIS for the rest of us.
Neo4j Spatial - GIS for the rest of us.Neo4j Spatial - GIS for the rest of us.
Neo4j Spatial - GIS for the rest of us.Peter Neubauer
 
Intro to graphs for HR analytics
Intro to graphs for HR analyticsIntro to graphs for HR analytics
Intro to graphs for HR analyticsRik Van Bruggen
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesCurtis Mosters
 
Moving Graphs to Production At Scale
Moving Graphs to Production At ScaleMoving Graphs to Production At Scale
Moving Graphs to Production At ScaleNeo4j
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
 
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph DatabasesGraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph DatabasesNeo4j
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 

Viewers also liked (9)

Pattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparabilityPattern Recognition in Multiple Bike sharing Systems for comparability
Pattern Recognition in Multiple Bike sharing Systems for comparability
 
Neo4j Spatial - GIS for the rest of us.
Neo4j Spatial - GIS for the rest of us.Neo4j Spatial - GIS for the rest of us.
Neo4j Spatial - GIS for the rest of us.
 
Intro to graphs for HR analytics
Intro to graphs for HR analyticsIntro to graphs for HR analytics
Intro to graphs for HR analytics
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databases
 
Moving Graphs to Production At Scale
Moving Graphs to Production At ScaleMoving Graphs to Production At Scale
Moving Graphs to Production At Scale
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
 
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph DatabasesGraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 

Similar to Performance of graph query languages

introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jCorie Pollock
 
Neo4j 3.2 Launch
Neo4j 3.2 LaunchNeo4j 3.2 Launch
Neo4j 3.2 LaunchNeo4j
 
NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperJesus Rodriguez
 
Powering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLPowering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLNic Raboy
 
DubJug: Neo4J and Open Data
DubJug: Neo4J and Open DataDubJug: Neo4J and Open Data
DubJug: Neo4J and Open DataScott Sosna
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingPeter Haase
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingAlexandre Riazanov
 
Graph databases and the #panamapapers
Graph databases and the #panamapapersGraph databases and the #panamapapers
Graph databases and the #panamapapersdarthvader42
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015Himanshu Desai
 

Similar to Performance of graph query languages (20)

Neo4j
Neo4jNeo4j
Neo4j
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4j
 
Neo4jrb
Neo4jrbNeo4jrb
Neo4jrb
 
Neo4j 3.2 Launch
Neo4j 3.2 LaunchNeo4j 3.2 Launch
Neo4j 3.2 Launch
 
NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
 
Powering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLPowering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQL
 
DubJug: Neo4J and Open Data
DubJug: Neo4J and Open DataDubJug: Neo4J and Open Data
DubJug: Neo4J and Open Data
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query Rewriting
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 
Hands On: Introduction to the Hadoop Ecosystem
Hands On: Introduction to the Hadoop EcosystemHands On: Introduction to the Hadoop Ecosystem
Hands On: Introduction to the Hadoop Ecosystem
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
Graph databases and the #panamapapers
Graph databases and the #panamapapersGraph databases and the #panamapapers
Graph databases and the #panamapapers
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
 

Recently uploaded

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 

Recently uploaded (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 

Performance of graph query languages

  • 1. 1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig Supervised by: Dr. Lena Wiese Georg-August-University Göttingen Prof. Dr. René Peinl, Florian Holzschuher Performance of graph query languages: Comparison of Cypher, Gremlin and native access in Neo4j
  • 2. 2 Agenda • RDBMS • Reason for NoSQL • Categories of NoSQL databases • Comparison of popular NoSQL databases • Motivation • Neo4j and Query Languages • Comparison of Neo4j to other databases • Testing (importance of benchmarking, different suites) • Results • Limitations and Future work
  • 3. 3 Introduction 1 RDBMS • For decades relational databases been a dominant choice • Structured Query Language (SQL) retrieves data with ease • Currently, Outsized volumes of dynamic data is been developed • Strict schemas and joining several tables for answering queries • Not a good choice for current state • So we require dynamic schemas, high scalability, high performance and so on
  • 4. 4 Introduction 2 • NoSQL databases are the first choice now, solves most the problems • Graph databases are best suited for storing networks of data (social networking) Features – NoSQL database has a proper query language – NoSQL databases do either trade availability or consistency in favor of partition-tolerance (CAP). – Neo4j, Cassandra, MongoDB, BigTable to name a few • It is an ideal choice for web 2.0
  • 5. 5 NoSQL databases • Four important categories of NoSQL databases Key-values Stores Column Family Stores Document stores Graph Databases Simplest and easy to implement, having a hash table with a unique key to the value as a pointer Widely used for data distribution, where keys point to multiple columns Used for semi structured data, storing it in JSON format similar to key-value store Used for storing graph like data e.g. social networks Redis, Oracle BDB, Voldemort BigTable model of Google MongoDB Neo4j
  • 6. 6 Comparison Between Popular NoSQL Databases MongoDB (Document-oriented) Rank No. 1 Cassandra (Wide Column) Rank No. 2 Neo4j (Graph) Rank No. 5 Replication and Failover for high availability Trade off is done for consistency providing high availability Neo4j which is very similar to MongoDB with blocking replication, cluster setup for high availability Consistency is default, auto sharding to ease scalability, replication, full index support Cassandra with incremental scalability, high availability, very eventually consistent Neo4j with scalable clustering support, runtime failover, Live Backup support
  • 7. 7 Different types of DBs and Languages Databases Languages Relational Databases SQL XML databases XPATH, XQUERY RDF RQL, SPARQL Objected oriented OQL Multidimensional MDX Graph Cypher, Gremlin
  • 8. 8 Motivation • To measure the performance of different graph query languages and native access in Neo4j • Compare ease of understanding , code readability, maintainability of the languages • Test the performance and correctness of these graph databases • Apache Shindig, for hosting OpenSocial applications • Compare performance of different back-ends on Neo4j
  • 9. 9 Neo4j and Query Languages • Neo4j, is an open-source NoSQL graph database • Which implements the property graph data model • Neo4j has a native Java Api with a traversal framework • Features – Supports ACID properties – Runtime failover – High performance – Scalability – Very good documentation – Very good query language, Cypher • Cypher, declarative query language similar to SQL • Gremlin, Groovy based query language
  • 10. 10 Comparison of Neo4j to other DBs Existing Work Neo4j and MySQL Neo4j and Other graph database Neo4j retrieved results faster than relational databases Data used for testing performance: 1k, 32k and 1m nodes reaching from 9k relationships to 8.4 million relationships Flexible than MySQL Jena and HypergraphDB were not able to load the database in a specified time Query times are 2-5 times lower that MySQL for their 500 objects data set DEX and Neo4j were able to load the largest benchmark sizes Neo4j performed better at the structural type queries than SQL Jena could load the graph with 1M nodes faster than Neo4j but it couldn’t scale Neo4j were slower than MySQL with integer data Neo4j is faster than DEX for the large dataset, and the reverse happens for the small dataset So, Neo4j is used for queries like friendship, movie favorites and more complicated commercial purposes queries DEX is able to scale better, whereas Neo4j obtained a good throughput
  • 11. 11 Setup • Apache shindig 2.5, for hosting OpenSocial applications • Neo4j has a native Java Api with which we can retrieve and traverse methods • Also directly accessible when neo4j is in embedded mode • A RESTful (REST stands for Representational State Transfer) web service interface • Several wrappers for various programming languages like python and java • Cypher is used for all the CRUD (create, read, update and delete) • Gremlin does both imperative and declarative querying
  • 12. 12 Data Used for testing • 2011 people • 26,982 messages • 24,365 activities • 2000 address • 200 groups • 100 organizations • They even tested on a bigger dataset 10,003 people • One had at least 1 friend or a maximum of 667 friends from 25,0000 friendship relationships • For bigger dataset 10,003 people, there were 137,000 friendships in total, a maximum of 1,448 friends for one person
  • 13. 13 Suites used for testing • Neo4j embedded • Neo4j REST • Neo4j Cypher embedded • Neo4j Cypher REST • Neo4j Gremlin Rest • MySQL JPA • These suites retrieves profiles, friends, group recommendations and other social networking features
  • 14. 14 Results 1 Comparison of query languages and native access Native object access Cypher Gremlin SQL Can retrieve and traverse methods, with a traversal framework Declarative query language does all the CRUD operations Groovy based query language with a compact syntax Structured query language, simple to understand Difficult to learn, Easy to learn, Difficult to learn Easy to learn Several lines of codes for simple retrieval Simple and easy to understand Compact syntax, difficult to understand Several lines of code Comparable Good for complex retrieval Good for small retrieval Slows down for complicated queries
  • 15. 15 Results 2 - Gremlin vs. Cypher Cypher START person= node:people(id = {id}) MATCH person-[:FRIEND_OF] -> friend-[:FRIEND_OF] -> friend_of_friend WHERE not (friend_of_friend <- [:FRIEND_OF]-person) RETURN friend_of_friend, COUNT(*) ORDER BY COUNT(*) DESC Gremlin t = new Table(); x = [];" g.idx('persons')[[id:id_param]]. out('FRIEND_OF').fill(x);" g.idx('persons')[[id:id_param]].out('FRIEND_OF'). out('FRIEND_OF').dedup().except(x).id.as('ID'). back(1).displayName.as('name'). table(t,['ID','name']){it}{it}.iterate(); t Friend Suggestion For A Person
  • 16. 16 Results 3 - Gremlin vs. Cypher Queries Cypher and Gremlin Performance Friend queries (simple) Gremlin is bit faster than Cypher Peoples queries Gremlin is slower than Cypher Message queries Gremlin is on par with Cypher FOAF queries (complicated) Cypher better than Gremlin • Gremlin is slower when there are complicated pattern matching • Complex queries with many properties, relationships Cypher out performed Gremlin • Gremlin is better for simple cases
  • 17. 17 Results 4 - from Original Paper Figure 1: 2000 people in ms Figure 2: Gremlin vs Cypher in ms
  • 18. 18 Results 5 • Embedded instance way faster than DBMS over the network • Neo4j query languages outperform JPA for friend queries • Remote access with REST slower compared to the embedded Neo4j native object access • JPA VS RESTful cypher and gremlin very interesting – For person profile JPA back-end performances equally good as RESTful cypher
  • 19. 19 Results 6 • Friend queries are more than one order of magnitude slower for JPA • Neo4j showed a constant performance when increasing from 2000 to 10,000 persons • MySQL drops performance by a factor of 5 for people queries • MySQL drops performance by a factor of 7-9 for peoples friends queries • Restful case is slower than JPA in most of the cases
  • 20. 20 Limitation • The data which they used was realistic to an extent • Results always showed some fluctuations • Not good for benchmarking and using the results for further research because of fluctuations • They have used different Cypher queries for embedded and rest benchmarking • Neo4j’s normal server settings were used • Haven't tested Neo4j´s advanced version with load balancing
  • 21. 21 Conclusion and Future work • Analyzed the performance and programming effort for different back-ends • Compared JPA back-end using MySQL with Cypher and Gremlin • Neo4j with Cypher had better performance overall • Gremlin performed better with simple queries • Cypher performed better with complicated queries • Neo4j is a good replacement for the traditional RDBMS for web 2.0 • Future work: To implement and test with an interesting approach of spring data Neo4j
  • 22. 22