SlideShare a Scribd company logo
1 of 42
Download to read offline
AURELIUS
THINKAURELIUS.COM
TITAN
Scalable Graph Database
Matthias Broecheler
@mbroecheler
April 12th, MMXIV
Database
L?;F NCG?
BCAB NBLIOABJON
NL;HM;=NCIH;F
Graph Database
Graph Database
M=;F;<F?
CHN?AL;N?>
IJ?H
MIOL=?
name: Newton
type: user
name: Hercules
type: user
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
name: Newton
type: user
name: Hercules
type: user
bought
bought
bought
viewed
in-Cart
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
name: Newton
type: user
name: Hercules
type: user
bought
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
time:09
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
name: Newton
type: user
name: Hercules
type: user
bought
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
time:09
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
1. Home-grown solution
2. Relational Database
3. Graph Database
Home-grown Solution
!  Start with your favorite NoSQL database
!  Cassandra, MongoDB, HBase, etc
1.  Error-prone
2.  Data model moves into application code
3.  Maintainability hazard
4.  No query language support
5.  No performance optimization
Relational Database
!  Relationship tables, SQL and joins
1.  Join processing is expensive
2.  Join processing on large tables does not scale
3.  Cumbersome query language
4.  Inflexible data model
SELECT P.title
FROM

User U1 JOIN Purchase P1 ON P1.buyerid = U1.userid

JOIN Purchase P2 ON P1.productid=P2.productid

JOIN Purchase P3 ON P2.buyerid=P3.buyerid 

JOIN Product P ON P3.productid = P.productid
WHERE

U1.name=“xyz” AND P1.time>T1 AND P2.time>T1
Relational Database
!  Relationship tables, joins, and SQL
1.  Join processing is expensive
2.  Join processing on large tables does not scale
3.  Cumbersome query language
4.  Inflexible data model
name: Newton
type: user
name: Hercules
type: user
bought
friends
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
duration: 60
time:09
name: Saturn
type: author
author
author
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
1. Home-grown solution
2. Relational Database
3. Graph Database
UML
Entity Relationship
Model
name: Hercules
type: user
bought
time:24
6?LN?R
%>A? ,;<?F
%>A?
0LIJ?LNS
t E?S q P;FO?
title: “Muscle building
for beginners”
type: book
name: Newton
type: user
name: Hercules
type: user
bought
friends
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
duration: 60
time:09
name: Saturn
type: author
author
author
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
g.V.has(‘name’,’xyz’).outE(‘bought’).has(‘time’,gt,T1).inV

.inE(‘bought’).has(‘time’,gt,T1).outV

.out(‘bought’).title
http://gremlindocs.com/
Architecture Analogy
MyISAM
Flexible Persistence
Partitionability
Availability
Consistency
Vertex-Centric Indices
!  Sort and index edges per
vertex by sor tkey
!  Sort key can be composite
!  Enables efficient focused
traversals
!  Only retrieve edges that matter
!  Uses push down predicates for
quick, index-driven retrieval
Token Ring
Graph Partitioning
;MMCAHM C>M NI G;J
P?LNC=?M CHNI “IJNCG;F”
NIE?H L;HA?
,INM I@ CHN?L?MNCHA KO?MNCIHM @IL@ONOL? QILE
OM?M "/0
Educating the Planet
Person
Person
Student
 Teacher
Course
Institution
Concept
Discussio
n
Comment
Share
enrolledIn
teaches
relatesTo
hasCourse
belongsTo
follows
author
references
hasComment
 relatesTo
author
partOf
relatesTo
121 Billion Edges
6.2 Billion Vertices
U -CFFCIH 5HCP?LMCNC?M
W . Y "CFFCIH 3NO>?HNM
0F;=?G?HN 'LIOJ
BCU .4RF
Setup
1.1 million edges / sec
OMCHA <;N=B GI>?
Data Ingestion
^ GU .G?>COG
10,200 transactions / sec
UZ L;H>IGFS =BIM?H
=IGJF?R NL;P?LM;F
N?GJF;N?M
Throughput
Transaction Description Avg (ms) Stdev (ms)
Student retrieves all content for a
single course in their course list 279.32 81.83
Student follows another student
193.72 22.77
Student is recommended people
to follow 241.33 256.48
Student reads their stream and
shares an item with followers 284.07 68.20
Student retrieves their profile 53.740 22.61
Student reads the most recent
comments for their courses 211.07 45.56
x = [] as Set; m = [:]!
m = user.out('follows').aggregate(x)[0..(num*2)]!
!.out('follows').except(x)[0..limit]!
!.groupCount(m);!
!
m.sort{-it.value}[0..num]._()!
!.transform{ [userid: it.key.id, !
! ! ! ! ! ! points: it.value]};!
&IFFIQ 2?=IGG?H>;NCIH
AURELIUS
THINKAURELIUS.COM
Faunus
Batch Graph Analytics
!  Hadoop-based Graph
Computing Framework
!  Graph Analytics
!  Breadth-first Traversals
!  Global Graph Computations
!  Batch Big Graph Data
Faunus Features
Faunus Architecture
g._()!
Faunus Work Flow
hdfs://user/ubuntu/
output/job-0/
output/job-1/
output/job-2/ {
graph*
sideeffect*
g.V.out .out .count()
Compressed HDFS Graphs
! stored in sequence files
! variable length encoding
! prefix compression
Degree Distribution
GitHub Network
g.V.sideEffect{

it.degree = it.out(‘follows’).count()
}.degree.groupCount
Degree Distribution
P(k) ~ k-γ
γ = 2.2
Global
Recommendations
gremlin> g.E.has('label','pushed','to').keep.!
! ! !V.out('pushed').out('to').!
! ! !in('to').in('pushed').!
! ! !sideEffect('{it.score =it.pathCounter}').!
! ! !score.order(F.decr,'name')!
!
# Top 5:!
Jippi ! ! ! !60892182927!
garbear ! ! !30095282886!
FakeHeal ! ! !30038040349!
brianchandotcom !24684133382!
nyarla! ! !15230275746!
Aurelius Graph Cluster
OLTP
 OLAP
Hadoop
MapReduce
Analysis results
back into Titan
Apache 2
g.V.label.groupCountg.v(101).out
titan.thinkaurelius.com
 faunus.thinkaurelius.com
aureliusgraphs@googlegroups.com
AURELIUS
THINKAURELIUS.COM
@AURELIUSGRAPHS

More Related Content

More from Matthias Broecheler

Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Matthias Broecheler
 
Titan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraTitan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraMatthias Broecheler
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksMatthias Broecheler
 
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksBudget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksMatthias Broecheler
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Matthias Broecheler
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksMatthias Broecheler
 

More from Matthias Broecheler (8)

Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013
 
Titan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraTitan - Graph Computing with Cassandra
Titan - Graph Computing with Cassandra
 
Data Day Texas 2013
Data Day Texas 2013Data Day Texas 2013
Data Day Texas 2013
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
 
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksBudget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large Networks
 
Probabilistic Soft Logic
Probabilistic Soft LogicProbabilistic Soft Logic
Probabilistic Soft Logic
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Titan @ Gitpro Conference 2014