SlideShare a Scribd company logo
1 of 42
Download to read offline
AURELIUS
THINKAURELIUS.COM
TITAN
Scalable Graph Database
Matthias Broecheler
@mbroecheler
April 12th, MMXIV
Database
L?;F NCG?
BCAB NBLIOABJON
NL;HM;=NCIH;F
Graph Database
Graph Database
M=;F;<F?
CHN?AL;N?>
IJ?H
MIOL=?
name: Newton
type: user
name: Hercules
type: user
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
name: Newton
type: user
name: Hercules
type: user
bought
bought
bought
viewed
in-Cart
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
name: Newton
type: user
name: Hercules
type: user
bought
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
time:09
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
name: Newton
type: user
name: Hercules
type: user
bought
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
time:09
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
1. Home-grown solution
2. Relational Database
3. Graph Database
Home-grown Solution
!  Start with your favorite NoSQL database
!  Cassandra, MongoDB, HBase, etc
1.  Error-prone
2.  Data model moves into application code
3.  Maintainability hazard
4.  No query language support
5.  No performance optimization
Relational Database
!  Relationship tables, SQL and joins
1.  Join processing is expensive
2.  Join processing on large tables does not scale
3.  Cumbersome query language
4.  Inflexible data model
SELECT P.title
FROM

User U1 JOIN Purchase P1 ON P1.buyerid = U1.userid

JOIN Purchase P2 ON P1.productid=P2.productid

JOIN Purchase P3 ON P2.buyerid=P3.buyerid 

JOIN Product P ON P3.productid = P.productid
WHERE

U1.name=“xyz” AND P1.time>T1 AND P2.time>T1
Relational Database
!  Relationship tables, joins, and SQL
1.  Join processing is expensive
2.  Join processing on large tables does not scale
3.  Cumbersome query language
4.  Inflexible data model
name: Newton
type: user
name: Hercules
type: user
bought
friends
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
duration: 60
time:09
name: Saturn
type: author
author
author
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
1. Home-grown solution
2. Relational Database
3. Graph Database
UML
Entity Relationship
Model
name: Hercules
type: user
bought
time:24
6?LN?R
%>A? ,;<?F
%>A?
0LIJ?LNS
t E?S q P;FO?
title: “Muscle building
for beginners”
type: book
name: Newton
type: user
name: Hercules
type: user
bought
friends
time:24
bought
bought
time:22
time:20
viewed
in-Cart
time:05
duration: 60
time:09
name: Saturn
type: author
author
author
title: “How to deal with Father issues”
type: book
title: “Muscle building for beginners”
type: book
title: “Dancing with the Stars”
type: DVD
title: “Friends forever bracelet”
type: Accessory
g.V.has(‘name’,’xyz’).outE(‘bought’).has(‘time’,gt,T1).inV

.inE(‘bought’).has(‘time’,gt,T1).outV

.out(‘bought’).title
http://gremlindocs.com/
Architecture Analogy
MyISAM
Flexible Persistence
Partitionability
Availability
Consistency
Vertex-Centric Indices
!  Sort and index edges per
vertex by sor tkey
!  Sort key can be composite
!  Enables efficient focused
traversals
!  Only retrieve edges that matter
!  Uses push down predicates for
quick, index-driven retrieval
Token Ring
Graph Partitioning
;MMCAHM C>M NI G;J
P?LNC=?M CHNI “IJNCG;F”
NIE?H L;HA?
,INM I@ CHN?L?MNCHA KO?MNCIHM @IL@ONOL? QILE
OM?M "/0
Educating the Planet
Person
Person
Student
 Teacher
Course
Institution
Concept
Discussio
n
Comment
Share
enrolledIn
teaches
relatesTo
hasCourse
belongsTo
follows
author
references
hasComment
 relatesTo
author
partOf
relatesTo
121 Billion Edges
6.2 Billion Vertices
U -CFFCIH 5HCP?LMCNC?M
W . Y "CFFCIH 3NO>?HNM
0F;=?G?HN 'LIOJ
BCU .4RF
Setup
1.1 million edges / sec
OMCHA <;N=B GI>?
Data Ingestion
^ GU .G?>COG
10,200 transactions / sec
UZ L;H>IGFS =BIM?H
=IGJF?R NL;P?LM;F
N?GJF;N?M
Throughput
Transaction Description Avg (ms) Stdev (ms)
Student retrieves all content for a
single course in their course list 279.32 81.83
Student follows another student
193.72 22.77
Student is recommended people
to follow 241.33 256.48
Student reads their stream and
shares an item with followers 284.07 68.20
Student retrieves their profile 53.740 22.61
Student reads the most recent
comments for their courses 211.07 45.56
x = [] as Set; m = [:]!
m = user.out('follows').aggregate(x)[0..(num*2)]!
!.out('follows').except(x)[0..limit]!
!.groupCount(m);!
!
m.sort{-it.value}[0..num]._()!
!.transform{ [userid: it.key.id, !
! ! ! ! ! ! points: it.value]};!
&IFFIQ 2?=IGG?H>;NCIH
AURELIUS
THINKAURELIUS.COM
Faunus
Batch Graph Analytics
!  Hadoop-based Graph
Computing Framework
!  Graph Analytics
!  Breadth-first Traversals
!  Global Graph Computations
!  Batch Big Graph Data
Faunus Features
Faunus Architecture
g._()!
Faunus Work Flow
hdfs://user/ubuntu/
output/job-0/
output/job-1/
output/job-2/ {
graph*
sideeffect*
g.V.out .out .count()
Compressed HDFS Graphs
! stored in sequence files
! variable length encoding
! prefix compression
Degree Distribution
GitHub Network
g.V.sideEffect{

it.degree = it.out(‘follows’).count()
}.degree.groupCount
Degree Distribution
P(k) ~ k-γ
γ = 2.2
Global
Recommendations
gremlin> g.E.has('label','pushed','to').keep.!
! ! !V.out('pushed').out('to').!
! ! !in('to').in('pushed').!
! ! !sideEffect('{it.score =it.pathCounter}').!
! ! !score.order(F.decr,'name')!
!
# Top 5:!
Jippi ! ! ! !60892182927!
garbear ! ! !30095282886!
FakeHeal ! ! !30038040349!
brianchandotcom !24684133382!
nyarla! ! !15230275746!
Aurelius Graph Cluster
OLTP
 OLAP
Hadoop
MapReduce
Analysis results
back into Titan
Apache 2
g.V.label.groupCountg.v(101).out
titan.thinkaurelius.com
 faunus.thinkaurelius.com
aureliusgraphs@googlegroups.com
AURELIUS
THINKAURELIUS.COM
@AURELIUSGRAPHS

More Related Content

More from Matthias Broecheler

Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Matthias Broecheler
 
Titan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraTitan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraMatthias Broecheler
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksMatthias Broecheler
 
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksBudget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksMatthias Broecheler
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Matthias Broecheler
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksMatthias Broecheler
 

More from Matthias Broecheler (8)

Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013Graph Computing @ Strangeloop 2013
Graph Computing @ Strangeloop 2013
 
Titan - Graph Computing with Cassandra
Titan - Graph Computing with CassandraTitan - Graph Computing with Cassandra
Titan - Graph Computing with Cassandra
 
Data Day Texas 2013
Data Day Texas 2013Data Day Texas 2013
Data Day Texas 2013
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
 
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large NetworksBudget-Match: Cost Effective Subgraph Matching on Large Networks
Budget-Match: Cost Effective Subgraph Matching on Large Networks
 
Probabilistic Soft Logic
Probabilistic Soft LogicProbabilistic Soft Logic
Probabilistic Soft Logic
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Titan @ Gitpro Conference 2014