More Related Content More from Takahiro Inoue(20) 「GraphDB徹底入門」〜構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く〜1. Graph DB
GraphDB
doryokujin
+WEB ( Tokyo.Webmining #9-2)
2. [Me]
doryokujin
2
2 33
[Company]
1
3. [ ]
MongoDB JP
TokyoWebMining MongoDB
[ ]
MongoDB
MongoDB GraphDB
4. #1
[MongoTokyo]
Mongo DB Congerence in Japan
2011 03 01
10gen 3
…
http://www.10gen.com/conferences/
mongotokyo2011
5. #2
[gihyo ]
gihyo.jp
2
DocumentDB GraphDB
NoSQL
6. Graph
Graph
Graph DB
Graph Traversal
Graph DB
Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraph
Tinker Pop
Gremlin, Blueprints, Pipes, Rexster, Mutant
7. Graph
Graph
Graph
Graph DB
Graph Traversal
Graph DB
Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraph
Tinker Pop
Gremlin, Blueprints, Pipes, Rexster, Mutant
8. Graph: Graph
Graph DB
Graph
9. Graph
[Graph]
Dots Lines
vertices edges
1 (relationship)
Dots Lines
Graph
11. Directed Graph
[ (Directed) Graph]
Vertices:
Edges:
(relationship)
(asymmetric)
12. Directed / Underected Graph
friend follow
friend follow
follow
friend follow
[Facebook] [Twitter]
”Undirected Graph” Follow ”Directed Graph”
” ” ” ”
”friends” ”follow”
14. Single-Relational Graph
friend follow
friend follow
follow
friend follow
[Facebook] [Twitter]
”Undirected Graph” Follow ”Directed Graph”
”Facebook ” ”Twitter ”
”friends” ”follow”
15. Single-Relational
Reply
num:5
Reply Block
num:5
Reply
DM
num:5
num:1
RT RT
Reply DM
num:2 num:2
num:2 num:1
[Twitter]
Graph ”Directed Graph”
”Twitter ”
”Reply”,”RT”,”DM”,”Block”
16. *Facebook Flickr
lives_in
is is is
follow
lives_in
friend is
share *
friend share follow follow
is
[ ] lives_in
Undirected Directed
is is
is
lives_in
18. Multi-Relational
Reply
Reply Block
DM Reply
RT RT
Reply DM [Twitter]
”Twitter ”
”Reply”,”RT”,”DM”,”Block”
19. Multi-Relational
*Facebook Flickr
lives_in
has has
has
follow
lives_in
friend has
share *
has friend share follow
lives_in
[Multi-Relatinal Graph]
has has has
lives_in
20. Property Graph
Property Graph
Multi-Relational Graph (Property)
Graph DB Graph
1
key/value
id id_A follow id id_B
follow 100 follow 500
follower 200 date 2011/01/23 follower 1000
21. Property Graph
Reply
num:5
Reply Block
num:5
Reply
DM
num:5
num:1
RT RT
Reply DM
num:2 num:2
num:2 num:1
Graph ”Property Graph”
”Twitter ”
”Reply”,”RT”,”DM”,”Block”
”num”
22. Property Graph
name doryokujin
sex man
lives_in birth 1985/05/14
has has id id_B
follow follow 1000
follower 2000
lives_in date 2011/01/23
friend has
friend
date 2011/01/23
has friend follow follow
date 2010/03/23 date 2011/01/23
name full name
mail xxx@yyy
address zzz
lives_in
id id_A
follow 100
follower 200
has has
date 2010/03/23
lives_in
23. Graph
The Graph Traversal Pattern
24. Property Graph
Property Graph Graph
Property Graph Graph DB
Tinker Pop
Hyper Graph
25. Graph DB
Gragh
Graph
Graph DB
Graph Traversal
Graph DB
Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraph
Tinker Pop
Gremlin, Blueprints, Pipes, Rexster, Mutant
27. Graph DB
[ DB ≠ Graph DB]
Graph
DB
DB Graph DB
28. RDB Graph
[Relatinal Database] A
outV inV
A B B C
A C
C D
D
D A
29. Document DB Graph
[Document Database] A
{
A : {
out : [B, C], in : [D]
}
B : {
in : [A] B C
}
C : {
out : [D], in : [A]
}
D : {
out : [A], in : [C] D
}
}
30. XML DB Graph
[XML Database] A
<graphml>
<graph>
<node id=A />
<node id=B /> B C
<node id=C />
<edge source=A target=B />
<edge source=A target=C />
<edge source=C target=D />
<edge source=D target=A />
</graph> D
</graphml>
31. Graph DB
[ ]
“A graph database is any storage system
that provides index-free adjacency”
The Graph Traversal Programming Pattern
(“adjacent”)
( “index-free” )
32. Non-Graph DB and
Index-Based Adjacency
B E
1. A
3. (B,C)
A
A B C
B, C E D, E
D E
2.
C D
log_2(n) (B,C)
time cost
33. Graph DB and
Index-Free Adjacency
‣
”Mini - Index”
B E
‣
1.
1 A
(B,C)
‣
C D
id id_B
follow 1000
follower 2000
36. Graph DB Query
Graph Query = Graph Traversal
Traversal =
Root
Graph
Graph Traversal (Root)
Index-Free Adjacency
37. private
void
printFriends(
Node
person
)
{
Traverser
traverser
=
person.traverse(
Order.BREADTH_FIRST,
//
StopEvaluator.END_OF_GRAPH,
//
Graph
ReturnableEvaluator.ALL_BUT_START_NODE,
//
Root
Node
MyRelationshipTypes.KNOWS,
//
”KNOWS”
Direction.OUTGOING
);
//
for
(
Node
friend
:
traverser
)
{
//
Node ”name”
System.out.println(
friend.getProperty(
"name"
)
);
}
Neo4j Wiki
}
38. 1
3
1
2
Trinity
Morpheus
Cypher
Agent
Smith
Neo4j Wiki
39. private
void
findHackers(
Node
startNode
) Neo4j Wiki
{
Traverser
traverser
=
startNode.traverse(
Order.BREADTH_FIRST,
//
StopEvaluator.END_OF_GRAPH,
//
Graph
new
ReturnableEvaluator()
//
{
public
boolean
isReturnableNode(
TraversalPosition
currentPosition
)
{
Relationship
rel
=
currentPosition.lastRelationshipTraversed();
if
(
rel
!=
null
&&
rel.isType(
MyRelationshipTypes.CODED_BY
)
)
{
return
true;
//
“CODED_BY”
}
return
false;
//
}
},
//
2
MyRelationshipTypes.CODED_BY,
Direction.OUTGOING,
//
MyRelationshipTypes.KNOWS,
Direction.OUTGOING
);
//
for
(
Node
hacker
:
traverser
)
{
TraversalPosition
position
=
traverser.currentPosition();
System.out.println(
"At
depth
"
+
position.depth()
+
"
=>
"
+
hacker.getProperty(
"name"
)
);
} ∴
At
depth
4
=>
The
Architect
42. Graph DB
” ”
10
”Knows”
Tables, Documents, Key/Value Model
GraphDB Union,
Intersection, Join
43. Graph DB
[ ]
Property Graph
Index-Free Adjacency
Graph Query = Graph Traversal
Data Locality
44. Graph DB
Graph
Graph
Graph DB
Graph Traversal
Graph DB
Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraph
Tinker Pop
Gremlin, Blueprints, Pipes, Rexster, Mutant
46. Neo4j
[ ] HP
Java
AGPLv3
2003 24 8
2009 VC
ACID
Propety Graph Model / Gremlin
Lucene
47. Neo4j
[Language Binding - Framework]
Python - Django
Ruby - Ruby on Rails
Clojure
Scala
Groovy - Griffin / Grails
Java - Spring Framework
Ruby
Ruby Java
48. Neo4j
[Tools]
Shell
Shell Graph Traverse Indexing
neo4j-server
Neo4j REST API
Admin tools
Online BackUp
Neoclipse
Neo4j ↑
Batch Insert
49. Neo4j
[ver. 1.2] 1.2
Neo4j Server
REST API
Admin Interface
High Availability
Kernel
51. sones
[ ] HP
C#
AGPLv3
2011 VC
ACID
REST Interface
Property Graph Model / Gremlin
: Property Hyper Graph
Graph Query Language(GQL)
52. sones
[GQL]
SQL Traversal
Cheat Sheet
Query
• FROM User SELECT User.Friends.Friends.Name
// aggregation
• SELECT COUNT(User.Friends)
• SELECT User.Friends.Random(2)
• SELECT User.Friends.Name.Substring(2,5)
54. Orient DB
[ ] HP
Java
Apache2.0
1997 C++ → Java
Document-Graph DB
ACID
Shell / REST Interface
Propety Graph Model / Gremlin
55. Orient DB
[Document-Graph DB]
[ ] Orient DB Object DB Key/Value
Server Document DB
// DATABASE OPEN
ODatabaseDocumentTx db = new ODatabaseDocumentTx("remote:localhost/petshop").open
("admin", "admin");
// Document
ODocument doc = new ODocument(db, "Person");
doc.field( "name", "Luke" );
doc.field( "surname", "Skywalker" );
doc.field( "city", new ODocument(db, "City").field("name","Rome").field("country",
"Italy") );
// Transaction
doc.save();
db.close();
56. Orient DB
[Document-Graph DB]
OGraphVertex
OGraphEdge
OGraphElement
ODocumentWrapper
Document
SQL
SELECT FROM OGraphVertex WHERE outEdges CONTAINS ( label = 'knows' )
//7 ”knows”
SELECT FROM OGraphVertex WHERE outEdges TRAVERSE(0,7,'out,outEdges')
( @class = 'OGraphEdge' and label = 'knows' )
57. Orient DB
[Language Binding Using Binary Protocol]
Java
C
PHP
JRuby (Ruby: soon)
[Language Binding Using REST Protocol]
Python
Java Script
59. InfoGrid
[ ] HP
JAVA
AGPLv3
ACID
REST Interface
MeshObject Graph
MeshBase _GDB = StoreMeshBase.create(_MySQLStore);
MeshObject _xkcd = _GDB.getMeshObjectLifecycleManager
().createMeshObject();
_xkcd.setProperty("Name", "xkcd");
_xkcd.setProperty("Url", "http://www.xkcd.com");
_xkcd.relate(_good)
61. Infinite Graph
[ ] HP
C++
Academic and Start
Up
2010 6
Distributed Graph DB
↑Objectivity/DB: distributed database server
63. Graph DB:
Data SQL Like
GraphDB License Language Protocol Gremlin Binding
Model Query
REST/ Property Ruby, Python,
Neo4j AGPLv3 Java Yes
Scala,...
-
JSON Graph
REST/ Property
sones AGPLv3 C# JSON Graph Yes - Yes
(XML) (+Extend)
REST/ Property PHP, Jruby,
OrientDB Apache2.0 Java Yes
Python, JS,...
Yes
JSON Graph
Property
REST/
Info Grid AGPLv3 Java Graph? - - -
JSON (MeshObject)
Infinite Property
Product C++ - - - -
Graph Graph
64. Graph DB
[ ]
Graph DB
Neo4j
Open Source Social Graph Software Not Ready Yet
Graph DB
Hypergtaph: PropertyGraph HyperGraph
Pregel: bulk synchronous parallel model Distributed DB
Google
FlockDB: Distributed DB for storing adjancency lists Twitter
65. Tinker Pop
Graph
Graph
Graph DB
Graph Traversal
Graph DB
Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraph
Tinker Pop
Gremlin, Blueprints, Pipes, Rexster, Mutant
67. Tinker Pop
[Tinker Pop] HP
Property Graph Model
GraphDB
Blueprints: A Property Graph Model Interface
Gremlin: A Graph Traversal Language
Pipes: A Data Flow Framework using Process Graphs
Rexster: A RESTful Graph Shell
Mutant: A Poly-ScriptEngine ScriptEngine
70. BluePrints
[ ] HP
GraphDB ”JDBC”
Property Graph Model GraphDB
[Now]
Tinker Graph: in-memory property graph model
Sail: Open RDF
Neo4j, Orient DB, sones, ...
[Future]
Redis
Infinite Graph, Dex
71. BluePrints
GraphDB
Graph graph = new Neo4jGraph("/tmp/graph/neo4j");
// Graph graph = new OrientGraph("/tmp/graph/orientdb");
Vertex a = graph.addVertex(null);
Vertex b = graph.addVertex(null);
a.setProperty("name","marko");
b.setProperty("name","aaron");
Edge e = graph.addEdge(null,a,b,"knows");
e.setProperty("since",2010);
graph.shutdown();
72. BluePrints
Transaction
graph.startTransaction();
try{
Vertex luca = graph.addVertex(null);
luca.setProperty( "name", "Luca" );
Vertex marko = graph.addVertex(null);
marko.setProperty( "name", "Marko" );
Edge lucaKnowsMarko = graph.addEdge(null, luca, marko,"knows");
graph.stopTransaction(Conclusion.SUCCESS);
} catch( Exception e ) {
graph.stopTransaction(Conclusion.FAILURE);
}
74. Gremlin
[ ] HP
Gremlin = Graph Programing Language
Blueprints GraphDB
Shell
GraphDB Query
Java + Groovy
76. doryokujin$ ./gremlin.sh
,,,/
(o o)
-----oOOo-(_)-oOOo-----
gremlin>
g
=
TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6
edges:6]
// 6 6
gremlin>
g.class
==>class
com.tinkerpop.blueprints.pgm.impls.tg.TinkerGraph
gremlin>
//
gremlin>
g.V
==>v[3]
==>v[2]
...
gremlin>
//
gremlin>
g.E
==>e[10][4-‐created-‐>5]
==>e[7][1-‐knows-‐>2]
==>e[9][1-‐created-‐>3]
...
Getting Srarted
77. gremlin>
v
=
g.v(1)
//
id=1
==>v[1]
gremlin>
v.keys()
//
==>age
==>name
gremlin>
v.values()
//
==>29
==>marko
gremlin>
v.name
+
'
is
'
+
v.age
+
'
years
old.'
==>marko
is
29
years
old.
gremlin>
//
id=1,
name=marko
gremlin>
v.outE
==>e[7][1-‐knows-‐>2]
==>e[9][1-‐created-‐>3]
==>e[8][1-‐knows-‐>4]
gremlin>
//
gremlin>
v.outE.weight
==>0.5
==>0.4
==>1.0 Getting Srarted
78. gremlin>
//
id=1 1.0
gremlin>
v.outE{it.weight
<
1.0}.inV
==>v[2]
==>v[3]
gremlin>
//
gremlin>
list
=
[]
gremlin>
v.outE{it.weight
<
1.0}.inV
>>
list
==>v[2]
==>v[3]
gremlin>
//
list property
maps
gremlin>
list.collect{
it.map()
}
==>{name=vadas,
age=27}
==>{name=lop,
lang=java}
gremlin>
//
list
gremlin>
list.inE()
==>e[7][1-‐knows-‐>2]
==>e[9][1-‐created-‐>3]
...
Getting Srarted
79. gremlin>
list.inE{it.label=='knows'}
//
'knows'
==>e[7][1-‐knows-‐>2]
gremlin>
list.inE()[[label:'knows']]
//
==>e[7][1-‐knows-‐>2]
gremlin>
list.inE()[[label:'knows']].outV.name
//
:name
==>marko
Getting Srarted
~20000ms:
g.V.outE{it['label']=='followed_by'}.inV.outE{it['label']=='followed_by'}.inV.outE
{it['label']=='followed_by'}.inV
>>-‐1
~9000ms:
g.V.outE{it.label=='followed_by'}.inV.outE{it.label=='followed_by'}.inV.outE
{it.label=='followed_by'}.inV
>>-‐1
~8500ms:
g.V.outE{it.getLabel()=='followed_by'}.inV.outE{it.getLabel()=='followed_by'}.inV.outE
{it.getLabel()=='followed_by'}.inV
>>-‐1
~6000ms:
g.V.outE[[label:'followed_by']].inV.outE[[label:'followed_by']].inV.outE
[[label:'followed_by']].inV
>>-‐1
ClosureFilterPipe vs. PropertyFIlterPipe
81. Pipes
[ ] HP
Pipes = Data Flow Framework
Pipes Graph Traversal 1 1
Pipes filtering, splitting, merging, traversing,...
82. Gremlin
g:id-v('a')/outE[@label='knows']/inV/outE[@label='develops']/inV/@name
Pipe pipe1 = new VertexEdgePipe(Step.OUT_EDGES);
Pipe pipe2 = new LabelFilterPipe("knows", Filter.NOT_EQUALS);
Pipe pipe3 = new EdgeVertexPipe(Step.IN_VERTEX);
Pipe pipe4 = new VertexEdgePipe(Step.OUT_EDGES);
Pipe pipe5 = new LabelFilterPipe("develops",
Filter.NOT_EQUALS);
Pipe pipe6 = new EdgeVertexPipe(Step.IN_VERTEX);
Pipe pipe7 = new PropertyPipe("name");
Pipe pipeline = new Pipeline
(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);
pipeline.setStarts(new SingleIterator(graph.getVertex("a"));
for(String name : pipeline) {
System.out.println(name);
} A Graph Processing Stack
83. Pipes
Pipes
public
class
NumCharsPipe
extends
AbstractPipe<String,Integer>
{
public
Integer
processNextStart()
{
String
word
=
this.starts.next();
return
word.length();
}
} A Graph Processing Stack
85. Rexster
[ ] HP
Rexster = A RESTful Graph Shell
Blueprints GraphDB RESTful
API (JSON)
Gremlin
86. > http://localhost:8182/examplegraph/vertices/b
{
"version":"0.1",
"results": {
"_type":"vertex",
"_id":"b",
"name":"aaron",
"type":"person"
},
"query_time":0.1537
} A Graph Processing Stack
// g:key-v('name','DARK STAR')[0]: Usin gGremlin Code
> http://localhost:8182/gratefulgraph/traversals/gremlin?
script=g:key-v%28%27name%27,%27DARK%20STAR%27%29[0]
{
"results":
[{
"_type":"vertex",
"_id":"89",
"name":"DARK
STAR",
"song_type":"original",
"performances":219,
"type":"song"}
],
"query_time":6.753024,
"success":true,
"version"
} Using Gremilin
88. Mutant
[ ] HP
Mutant = A Poly-ScriptEngine ScriptEngine
JVM
Script Engine
89. Mutant Console
marko:~/software/mutant$
./mutant.sh
//
oO
~~-‐_
___m(___m___~.___
MuTanT
0.1-‐SNAPSHOT
_|__|__|__|__|__|
[
?h
=
help
]
[gremlin]
gremlin
0.6-‐SNAPSHOT
[Groovy]
Groovy
Scripting
Engine
2.0
[ruby]
JSR
223
JRuby
Engine
1.5.5
[ECMAScript]
Mozilla
Rhino
1.6
release
2
[AppleScript]
AppleScriptEngine
1.0
mutant[gremlin]>
$x
:=
12
[12]
mutant[gremlin]>
?x
mutant[AppleScript]>
?x
mutant[Groovy]>
$x
12
mutant[Groovy]>
?x
mutant[ruby]>
$x
12
mutant[ruby]>
?x
mutant[ECMAScript]>
$x
12 Basic Examples
90. [ ]
Graph DB
Graph DB
Graph Partitioning
Pregel Neo4j
91. …
※
Graph DB
http://snap.stanford.edu/data/index.html