「GraphDB徹底入門」〜構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く〜

56,518 views
56,417 views

Published on

Published in: Technology
1 Comment
169 Likes
Statistics
Notes
No Downloads
Views
Total views
56,518
On SlideShare
0
From Embeds
0
Number of Embeds
13,001
Actions
Shares
0
Downloads
753
Comments
1
Likes
169
Embeds 0
No embeds

No notes for slide

「GraphDB徹底入門」〜構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く〜

  1. 1. Graph DB GraphDB doryokujin+WEB ( Tokyo.Webmining #9-2)
  2. 2. [Me] doryokujin 2 2 33[Company] 1
  3. 3. [ ] MongoDB JP TokyoWebMining MongoDB[ ] MongoDB MongoDB GraphDB
  4. 4. #1[MongoTokyo] Mongo DB Congerence in Japan 2011 03 01 10gen 3 … http://www.10gen.com/conferences/ mongotokyo2011
  5. 5. #2[gihyo ] gihyo.jp 2 DocumentDB GraphDB NoSQL
  6. 6. Graph GraphGraph DB Graph Traversal Graph DB Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant
  7. 7. GraphGraph GraphGraph DB Graph Traversal Graph DB Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant
  8. 8. Graph: GraphGraph DB Graph
  9. 9. Graph[Graph] Dots Lines vertices edges 1 (relationship) Dots Lines Graph
  10. 10. Undirected Graph[ (Undirected)Graph] Vertices: Edges: (relationship) (symmetric)
  11. 11. Directed Graph[ (Directed) Graph] Vertices: Edges: (relationship) (asymmetric)
  12. 12. Directed / Underected Graph friend follow friend follow followfriend follow [Facebook] [Twitter] ”Undirected Graph” Follow ”Directed Graph” ” ” ” ” ”friends” ”follow”
  13. 13. Single-Relational GraphSingle-Relational Structures → Undirected / Directed Graph Single-Relatinal 1 Graph
  14. 14. Single-Relational Graph friend follow friend follow followfriend follow [Facebook] [Twitter] ”Undirected Graph” Follow ”Directed Graph” ”Facebook ” ”Twitter ” ”friends” ”follow”
  15. 15. Single-Relational Reply num:5 Reply Block num:5 Reply DM num:5 num:1 RT RT Reply DM num:2 num:2 num:2 num:1 [Twitter] Graph ”Directed Graph” ”Twitter ” ”Reply”,”RT”,”DM”,”Block”
  16. 16. *Facebook Flickr lives_in is is is follow lives_in friend is share * friend share follow follow is[ ] lives_in Undirected Directed is is is lives_in
  17. 17. Multi-Relational GraphMulti-Relational Structures lives_in: User → Country Share: Facebook → Flikcr
  18. 18. Multi-Relational Reply Reply Block DM Reply RT RTReply DM [Twitter] ”Twitter ” ”Reply”,”RT”,”DM”,”Block”
  19. 19. Multi-Relational *Facebook Flickr lives_in has has has follow lives_in friend has share * has friend share follow lives_in[Multi-Relatinal Graph] has has has lives_in
  20. 20. Property Graph Property Graph Multi-Relational Graph (Property) Graph DB Graph 1 key/value id id_A follow id id_Bfollow 100 follow 500follower 200 date 2011/01/23 follower 1000
  21. 21. Property Graph Reply num:5 Reply Block num:5 Reply DM num:5 num:1 RT RTReply DM num:2 num:2num:2 num:1 Graph ”Property Graph” ”Twitter ” ”Reply”,”RT”,”DM”,”Block” ”num”
  22. 22. Property Graph name doryokujin sex man lives_in birth 1985/05/14 has has id id_B follow follow 1000 follower 2000lives_in date 2011/01/23 friend has friend date 2011/01/23 has friend follow follow date 2010/03/23 date 2011/01/23 name full name mail xxx@yyy address zzz lives_in id id_A follow 100 follower 200 has has date 2010/03/23 lives_in
  23. 23. Graph The Graph Traversal Pattern
  24. 24. Property Graph Property Graph Graph Property Graph Graph DB Tinker Pop Hyper Graph
  25. 25. Graph DBGragh GraphGraph DB Graph Traversal Graph DB Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant
  26. 26. Graph DB: Property Graph DB“Graph DB”
  27. 27. Graph DB[ DB ≠ Graph DB] Graph DB DB Graph DB
  28. 28. RDB Graph[Relatinal Database] AoutV inV A B B C A C C D D D A
  29. 29. Document DB Graph [Document Database] A{ A : { out : [B, C], in : [D] } B : { in : [A] B C } C : { out : [D], in : [A] } D : { out : [A], in : [C] D }}
  30. 30. XML DB Graph [XML Database] A<graphml><graph><node id=A /><node id=B /> B C<node id=C /><edge source=A target=B /><edge source=A target=C /><edge source=C target=D /><edge source=D target=A /></graph> D</graphml>
  31. 31. Graph DB[ ]“A graph database is any storage system that provides index-free adjacency” The Graph Traversal Programming Pattern (“adjacent”) ( “index-free” )
  32. 32. Non-Graph DB and Index-Based Adjacency B E 1. A 3. (B,C) A A B CB, C E D, E D E 2. C D log_2(n) (B,C) time cost
  33. 33. Graph DB and Index-Free Adjacency‣ ”Mini - Index” B E‣ 1. 1 A (B,C)‣ C D id id_B follow 1000 follower 2000
  34. 34. Property (key/value) The Graph Traversal Programming Pattern
  35. 35. GraphDB: Graph Traversal Graph DBGraph DB Query
  36. 36. Graph DB QueryGraph Query = Graph Traversal Traversal = Root Graph Graph Traversal (Root) Index-Free Adjacency
  37. 37. private  void  printFriends(  Node  person  ){        Traverser  traverser  =  person.traverse(                Order.BREADTH_FIRST,    //                  StopEvaluator.END_OF_GRAPH,  //  Graph                ReturnableEvaluator.ALL_BUT_START_NODE,  //  Root  Node                MyRelationshipTypes.KNOWS,  //  ”KNOWS”                Direction.OUTGOING  );  //          for  (  Node  friend  :  traverser  )        {      //   Node ”name”                System.out.println(  friend.getProperty(  "name"  )  );        } Neo4j Wiki}
  38. 38. 1 31 2 Trinity Morpheus Cypher Agent  Smith Neo4j Wiki
  39. 39. private  void  findHackers(  Node  startNode  ) Neo4j Wiki{        Traverser  traverser  =  startNode.traverse(                Order.BREADTH_FIRST,  //                  StopEvaluator.END_OF_GRAPH,  //  Graph                new  ReturnableEvaluator()  //                  {                        public  boolean  isReturnableNode(  TraversalPosition  currentPosition  )                        {                                Relationship  rel  =  currentPosition.lastRelationshipTraversed();                                if  (  rel  !=  null  &&  rel.isType(  MyRelationshipTypes.CODED_BY  )  )                                {                                        return  true;  //  “CODED_BY”                                  }                                return  false;  //                          }                },  //   2                MyRelationshipTypes.CODED_BY,  Direction.OUTGOING,  //                  MyRelationshipTypes.KNOWS,  Direction.OUTGOING  );  //          for  (  Node  hacker  :  traverser  )        {                TraversalPosition  position  =  traverser.currentPosition();                System.out.println(  "At  depth  "  +  position.depth()  +  "  =>  "                        +  hacker.getProperty(  "name"  )  );        } ∴  At  depth  4  =>  The  Architect
  40. 40. Graph DB[Data Locality] [Local Search, Social Network] 2 [Transition] Web [Recommendation]
  41. 41. [Graph Problems] [Shortest Path] 2GraphDB Traversal Neo4jrb
  42. 42. Graph DB ” ” 10 ”Knows” Tables, Documents, Key/Value Model GraphDB Union, Intersection, Join
  43. 43. Graph DB[ ] Property Graph Index-Free Adjacency Graph Query = Graph Traversal Data Locality
  44. 44. Graph DBGraph GraphGraph DB Graph Traversal Graph DB Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant
  45. 45. Neo4j
  46. 46. Neo4j[ ] HP Java AGPLv3 2003 24 8 2009 VC ACID Propety Graph Model / Gremlin Lucene
  47. 47. Neo4j[Language Binding - Framework] Python - Django Ruby - Ruby on Rails Clojure Scala Groovy - Griffin / Grails Java - Spring Framework Ruby Ruby Java
  48. 48. Neo4j[Tools] Shell Shell Graph Traverse Indexing neo4j-server Neo4j REST API Admin tools Online BackUp Neoclipse Neo4j ↑ Batch Insert
  49. 49. Neo4j[ver. 1.2] 1.2 Neo4j Server REST API Admin Interface High Availability Kernel
  50. 50. sones
  51. 51. sones[ ] HP C# AGPLv3 2011 VC ACID REST Interface Property Graph Model / Gremlin : Property Hyper Graph Graph Query Language(GQL)
  52. 52. sones [GQL] SQL Traversal Cheat Sheet Query• FROM User SELECT User.Friends.Friends.Name// aggregation• SELECT COUNT(User.Friends)• SELECT User.Friends.Random(2)• SELECT User.Friends.Name.Substring(2,5)
  53. 53. Orient DB
  54. 54. Orient DB[ ] HP Java Apache2.0 1997 C++ → Java Document-Graph DB ACID Shell / REST Interface Propety Graph Model / Gremlin
  55. 55. Orient DB [Document-Graph DB] [ ] Orient DB Object DB Key/Value Server Document DB// DATABASE OPENODatabaseDocumentTx db = new ODatabaseDocumentTx("remote:localhost/petshop").open("admin", "admin");// DocumentODocument doc = new ODocument(db, "Person");doc.field( "name", "Luke" );doc.field( "surname", "Skywalker" );doc.field( "city", new ODocument(db, "City").field("name","Rome").field("country","Italy") );             // Transactiondoc.save();db.close();
  56. 56. Orient DB [Document-Graph DB] OGraphVertex OGraphEdge OGraphElement ODocumentWrapper Document SQLSELECT FROM OGraphVertex WHERE outEdges CONTAINS ( label = knows )//7 ”knows”SELECT FROM OGraphVertex WHERE outEdges TRAVERSE(0,7,out,outEdges)( @class = OGraphEdge and label = knows )
  57. 57. Orient DB[Language Binding Using Binary Protocol] Java C PHP JRuby (Ruby: soon)[Language Binding Using REST Protocol] Python Java Script
  58. 58. InfoGrid
  59. 59. InfoGrid [ ] HP JAVA AGPLv3 ACID REST Interface MeshObject GraphMeshBase _GDB = StoreMeshBase.create(_MySQLStore);MeshObject _xkcd = _GDB.getMeshObjectLifecycleManager().createMeshObject();_xkcd.setProperty("Name", "xkcd");_xkcd.setProperty("Url", "http://www.xkcd.com");_xkcd.relate(_good)
  60. 60. Infinite Graph
  61. 61. Infinite Graph[ ] HP C++ Academic and Start Up 2010 6 Distributed Graph DB ↑Objectivity/DB: distributed database server
  62. 62. Graph DB: Data SQL LikeGraphDB License Language Protocol Gremlin Binding Model Query REST/ Property Ruby, Python, Neo4j AGPLv3 Java Yes Scala,... - JSON Graph REST/ Property sones AGPLv3 C# JSON Graph Yes - Yes (XML) (+Extend) REST/ Property PHP, Jruby,OrientDB Apache2.0 Java Yes Python, JS,... Yes JSON Graph Property REST/Info Grid AGPLv3 Java Graph? - - - JSON (MeshObject) Infinite Property Product C++ - - - - Graph Graph
  63. 63. Graph DB[ ] Graph DB Neo4j Open Source Social Graph Software Not Ready Yet Graph DB Hypergtaph: PropertyGraph HyperGraph Pregel: bulk synchronous parallel model Distributed DB Google FlockDB: Distributed DB for storing adjancency lists Twitter
  64. 64. Tinker PopGraph GraphGraph DB Graph Traversal Graph DB Neo4j, Sones, InfoGrid, OrientDB, InfiniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant
  65. 65. Tinker Pop
  66. 66. Tinker Pop[Tinker Pop] HP Property Graph Model GraphDB Blueprints: A Property Graph Model Interface Gremlin: A Graph Traversal Language Pipes: A Data Flow Framework using Process Graphs Rexster: A RESTful Graph Shell Mutant: A Poly-ScriptEngine ScriptEngine
  67. 67. Tinker Pop
  68. 68. Tinker Pop: BluePrints
  69. 69. BluePrints[ ] HP GraphDB ”JDBC”Property Graph Model GraphDB[Now] Tinker Graph: in-memory property graph model Sail: Open RDF Neo4j, Orient DB, sones, ...[Future] Redis Infinite Graph, Dex
  70. 70. BluePrints GraphDBGraph graph = new Neo4jGraph("/tmp/graph/neo4j");// Graph graph = new OrientGraph("/tmp/graph/orientdb");Vertex a = graph.addVertex(null);Vertex b = graph.addVertex(null);a.setProperty("name","marko");b.setProperty("name","aaron");Edge e = graph.addEdge(null,a,b,"knows");e.setProperty("since",2010);graph.shutdown();
  71. 71. BluePrints Transactiongraph.startTransaction();try{  Vertex luca = graph.addVertex(null);  luca.setProperty( "name", "Luca" );  Vertex marko = graph.addVertex(null);  marko.setProperty( "name", "Marko" );  Edge lucaKnowsMarko = graph.addEdge(null, luca, marko,"knows");  graph.stopTransaction(Conclusion.SUCCESS);} catch( Exception e ) {  graph.stopTransaction(Conclusion.FAILURE);}
  72. 72. Tinker Pop: Gremlin
  73. 73. Gremlin[ ] HPGremlin = Graph Programing LanguageBlueprints GraphDBShell GraphDB QueryJava + Groovy
  74. 74. GremlinProperty Graph Basic Graph Traversals
  75. 75. doryokujin$ ./gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin>  g  =  TinkerGraphFactory.createTinkerGraph()==>tinkergraph[vertices:6  edges:6]  // 6 6gremlin>  g.class==>class  com.tinkerpop.blueprints.pgm.impls.tg.TinkerGraphgremlin>  //  gremlin>  g.V==>v[3]==>v[2]...gremlin>  //  gremlin>  g.E==>e[10][4-­‐created-­‐>5]==>e[7][1-­‐knows-­‐>2]==>e[9][1-­‐created-­‐>3]... Getting Srarted
  76. 76. gremlin>  v  =  g.v(1)  //  id=1  ==>v[1]gremlin>  v.keys()  //  ==>age==>namegremlin>  v.values()  //  ==>29==>markogremlin>  v.name  +    is    +  v.age  +    years  old.==>marko  is  29  years  old.gremlin>  //  id=1,  name=marko  gremlin>  v.outE==>e[7][1-­‐knows-­‐>2]==>e[9][1-­‐created-­‐>3]==>e[8][1-­‐knows-­‐>4]gremlin>  //  gremlin>  v.outE.weight==>0.5==>0.4==>1.0 Getting Srarted
  77. 77. gremlin>  //  id=1 1.0gremlin>  v.outE{it.weight  <  1.0}.inV==>v[2]==>v[3]gremlin>  //  gremlin>  list  =  []                                                          gremlin>  v.outE{it.weight  <  1.0}.inV  >>  list==>v[2]==>v[3]gremlin>  //  list property  mapsgremlin>  list.collect{  it.map()  }==>{name=vadas,  age=27}==>{name=lop,  lang=java}gremlin>  //  listgremlin>  list.inE()              ==>e[7][1-­‐knows-­‐>2]==>e[9][1-­‐created-­‐>3]... Getting Srarted
  78. 78. gremlin>  list.inE{it.label==knows}    //   knows ==>e[7][1-­‐knows-­‐>2] gremlin>  list.inE()[[label:knows]]  //   ==>e[7][1-­‐knows-­‐>2] gremlin>  list.inE()[[label:knows]].outV.name  // :name   ==>marko Getting Srarted~20000ms:  g.V.outE{it[label]==followed_by}.inV.outE{it[label]==followed_by}.inV.outE                      {it[label]==followed_by}.inV  >>-­‐1~9000ms:    g.V.outE{it.label==followed_by}.inV.outE{it.label==followed_by}.inV.outE                    {it.label==followed_by}.inV  >>-­‐1~8500ms:    g.V.outE{it.getLabel()==followed_by}.inV.outE{it.getLabel()==followed_by}.inV.outE                            {it.getLabel()==followed_by}.inV  >>-­‐1~6000ms:    g.V.outE[[label:followed_by]].inV.outE[[label:followed_by]].inV.outE                      [[label:followed_by]].inV  >>-­‐1 ClosureFilterPipe vs. PropertyFIlterPipe
  79. 79. Tinker Pop: Pipes
  80. 80. Pipes[ ] HPPipes = Data Flow FrameworkPipes Graph Traversal 1 1Pipes filtering, splitting, merging, traversing,...
  81. 81. Gremling:id-v(a)/outE[@label=knows]/inV/outE[@label=develops]/inV/@name Pipe pipe1 = new VertexEdgePipe(Step.OUT_EDGES); Pipe pipe2 = new LabelFilterPipe("knows", Filter.NOT_EQUALS); Pipe pipe3 = new EdgeVertexPipe(Step.IN_VERTEX); Pipe pipe4 = new VertexEdgePipe(Step.OUT_EDGES); Pipe pipe5 = new LabelFilterPipe("develops", Filter.NOT_EQUALS); Pipe pipe6 = new EdgeVertexPipe(Step.IN_VERTEX); Pipe pipe7 = new PropertyPipe("name"); Pipe pipeline = new Pipeline (pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7); pipeline.setStarts(new SingleIterator(graph.getVertex("a")); for(String name : pipeline) {   System.out.println(name); } A Graph Processing Stack
  82. 82. Pipes Pipespublic  class  NumCharsPipe  extends  AbstractPipe<String,Integer>  {    public  Integer  processNextStart()  {        String  word  =  this.starts.next();        return  word.length();    }} A Graph Processing Stack
  83. 83. Tinker Pop: Rexster
  84. 84. Rexster[ ] HPRexster = A RESTful Graph ShellBlueprints GraphDB RESTfulAPI (JSON)Gremlin
  85. 85. > http://localhost:8182/examplegraph/vertices/b{  "version":"0.1",  "results": {    "_type":"vertex",    "_id":"b",    "name":"aaron",    "type":"person"  },  "query_time":0.1537} A Graph Processing Stack// g:key-v(name,DARK STAR)[0]: Usin gGremlin Code> http://localhost:8182/gratefulgraph/traversals/gremlin?script=g:key-v%28%27name%27,%27DARK%20STAR%27%29[0]{    "results":  [{        "_type":"vertex",        "_id":"89",        "name":"DARK  STAR",        "song_type":"original",        "performances":219,        "type":"song"}    ],    "query_time":6.753024,    "success":true,    "version"} Using Gremilin
  86. 86. Tinker Pop: Mutant
  87. 87. Mutant[ ] HPMutant = A Poly-ScriptEngine ScriptEngineJVM Script Engine
  88. 88. Mutant Consolemarko:~/software/mutant$  ./mutant.sh              //          oO  ~~-­‐____m(___m___~.___    MuTanT  0.1-­‐SNAPSHOT_|__|__|__|__|__|          [  ?h  =  help  ][gremlin]  gremlin  0.6-­‐SNAPSHOT[Groovy]  Groovy  Scripting  Engine  2.0[ruby]  JSR  223  JRuby  Engine  1.5.5[ECMAScript]  Mozilla  Rhino  1.6  release  2[AppleScript]  AppleScriptEngine  1.0mutant[gremlin]>  $x  :=  12[12]mutant[gremlin]>  ?xmutant[AppleScript]>  ?xmutant[Groovy]>  $x12mutant[Groovy]>  ?xmutant[ruby]>  $x12mutant[ruby]>  ?xmutant[ECMAScript]>  $x12 Basic Examples
  89. 89. [ ] Graph DB Graph DB Graph Partitioning Pregel Neo4j
  90. 90. …※ Graph DB http://snap.stanford.edu/data/index.html

×