11   8   5
•                  (   )

                 •   @kimuras

             •                          G(2007   )

             •
             •
                 •

11   8   5
11   8   5
Agenda

             • Introduction
             • The past work
             • Introduction to GraphDB
             • Introduction to Neo4j
             • Introduction to analysis sample

11   8   5
Introduction



11   8   5
Motivation for social graph analysis




11   8   5
mixi
                          30000000

                                            ID
                          22500000
         # of member id




                          15000000



                           7500000



                                 0
                                     2007        2008   2009   2010   2011
                                                        year


11   8        5
What is Social Graph?
11   8   5
11   8   5
11   8   5
11   8   5
11   8   5
11   8   5
11   8   5
11   8   5
Feed Back




11   8   5
Feed Back




11   8   5
Feed Back




11   8   5
Feed Back




11   8   5
Feed Back




11   8   5
Approach for SG
                 analysis


                    Feed Back




11   8   5
Approach for SG
                 analysis


                    Feed Back




11   8   5
Approach for SG
                 analysis


                    Feed Back




11   8   5
Approach for SG
                 analysis


                    Feed Back




11   8   5
The past work



11   8   5
11   8   5
•


11   8   5
•
         •

11   8   5
Relational Databases




 from_id      to_id    id   name     age
 1            2        1    Kimura   18
 1            3        2    kato     45
 2            3        3    ito      21




11   8   5
Relational Databases

                                           Dump &
                                           Denormalization




 from_id      to_id    id   name     age
 1            2        1    Kimura   18
 1            3        2    kato     45
 2            3        3    ito      21




11   8   5
Relational Databases

                                           Dump &
                                           Denormalization




 from_id      to_id    id   name     age                     Key      value

 1            2        1    Kimura   18                      From:1   2,3

 1            3        2    kato     45                      From:2   3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45




11   8   5
Relational Databases

                                           Dump &
                                           Denormalization




 from_id      to_id    id   name     age                     Key      value

 1            2        1    Kimura   18                      From:1   2,3

 1            3        2    kato     45                      From:2   3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45




11   8   5
Relational Databases

                                           Dump &
                                           Denormalization




 from_id      to_id    id   name     age                     Key      value

 1            2        1    Kimura   18                      From:1   2,3

 1            3        2    kato     45                      From:2   3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45




11   8   5
Relational Databases

                                           Dump &
                                           Denormalization




 from_id      to_id    id   name     age                     Key      value

 1            2        1    Kimura   18                      From:1   2,3

 1            3        2    kato     45                      From:2   3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45




11   8   5
Relational Databases

                                           Dump &

                      reimplementation     Denormalization




 from_id      to_id    id   name     age                     Key      value

 1            2        1    Kimura   18                      From:1   2,3

 1            3        2    kato     45                      From:2   3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45




11   8   5
Relational Databases

                                           Dump &

                      reimplementation     Denormalization




 from_id      to_id    id   name     age                     Key      value

 1
 1
              2
              3
                      maintenance cost
                       1
                       2
                            Kimura
                            kato
                                     18
                                     45
                                                             From:1
                                                             From:2
                                                                      2,3
                                                                      3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45




11   8   5
Relational Databases

                                           Dump &

                      reimplementation     Denormalization




 from_id      to_id    id   name     age                     Key      value

 1
 1
              2
              3
                      maintenance cost
                       1
                       2
                            Kimura
                            kato
                                     18
                                     45
                                                             From:1
                                                             From:2
                                                                      2,3
                                                                      3

 2            3        3    ito      21                      Prof:1   Kimuras,18
                                                             Prof:2   Kato,45

                                  scalability


11   8   5
Introduction to GraphDB




11   8   5
What is graph




11   8   5
What is graph
              Vertex (node :   )




11   8   5
What is graph
                      Vertex (node :   )




             Edge (     )




11   8   5
What is graph
                      Vertex (node :   )


                       Undirected graph (   )

             Edge (     )




11   8   5
What is graph
                      Vertex (node :   )




             Edge (     )




11   8   5
What is graph
                      Vertex (node :   )




             Edge (     )




11   8   5
What is graph
                      Vertex (node :   )




             Edge (     )




11   8   5
What is graph
                      Vertex (node :     )


                            Directed graph (   )

             Edge (     )




11   8   5
What is GraphDB
                      Vertex (node :   )




             Edge (     )




11   8   5
What is GraphDB
             ID:   1
                            Vertex (node :   )
             NAME: kimura
             PROP: Male
             AGE: 18




             Edge (           )




11   8   5
What is GraphDB
             ID:   1
                            Vertex (node :       )
             NAME: kimura
             PROP: Male
             AGE: 18




             Edge (           )
                                  ID:   2
                                  NAME: ITO
                                  PROP: Female
                                  AGE: 21




11   8   5
What is GraphDB
             ID:   1
                            Vertex (node :       )
             NAME: kimura
             PROP: Male
             AGE: 18




             Edge (           )
                                  ID:   2
                                  NAME: ITO
                                  PROP: Female
                                  AGE: 21




11   8   5
What is GraphDB
             ID:   1
                            Vertex (node :       )
             NAME: kimura
             PROP: Male
             AGE: 18




             Edge (           )
                                  ID:   2
                                  NAME: ITO
                                  PROP: Female
                                  AGE: 21




11   8   5
What is GraphDB
             ID:   1
                                    Vertex (node :       )
             NAME: kimura
             PROP: Male
             AGE: 18




              Edge (                  )
                                          ID:   2
             ID:       3                  NAME: ITO
             LABEL:    Like               PROP: Female
             Since:    2011/08/06         AGE: 21
             OutGoing: 2




11   8   5
What is GraphDB
             ID:   1
                                    Vertex (node :       )
             NAME: kimura
             PROP: Male
             AGE: 18




              Edge (                  )
                                          ID:   2
             ID:       3                  NAME: ITO
             LABEL:    Like               PROP: Female
             Since:    2011/08/06         AGE: 21
             OutGoing: 2




11   8   5
What is GraphDB
             ID:   1
                                    Vertex (node :       )
             NAME: kimura
             PROP: Male
             AGE: 18




              Edge (                  )
                                          ID:   2
             ID:       3                  NAME: ITO
             LABEL:    Like               PROP: Female
             Since:    2011/08/06         AGE: 21
             OutGoing: 2




11   8   5
The implementations
                for GraphDB




               http://en.wikipedia.org/wiki/GraphDB

11   8   5
Introduction to Neo4j




11   8   5
GraphDB Neo4j
                •     True ACID transactions
                •     High availability
                •     Scales to billions of nods and relationships
                •     High speed querying through traversals


                        Single instance(GPLv3)   Multiple instance(AGPLv3)
         Embedded       EmbeddedGraphDatabase    HighlyAvailableGraphDatabase
         Standalone     Neo4j Server             Neo4j Server high availability mode


                                                            http://neo4j.org/
11   8   5
Other my favorite features
                for Neo4j




                   http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack


11   8   5
Other my favorite features
                for Neo4j
         • RESTful APIs




                          http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack


11   8   5
Other my favorite features
                for Neo4j
         • RESTful APIs
         • Query Language(Cypher)




                          http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack


11   8   5
Other my favorite features
                for Neo4j
         • RESTful APIs
         • Query Language(Cypher)
         • Full indexing
             – lucene




                          http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack


11   8   5
Other my favorite features
                for Neo4j
         • RESTful APIs
         • Query Language(Cypher)
         • Full indexing
            – lucene
         • Implemented graph algorithm
             – A*, Dijkstra
             – High speed traverse




                                     http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack


11   8   5
Other my favorite features
                for Neo4j
         • RESTful APIs
         • Query Language(Cypher)
         • Full indexing
            – lucene
         • Implemented graph algorithm
             – A*, Dijkstra
             – High speed traverse
         • Gremlin supported
             – Like a query language

                                     http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack


11   8   5
Introduction simple Neo4j usecase
                Single node           Multi node
     Embedded
     Server




11   8   5
Introduction simple Neo4j usecase
                Single node           Multi node
     Embedded



                Analyses system
     Server




11   8   5
Introduction simple Neo4j usecase
                Single node           Multi node
     Embedded



                Analyses system       Analyses system
     Server




11   8   5
Introduction simple Neo4j usecase
                Single node           Multi node
     Embedded



                Analyses system       Analyses system




                Analyses system
     Server




11   8   5
Introduction simple Neo4j usecase
                Single node           Multi node
     Embedded



                Analyses system       Analyses system




                Analyses system       Analyses system
     Server




11   8   5
Introduction simple Neo4j usecase
                Single node           Multi node
     Embedded



                Analyses system       Analyses system




                Analyses system       Analyses system
     Server




11   8   5
Introduction simple Neo4j usecase
                   Single node          Multi node
                Analyses system
     Embedded



                                        Analyses system




                  Analyses system       Analyses system
     Server




11   8   5
Introduction simple Neo4j usecase
                   Single node          Multi node
                Analyses system
     Embedded



                                        Analyses system




                  Analyses system       Analyses system
     Server




11   8   5
Introduction to simple
                embedded Neo4j

             • Insert Vertices & make Relationships
              • Single node & Embedded
             • Traversal sample

11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {
                 GraphDatabaseService graphDb = new
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                     firstNode.setProperty("Name", "Kimura");
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {
                     tx.finish();
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {
                 GraphDatabaseService graphDb = new
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                     firstNode.setProperty("Name", "Kimura");
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {
                     tx.finish();
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {            ID:   1
                 GraphDatabaseService graphDb = new                    NAME: kimura
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                     firstNode.setProperty("Name", "Kimura");
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {
                     tx.finish();
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {            ID:   1
                 GraphDatabaseService graphDb = new                    NAME: kimura
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                     firstNode.setProperty("Name", "Kimura");
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {
                     tx.finish();
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {            ID:   1
                 GraphDatabaseService graphDb = new                    NAME: kimura
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                     firstNode.setProperty("Name", "Kimura");
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {                                           ID:   2
                     tx.finish();                                      NAME: Kato
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {            ID:   1
                 GraphDatabaseService graphDb = new                    NAME: kimura
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                     firstNode.setProperty("Name", "Kimura");
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {                                           ID:   2
                     tx.finish();                                      NAME: Kato
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Insert vertices,
                            make relationship
         public final class InputVertex {
             public static void main(final String[] args) {                             ID:   1
                 GraphDatabaseService graphDb = new                                     NAME: kimura
                                EmbeddedGraphDatabase("/tmp/neo4j");
                 Transaction tx = graphDb.beginTx();
                 try {
                     Node firstNode = graphDb.createNode();
                                                                       ID:       3
                     firstNode.setProperty("Name", "Kimura");          Relation: Like
                     Node secondNode = graphDb.createNode();
                     secondNode.setProperty("Name", "Kato");
                     firstNode.createRelationshipTo(secondNode,
                          DynamicRelationshipType.withName("LIKE"));
                     tx.success();
                 } finally {                                                            ID:   2
                     tx.finish();                                                       NAME: Kato
                 }
                 graphDb.shutdown();
             }
         }


11   8   5
Batch Insert
                 • Non thread safe, non transaction
                 • But very fast!
             public final class Batch {
                 public static void main(final String[] args) {
                     BatchInserter inserter = new BatchInserterImpl("/tmp/neo4j",
                             BatchInserterImpl.loadProperties("/tmp/neo4j.props"));
                     Map<String, Object> prop = new HashMap<String, Object>();
                     prop.put("Name", "Kimura");
                     prop.put("Age", 21);
                     long node1 = inserter.createNode(prop);

                     prop.put("Name", "Kato");
                     prop.put("Age", 21);
                     long node2 = inserter.createNode(prop);
                     inserter.createRelationship(node1, node2,
                             DynamicRelationshipType.withName("LIKE"), null);
                     inserter.shutdown();
                 }
             }

11   8   5
Traversal sample
                 •
             public static void main(final String[] args) {
                     GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]);
                     Node node = graphDB.getNodeById(1);
                     Traverser friends = node.traverse(

                       Order.DEPTH_FIRST,

                       StopEvaluator.END_OF_GRAPH,

                       ReturnableEvaluator.ALL_BUT_START_NODE,

                       DynamicRelationshipType.withName("LIKE"),

                       Direction.OUTGOING);
                     for (Node nodeBuf : friends) {
                         TraversalPosition currentPosition = friends.currentPosition();
                     }
                 }

11   8   5
Traversal sample
                 •
             public static void main(final String[] args) {
                     GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]);
                     Node node = graphDB.getNodeById(1);
                     Traverser friends = node.traverse(
                       //
                       Order.DEPTH_FIRST,   BREADTH_FIRST


                       StopEvaluator.END_OF_GRAPH,

                       ReturnableEvaluator.ALL_BUT_START_NODE,

                       DynamicRelationshipType.withName("LIKE"),

                       Direction.OUTGOING);
                     for (Node nodeBuf : friends) {
                         TraversalPosition currentPosition = friends.currentPosition();
                     }
                 }

11   8   5
Traversal sample
                 •
             public static void main(final String[] args) {
                     GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]);
                     Node node = graphDB.getNodeById(1);
                     Traverser friends = node.traverse(
                       //
                       Order.DEPTH_FIRST,   BREADTH_FIRST
                       //
                       StopEvaluator.END_OF_GRAPH,   DEPTH_ONE

                       ReturnableEvaluator.ALL_BUT_START_NODE,

                       DynamicRelationshipType.withName("LIKE"),

                       Direction.OUTGOING);
                     for (Node nodeBuf : friends) {
                         TraversalPosition currentPosition = friends.currentPosition();
                     }
                 }

11   8   5
Traversal sample
                 •
             public static void main(final String[] args) {
                     GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]);
                     Node node = graphDB.getNodeById(1);
                     Traverser friends = node.traverse(
                       //
                       Order.DEPTH_FIRST,   BREADTH_FIRST
                       //
                       StopEvaluator.END_OF_GRAPH,   DEPTH_ONE
                       //
                       ReturnableEvaluator.ALL_BUT_START_NODE,     ALL, isReturnableNode()

                       DynamicRelationshipType.withName("LIKE"),

                       Direction.OUTGOING);
                     for (Node nodeBuf : friends) {
                         TraversalPosition currentPosition = friends.currentPosition();
                     }
                 }

11   8   5
Traversal sample
                 •
             public static void main(final String[] args) {
                     GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]);
                     Node node = graphDB.getNodeById(1);
                     Traverser friends = node.traverse(
                       //
                       Order.DEPTH_FIRST,   BREADTH_FIRST
                       //
                       StopEvaluator.END_OF_GRAPH,   DEPTH_ONE
                       //
                       ReturnableEvaluator.ALL_BUT_START_NODE,     ALL, isReturnableNode()
                       //
                       DynamicRelationshipType.withName("LIKE"),

                       Direction.OUTGOING);
                     for (Node nodeBuf : friends) {
                         TraversalPosition currentPosition = friends.currentPosition();
                     }
                 }

11   8   5
Traversal sample
                 •
             public static void main(final String[] args) {
                     GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]);
                     Node node = graphDB.getNodeById(1);
                     Traverser friends = node.traverse(
                       //
                       Order.DEPTH_FIRST,   BREADTH_FIRST
                       //
                       StopEvaluator.END_OF_GRAPH,   DEPTH_ONE
                       //
                       ReturnableEvaluator.ALL_BUT_START_NODE,     ALL, isReturnableNode()
                       //
                       DynamicRelationshipType.withName("LIKE"),
                       //
                       Direction.OUTGOING);   INCOMING, BOTH
                     for (Node nodeBuf : friends) {
                         TraversalPosition currentPosition = friends.currentPosition();
                     }
                 }

11   8   5
Traversal sample
                 Order.BREADTH_FIRST
             •




11   8   5
Traversal sample
                 Order.BREADTH_FIRST
             •




11   8   5
Traversal sample
                 Order.BREADTH_FIRST
             •




11   8   5
Traversal sample
                 Order.BREADTH_FIRST
             •




11   8   5
Traversal sample
                 Order.BREADTH_FIRST
             •




11   8   5
Traversal sample
                 Order.BREADTH_FIRST
             •




11   8   5
Traversal sample
                 Order.DEPTH_FIRST
             •




11   8   5
Traversal sample
                 Order.DEPTH_FIRST
             •




11   8   5
Traversal sample
                 Order.DEPTH_FIRST
             •




11   8   5
Traversal sample
                 Order.DEPTH_FIRST
             •




11   8   5
Traversal sample
                 Order.DEPTH_FIRST
             •




11   8   5
Traversal sample
                 Order.DEPTH_FIRST
             •




11   8   5
Neoclipse sample




                    http://wiki.neo4j.org/content/Neoclipse
11   8   5
experiment




11   8   5
experiment
             •   mixi             Neo4j

             •
                 •   Machine: 24 core CPU, Memory 65GB

                 •   Neo4j: BatchInsert, community, embedded

             •   Data

                 •                1.5                    60



11   8   5
experiment
             •   mixi             Neo4j

             •
                 •   Machine: 24 core CPU, Memory 65GB

                 •   Neo4j: BatchInsert, community, embedded

             •   Data

                 •                1.5                    60

                                513m17sec (about 8.6h)
11   8   5
Network Dataset
             •   Stanford Large Network Dataset Collection

                 •    SNAP has a Wide variety of graph data!
                          Social Networks             Communication networks

                         Citation networks             Collaboration networks

                            Web graphs             Product co-purchasing networks

                  Internet peer-to-peer networks           Road networks

                     Autonomous systems graphs            Signed networks

                 Wikipedia networks and metadata      Memetracker and Twitter


                                         http://snap.stanford.edu/data/index.html

11   8   5
Introduction to Analysis
                     Sample



11   8   5
Architecture

                Service
                               Database   Analyses   Visualization
             (Social Graph)




11   8   5
Architecture

                Service
                               Database   Analyses   Visualization
             (Social Graph)




11   8   5
Introduction Analyses
                       Sample


             • Centrarity (       )

             • Clustering coefficient (   )




11   8   5
Centrality (   )

             •       =




11   8   5
Centrality (   )

             •       =




11   8   5
Centrality (   )

             •       =




11   8   5
Centrality (   )

             •       =




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
Centrality (       )

             •       =


                         Pagerank




11   8   5
•
             •   =   Vertex   (   )




11   8   5
•
             •   =   Vertex       (   )



                 1            1



                              1


11   8   5
•
             •   =       Vertex        (   )


                           2
                 1                 1


                     2
                                   1
                               2

11   8   5
•
             •   =       Vertex        (   )


                           2
                 1                 1


                     2
                                   1
                               2

11   8   5
•
             •   =       Vertex        (   )


                           2
                 1                 1
                               4
                     2
                                   1
                               2

11   8   5
•
             •   =       Vertex        (   )


                           2
                 1                 1
                               4
                     2
                                   1
                               2

11   8   5
mixi

              •     1000

              •             summary


             Min      1st Que. Median     Mean    3rd Que.    Max

             1.00          3.00   10.00   25.69    30.00     903.00


11   8   5
mixi




11   8   5
•
                 •   ≒




11   8   5
•
                 •   ≒
                         =0/3=0




11   8   5
•
                 •   ≒
                         =0/3=0


                         =1/3




11   8   5
•
                 •   ≒
                         =0/3=0


                         =1/3


                         =2/3




11   8   5
•
                 •   ≒
                         =0/3=0


                         =1/3


                         =2/3


                         =3/3=1
11   8   5
•   1000

             •                     summary


             Min        1st Que. Median   Mean     3rd Que.   Max

             0.00        0.00    0.1157   0.2071    0.2667    1.000


11   8   5
11   8   5
11   8   5
•   25   0.08




11   8   5
•   14   0.17




11   8   5
•   10   0.68




11   8   5
•   4   1




11   8   5
Visualization Sample




11   8   5
•          2hop      Social Graph


             •   Edge


                 •                              (              )


             •   Vertex


                 •                       (                 )


             •                 Gephi
                                       http://gephi.org/

11   8   5
11   8   5
•   Social Graph

                 •
             •   GraphDB

             •   Neo4j

             •   R

             •   Visualization

11   8   5
Thanks!



11   8   5

ソーシャルグラフのデータ解析

  • 1.
    11 8 5
  • 2.
    ( ) • @kimuras • G(2007 ) • • • 11 8 5
  • 3.
    11 8 5
  • 4.
    Agenda • Introduction • The past work • Introduction to GraphDB • Introduction to Neo4j • Introduction to analysis sample 11 8 5
  • 5.
  • 6.
    Motivation for socialgraph analysis 11 8 5
  • 7.
    mixi 30000000 ID 22500000 # of member id 15000000 7500000 0 2007 2008 2009 2010 2011 year 11 8 5
  • 8.
    What is SocialGraph? 11 8 5
  • 9.
    11 8 5
  • 10.
    11 8 5
  • 11.
    11 8 5
  • 12.
    11 8 5
  • 13.
    11 8 5
  • 14.
    11 8 5
  • 15.
    11 8 5
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
    Approach for SG analysis Feed Back 11 8 5
  • 22.
    Approach for SG analysis Feed Back 11 8 5
  • 23.
    Approach for SG analysis Feed Back 11 8 5
  • 24.
    Approach for SG analysis Feed Back 11 8 5
  • 25.
  • 26.
    11 8 5
  • 27.
  • 28.
    • 11 8 5
  • 29.
    Relational Databases from_id to_id id name age 1 2 1 Kimura 18 1 3 2 kato 45 2 3 3 ito 21 11 8 5
  • 30.
    Relational Databases Dump & Denormalization from_id to_id id name age 1 2 1 Kimura 18 1 3 2 kato 45 2 3 3 ito 21 11 8 5
  • 31.
    Relational Databases Dump & Denormalization from_id to_id id name age Key value 1 2 1 Kimura 18 From:1 2,3 1 3 2 kato 45 From:2 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 11 8 5
  • 32.
    Relational Databases Dump & Denormalization from_id to_id id name age Key value 1 2 1 Kimura 18 From:1 2,3 1 3 2 kato 45 From:2 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 11 8 5
  • 33.
    Relational Databases Dump & Denormalization from_id to_id id name age Key value 1 2 1 Kimura 18 From:1 2,3 1 3 2 kato 45 From:2 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 11 8 5
  • 34.
    Relational Databases Dump & Denormalization from_id to_id id name age Key value 1 2 1 Kimura 18 From:1 2,3 1 3 2 kato 45 From:2 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 11 8 5
  • 35.
    Relational Databases Dump & reimplementation Denormalization from_id to_id id name age Key value 1 2 1 Kimura 18 From:1 2,3 1 3 2 kato 45 From:2 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 11 8 5
  • 36.
    Relational Databases Dump & reimplementation Denormalization from_id to_id id name age Key value 1 1 2 3 maintenance cost 1 2 Kimura kato 18 45 From:1 From:2 2,3 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 11 8 5
  • 37.
    Relational Databases Dump & reimplementation Denormalization from_id to_id id name age Key value 1 1 2 3 maintenance cost 1 2 Kimura kato 18 45 From:1 From:2 2,3 3 2 3 3 ito 21 Prof:1 Kimuras,18 Prof:2 Kato,45 scalability 11 8 5
  • 38.
  • 39.
  • 40.
    What is graph Vertex (node : ) 11 8 5
  • 41.
    What is graph Vertex (node : ) Edge ( ) 11 8 5
  • 42.
    What is graph Vertex (node : ) Undirected graph ( ) Edge ( ) 11 8 5
  • 43.
    What is graph Vertex (node : ) Edge ( ) 11 8 5
  • 44.
    What is graph Vertex (node : ) Edge ( ) 11 8 5
  • 45.
    What is graph Vertex (node : ) Edge ( ) 11 8 5
  • 46.
    What is graph Vertex (node : ) Directed graph ( ) Edge ( ) 11 8 5
  • 47.
    What is GraphDB Vertex (node : ) Edge ( ) 11 8 5
  • 48.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) 11 8 5
  • 49.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) ID: 2 NAME: ITO PROP: Female AGE: 21 11 8 5
  • 50.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) ID: 2 NAME: ITO PROP: Female AGE: 21 11 8 5
  • 51.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) ID: 2 NAME: ITO PROP: Female AGE: 21 11 8 5
  • 52.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) ID: 2 ID: 3 NAME: ITO LABEL: Like PROP: Female Since: 2011/08/06 AGE: 21 OutGoing: 2 11 8 5
  • 53.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) ID: 2 ID: 3 NAME: ITO LABEL: Like PROP: Female Since: 2011/08/06 AGE: 21 OutGoing: 2 11 8 5
  • 54.
    What is GraphDB ID: 1 Vertex (node : ) NAME: kimura PROP: Male AGE: 18 Edge ( ) ID: 2 ID: 3 NAME: ITO LABEL: Like PROP: Female Since: 2011/08/06 AGE: 21 OutGoing: 2 11 8 5
  • 55.
    The implementations for GraphDB http://en.wikipedia.org/wiki/GraphDB 11 8 5
  • 56.
  • 57.
    GraphDB Neo4j • True ACID transactions • High availability • Scales to billions of nods and relationships • High speed querying through traversals Single instance(GPLv3) Multiple instance(AGPLv3) Embedded EmbeddedGraphDatabase HighlyAvailableGraphDatabase Standalone Neo4j Server Neo4j Server high availability mode http://neo4j.org/ 11 8 5
  • 58.
    Other my favoritefeatures for Neo4j http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack 11 8 5
  • 59.
    Other my favoritefeatures for Neo4j • RESTful APIs http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack 11 8 5
  • 60.
    Other my favoritefeatures for Neo4j • RESTful APIs • Query Language(Cypher) http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack 11 8 5
  • 61.
    Other my favoritefeatures for Neo4j • RESTful APIs • Query Language(Cypher) • Full indexing – lucene http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack 11 8 5
  • 62.
    Other my favoritefeatures for Neo4j • RESTful APIs • Query Language(Cypher) • Full indexing – lucene • Implemented graph algorithm – A*, Dijkstra – High speed traverse http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack 11 8 5
  • 63.
    Other my favoritefeatures for Neo4j • RESTful APIs • Query Language(Cypher) • Full indexing – lucene • Implemented graph algorithm – A*, Dijkstra – High speed traverse • Gremlin supported – Like a query language http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack 11 8 5
  • 64.
    Introduction simple Neo4jusecase Single node Multi node Embedded Server 11 8 5
  • 65.
    Introduction simple Neo4jusecase Single node Multi node Embedded Analyses system Server 11 8 5
  • 66.
    Introduction simple Neo4jusecase Single node Multi node Embedded Analyses system Analyses system Server 11 8 5
  • 67.
    Introduction simple Neo4jusecase Single node Multi node Embedded Analyses system Analyses system Analyses system Server 11 8 5
  • 68.
    Introduction simple Neo4jusecase Single node Multi node Embedded Analyses system Analyses system Analyses system Analyses system Server 11 8 5
  • 69.
    Introduction simple Neo4jusecase Single node Multi node Embedded Analyses system Analyses system Analyses system Analyses system Server 11 8 5
  • 70.
    Introduction simple Neo4jusecase Single node Multi node Analyses system Embedded Analyses system Analyses system Analyses system Server 11 8 5
  • 71.
    Introduction simple Neo4jusecase Single node Multi node Analyses system Embedded Analyses system Analyses system Analyses system Server 11 8 5
  • 72.
    Introduction to simple embedded Neo4j • Insert Vertices & make Relationships • Single node & Embedded • Traversal sample 11 8 5
  • 73.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { GraphDatabaseService graphDb = new EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); firstNode.setProperty("Name", "Kimura"); Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { tx.finish(); } graphDb.shutdown(); } } 11 8 5
  • 74.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { GraphDatabaseService graphDb = new EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); firstNode.setProperty("Name", "Kimura"); Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { tx.finish(); } graphDb.shutdown(); } } 11 8 5
  • 75.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { ID: 1 GraphDatabaseService graphDb = new NAME: kimura EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); firstNode.setProperty("Name", "Kimura"); Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { tx.finish(); } graphDb.shutdown(); } } 11 8 5
  • 76.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { ID: 1 GraphDatabaseService graphDb = new NAME: kimura EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); firstNode.setProperty("Name", "Kimura"); Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { tx.finish(); } graphDb.shutdown(); } } 11 8 5
  • 77.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { ID: 1 GraphDatabaseService graphDb = new NAME: kimura EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); firstNode.setProperty("Name", "Kimura"); Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { ID: 2 tx.finish(); NAME: Kato } graphDb.shutdown(); } } 11 8 5
  • 78.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { ID: 1 GraphDatabaseService graphDb = new NAME: kimura EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); firstNode.setProperty("Name", "Kimura"); Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { ID: 2 tx.finish(); NAME: Kato } graphDb.shutdown(); } } 11 8 5
  • 79.
    Insert vertices, make relationship public final class InputVertex { public static void main(final String[] args) { ID: 1 GraphDatabaseService graphDb = new NAME: kimura EmbeddedGraphDatabase("/tmp/neo4j"); Transaction tx = graphDb.beginTx(); try { Node firstNode = graphDb.createNode(); ID: 3 firstNode.setProperty("Name", "Kimura"); Relation: Like Node secondNode = graphDb.createNode(); secondNode.setProperty("Name", "Kato"); firstNode.createRelationshipTo(secondNode, DynamicRelationshipType.withName("LIKE")); tx.success(); } finally { ID: 2 tx.finish(); NAME: Kato } graphDb.shutdown(); } } 11 8 5
  • 80.
    Batch Insert • Non thread safe, non transaction • But very fast! public final class Batch { public static void main(final String[] args) { BatchInserter inserter = new BatchInserterImpl("/tmp/neo4j", BatchInserterImpl.loadProperties("/tmp/neo4j.props")); Map<String, Object> prop = new HashMap<String, Object>(); prop.put("Name", "Kimura"); prop.put("Age", 21); long node1 = inserter.createNode(prop); prop.put("Name", "Kato"); prop.put("Age", 21); long node2 = inserter.createNode(prop); inserter.createRelationship(node1, node2, DynamicRelationshipType.withName("LIKE"), null); inserter.shutdown(); } } 11 8 5
  • 81.
    Traversal sample • public static void main(final String[] args) { GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]); Node node = graphDB.getNodeById(1); Traverser friends = node.traverse( Order.DEPTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, DynamicRelationshipType.withName("LIKE"), Direction.OUTGOING); for (Node nodeBuf : friends) { TraversalPosition currentPosition = friends.currentPosition(); } } 11 8 5
  • 82.
    Traversal sample • public static void main(final String[] args) { GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]); Node node = graphDB.getNodeById(1); Traverser friends = node.traverse( // Order.DEPTH_FIRST, BREADTH_FIRST StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, DynamicRelationshipType.withName("LIKE"), Direction.OUTGOING); for (Node nodeBuf : friends) { TraversalPosition currentPosition = friends.currentPosition(); } } 11 8 5
  • 83.
    Traversal sample • public static void main(final String[] args) { GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]); Node node = graphDB.getNodeById(1); Traverser friends = node.traverse( // Order.DEPTH_FIRST, BREADTH_FIRST // StopEvaluator.END_OF_GRAPH, DEPTH_ONE ReturnableEvaluator.ALL_BUT_START_NODE, DynamicRelationshipType.withName("LIKE"), Direction.OUTGOING); for (Node nodeBuf : friends) { TraversalPosition currentPosition = friends.currentPosition(); } } 11 8 5
  • 84.
    Traversal sample • public static void main(final String[] args) { GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]); Node node = graphDB.getNodeById(1); Traverser friends = node.traverse( // Order.DEPTH_FIRST, BREADTH_FIRST // StopEvaluator.END_OF_GRAPH, DEPTH_ONE // ReturnableEvaluator.ALL_BUT_START_NODE, ALL, isReturnableNode() DynamicRelationshipType.withName("LIKE"), Direction.OUTGOING); for (Node nodeBuf : friends) { TraversalPosition currentPosition = friends.currentPosition(); } } 11 8 5
  • 85.
    Traversal sample • public static void main(final String[] args) { GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]); Node node = graphDB.getNodeById(1); Traverser friends = node.traverse( // Order.DEPTH_FIRST, BREADTH_FIRST // StopEvaluator.END_OF_GRAPH, DEPTH_ONE // ReturnableEvaluator.ALL_BUT_START_NODE, ALL, isReturnableNode() // DynamicRelationshipType.withName("LIKE"), Direction.OUTGOING); for (Node nodeBuf : friends) { TraversalPosition currentPosition = friends.currentPosition(); } } 11 8 5
  • 86.
    Traversal sample • public static void main(final String[] args) { GraphDatabaseService graphDB = new EmbeddedGraphDatabase(args[0]); Node node = graphDB.getNodeById(1); Traverser friends = node.traverse( // Order.DEPTH_FIRST, BREADTH_FIRST // StopEvaluator.END_OF_GRAPH, DEPTH_ONE // ReturnableEvaluator.ALL_BUT_START_NODE, ALL, isReturnableNode() // DynamicRelationshipType.withName("LIKE"), // Direction.OUTGOING); INCOMING, BOTH for (Node nodeBuf : friends) { TraversalPosition currentPosition = friends.currentPosition(); } } 11 8 5
  • 87.
    Traversal sample Order.BREADTH_FIRST • 11 8 5
  • 88.
    Traversal sample Order.BREADTH_FIRST • 11 8 5
  • 89.
    Traversal sample Order.BREADTH_FIRST • 11 8 5
  • 90.
    Traversal sample Order.BREADTH_FIRST • 11 8 5
  • 91.
    Traversal sample Order.BREADTH_FIRST • 11 8 5
  • 92.
    Traversal sample Order.BREADTH_FIRST • 11 8 5
  • 93.
    Traversal sample Order.DEPTH_FIRST • 11 8 5
  • 94.
    Traversal sample Order.DEPTH_FIRST • 11 8 5
  • 95.
    Traversal sample Order.DEPTH_FIRST • 11 8 5
  • 96.
    Traversal sample Order.DEPTH_FIRST • 11 8 5
  • 97.
    Traversal sample Order.DEPTH_FIRST • 11 8 5
  • 98.
    Traversal sample Order.DEPTH_FIRST • 11 8 5
  • 99.
    Neoclipse sample http://wiki.neo4j.org/content/Neoclipse 11 8 5
  • 100.
  • 101.
    experiment • mixi Neo4j • • Machine: 24 core CPU, Memory 65GB • Neo4j: BatchInsert, community, embedded • Data • 1.5 60 11 8 5
  • 102.
    experiment • mixi Neo4j • • Machine: 24 core CPU, Memory 65GB • Neo4j: BatchInsert, community, embedded • Data • 1.5 60 513m17sec (about 8.6h) 11 8 5
  • 103.
    Network Dataset • Stanford Large Network Dataset Collection • SNAP has a Wide variety of graph data! Social Networks Communication networks Citation networks Collaboration networks Web graphs Product co-purchasing networks Internet peer-to-peer networks Road networks Autonomous systems graphs Signed networks Wikipedia networks and metadata Memetracker and Twitter http://snap.stanford.edu/data/index.html 11 8 5
  • 104.
  • 105.
    Architecture Service Database Analyses Visualization (Social Graph) 11 8 5
  • 106.
    Architecture Service Database Analyses Visualization (Social Graph) 11 8 5
  • 107.
    Introduction Analyses Sample • Centrarity ( ) • Clustering coefficient ( ) 11 8 5
  • 108.
    Centrality ( ) • = 11 8 5
  • 109.
    Centrality ( ) • = 11 8 5
  • 110.
    Centrality ( ) • = 11 8 5
  • 111.
    Centrality ( ) • = 11 8 5
  • 112.
    Centrality ( ) • = Pagerank 11 8 5
  • 113.
    Centrality ( ) • = Pagerank 11 8 5
  • 114.
    Centrality ( ) • = Pagerank 11 8 5
  • 115.
    Centrality ( ) • = Pagerank 11 8 5
  • 116.
    Centrality ( ) • = Pagerank 11 8 5
  • 117.
    Centrality ( ) • = Pagerank 11 8 5
  • 118.
    Centrality ( ) • = Pagerank 11 8 5
  • 119.
    • = Vertex ( ) 11 8 5
  • 120.
    • = Vertex ( ) 1 1 1 11 8 5
  • 121.
    • = Vertex ( ) 2 1 1 2 1 2 11 8 5
  • 122.
    • = Vertex ( ) 2 1 1 2 1 2 11 8 5
  • 123.
    • = Vertex ( ) 2 1 1 4 2 1 2 11 8 5
  • 124.
    • = Vertex ( ) 2 1 1 4 2 1 2 11 8 5
  • 125.
    mixi • 1000 • summary Min 1st Que. Median Mean 3rd Que. Max 1.00 3.00 10.00 25.69 30.00 903.00 11 8 5
  • 126.
  • 127.
    • ≒ 11 8 5
  • 128.
    • ≒ =0/3=0 11 8 5
  • 129.
    • ≒ =0/3=0 =1/3 11 8 5
  • 130.
    • ≒ =0/3=0 =1/3 =2/3 11 8 5
  • 131.
    • ≒ =0/3=0 =1/3 =2/3 =3/3=1 11 8 5
  • 132.
    1000 • summary Min 1st Que. Median Mean 3rd Que. Max 0.00 0.00 0.1157 0.2071 0.2667 1.000 11 8 5
  • 133.
    11 8 5
  • 134.
    11 8 5
  • 135.
    25 0.08 11 8 5
  • 136.
    14 0.17 11 8 5
  • 137.
    10 0.68 11 8 5
  • 138.
    4 1 11 8 5
  • 139.
  • 140.
    2hop Social Graph • Edge • ( ) • Vertex • ( ) • Gephi http://gephi.org/ 11 8 5
  • 141.
    11 8 5
  • 142.
    Social Graph • • GraphDB • Neo4j • R • Visualization 11 8 5
  • 143.