Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

グラフDB + G2GML で DBpediaのデータ解析 

43 views

Published on

PGXユーザ勉強会#13にて

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

グラフDB + G2GML で DBpediaのデータ解析 

  1. 1. B + + 6 21307 # # +/D P # 9B L MG
  2. 2. G2GML(https://github.com/g2glab/g2g) ! G ! G2G Mapper B FD Node.js B
  3. 3. G2G MapperQ L ! A M RP ! 2 G S M
  4. 4. GD .1 e PN D d D • M • . 43 2D • X L M • 1 2 B a
  5. 5. G2GML PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX prop-ja: <http://ja.dbpedia.org/property/> PREFIX schema: <http://schema.org/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> (sci:Scientist {name: nam}) ?sci a dbo:Scientist; rdfs:label ?nam. (sci1:Scientist)-[:related {rel:p}]->(sci2:Scientist) ?sci1 ?p ?sci2. G2GML
  6. 6. PGX $ g2g -f pgx examples/dbp_sci/dbp_sci.g2g http://dbpedia.org/sparql ... "output/dbp_sci/pgx/dbp_sci.pgx.nodes" has been created. "output/dbp_sci/pgx/dbp_sci.pgx.edges" has been created. "output/dbp_sci/pgx/dbp_sci.pgx.json" has been created. $ pgx pgx> G = session.readGraphWithProperties("output/dbp_sci/pgx/dbp_sci.pgx.json") ==> PgxGraph[name=scientists.pgx,N=16409,E=43420,created=1563203418550] G2G Mapper PGX
  7. 7. pgx> G.queryPgql("SELECT x.id(), x.name WHERE (x)").print(5) +-----------------------------------------------------------------------+ | x.id() | x.name | +-----------------------------------------------------------------------+ | http://dbpedia.org/resource/A._Carl_Helmholz | A. Carl Helmholz | | http://dbpedia.org/resource/A._David_Buckingham | A. David Buckingham | | http://dbpedia.org/resource/A._E._Douglass | A. E. Douglass | | http://dbpedia.org/resource/A.F.P._Hulsewé | A.F.P. Hulsewé | | http://dbpedia.org/resource/A._Hari_Reddi | A. Hari Reddi | +-----------------------------------------------------------------------+ PGX
  8. 8. pgx> analyst.pagerank(G, 0.001, 0.85, 100) ==> VertexProperty[name=pagerank,type=double,graph=scientists.pgx] pgx> G.queryPgql("SELECT n.name, n.pagerank WHERE (n) ORDER BY n.pagerank DESC ").print(5) +-----------------------------------------+ | n.name | n.pagerank | +-----------------------------------------+ | Charles Darwin | 0.0024586422817409128 | | Albert Einstein | 0.0021817969144535304 | | Sigmund Freud | 0.0013643043106761389 | | Isaac Newton | 0.0012581582598198172 | | Electron | 0.0012360638838511695 | +-----------------------------------------+
  9. 9. Deep Walk(1) https://docs.oracle.com/cd/E56133_01/latest/prog-guides/mllib/deepwalk.html model = analyst.deepWalkModelBuilder(). setWindowSize(3). setWalksPerVertex(6). setWalkLength(4).build() ==> oracle.pgx.api.beta.mllib.DeepWalkModel@4bdf pgx> model.fit(G) ==> null pgx> similars = model.computeSimilars("http://dbpedia.org/resource/Edsger_W._Dijkstra", 10)
  10. 10. Deep Walk(2) pgx> similars.print() +---------------------------------------------------------------------+ | dstVertex | similarity | +---------------------------------------------------------------------+ | http://dbpedia.org/resource/Edsger_W._Dijkstra | 1.0 | | http://dbpedia.org/resource/Gabriel_Lippmann | 0.9730496406555176 | | http://dbpedia.org/resource/Grigori_Perelman | 0.9713725447654724 | | http://dbpedia.org/resource/Philipp_Lenard | 0.9697780609130859 | | http://dbpedia.org/resource/Richard_P._Brent | 0.9696416854858398 | | http://dbpedia.org/resource/Warren_Winkelstein | 0.966417670249939 | | http://dbpedia.org/resource/Fiona_Stanley | 0.9648353457450867 | | http://dbpedia.org/resource/Gerard_J._Holzmann | 0.9636005163192749 | | http://dbpedia.org/resource/Alexander_Wetmore | 0.9633997678756714 | | http://dbpedia.org/resource/Martin_vom_Brocke | 0.9627030491828918 | +---------------------------------------------------------------------+ ==> oracle.pgx.api.beta.frames.internal.PgxFrameImpl@64f9f455
  11. 11. Neo4j $ g2g -f neo examples/dbp_sci/dbp_sci.g2g http://dbpedia.org/sparql ... "output/dbp_sci/neo/dbp_sci.neo.nodes" has been created. "output/dbp_sci/neo/dbp_sci.neo.edges" has been created. $ rm -r $NEO4J_HOME/libexec/data/databases/graph.db $ neo4j-admin import --database=graph.db --nodes=output/scientists/neo/scientists.neo.nodes --relationships=output/scientists/neo/scientists.neo.nodes --delimiter='t' G2G Mapper Neo4J
  12. 12. Neo4J CALL algo.labelPropagation('Scientist', 'related', {iterations:5000, writeProperty:'partition', write:true, direction: 'OUTGOING'}) YIELD nodes, iterations, loadMillis, computeMillis, writeMillis, write, writeProperty; Neo4J https://github.com/neo4j-contrib/neo4j-graph-algorithms CALL algo.pageRank('Scientist', 'related', {iterations:20, dampingFactor:0.85, write: true,writeProperty:"pagerank"}) YIELD nodes, iterations, loadMillis, computeMillis, writeMillis, dampingFactor, write, writeProperty
  13. 13. Neovis.js : (1) - - - 4 /- . / /. . / / - - - - - /. var config = { ... labels: { "Scientist": { "caption": "vis_label", "size": "pagerank", "community": "partition" } }, relationships: { "related": { "thickness": "weight", "caption": false } }, initial_cypher: "MATCH (n)-[r]->(m) RETURN n,r,m LIMIT 100" };
  14. 14. Neovis.js (2)
  15. 15. B ) M ) M ( 2 ) D L P M G B https://github.com/g2glab/g2g

×