Data Day Texas 2013

2,321 views

Published on

An introduction to graph databases and graph computing frameworks in general and overview of the Aurelius graph cluster in particular. Discusses Titan and Faunus and demonstrates how to build a knowledge graph using the cluster.

This presentation was given at Data Day Texas in 2013. http://datadaytexas.com/

Published in: Technology
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,321
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
0
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide
  • Health care: cancer, personalized medicinesocial systemseconomy
  • Source:http://www.digitaltrends.com/mobile/inside-knowledge-graph-googles-deep-diving-semantic-search/
  • http://socialmediatoday.com/larry-weintraub/1171711/facebook-seo-comes-life-graph-search-launches
  • Data Day Texas 2013

    1. 1. Graph DatabasesAnalyzing Relationships at Scale#DDTX13Matthias Broecheler, CTO@mbroecheler AURELIUSMarch XXX, MMXIII THINKAURELIUS.COM
    2. 2. THE BRAIN
    3. 3. EMERGENCE
    4. 4. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 PropertyVertex name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 name: Pluto name: Cerberus type: god type: monster age: 4000 Graph
    5. 5. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45Edge brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 father father battled time: 2 battled brother time:12 Edge Property name: Pluto name: Cerberus type: god type: monster age: 4000 Edge Label pet Graph
    6. 6. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mothername: Saturn name: Jupiter name: Hercules name: Hydratype: titan type: god type: demigod type: monsterage: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 pet Path
    7. 7. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mothername: Saturn name: Jupiter name: Hercules name: Hydratype: titan type: god type: demigod type: monsterage: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 pet Degree
    8. 8. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mothername: Saturn name: Jupiter name: Hercules name: Hydratype: titan type: god type: demigod type: monsterage: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 pet Shortest Paths
    9. 9. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mothername: Saturn name: Jupiter name: Hercules name: Hydratype: titan type: god type: demigod type: monsterage: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 pet Centrality
    10. 10. Tinkerpop Graph Stack Graph Server Graph Algorithms Object-Graph Mapper Traversal Language Dataflow Processing Generic Graph API
    11. 11. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mothername: Saturn name: Jupiter name: Hercules name: Hydratype: titan type: god type: demigod type: monsterage: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 pet g.V! g.E!
    12. 12. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 v battled father father time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petv = g.V(‘name’,’Hercules’)!
    13. 13. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 v battled father father time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petv.out(‘father’,’mother’)!
    14. 14. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 v battled father father time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petv.out(‘father’).out(‘brother’).name!
    15. 15. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 v battled father father time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petv.outE(‘battled’).has(‘time’,T.gt,5).inV.name!
    16. 16. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 v battled father father time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petv.out(‘father’).out(‘brother’)!.has(‘age’,T.lt,4200).name!
    17. 17. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petg.query().has(‘age’,T.gt,4200).vertices()!
    18. 18. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petg.query().has(‘time’,T.lt,5).edges()!
    19. 19. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petsaturn.as(x).in(father)!.loop(x){it.loops < 3}.next()!
    20. 20. name: Neptune name: Alcmene type: god type: human age: 4500 age: 45 brother mother name: Saturn name: Jupiter name: Hercules name: Hydra type: titan type: god type: demigod type: monster age: 10000 father father battled time: 2 battled brother time:12 name: Pluto name: Cerberus type: god type: monster age: 4000 petg.V.sideEffect{
 !it.rank = it.both.both.both.count()
}!
    21. 21. Speed of Traversal/Process The Graph LandscapeIllustration only, not to scale Size of Graph
    22. 22. Apache 2 Aurelius Graph Cluster TITAN FAUNUS FULGORA Map/Reduce Load Bulk Load Analysis results back into Titan Stores a massive-scale Batch processing of large Runs global graph algorithmsproperty graph allowing real- graphs with Hadoop on large, compressed, time traversals and updates in-memory graphs
    23. 23. Titan Features  Numerous Concurrent Users  Many Short Transactions   read/write  Real-time Traversals (OLTP)  High Availability  Dynamic Scalability  Variable Consistency Model   ACID or eventual consistency  Real-time Big Graph Data
    24. 24. Storage Backends PartitionabilityConsistency Availability
    25. 25. $ ./titan-0.2.0/bin/gremlin.sh! ! ! !,,,/! (o o)!-----oOOo-(_)-oOOo-----!gremlin> g = TitanFactory.open(/tmp/titan)!==>titangraph[local:/tmp/titan]!gremlin> v = g.V(‘name’,’Hercules’)!==>v[4]!gremlin> v.out(‘father’).out(‘brother’).name!
    26. 26. Vertex-Centric Indices  Sort and index edges per vertex by primary key   Primary key can be composite  Enables efficient focused traversals   Only retrieve edges that matter  Uses push down predicates for quick, index-driven retrieval
    27. 27. battled battled battled time: 1 time: 3 time: 5 mother battled v v.query()! time: 9 father fought fought
    28. 28. battled battled battled time: 1 time: 3 time: 5 mother battled v v.query()! time: 9 .direction(OUT)! father
    29. 29. battled battled battled time: 1 time: 3 time: 5 battled v v.query()! time: 9 .direction(OUT)! .labels(‘battled’)!
    30. 30. battled battled time: 1 time: 3 v v.query()! .direction(OUT)! .labels(‘battled’)! .has(‘time,T.lt,5)!
    31. 31. Titan Server REST REXPRO$ wget http://s3.thinkaurelius.com/downloads/titan/titan-cassandra-0.3.0.zip!$ unzip titan-cassandra-0.3.0.zip!$ cd titan-cassandra-0.3.0!$ sudo bin/titan.sh config/titan-server-rexster.xml config/titan-server-cassandra.properties!
    32. 32. Graph Indexing  Vertex and Edge indexing  Pluggable index provider   ElasticSearch   Lucene  Full-text search  Numeric range search  Geographic search
    33. 33. name: Neptune name: Alcmene age: 4500 type: human title: God of the age: 45 earth and ocean brother mother name: Jupitername: Saturn age: 4800 name: Hercules name: Hydratype: titan title: God of the title: Divine hero type: monsterage: 10000 heaven and skies father father battled time: 2 battled locaion: [37.7,23.9] brother time:12 location: [39,22] name: Pluto name: Cerberus age: 4000 title: Ugly beast of the title: God of the underworld underworld pet
    34. 34. name: Neptune name: Alcmene age: 4500 type: human title: God of the age: 45 earth and ocean brother mother name: Jupiter name: Saturn age: 4800 name: Hercules name: Hydra type: titan title: God of the title: Divine hero type: monster age: 10000 heaven and skies father father battled time: 2 battled locaion: [37.7,23.9] brother time:12 location: [39,22] name: Pluto name: Cerberus age: 4000 title: Ugly beast of the title: God of the underworld underworld petg.query().has(‘title’,Txt.CONTAINS,’god’).vertices()!
    35. 35. name: Neptune name: Alcmene age: 4500 type: human title: God of the age: 45 earth and ocean brother mother name: Jupitername: Saturn age: 4800 name: Hercules name: Hydratype: titan title: God of the title: Divine hero type: monsterage: 10000 heaven and skies father father battled time: 2 battled locaion: [37.7,23.9] brother time:12 location: [39,22] name: Pluto name: Cerberus age: 4000 title: Ugly beast of the title: God of the underworld underworld petg.query().has(‘age’,GREATER_THAN,4500)
.has(‘title’,CONTAINS,’god’).vertices()!
    36. 36. name: Neptune name: Alcmene age: 4500 type: human title: God of the age: 45 earth and ocean brother mother name: Jupitername: Saturn age: 4800 name: Hercules name: Hydratype: titan title: God of the title: Divine hero type: monsterage: 10000 heaven and skies father father battled time: 2 battled locaion: [37.7,23.9] brother time:12 location: [39,22] name: Pluto name: Cerberus age: 4000 title: Ugly beast of the title: God of the underworld underworld pet g.query().has(‘location’,WITHIN,
 Geoshape.circle(38,24,50).edges()!
    37. 37. Faunus Features  Hadoop-based Graph Computing Framework  Graph Analytics  Breadth-first Traversals  Global Graph Computations  Batch Big Graph Data
    38. 38. Faunus Architecture g._()!
    39. 39. Faunus Work Flowg.V.out .out .count() hdfs://user/ubuntu/ output/job-0/ output/job-1/ graph* output/job-2/ { sideeffect*Compressed HDFS Graphs  stored in sequence files  variable length encoding  prefix compression
    40. 40. Faunus Setup$ bin/gremlin.sh ! ,,,/! (o o)!-----oOOo-(_)-oOOo-----!gremlin> g = FaunusFactory.open(bin/titan-hbase.properties)!==>faunusgraph[titanhbaseinputformat]!gremlin> g.getProperties()!==>faunus.graph.input.format=com.thinkaurelius.faunus.formats.titan.hbase.TitanHBaseInputFormat==>faunus.graph.output.format=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat!==>faunus.sideeffect.output.format=org.apache.hadoop.mapreduce.lib.output.TextOutputFormat!==>faunus.output.location=dbpedia!==>faunus.output.location.overwrite=true!gremlin> g._() !12/11/09 15:17:45 INFO mapreduce.FaunusCompiler: Compiled to 1 MapReduce job(s)!12/11/09 15:17:45 INFO mapreduce.FaunusCompiler: Executing job 1 out of 1:MapSequence[com.thinkaurelius.faunus.mapreduce.transform.IdentityMap.Map]!12/11/09 15:17:50 INFO mapred.JobClient: Running job: job_201211081058_0003!
    41. 41. Build a Knowledge Graph  Based on DBPedia   Graph version of Wikipedia   ~290 million edges (~1B triples)1.  Bulk load RDF into Faunus   6 m1.xlarge2.  Convert to property graph3.  Bulk load into Titan   3 m1.xlarge with Cassandra4.  OLTP+OLAP   Total Time: ~ 2 hours
    42. 42. Graph OLTPgremlin> g = TitanFactory.open(bin/cassandra.local) !==>titangraph[cassandrathrift:10.176.213.110]!gremlin> g.V(name,Random_walker_algorithm).both.name!==>Random_walk!==>Segmentation_(image_processing)!==>Graph_(mathematics)!==>Laplacian_matrix!==>Graph!==>Laplacian_matrix!==>Electrical_network!==>Resistor!==>Electrical_resistance_and_conductance!==>Ground_(electricity)!==>Direct_current!==>Voltage_source!==>Precomputation!==>Category:Computer_vision!==>Random_Walker_(Computer_Vision)!==>List_of_algorithms!==>Segmentation_(image_processing)!==>Watershed_(image_processing)!==>Random_walker_(computer_vision)!==>Random_Walker_(computer_vision)!
    43. 43. Graph OLAPgremlin> g.V(name,Learning).out.out.out.out[0..10].name !==>Latium!==>Roman_Kingdom!==>Roman_Republic!==>Roman_Empire!==>Middle_Ages!==>Early_modern_Europe!==>Armenian_Kingdom_of_Cilicia!==>Lingua_franca!==>Vatican_City!==>Vulgar_Latin!==>Romance_languages!
    44. 44. Complex Problem1.  Identify Entities2.  Identify Relationships3.  Apply Graph Analysis
    45. 45. Apache 2 Aurelius Graph Cluster TITAN FAUNUS FULGORA Map/Reduce Load Bulk Load Analysis results aureliusgraphs@googlegroups.com back into Titan Stores a massive-scale Batch processing of large Runs global graph algorithmsproperty graph allowing real- graphs with Hadoop on large, compressed, time traversals and updates in-memory graphs
    46. 46. TINKERPOP.COM
    47. 47. Thanks! Vadas Gintautas Marko Rodriguez @vadasg @twarko Stephen Mallette Daniel LaRocque @spmallette AURELIUS THINKAURELIUS.COM
    48. 48. We are Hiring AURELIUS THINKAURELIUS.COM @AURELIUSGRAPHS

    ×