Adding Value through graph analysis using Titan and Faunus
Upcoming SlideShare
Loading in...5
×
 

Adding Value through graph analysis using Titan and Faunus

on

  • 14,081 views

In this presentation we discuss how graph analysis can add value to your data and how to use open source tools like Titan and Faunus to build scalable graph processing systems. ...

In this presentation we discuss how graph analysis can add value to your data and how to use open source tools like Titan and Faunus to build scalable graph processing systems.
This presentation gives an update on the development status of Titan and Faunus with a preview of what is to come.

Statistics

Views

Total Views
14,081
Views on SlideShare
9,669
Embed Views
4,412

Actions

Likes
22
Downloads
199
Comments
0

16 Embeds 4,412

http://nosql.mypopescu.com 4323
https://twitter.com 39
http://feeds.feedburner.com 16
http://www.hanrss.com 8
http://www.newsblur.com 6
http://newsblur.com 4
http://www.scoop.it 3
http://127.0.0.1 3
http://www.bing.com 2
http://tweetedtimes.com 2
http://www.soso.com 1
http://72.30.186.176 1
http://j.mp 1
http://dev.newsblur.com 1
http://translate.googleusercontent.com 1
http://www.verious.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Adding Value through graph analysis using Titan and Faunus Adding Value through graph analysis using Titan and Faunus Presentation Transcript

  • KNOWLEDGEINFORMATIONDATAAdding Value Through GraphAnalysisMatthias Broecheler, CTO@mbroecheler AURELIUSMarch V, MMXIII THINKAURELIUS.COM
  • " " " " " " " " "Communities of Interest Finding Influencers "Understanding Behavior "
  • " " " " " " " " "Information Integration Recommendation "Question Answering "
  • " " " " " " " " "Fraud Detection Risk Analysis "Market Valuation "
  • Knowledge ValueInformation Data
  • likes(Jane Joe, cute mamals):0.8 Knowledge userid:3552" clicked timestamp: addid:9914 Information 93932342 "2013-03-03 18:52:48:112;12.123.211.192; ACCESS/TRR;http://adserve.domain.com/render.cgi?uid=F32282DA39B&flagtru&xls=trendi Datang ; ACTION=CLICK|DELAY=250|x=450|y=632!
  • Graph Databases &likes(Jane Joe, cute mamals):0.8 Graph Analysis Knowledge userid:3552" clicked timestamp: addid:9914 Information 93932342 "2013-03-03 18:52:48:112;12.123.211.192; ACCESS/TRR;http://adserve.domain.com/render.cgi?uid=F32282DA39B&flagtru&xls=trendi Datang ; ACTION=CLICK|DELAY=250|x=450|y=632!
  • IGraph Foundation AURELIUS THINKAURELIUS.COM
  • name: Neptune name: Alcmene type: god type: godVertex Property name: Saturn name: Jupiter name: Hercules type: titan type: god type: demigod name: Pluto name: Cerberus type: god type: monster Graph
  • name: Neptune name: Alcmene type: god type: godEdge brother mother name: Saturn name: Jupiter name: Hercules type: titan type: god type: demigod father father Edge battled brother Property time:12 name: Pluto name: Cerberus type: god type: monster Edge Type pet Graph
  • name: Neptune name: Alcmene type: god type: god brother mothername: Saturn name: Jupiter name: Herculestype: titan type: god type: demigod father father battled brother time:12 name: Pluto name: Cerberus type: god type: monster pet Path
  • name: Neptune name: Alcmene type: god type: god brother mothername: Saturn name: Jupiter name: Herculestype: titan type: god type: demigod father father battled brother time:12 name: Pluto name: Cerberus type: god type: monster pet Degree
  • Apache 2 Aurelius Graph Cluster TITAN FAUNUS FULGORA Map/Reduce Load Bulk Load Analysis results back into Titan Stores a massive-scale Batch processing of large Runs global graph algorithmsproperty graph allowing real- graphs with Hadoop on large, compressed, time traversals and updates in-memory graphs
  • IITitan Graph Database AURELIUS THINKAURELIUS.COM
  • Titan Features  Numerous Concurrent Users  Many Short Transactions   read/write  Real-time Traversals (OLTP)  High Availability  Dynamic Scalability  Variable Consistency Model   ACID or eventual consistency  Real-time Big Graph Data
  • Storage Backends PartitionabilityConsistency Availability
  • $ ./titan-0.2.0/bin/gremlin.sh! ! ! !,,,/! (o o)!-----oOOo-(_)-oOOo-----!gremlin> g = TitanFactory.open(/tmp/titan)!==>titangraph[local:/tmp/titan]!gremlin> v = g.V(‘name’,’Hercules’)!==>v[4]!gremlin> v.out(‘father’).out(‘brother’).name!
  • name: Neptune name: Alcmene type: god type: god brother mother name: Saturn name: Jupiter name: Hercules type: titan type: god type: demigod father father battled brother time:12 name: Pluto name: Cerberus type: god type: monster petgremlin> v.out(‘father’).out(‘brother’).name!
  • Vertex-Centric Indices  Sort and index edges per vertex by primary key   Primary key can be composite  Enables efficient focused traversals   Only retrieve edges that matter  Uses push down predicates for quick, index-driven retrieval
  • battled battled battled time: 1 time: 3 time: 5 mother battled v v.query()! time: 9 father fought fought
  • battled battled battled time: 1 time: 3 time: 5 mother battled v v.query()! time: 9 .direction(OUT)! father
  • battled battled battled time: 1 time: 3 time: 5 battled v v.query()! time: 9 .direction(OUT)! .labels(‘battled’)!
  • battled battled time: 1 time: 3 v v.query()! .direction(OUT)! .labels(‘battled’)! .has(‘time,T.lt,5)!
  • Titan FeaturesI.  Data ManagementII.  Vertex-Centric Indices
  • Titan FeaturesIII.  Graph PartitioningIV.  Edge Compression
  • IIITITAN 0.3.0 [-SNAPSHOT] AURELIUS THINKAURELIUS.COM
  • Titan Embedding  Rexster RexPro   lightweight Gremlin Server   binary protocol  Titan Gremlin Engine  Embedded Storage Backend   in-JVM method calls  Native clients   Java, Python, Clojure
  • Graph Indexing  Vertex and Edge indexing  Pluggable index provider   ElasticSearch   Lucene  Full-text search  Numeric range search  Geographic search
  • name: Neptune name: Alcmene age: 5200 age: 3300 title: God of the earth and ocean brother mother name: Jupitername: Saturn age: 4800 name: Herculesage: 5900 title: God of the title: Divine hero heaven and skies father father battled brother time:12 location: (38.071,23.745) name: Pluto name: Cerberus age: 4900 title: Ugly beast of the title: God of the underworld underworld pet
  • name: Neptune name: Alcmene age: 5200 age: 3300 title: God of the earth and ocean brother mother name: Jupiter name: Saturn age: 4800 name: Hercules age: 5900 title: God of the title: Divine hero heaven and skies father father battled brother time:12 location: (38.071,23.745) name: Pluto name: Cerberus age: 4900 title: Ugly beast of the title: God of the underworld underworld petg.query().has(‘age’,Cmp.GREATER_THAN,5000).vertices()!
  • name: Neptune name: Alcmene age: 5200 age: 3300 title: God of the earth and ocean brother mother name: Jupiter name: Saturn age: 4800 name: Hercules age: 5900 title: God of the title: Divine hero heaven and skies father father battled brother time:12 location: (38.071,23.745) name: Pluto name: Cerberus age: 4900 title: Ugly beast of the title: God of the underworld underworld petg.query().has(‘title’,Txt.CONTAINS,’god’).vertices()!
  • name: Neptune name: Alcmene age: 5200 age: 3300 title: God of the earth and ocean brother mother name: Jupiter name: Saturn age: 4800 name: Hercules age: 5900 title: God of the title: Divine hero heaven and skies father father battled brother time:12 location: (38.071,23.745) name: Pluto name: Cerberus age: 4900 title: Ugly beast of the title: God of the underworld underworld petg.query().has(‘age’,Cmp.GREATER_THAN,5000)
has(‘title’,Txt.CONTAINS,’god’).vertices()!
  • name: Neptune name: Alcmene age: 5200 age: 3300 title: God of the earth and ocean brother mother name: Jupitername: Saturn age: 4800 name: Herculesage: 5900 title: God of the title: Divine hero heaven and skies father father battled brother time:12 location: (38.071,23.745) name: Pluto name: Cerberus age: 4900 title: Ugly beast of the title: God of the underworld underworld pet g.query().has(‘location’,Geo.WITHIN,
 Geoshape.circle(38,23,100).edges()!
  • IVFaunus Graph Analytics AURELIUS THINKAURELIUS.COM
  • Faunus Features  Hadoop-based Graph Computing Framework  Graph Analytics  Breadth-first Traversals  Global Graph Computations  Batch Big Graph Data
  • Faunus Architecture g._()!
  • Faunus Work Flowg.V.out .out .count() hdfs://user/ubuntu/ output/job-0/ output/job-1/ graph* output/job-2/ { sideeffect*Compressed HDFS Graphs  stored in sequence files  variable length encoding  prefix compression
  • Apache 2 Aurelius Graph Cluster TITAN FAUNUS FULGORA Map/Reduce Load Bulk Load Analysis results back into Titan Stores a massive-scale Batch processing of large Runs global graph algorithmsproperty graph allowing real- graphs with Hadoop on large, compressed, time traversals and updates in-memory graphs
  • What’s New  Faunus 0.1 released  Bulk Import / Export for Titan   loaded graph into Titan   loading derivations into Titan   RDF support  Many optimizations   vertex compression
  • Faunus Setup$ bin/gremlin.sh ! ,,,/! (o o)!-----oOOo-(_)-oOOo-----!gremlin> g = FaunusFactory.open(bin/titan-hbase.properties)!==>faunusgraph[titanhbaseinputformat]!gremlin> g.getProperties()!==>faunus.graph.input.format=com.thinkaurelius.faunus.formats.titan.hbase.TitanHBaseInputFormat==>faunus.graph.output.format=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat!==>faunus.sideeffect.output.format=org.apache.hadoop.mapreduce.lib.output.TextOutputFormat!==>faunus.output.location=dbpedia!==>faunus.output.location.overwrite=true!gremlin> g._() !12/11/09 15:17:45 INFO mapreduce.FaunusCompiler: Compiled to 1 MapReduce job(s)!12/11/09 15:17:45 INFO mapreduce.FaunusCompiler: Executing job 1 out of 1:MapSequence[com.thinkaurelius.faunus.mapreduce.transform.IdentityMap.Map]!12/11/09 15:17:50 INFO mapred.JobClient: Running job: job_201211081058_0003!
  • Build a Knowledge Graph  Based on DBPedia   Graph version of Wikipedia   ~290 million edges (~1B triples)1.  Bulk load RDF into Faunus   6 m1.xlarge2.  Convert to property graph3.  Bulk load into Titan   3 m1.xlarge with Cassandra4.  OLTP+OLAP   Total Time: ~ 2 hours
  • Graph OLTPgremlin> g = TitanFactory.open(bin/cassandra.local) !==>titangraph[cassandrathrift:10.176.213.110]!gremlin> g.V(name,Random_walker_algorithm).both.name!==>Random_walk!==>Segmentation_(image_processing)!==>Graph_(mathematics)!==>Laplacian_matrix!==>Graph!==>Laplacian_matrix!==>Electrical_network!==>Resistor!==>Electrical_resistance_and_conductance!==>Ground_(electricity)!==>Direct_current!==>Voltage_source!==>Precomputation!==>Category:Computer_vision!==>Random_Walker_(Computer_Vision)!==>List_of_algorithms!==>Segmentation_(image_processing)!==>Watershed_(image_processing)!==>Random_walker_(computer_vision)!==>Random_Walker_(computer_vision)!
  • gremlin> g.V(name,Learning).out.out.out.out[0..10].name !==>Latium!==>Roman_Kingdom!==>Roman_Republic!==>Roman_Empire!==>Middle_Ages!==>Early_modern_Europe!==>Armenian_Kingdom_of_Cilicia!==>Lingua_franca!==>Vatican_City!==>Vulgar_Latin!==>Romance_languages!
  • Apache 2 Aurelius Graph Cluster TITAN FAUNUS FULGORA Map/Reduce Load Bulk Load Analysis results aureliusgraphs@googlegroups.com back into Titan Stores a massive-scale Batch processing of large Runs global graph algorithmsproperty graph allowing real- graphs with Hadoop on large, compressed, time traversals and updates in-memory graphs
  • Speed of Traversal/Process The Graph LandscapeIllustration only, not to scale Size of Graph
  • TINKERPOP.COM
  • Thanks! Vadas Gintautas Marko Rodriguez @vadasg @twarko Stephen Mallette Daniel LaRocque @spmallette AURELIUS THINKAURELIUS.COM
  • We are Hiring AURELIUS THINKAURELIUS.COM