• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Titan: The Rise of Big Graph Data
 

Titan: The Rise of Big Graph Data

on

  • 72,953 views

A graph is a data structure composed of vertices/dots and edges/lines. A graph database is a software system used to persist and process graphs. The common conception in today's database community is ...

A graph is a data structure composed of vertices/dots and edges/lines. A graph database is a software system used to persist and process graphs. The common conception in today's database community is that there is a tradeoff between the scale of data and the complexity/interlinking of data. To challenge this understanding, Aurelius has developed Titan under the liberal Apache 2 license. Titan supports both the size of modern data and the modeling power of graphs to usher in the era of Big Graph Data. Novel techniques in edge compression, data layout, and vertex-centric indices that exploit significant orders are used to facilitate the representation and processing of a single atomic graph structure across a multi-machine cluster. To ensure ease of adoption by the graph community, Titan natively implements the TinkerPop 2 Blueprints API. This presentation will review the graph landscape, Titan's techniques for scale by distribution, and a collection of satellite graph technologies to be released by Aurelius in the coming summer months of 2012.

Statistics

Views

Total Views
72,953
Views on SlideShare
60,990
Embed Views
11,963

Actions

Likes
105
Downloads
1,407
Comments
6

96 Embeds 11,963

http://www.rene-pickhardt.de 4084
http://jaxenter.com 3399
http://bickson.blogspot.com 1170
http://it-republik.de 453
http://architects.dzone.com 378
http://bickson.blogspot.in 241
http://www.conseilsmarketing.com 217
http://localhost 203
http://java.dzone.com 193
http://titandb.wpengine.com 115
http://bickson.blogspot.co.uk 106
http://bickson.blogspot.fr 82
http://bickson.blogspot.co.il 80
http://bickson.blogspot.ca 76
http://mcgivrer.fr 75
http://entwickler.de 68
http://bickson.blogspot.de 68
http://entwickler.com 67
http://bickson.blogspot.nl 61
http://192.168.2.102 58
http://www.linkedin.com 58
https://twitter.com 54
http://jaxenter.de 37
http://findhy.com 35
http://feeds.feedburner.com 34
http://bickson.blogspot.com.es 31
http://127.0.0.1 25
http://bickson.blogspot.com.au 25
http://bickson.blogspot.jp 24
http://bickson.blogspot.com.ar 24
http://bickson.blogspot.com.br 23
http://bickson.blogspot.ch 22
http://bickson.blogspot.it 20
http://bickson.blogspot.sg 19
http://bickson.blogspot.gr 19
http://bickson.blogspot.ie 18
http://bickson.blogspot.tw 15
http://blog.galaksiya.com 15
http://mein-webdesk924.de 15
http://bickson.blogspot.ru 14
http://aws.w3db.us 14
http://bickson.blogspot.mx 13
http://bickson.blogspot.kr 13
http://bickson.blogspot.be 13
http://bickson.blogspot.hk 12
http://bickson.blogspot.se 11
http://cloud.feedly.com 11
http://impuestos2010.wordpress.com 11
http://tsthdp.akb.raftel 11
http://mcgivrer.wordpress.com 10
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

16 of 6 previous next Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Titan: The Rise of Big Graph Data Titan: The Rise of Big Graph Data Presentation Transcript

    • TITANTHE RISE OF BIG GRAPH DATA MARKO A. RODRIGUEZ MATTHIAS BROECHELER http://THINKAURELIUS.COM
    • ABSTRACTA graph is a data structure composed of vertices/dots andedges/lines. A graph database is a software system used topersist and process graphs. The common conception in todaysdatabase community is that there is a tradeoff between thescale of data and the complexity/interlinking of data. Tochallenge this understanding, Aurelius has developed Titanunder the liberal Apache 2 license. Titan supports both the sizeof modern data and the modeling power of graphs to usher inthe era of Big Graph Data. Novel techniques in edgecompression, data layout, and vertex-centric indices thatexploit significant orders are used to facilitate therepresentation and processing of a single atomic graphstructure across a multi-machine cluster. To ensure ease ofadoption by the graph community, Titan natively implementsthe TinkerPop 2 Blueprints API. This presentation will reviewthe graph landscape, Titans techniques for scale bydistribution, and a collection of satellite graph technologies tobe released by Aurelius in the coming summer months of 2012.
    • SPEAKER BIOGRAPHIES Dr. Marko A. Rodriguez is the founder of the graph consulting firm Aurelius. He has focused his academic and commercial career on the theoretical and applied aspects of graphs. Marko is a cofounder of TinkerPop and the primary developer of the Gremlin graph traversal language. Dr. Matthias Broecheler has been researching and developing large-scale graph database systems for many years in both academia and in his role as a cofounder of the Aurelius graph consulting firm. He is the primary developer of the distributed graph database Titan. Matthias focuses most of his time and effort on novel OLTP and OLAP graph processing solutions.
    • SPONSORS As the leading education services company, Pearson is serious about evolving how the world learns. We apply our deep education experience and research, invest in innovative technologies, and promote collaboration throughout the education ecosystem. Real change is our commitment and its results are delivered through connecting capabilities to create actionable, scalable solutions that improve access, affordability, and achievement. Aurelius is a team of software engineers and scientists committed to applying graph theory and network science to problems in numerous domains. Aurelius develops the theory and technology whereby graphs can be used to model, understand, predict, and influence the behavior of complex, interrelated social, economic, and physical networks.Jive is the pioneer and worlds leading provider of social business solutions. Our productsapply powerful technology that helps people connect, communicate and collaborate to getmore work done and solve their biggest business challenges. Millions of users and manyof the worldʼs most successful companies rely on Jive day in and day out to get workdone, serve their customers and stay ahead of their competitors.
    • OUTLINE1. ThE GRAPH LANDSCAPE An introduction to graph computing. Graph technologies on the market today.2. INTRODUCTION TO TITAN Getting up and running with Titan. Titans techniques for scalability.3. THE FUTURE OF AURELIUS Satellite technologies and the OLAP story. The graph landscape reprise.
    • PART 1:ThE GRAPH LANDSCAPE MARKO A. RODRIGUEZ
    • GRAPH
    • EDGEVERTEX GRAPH
    • EDGEVERTEX GRAPH G = (V, E) Graph Vertices Edges
    • G = (V, E)Classic Textbook Graph Structure
    • A homogenous set of vertices... V
    • ...connected by a homogenous set of edges. E
    • RESTRICTED MODELINGPeople and follows relationships...
    • RESTRICTED MODELINGPeople and follows relationships... ...xor webpages and citations.
    • AN INTEGRATED MODEL IS TYPICALLY DESIRED references createdBy followsreferences references follows mentions
    • AN INTEGRATED MODEL IS USEFUL references createdBy follows references references follows mentionsAllows for more interesting/novel algorithms. (beyond "textbook" graph algorithms)Allows for a universal model of things and their relationships. (a single, unified model of a domain of interest)
    • THE PROPERTY GRAPH G = (V, E, λ) Current Popular Graph Structure* Directed, attributed, edge-labeled graph* Multi-relational graph with key/value pairs on the elements
    • VERTEX
    • PROPERTIES name:hercules VERTEX
    • PROPERTIESKEY VALUE name:hercules VERTEX
    • name:hercules
    • name:hercules mother name:alcmene type:human
    • name:hercules LABEL mother EDGE name:alcmene type:human
    • name:hercules mother name:alcmene type:human
    • name:hercules mother fathername:jupiter name:alcmene type:god type:human
    • IS HERCULES A DEMIGOD?DEMIGOD = HALF HUMAN + HALF GOD name:hercules mother father name:jupiter name:alcmene type:god type:human
    • name:hercules mother father name:jupiter name:alcmene type:god type:humangremlin> hercules==>v[0]
    • name:hercules mother father name:jupiter name:alcmene type:god type:humangremlin> hercules.out(mother,father)==>v[1]==>v[2]
    • DEMIGOD = HALF HUMAN + HALF GOD name:hercules mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.out(mother,father).type ==>human ==>god
    • DEMIGOD = HALF HUMAN + HALF GOD name:hercules type:demigod mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.type = demigod ==>demigod
    • COMPUTINGPROCESS STRUCTURE
    • COMPUTINGPROCESS STRUCTURETRAVERSAL GRAPH
    • COMPUTINGPROCESS STRUCTURETRAVERSAL GRAPH COMPUTING GRAPH-BASED
    • WhY GRAPH-BASED COMPUTING?
    • WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING
    • WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING
    • WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING NUMEROUS ANALYSES Mixing Patterns Ranking Inference Motifs Path Expressions Centrality Scoring Geodesics
    • ANALYSES ARE THEEPIPHENOMENA OF TRAVERSAL f( )→
    • WHAT IS THE SIGNIFICANCE OF GRAPH ANALYSIS?
    • ANALYSES YIELDINSIGHTS ABOUT THE MODEL TA TS D A UC OD PR = DE DATA CIS -D ION RIV SU EN PP OR T
    • RECOMMENDATIONPeople you may know. SOCIAL GRAPHProducts you might like. RATINGS GRAPHMovies you should watch and SOCIAL+RATINGS the friends you should watch them with. GRAPH
    • WHO ELSE MIGHT HERCULES KNOW? cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6
    • cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules==>v[0]
    • cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules.out(knows)==>v[1]==>v[2]==>v[3]
    • cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules.out(knows).out(knows)==>v[4]==>v[5]==>v[5]==>v[6]==>v[5]
    • cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules.out(knows).out(knows).groupCount.cap==>v[4]=1==>v[5]=3==>v[6]=1
    • HERCULES PROBABLY KNOWS NEPTUNE cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 knows
    • HERCULES PROBABLY KNOWS NEPTUNE PH cerberus pluto knows A 1 4 knows knows E" GR YL hercules nemean neptune ST knows knows 0 2 5 K OO knows knows EX TB hydra jupiter knows "T 3 6 IS A IS knows TH
    • HERCULES PROBABLY KNOWS NEPTUNE cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father ...PROBABLY MORE SO WHEN OTHER TYPES OF EDGES ARE ANALYZED
    • cerberus pluto knows 1 4 knows knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father
    • cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father
    • cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
    • human flesh 7 cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
    • likes human flesh 7 likes cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
    • tartarus 8 likes human flesh 7 likes cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
    • tartarus 8 likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
    • tartarus 8RATINGS GRAPH likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
    • NEMEAN MIGHT LIKE TARTARUS PRODUCT GRAPH tartarus smellsOf 8 RATINGS GRAPH likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes composedOf knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH* Collaborative Filtering + Content-Based Recommendation
    • PATH FINDINGHow is this person related to this film? MOVIE GRAPHWhich authors of this book also BOOK GRAPH wrote a New York Times bestseller?Which movies are based on a book by a MOVIE+BOOK New York Times bestseller? GRAPH
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new yorkgremlin> hercules==>v[0]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new yorkgremlin> hercules.out(depictedIn)==>v[7]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie)==>v[7]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor)==>v[8]==>v[10]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role)==>v[0]==>v[6]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules)==>v[0]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2)==>v[8]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor)==>v[9]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor) .as(star)==>v[9]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor) .as(star).select==>[movie:v[7], star:v[9]]
    • WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor) .as(star).select{it.name}==>[movie:hercules in new york, star:arnold schwarzenegger]
    • jupiter hercules 6 0 depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • jupiter hercules 6 0 depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • fred saberhagen 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • fred albuquerque saberhagen livesIn 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • fred santa fe albuquerque saberhagen 25-North livesIn 15 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • marko fredrodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • marko fredrodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 thinksHeIs writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
    • TRANSPORTATION GRAPH marko fredrodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 thinksHeIs BOOK GRAPH writtenBy PROFILE jupiter hercules GRAPH depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york MOVIE GRAPH
    • SOCIAL INFLUENCE Who are the most influential people in java, mathematics, art, surreal art, politics, ...? Which region of the social graph will propagate this advertisement this furthest? Which 3 experts should review this submitted article? Which people should I talk to at the upcoming conference and what topics should I talk to them about?SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
    • PATTERN IDENTIFICATION This connectivity pattern is a sign of financial fraud. When this motif is found, a red flag will be raised. TRANSACTION GRAPH Healthy discourse is typified by a discussion board with a branch factor in this range and a concept clique score in this range. DISCUSSION GRAPH
    • KNOWLEDGE DISCOVERYThe terms "ice", "fans", "stanley cup," WIKIPEDIA GRAPH are classified as "sports"Given that all identified birds fly, it can be deduced that all birds fly. If contrary evidence is provided, EVIDENTIAL LOGIC GRAPH then this "fact" can be retracted.
    • WORLD MODEL
    • WORLD PROCESSES WORLD MODEL
    • WORLD PROCESSES WORLD MODELA single world model and various types of traversers moving through that model to solve problems.
    • COMPUTINGPROCESS STRUCTURETRAVERSAL GRAPH COMPUTING GRAPH-BASED
    • GRAPH COMPUTING ENGINES
    • MEMORY-BASED GRAPHSGraph FrameworkApplication NetworkX http://networkx.lanl.gov/ iGraphhttp://igraph.sourceforge.net/ JUNG http://jung.sourceforge.net/
    • DISK-BASED GRAPHSGraph Database Neo4j Application Application http://neo4j.org/ Application OrientDB http://orientdb.org InfiniteGraph http://objectivity.com DEX http://www.sparsity-technologies.com/dex
    • CLUSTER-BASED GRAPHS Bulk Synchronous Parallel Processing Application Application Application Hama 3 http://incubator.apache.org/hama/ 2 1 Giraph http://incubator.apache.org/giraph/ GoldenOrb http://goldenorbos.org/* In the same spirit as Googles Pregel
    • MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriented towards "textbook-style" graphs. * Based on typical behavior
    • MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriented towards "textbook-style" graphs.DISK-BASED GRAPHSGraph size is constrained by local disk.Optimized for local graph algorithms.Oriented towards property graphs. * Based on typical behavior
    • MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriented towards "textbook-style" graphs.DISK-BASED GRAPHSGraph size is constrained by local disk.Optimized for local graph algorithms.Oriented towards property graphs.CLUSTER-BASED GRAPHSGraph size is constrained to clusters total RAM.Optimized for global graph algorithms.Oriented towards "textbook-style" graphs. * Based on typical behavior
    • TINKERPOP Support for various graph vendorsOpen source graph product group * Encompassing the various graph computing styles Simple, well-defined products Provides a vendor-agnostic graph frameworkhttp://tinkerpop.com * Based on future directions
    • TINKERPOP Graph Server Graph Algorithms Object-Graph Mapper Traversal Language Dataflow Processing http://tinkerpop.com Generic Graph APIhttp://${project.name}.tinkerpop.com
    • TINKERPOP INTEGRATIONhttp://tinkerpop.com
    • AND NOW THERE IS ANOTHER...
    • TITAN
    • PART 2:INTRODUCTION TO TITAN MATTHIAS BROECHELER
    • WhY CREATE TITAN? A number of Aurelius clients... ...need to represent and process graphs at the 100+ billion edge scale w/ thousands of concurrent transactions. ...need both local graph traversals (OLTP) and batch graph processing (OLAP). ...desire a free, open source distributed graph database.
    • TITANs KEY FEATURES Titan provides... ..."infinite size" graphs and "unlimited" users by means of a distributed storage engine. ...real-time local traversals (OLTP) and support for global batch processing via Hadoop (OLAP). ...distribution via the liberal, free, open source Apache2 license.
    • matthias$
    • matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$
    • matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$ unzip titan.zipArchive: titan.zip creating: titan/ ...matthias$
    • matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$ unzip titan.zipArchive: titan.zip creating: titan/ ...matthias$ cd titantitan$
    • matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$ unzip titan.zipArchive: titan.zip creating: titan/ ...matthias$ cd titantitan$ bin/gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin>
    • gremlin> g = TitanFactory.open(/tmp/local-titan)==>titangraph[local:/tmp/local-titan]
    • DE MO INE ACHgremlin> g = TitanFactory.open(/tmp/local-titan) LM==>titangraph[local:/tmp/local-titan] LO CA
    • gremlin> g.createKeyIndex(name,Vertex.class)==>nullgremlin> g.stopTransaction(SUCCESS)==>null
    • name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> g.loadGraphML(data/graph-of-the-gods.xml)==>null* The Graph of the Gods is a toy dataset distributed with Titan
    • name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> hercules = g.V(name,hercules).next()==>v[24]
    • name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> hercules.out(mother,father)==>v[44]==>v[16]
    • name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> hercules.out(mother,father).name==>alcmene==>jupiter
    • THAT WAS TITAN LOCAL. NEXT IS TITAN DISTRIBUTED.Broecheler, M., Pugliese, A., Subrahmanian, V.S., "COSI: Cloud Oriented Subgraph Identification in Massive Social Networks,"Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 248-255, 2010.http://www.knowledgefrominformation.com/2010/08/01/cosi-cloud-oriented-subgraph-identification-in-massive-social-networks/
    • BACKEND AGNOSTIC -OR-
    • TITAN DISTRIBUTED VIA CASSANDRAtitan$ bin/gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin> conf = new BaseConfiguration();==>org.apache.commons.configuration.BaseConfiguration@763861e6gremlin> conf.setProperty("storage.backend","cassandra");gremlin> conf.setProperty("storage.hostname","77.77.77.77");gremlin> g = TitanFactory.open(conf);==>titangraph[cassandra:77.77.77.77]gremlin>* There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
    • INHERITED FEATURES Continuously available with no single point of failure. No write bottlenecks to the graph as there is no master/slave architecture. Built-in replication ensures data is available during machine failure. Caching layer ensures that continuously accessed data is available in memory. Elastic scalability allows for the introduction and removal of machines.Cassandra available at http://cassandra.apache.org/
    • TITAN DISTRIBUTED VIA HBASEtitan$ bin/gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin> conf = new BaseConfiguration();==>org.apache.commons.configuration.BaseConfiguration@763861e6gremlin> conf.setProperty("storage.backend","hbase");gremlin> conf.setProperty("storage.hostname","77.77.77.77");gremlin> g = TitanFactory.open(conf);==>titangraph[hbase:77.77.77.77]gremlin>* There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
    • INHERITED FEATURES Strictly consistent reads and writes. Linear scalability with the addition of machines. Base classes for backing Hadoop MapReduce jobs with HBase tables. HDFS-based data replication. Generally good integration with the tools in the Hadoop ecosystem.HBase available at http://hbase.apache.org/
    • TITAN AND THE CAP THEOREM Partitionability y Ava c ten il is abi s on ty li C
    • Titan is all about ...
    • Titan is all about numerous concurrent users...
    • Titan is all about numerous concurrent users... high availability....
    • Titan is all about numerous concurrent users... high availability.... dynamic scalability...
    • THE HOW OF TITANDATA MANAGEMENT EDGE COMPRESSION VERTEX-CENTRIC INDICES
    • THE HOW OF TITANDATA MANAGEMENT
    • DATA MANAGEMENT MAIN DESIGN PRINCIPLESImmutable, Atomic Edges Optimistic Concurrency Control hercules cerberus battled1 hercules time:12 cerberus2 battled + + + hercules time:12 successful:true cerberus + -3 battled + Fined-Grained Locking Control
    • DATA MANAGEMENT TYPE DEFINITION Datatype Constraints Edge Label Signatures TitanKey timeKey = TitanLabel battled = g.makeType().name("time") g.makeType().name("battled") .dataType(Integer.class) .signature(timeKey) time:12 time:"twelve" hercules cerberus battled time:12 Functional Declarations TitanLabel father = g.makeType().name("father") .functional() hercules jupiter father mars fatherData management configurations allow Titan to optimize how information is stored/retrieved from disk.
    • DATA MANAGEMENT TYPE DEFINITION Endogenous Indices g.createKeyIndex("name",Vertex.class) Unique Property Key/Value Pairs TitanKey status = name:jupiter g.makeType().name("status") name:hercules .unique() name:hermes name:jupiter name:neptune status:king of the gods status:king of the godsData management configurations allow Titan to optimize how information is stored/retrieved from disk.
    • DATA MANAGEMENT LOCKING SYSTEMEnsures consistency over non-consistent storage backends. hercules father jupiter write hercules jupiter father neptune father hercules write 1. Acquire lock at the end of the transaction. - locking mechanism depends on storage layer consistency guarantees. 2. Verify original read. 3. Fail transaction if any precondition is violated.
    • DATA MANAGEMENT ID MANAGEMENT [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine
    • DATA MANAGEMENT ID MANAGEMENT[0,1,2] [3,4,5] [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine [6,7,8] [9,10,11] Pool Subsets Assigned to Individual Instances
    • THE HOW OF TITANEDGE COMPRESSION
    • EDGE COMPRESSION Natural graphs have a small world, community/cluster property. Community 1 Community 2 High intra-connectivity within a community and low inter-connectivity between communities.Watts, D. J., Strogatz, S. H., "Collective Dynamics of Small-World Networks,"Nature 393 (6684), pp. 440–442, 1998.
    • EDGE COMPRESSION
    • EDGE COMPRESSION knows12345678 12345683
    • EDGE COMPRESSION knows 12345678 12345683
    • EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes
    • EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes 12345678 9 +5
    • EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes 12345678 9 +5 + 12345678 9 5 7 bytes
    • THE HOW OF TITANVERTEX-CENTRIC INDICES
    • VERTEX-CENTRIC INDICES THE SUPER NODE PROBLEMNatural, real-world graphs containvertices of high degree.Even if rare, their degree ensures thatthey exist on many paths.Traversing a high degree vertexmeans touching numerous incidentedges and potentially touching mostof the graph in only a few steps.
    • VERTEX-CENTRIC INDICES A SUPER NODE SOLUTIONA "super node" only exists from thevantage point of classic "textbookstyle" graphs.In the world of property graphs,intelligent disk-level filtering caninterpret a "super node" as a moremanageable low-degree vertex.Vertex-centric querying utilizes B-Treesand sort orders for speedy lookup ofincident edges with particular qualities.
    • VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query() stars:5 likes likes stars:2 stars:2 likes knows knows stars:3 stars:3 likes likes knows 8 edges
    • VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) stars:5 likes likes stars:2 stars:2 likes knows knows stars:3 stars:3 likes likes 7 edges
    • VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) .labels("likes") stars:5 likes likes stars:2 stars:2 likes stars:3 stars:3 likes likes 5 edges
    • VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) .labels("likes").has("stars",5) stars:5 likes 1 edge
    • VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES Query Query.direction(Direction)PREDICATES Query Query.labels(String... labels) Query Query.has(String, Object, Compare) Query Query.has(String, Object) Query Query.range(String, Object, Object)GETTERS Iterable<Vertex> Query.vertices() Iterable<Edge> Query.edges()
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING battledtime:1 time:2 battled time:12 battled knows knows
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING battledtime:1 time:2 battled battled time:12 battled knows knows knows
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING battledtime:1 battled w/ time 1-5 time:2 battled time:12 battled battled w/ time 5-10 knows TitanLabel battled = g.makeType().name("battled") .primaryKey(time) knows knows
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father mother knows battled
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father mother knows battled
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father family TypeGroup family = TypeGroup.of(2,"family"); mother TitanLabel father = g.makeType().name("father") .group(family).makeEdgeLabel(); TitanLabel mother = knows g.makeType().name("mother") .group(family).makeEdgeLabel(); TitanLabel brother = battled g.makeType().name("brother") .group(family).makeEdgeLabel();
    • VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father family mother knows battled vertex.query().group("family")...
    • THAT IS HOW TITAN WORKSDATA MANAGEMENT EDGE COMPRESSION VERTEX-CENTRIC INDICES
    • WHAT IF YOU WANTED TO CREATE TWITTER FROM SCRATCH? SIMULATING TWITTER
    • 3 BILLION EDGES 100 MILLION VERTICES10000 CONCURRENT USERS 50 MACHINES 1 GRAPH DATABASE COMING JULY 2012
    • PART 3:THE FUTURE OF AURELIUSMARKO A. RODRIGUEZ MATTHIAS BROECHELER
    • AURELIUS GRAPH COMPUTING STORYTitan as the highly scalable, distributed graph database solution. OLTP
    • AURELIUS GRAPH COMPUTING STORYTitan as the highly scalable, distributed graph database solution.Titan as the source (and potential sink) for other graphprocessing solutions. OLTP OLAP
    • FAUNUSGOD OF HERDS
    • FAUNUSPATH ALGEBRA FOR HADOOP battled battled hercules cretan bull theseus ￿ A · A ◦ n(I) ally hercules theseusDerived graphs are single-relational and are typically much smaller thantheir multi-relational source. Therefore, derived graphs can be subjected to"textbook-style" graph algorithms in both a meaningful and efficient manner. WHO IS THE MOST CENTRAL ALLY?
    • FAUNUSPATH ALGEBRA FOR HADOOP ￿B = A · A ◦ n(I) B · B ◦ n(I) ally ally ally ally ally ally ally ally ally ally ally ally ally "My allies allies are my allies." ￿ 2 (A · A ) ◦ n(I)
    • FAUNUS PATH ALGEBRA FOR HADOOP Used for global graph operations. Implements the multi-relational path algebra as a collection of Map/Reduce operations Reduce a massive property graph into a smaller semantically-rich single-relational graph. Project codename: TinkerPoop Support for "HadoopGraph" and HDFS file formatsRodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks toSingle-Relational Network Analysis Algorithms,” Journal of Informetrics,4(1), pp. 29-41, 2009. http://arxiv.org/abs/0806.2274
    • FULGORAGODDESS OF LIGHTNING
    • FULGORA AN EFFICIENt IN-MEMORY GRAPH ENGINE Non-transactional, in-memory graph engine. It is not a "database." Process ~90 billion edges in 68-Gigs of RAM assuming a small world topology. Perform complex graph algorithms in-memory. global graph analysis multi-relational graph analysisSimilar in spirit to Twitters Cassovary: https://github.com/twitter/cassovary
    • THE AURELIUS OLAP FLOWStores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph Map/Reduce Load into RAM on a single-machine Update graph with derived edges Update element properties with algorithm results to a stats package
    • THE AURELIUS OLAP FLOWStores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph Map/Reduce Load into RAM on a single-machine ally ally_centrality:0.0123 hercules theseus hercules to a stats package
    • THE AURELIUS OLAP FLOWStores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph to a stats package
    • AURELIUS USE OF BLUEPRINTS Aurelius products use the Blueprints API so any graph product can communicate with any other graph product. The code for graph databases, frameworks, algorithms, and batch-processing are written in terms of the Blueprints API. Aurelius encourages developers to use Blueprints/ TinkerPop in order to grow a rich ecosystem of interoperable graph technologies.
    • THE GRAPH LANDSCAPE REPRISE Speed of Traversal/Process Size of Graph/Structure* Not to scale. Did not want to overlap logos.
    • NEXT STEPS Make use of and/or contribute to the free, open source Titan product.Learn about applying graphtheory and network science. http://thinkaurelius.com http://thinkaurelius.github.com/titan/
    • THANK YOU
    • CREDITS PRESENTERSMARKO A. RODRIGUEZMATTHIAS BROCHELER FINANCIAL SUPPORT PEARSON EDUCATION AURELIUSLOCATION PROVISIONS JIVE SOFTWARE MANY THANKS TO DAN LAROCQUETINKERPOP COMMUNITY STEPHEN MALLETTE BOBBY NORTON KETRINA YIM