0
TITANTHE RISE OF BIG GRAPH DATA             MARKO A. RODRIGUEZ            MATTHIAS BROECHELER            http://THINKAUREL...
ABSTRACTA graph is a data structure composed of vertices/dots andedges/lines. A graph database is a software system used t...
SPEAKER BIOGRAPHIES  Dr. Marko A. Rodriguez is the founder of the graph consulting firm Aurelius.  He has focused his acade...
SPONSORS    As the leading education services company, Pearson is serious about evolving how    the world learns. We apply...
OUTLINE1. ThE GRAPH LANDSCAPE  An introduction to graph computing.  Graph technologies on the market today.2. INTRODUCTION...
PART 1:ThE GRAPH LANDSCAPE     MARKO A. RODRIGUEZ
GRAPH
EDGEVERTEX         GRAPH
EDGEVERTEX         GRAPH                        G = (V, E)                        Graph   Vertices Edges
G = (V, E)Classic Textbook Graph Structure
A homogenous set of vertices...                                  V
...connected by a homogenous set of edges.                                             E
RESTRICTED MODELINGPeople and follows relationships...
RESTRICTED MODELINGPeople and follows relationships...   ...xor webpages and citations.
AN INTEGRATED MODEL     IS TYPICALLY DESIRED                   references             createdBy               followsrefer...
AN INTEGRATED MODEL         IS USEFUL                             references                        createdBy           fo...
THE PROPERTY GRAPH                 G = (V, E, λ)                         Current Popular Graph Structure* Directed, attrib...
VERTEX
PROPERTIES  name:hercules VERTEX
PROPERTIESKEY      VALUE   name:hercules   VERTEX
name:hercules
name:hercules                mother                    name:alcmene                     type:human
name:hercules                         LABEL                mother            EDGE                    name:alcmene         ...
name:hercules                mother                    name:alcmene                     type:human
name:hercules                                  mother         fathername:jupiter                          name:alcmene typ...
IS HERCULES A DEMIGOD?DEMIGOD = HALF HUMAN + HALF GOD                        name:hercules                                ...
name:hercules                                     mother            father   name:jupiter                          name:al...
name:hercules                                     mother            father   name:jupiter                          name:al...
DEMIGOD = HALF HUMAN + HALF GOD                         name:hercules                                         mother      ...
DEMIGOD = HALF HUMAN + HALF GOD                         name:hercules                         type:demigod                ...
COMPUTINGPROCESS               STRUCTURE
COMPUTINGPROCESS                 STRUCTURETRAVERSAL                GRAPH
COMPUTINGPROCESS                    STRUCTURETRAVERSAL                   GRAPH             COMPUTING            GRAPH-BASED
WhY GRAPH-BASED COMPUTING?
WhY GRAPH-BASED COMPUTING?     INTUITIVE MODELING
WhY GRAPH-BASED COMPUTING?     INTUITIVE MODELING    EXPRESSIVE QUERYING
WhY GRAPH-BASED COMPUTING?            INTUITIVE MODELING       EXPRESSIVE QUERYING        NUMEROUS ANALYSES               ...
ANALYSES ARE THEEPIPHENOMENA OF TRAVERSAL   f(        )→
WHAT IS THE SIGNIFICANCE OF     GRAPH ANALYSIS?
ANALYSES YIELDINSIGHTS ABOUT THE MODEL                     TA TS                  D A UC                  OD              ...
RECOMMENDATIONPeople you may know.                      SOCIAL GRAPHProducts you might like.                 RATINGS GRAPH...
WHO ELSE MIGHT HERCULES KNOW?                     cerberus              pluto                                  knows      ...
cerberus              pluto                                         knows                               1                 ...
cerberus               pluto                                           knows                                  1           ...
cerberus              pluto                                          knows                                1               ...
cerberus                pluto                                          knows                                1             ...
HERCULES PROBABLY KNOWS NEPTUNE                      cerberus                      pluto                                  ...
HERCULES PROBABLY KNOWS NEPTUNE                                                              PH                       cerb...
HERCULES PROBABLY KNOWS NEPTUNE                      cerberus               pluto                                    knows...
cerberus               pluto                                 knows                      1                     4           ...
cerberus               pluto                                         knows                              1                 ...
cerberus                  pluto                                         knows                              1              ...
human flesh            7                                    cerberus                  pluto                                ...
likes       human flesh            7                            likes       cerberus                  pluto                ...
tartarus                                                                 8                                         likes  ...
tartarus                                                                  8                                         likes ...
tartarus                                                                         8RATINGS GRAPH                           ...
NEMEAN MIGHT LIKE TARTARUS                                                 PRODUCT GRAPH              tartarus            ...
PATH FINDINGHow is this person related to this film?   MOVIE GRAPHWhich authors of this book also                          ...
WHO PLAYED HERCULES              IN WHAT MOVIE?                 jupiter                   hercules                   6    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
WHO PLAYED HERCULES                IN WHAT MOVIE?                   jupiter                   hercules                    ...
jupiter                   hercules                   6                          0                              depictedIn ...
jupiter                   hercules                   6                          0                              depictedIn ...
jupiter                   hercules                                                          depictedIn                    ...
fred                                                                           saberhagen                                 ...
fred                                      albuquerque                          saberhagen                                 ...
fred                 santa fe              albuquerque                          saberhagen                               2...
marko                                                                             fredrodriguez             santa fe      ...
marko                                                                               fredrodriguez             santa fe    ...
TRANSPORTATION GRAPH  marko                                                                               fredrodriguez   ...
SOCIAL INFLUENCE   Who are the most influential people in    java, mathematics, art, surreal art, politics, ...?   Which re...
PATTERN IDENTIFICATION This connectivity pattern is a sign of financial fraud.  When this motif is found, a red flag will be...
KNOWLEDGE DISCOVERYThe terms "ice", "fans", "stanley cup,"                                             WIKIPEDIA GRAPH are...
WORLD MODEL
WORLD PROCESSES  WORLD MODEL
WORLD PROCESSES                 WORLD MODELA single world model and various types of traversers   moving through that mode...
COMPUTINGPROCESS                    STRUCTURETRAVERSAL                   GRAPH             COMPUTING            GRAPH-BASED
GRAPH COMPUTING    ENGINES
MEMORY-BASED GRAPHSGraph FrameworkApplication                                         NetworkX                            ...
DISK-BASED GRAPHSGraph Database                                                              Neo4j  Application    Applica...
CLUSTER-BASED GRAPHS   Bulk Synchronous Parallel Processing         Application              Application                  ...
MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriente...
MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriente...
MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriente...
TINKERPOP                                                   Support for various graph vendorsOpen source graph product gro...
TINKERPOP                                             Graph                                             Server            ...
TINKERPOP INTEGRATIONhttp://tinkerpop.com
AND NOW THERE IS ANOTHER...
TITAN
PART 2:INTRODUCTION TO TITAN     MATTHIAS BROECHELER
WhY CREATE TITAN?     A number of Aurelius clients...         ...need to represent and process        graphs at the 100+ b...
TITANs KEY FEATURES       Titan provides...          ..."infinite size" graphs and          "unlimited" users by means of a...
matthias$
matthias$ wget http://thinkaurelius/titan.zip  % Total    % Received % Xferd Average Speed    Time     Time100 99999    0 ...
matthias$ wget http://thinkaurelius/titan.zip  % Total    % Received % Xferd Average Speed    Time     Time100 99999    0 ...
matthias$ wget http://thinkaurelius/titan.zip  % Total    % Received % Xferd Average Speed    Time     Time100 99999    0 ...
matthias$ wget http://thinkaurelius/titan.zip  % Total    % Received % Xferd Average Speed    Time     Time100 99999    0 ...
gremlin> g = TitanFactory.open(/tmp/local-titan)==>titangraph[local:/tmp/local-titan]
DE                                                           MO                                                       INE ...
gremlin> g.createKeyIndex(name,Vertex.class)==>nullgremlin> g.stopTransaction(SUCCESS)==>null
name:saturn        name:sky                       name:sea                                                          type:t...
name:saturn        name:sky                       name:sea                                                     type:titan ...
name:saturn        name:sky                       name:sea                                                     type:titan ...
name:saturn        name:sky                       name:sea                                                     type:titan ...
THAT WAS TITAN LOCAL.        NEXT IS TITAN DISTRIBUTED.Broecheler, M., Pugliese, A., Subrahmanian, V.S., "COSI: Cloud Orie...
BACKEND AGNOSTIC     -OR-
TITAN DISTRIBUTED                       VIA CASSANDRAtitan$ bin/gremlin.sh         ,,,/         (o o)-----oOOo-(_)-oOOo---...
INHERITED FEATURES      Continuously available with no single point of failure.      No write bottlenecks to the graph as ...
TITAN DISTRIBUTED                         VIA HBASEtitan$ bin/gremlin.sh         ,,,/         (o o)-----oOOo-(_)-oOOo-----...
INHERITED FEATURES      Strictly consistent reads and writes.      Linear scalability with the addition of machines.      ...
TITAN AND THE CAP THEOREM                 Partitionability    y                                         Ava        c    te...
Titan is all about ...
Titan is all about numerous concurrent users...
Titan is all about numerous concurrent users...                                      high availability....
Titan is all about numerous concurrent users...                                      high availability....                ...
THE HOW OF TITANDATA MANAGEMENT  EDGE COMPRESSION   VERTEX-CENTRIC INDICES
THE HOW OF TITANDATA MANAGEMENT
DATA MANAGEMENT                MAIN DESIGN PRINCIPLESImmutable, Atomic Edges                           Optimistic Concurre...
DATA MANAGEMENT                              TYPE DEFINITION      Datatype Constraints                                  Ed...
DATA MANAGEMENT                                  TYPE DEFINITION           Endogenous Indices  g.createKeyIndex("name",Ver...
DATA MANAGEMENT               LOCKING SYSTEMEnsures consistency over non-consistent storage backends.               hercul...
DATA MANAGEMENT     ID MANAGEMENT           [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine
DATA MANAGEMENT           ID MANAGEMENT[0,1,2]                                     [3,4,5]             [0,1,2,3,4,5,6,7,8,...
THE HOW OF TITANEDGE COMPRESSION
EDGE COMPRESSION              Natural graphs have a small world, community/cluster property.                     Community...
EDGE COMPRESSION
EDGE COMPRESSION           knows12345678           12345683
EDGE COMPRESSION             knows  12345678           12345683
EDGE COMPRESSION             knows  12345678           12345683  12345678     9     12345683   24 bytes
EDGE COMPRESSION             knows  12345678           12345683  12345678     9     12345683   24 bytes  12345678     9   ...
EDGE COMPRESSION                   knows   12345678                12345683   12345678          9     12345683   24 bytes ...
THE HOW OF TITANVERTEX-CENTRIC INDICES
VERTEX-CENTRIC INDICES        THE SUPER NODE PROBLEMNatural, real-world graphs containvertices of high degree.Even if rare...
VERTEX-CENTRIC INDICES           A SUPER NODE SOLUTIONA "super node" only exists from thevantage point of classic "textboo...
VERTEX-CENTRIC INDICES  PUSHDOWN PREDICATES  vertex.query()                               stars:5                      lik...
VERTEX-CENTRIC INDICES  PUSHDOWN PREDICATES  vertex.query().direction(OUT)                              stars:5           ...
VERTEX-CENTRIC INDICES  PUSHDOWN PREDICATES  vertex.query().direction(OUT)    .labels("likes")                            ...
VERTEX-CENTRIC INDICES  PUSHDOWN PREDICATES  vertex.query().direction(OUT)    .labels("likes").has("stars",5)             ...
VERTEX-CENTRIC INDICES             PUSHDOWN PREDICATES             Query   Query.direction(Direction)PREDICATES           ...
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING         battledtime:1         time:2         battled         time:12    ...
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING         battledtime:1         time:2                   battled         b...
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING         battledtime:1                       battled w/ time 1-5         ...
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother  father mother   knows   battled
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother  father mother   knows   battled
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother  father     family                      TypeGroup family =      ...
VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother  father     family mother   knows   battled                     ...
THAT IS HOW TITAN WORKSDATA MANAGEMENT  EDGE COMPRESSION   VERTEX-CENTRIC INDICES
WHAT IF YOU WANTED TO CREATE  TWITTER FROM SCRATCH?     SIMULATING TWITTER
3 BILLION EDGES   100 MILLION VERTICES10000 CONCURRENT USERS         50 MACHINES     1 GRAPH DATABASE     COMING JULY 2012
PART 3:THE FUTURE OF AURELIUSMARKO A. RODRIGUEZ   MATTHIAS BROECHELER
AURELIUS GRAPH    COMPUTING STORYTitan as the highly scalable, distributed graph database solution.   OLTP
AURELIUS GRAPH    COMPUTING STORYTitan as the highly scalable, distributed graph database solution.Titan as the source (an...
FAUNUSGOD OF HERDS
FAUNUSPATH ALGEBRA FOR HADOOP                                       battled                     battled                   ...
FAUNUSPATH ALGEBRA FOR HADOOP                       B = A · A ◦ n(I)                            B · B ◦ n(I)              ...
FAUNUS    PATH ALGEBRA FOR HADOOP                       Used for global graph operations.                              Imp...
FULGORAGODDESS OF LIGHTNING
FULGORA         AN EFFICIENt IN-MEMORY              GRAPH ENGINE                               Non-transactional, in-memor...
THE AURELIUS OLAP FLOWStores a massive-scale    property graph                                                          An...
THE AURELIUS OLAP FLOWStores a massive-scale    property graph                                                            ...
THE AURELIUS OLAP FLOWStores a massive-scale    property graph                                                   Analyzes ...
AURELIUS USE OF BLUEPRINTS    Aurelius products use the Blueprints API so any    graph product can communicate with any ot...
THE GRAPH LANDSCAPE                                      REPRISE   Speed of Traversal/Process                             ...
NEXT STEPS                              Make use of and/or contribute to the                               free, open sour...
THANK YOU
CREDITS    PRESENTERSMARKO A. RODRIGUEZMATTHIAS BROCHELER FINANCIAL SUPPORT PEARSON EDUCATION      AURELIUSLOCATION PROVIS...
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Upcoming SlideShare
Loading in...5
×

Titan: The Rise of Big Graph Data

100,740

Published on

A graph is a data structure composed of vertices/dots and edges/lines. A graph database is a software system used to persist and process graphs. The common conception in today's database community is that there is a tradeoff between the scale of data and the complexity/interlinking of data. To challenge this understanding, Aurelius has developed Titan under the liberal Apache 2 license. Titan supports both the size of modern data and the modeling power of graphs to usher in the era of Big Graph Data. Novel techniques in edge compression, data layout, and vertex-centric indices that exploit significant orders are used to facilitate the representation and processing of a single atomic graph structure across a multi-machine cluster. To ensure ease of adoption by the graph community, Titan natively implements the TinkerPop 2 Blueprints API. This presentation will review the graph landscape, Titan's techniques for scale by distribution, and a collection of satellite graph technologies to be released by Aurelius in the coming summer months of 2012.

Published in: Technology
8 Comments
165 Likes
Statistics
Notes
No Downloads
Views
Total Views
100,740
On Slideshare
0
From Embeds
0
Number of Embeds
101
Actions
Shares
0
Downloads
1,982
Comments
8
Likes
165
Embeds 0
No embeds

No notes for slide

Transcript of "Titan: The Rise of Big Graph Data"

  1. 1. TITANTHE RISE OF BIG GRAPH DATA MARKO A. RODRIGUEZ MATTHIAS BROECHELER http://THINKAURELIUS.COM
  2. 2. ABSTRACTA graph is a data structure composed of vertices/dots andedges/lines. A graph database is a software system used topersist and process graphs. The common conception in todaysdatabase community is that there is a tradeoff between thescale of data and the complexity/interlinking of data. Tochallenge this understanding, Aurelius has developed Titanunder the liberal Apache 2 license. Titan supports both the sizeof modern data and the modeling power of graphs to usher inthe era of Big Graph Data. Novel techniques in edgecompression, data layout, and vertex-centric indices thatexploit significant orders are used to facilitate therepresentation and processing of a single atomic graphstructure across a multi-machine cluster. To ensure ease ofadoption by the graph community, Titan natively implementsthe TinkerPop 2 Blueprints API. This presentation will reviewthe graph landscape, Titans techniques for scale bydistribution, and a collection of satellite graph technologies tobe released by Aurelius in the coming summer months of 2012.
  3. 3. SPEAKER BIOGRAPHIES Dr. Marko A. Rodriguez is the founder of the graph consulting firm Aurelius. He has focused his academic and commercial career on the theoretical and applied aspects of graphs. Marko is a cofounder of TinkerPop and the primary developer of the Gremlin graph traversal language. Dr. Matthias Broecheler has been researching and developing large-scale graph database systems for many years in both academia and in his role as a cofounder of the Aurelius graph consulting firm. He is the primary developer of the distributed graph database Titan. Matthias focuses most of his time and effort on novel OLTP and OLAP graph processing solutions.
  4. 4. SPONSORS As the leading education services company, Pearson is serious about evolving how the world learns. We apply our deep education experience and research, invest in innovative technologies, and promote collaboration throughout the education ecosystem. Real change is our commitment and its results are delivered through connecting capabilities to create actionable, scalable solutions that improve access, affordability, and achievement. Aurelius is a team of software engineers and scientists committed to applying graph theory and network science to problems in numerous domains. Aurelius develops the theory and technology whereby graphs can be used to model, understand, predict, and influence the behavior of complex, interrelated social, economic, and physical networks.Jive is the pioneer and worlds leading provider of social business solutions. Our productsapply powerful technology that helps people connect, communicate and collaborate to getmore work done and solve their biggest business challenges. Millions of users and manyof the worldʼs most successful companies rely on Jive day in and day out to get workdone, serve their customers and stay ahead of their competitors.
  5. 5. OUTLINE1. ThE GRAPH LANDSCAPE An introduction to graph computing. Graph technologies on the market today.2. INTRODUCTION TO TITAN Getting up and running with Titan. Titans techniques for scalability.3. THE FUTURE OF AURELIUS Satellite technologies and the OLAP story. The graph landscape reprise.
  6. 6. PART 1:ThE GRAPH LANDSCAPE MARKO A. RODRIGUEZ
  7. 7. GRAPH
  8. 8. EDGEVERTEX GRAPH
  9. 9. EDGEVERTEX GRAPH G = (V, E) Graph Vertices Edges
  10. 10. G = (V, E)Classic Textbook Graph Structure
  11. 11. A homogenous set of vertices... V
  12. 12. ...connected by a homogenous set of edges. E
  13. 13. RESTRICTED MODELINGPeople and follows relationships...
  14. 14. RESTRICTED MODELINGPeople and follows relationships... ...xor webpages and citations.
  15. 15. AN INTEGRATED MODEL IS TYPICALLY DESIRED references createdBy followsreferences references follows mentions
  16. 16. AN INTEGRATED MODEL IS USEFUL references createdBy follows references references follows mentionsAllows for more interesting/novel algorithms. (beyond "textbook" graph algorithms)Allows for a universal model of things and their relationships. (a single, unified model of a domain of interest)
  17. 17. THE PROPERTY GRAPH G = (V, E, λ) Current Popular Graph Structure* Directed, attributed, edge-labeled graph* Multi-relational graph with key/value pairs on the elements
  18. 18. VERTEX
  19. 19. PROPERTIES name:hercules VERTEX
  20. 20. PROPERTIESKEY VALUE name:hercules VERTEX
  21. 21. name:hercules
  22. 22. name:hercules mother name:alcmene type:human
  23. 23. name:hercules LABEL mother EDGE name:alcmene type:human
  24. 24. name:hercules mother name:alcmene type:human
  25. 25. name:hercules mother fathername:jupiter name:alcmene type:god type:human
  26. 26. IS HERCULES A DEMIGOD?DEMIGOD = HALF HUMAN + HALF GOD name:hercules mother father name:jupiter name:alcmene type:god type:human
  27. 27. name:hercules mother father name:jupiter name:alcmene type:god type:humangremlin> hercules==>v[0]
  28. 28. name:hercules mother father name:jupiter name:alcmene type:god type:humangremlin> hercules.out(mother,father)==>v[1]==>v[2]
  29. 29. DEMIGOD = HALF HUMAN + HALF GOD name:hercules mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.out(mother,father).type ==>human ==>god
  30. 30. DEMIGOD = HALF HUMAN + HALF GOD name:hercules type:demigod mother father name:jupiter name:alcmene type:god type:human gremlin> hercules.type = demigod ==>demigod
  31. 31. COMPUTINGPROCESS STRUCTURE
  32. 32. COMPUTINGPROCESS STRUCTURETRAVERSAL GRAPH
  33. 33. COMPUTINGPROCESS STRUCTURETRAVERSAL GRAPH COMPUTING GRAPH-BASED
  34. 34. WhY GRAPH-BASED COMPUTING?
  35. 35. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING
  36. 36. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING
  37. 37. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING NUMEROUS ANALYSES Mixing Patterns Ranking Inference Motifs Path Expressions Centrality Scoring Geodesics
  38. 38. ANALYSES ARE THEEPIPHENOMENA OF TRAVERSAL f( )→
  39. 39. WHAT IS THE SIGNIFICANCE OF GRAPH ANALYSIS?
  40. 40. ANALYSES YIELDINSIGHTS ABOUT THE MODEL TA TS D A UC OD PR = DE DATA CIS -D ION RIV SU EN PP OR T
  41. 41. RECOMMENDATIONPeople you may know. SOCIAL GRAPHProducts you might like. RATINGS GRAPHMovies you should watch and SOCIAL+RATINGS the friends you should watch them with. GRAPH
  42. 42. WHO ELSE MIGHT HERCULES KNOW? cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6
  43. 43. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules==>v[0]
  44. 44. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules.out(knows)==>v[1]==>v[2]==>v[3]
  45. 45. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules.out(knows).out(knows)==>v[4]==>v[5]==>v[5]==>v[6]==>v[5]
  46. 46. cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6gremlin> hercules.out(knows).out(knows).groupCount.cap==>v[4]=1==>v[5]=3==>v[6]=1
  47. 47. HERCULES PROBABLY KNOWS NEPTUNE cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows hydra jupiter knows 3 6 knows
  48. 48. HERCULES PROBABLY KNOWS NEPTUNE PH cerberus pluto knows A 1 4 knows knows E" GR YL hercules nemean neptune ST knows knows 0 2 5 K OO knows knows EX TB hydra jupiter knows "T 3 6 IS A IS knows TH
  49. 49. HERCULES PROBABLY KNOWS NEPTUNE cerberus pluto knows 1 4 knows knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father ...PROBABLY MORE SO WHEN OTHER TYPES OF EDGES ARE ANALYZED
  50. 50. cerberus pluto knows 1 4 knows knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father
  51. 51. cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father
  52. 52. cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  53. 53. human flesh 7 cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  54. 54. likes human flesh 7 likes cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  55. 55. tartarus 8 likes human flesh 7 likes cerberus pluto knows 1 4 knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  56. 56. tartarus 8 likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes knows likes knowshercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  57. 57. tartarus 8RATINGS GRAPH likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH
  58. 58. NEMEAN MIGHT LIKE TARTARUS PRODUCT GRAPH tartarus smellsOf 8 RATINGS GRAPH likes human flesh likes likes 7 likes cerberus pluto knows 1 4 dislikes composedOf knows likes knows hercules nemean neptune knows knows 0 2 5 knows knows brother hydra jupiter knows 3 6 father SOCIAL GRAPH* Collaborative Filtering + Content-Based Recommendation
  59. 59. PATH FINDINGHow is this person related to this film? MOVIE GRAPHWhich authors of this book also BOOK GRAPH wrote a New York Times bestseller?Which movies are based on a book by a MOVIE+BOOK New York Times bestseller? GRAPH
  60. 60. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  61. 61. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new yorkgremlin> hercules==>v[0]
  62. 62. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new yorkgremlin> hercules.out(depictedIn)==>v[7]
  63. 63. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie)==>v[7]
  64. 64. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor)==>v[8]==>v[10]
  65. 65. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role)==>v[0]==>v[6]
  66. 66. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules)==>v[0]
  67. 67. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2)==>v[8]
  68. 68. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor)==>v[9]
  69. 69. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor) .as(star)==>v[9]
  70. 70. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor) .as(star).select==>[movie:v[7], star:v[9]]
  71. 71. WHO PLAYED HERCULES IN WHAT MOVIE? jupiter hercules 6 0 depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 movie star hercules in new yorkgremlin> hercules.out(depictedIn).as(movie).out(hasActor) .out(role).retain(hercules).back(2).out(actor) .as(star).select{it.name}==>[movie:hercules in new york, star:arnold schwarzenegger]
  72. 72. jupiter hercules 6 0 depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  73. 73. jupiter hercules 6 0 depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  74. 74. jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  75. 75. fred saberhagen 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  76. 76. fred albuquerque saberhagen livesIn 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  77. 77. fred santa fe albuquerque saberhagen 25-North livesIn 15 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedInernest arnoldgraves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  78. 78. marko fredrodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  79. 79. marko fredrodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 thinksHeIs writtenBy jupiter hercules depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york
  80. 80. TRANSPORTATION GRAPH marko fredrodriguez santa fe albuquerque saberhagen livesIn 25-North livesIn 16 15 14 13 thinksHeIs BOOK GRAPH writtenBy PROFILE jupiter hercules GRAPH depictedIn the arms of 6 0 12 hercules depictedIn role role depictedIn ernest arnold graves schwarzenegger actor hasActor hasActor actor 11 10 7 8 9 hercules in new york MOVIE GRAPH
  81. 81. SOCIAL INFLUENCE Who are the most influential people in java, mathematics, art, surreal art, politics, ...? Which region of the social graph will propagate this advertisement this furthest? Which 3 experts should review this submitted article? Which people should I talk to at the upcoming conference and what topics should I talk to them about?SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
  82. 82. PATTERN IDENTIFICATION This connectivity pattern is a sign of financial fraud. When this motif is found, a red flag will be raised. TRANSACTION GRAPH Healthy discourse is typified by a discussion board with a branch factor in this range and a concept clique score in this range. DISCUSSION GRAPH
  83. 83. KNOWLEDGE DISCOVERYThe terms "ice", "fans", "stanley cup," WIKIPEDIA GRAPH are classified as "sports"Given that all identified birds fly, it can be deduced that all birds fly. If contrary evidence is provided, EVIDENTIAL LOGIC GRAPH then this "fact" can be retracted.
  84. 84. WORLD MODEL
  85. 85. WORLD PROCESSES WORLD MODEL
  86. 86. WORLD PROCESSES WORLD MODELA single world model and various types of traversers moving through that model to solve problems.
  87. 87. COMPUTINGPROCESS STRUCTURETRAVERSAL GRAPH COMPUTING GRAPH-BASED
  88. 88. GRAPH COMPUTING ENGINES
  89. 89. MEMORY-BASED GRAPHSGraph FrameworkApplication NetworkX http://networkx.lanl.gov/ iGraphhttp://igraph.sourceforge.net/ JUNG http://jung.sourceforge.net/
  90. 90. DISK-BASED GRAPHSGraph Database Neo4j Application Application http://neo4j.org/ Application OrientDB http://orientdb.org InfiniteGraph http://objectivity.com DEX http://www.sparsity-technologies.com/dex
  91. 91. CLUSTER-BASED GRAPHS Bulk Synchronous Parallel Processing Application Application Application Hama 3 http://incubator.apache.org/hama/ 2 1 Giraph http://incubator.apache.org/giraph/ GoldenOrb http://goldenorbos.org/* In the same spirit as Googles Pregel
  92. 92. MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriented towards "textbook-style" graphs. * Based on typical behavior
  93. 93. MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriented towards "textbook-style" graphs.DISK-BASED GRAPHSGraph size is constrained by local disk.Optimized for local graph algorithms.Oriented towards property graphs. * Based on typical behavior
  94. 94. MEMORY-bASED GRAPHSGraph size is constrained by local machines RAM.Rich graph algorithm and visualization packages.Oriented towards "textbook-style" graphs.DISK-BASED GRAPHSGraph size is constrained by local disk.Optimized for local graph algorithms.Oriented towards property graphs.CLUSTER-BASED GRAPHSGraph size is constrained to clusters total RAM.Optimized for global graph algorithms.Oriented towards "textbook-style" graphs. * Based on typical behavior
  95. 95. TINKERPOP Support for various graph vendorsOpen source graph product group * Encompassing the various graph computing styles Simple, well-defined products Provides a vendor-agnostic graph frameworkhttp://tinkerpop.com * Based on future directions
  96. 96. TINKERPOP Graph Server Graph Algorithms Object-Graph Mapper Traversal Language Dataflow Processing http://tinkerpop.com Generic Graph APIhttp://${project.name}.tinkerpop.com
  97. 97. TINKERPOP INTEGRATIONhttp://tinkerpop.com
  98. 98. AND NOW THERE IS ANOTHER...
  99. 99. TITAN
  100. 100. PART 2:INTRODUCTION TO TITAN MATTHIAS BROECHELER
  101. 101. WhY CREATE TITAN? A number of Aurelius clients... ...need to represent and process graphs at the 100+ billion edge scale w/ thousands of concurrent transactions. ...need both local graph traversals (OLTP) and batch graph processing (OLAP). ...desire a free, open source distributed graph database.
  102. 102. TITANs KEY FEATURES Titan provides... ..."infinite size" graphs and "unlimited" users by means of a distributed storage engine. ...real-time local traversals (OLTP) and support for global batch processing via Hadoop (OLAP). ...distribution via the liberal, free, open source Apache2 license.
  103. 103. matthias$
  104. 104. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$
  105. 105. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$ unzip titan.zipArchive: titan.zip creating: titan/ ...matthias$
  106. 106. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$ unzip titan.zipArchive: titan.zip creating: titan/ ...matthias$ cd titantitan$
  107. 107. matthias$ wget http://thinkaurelius/titan.zip % Total % Received % Xferd Average Speed Time Time100 99999 0 99999 0 0 11078 0 --:--:-- 0:01:01matthias$ unzip titan.zipArchive: titan.zip creating: titan/ ...matthias$ cd titantitan$ bin/gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin>
  108. 108. gremlin> g = TitanFactory.open(/tmp/local-titan)==>titangraph[local:/tmp/local-titan]
  109. 109. DE MO INE ACHgremlin> g = TitanFactory.open(/tmp/local-titan) LM==>titangraph[local:/tmp/local-titan] LO CA
  110. 110. gremlin> g.createKeyIndex(name,Vertex.class)==>nullgremlin> g.stopTransaction(SUCCESS)==>null
  111. 111. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> g.loadGraphML(data/graph-of-the-gods.xml)==>null* The Graph of the Gods is a toy dataset distributed with Titan
  112. 112. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> hercules = g.V(name,hercules).next()==>v[24]
  113. 113. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> hercules.out(mother,father)==>v[44]==>v[16]
  114. 114. name:saturn name:sky name:sea type:titan type:location type:location lives father lives name:jupiter brother name:neptune type:god type:god father brother brother name:hercules type:demigod mother name:pluto type:god name:alcmene type:human pet battled battled battled lives time:1 time:2 time:12 lives name:tartarus type:location name:nemean name:hydra name:cerberus type:monster type:monster type:monstergremlin> hercules.out(mother,father).name==>alcmene==>jupiter
  115. 115. THAT WAS TITAN LOCAL. NEXT IS TITAN DISTRIBUTED.Broecheler, M., Pugliese, A., Subrahmanian, V.S., "COSI: Cloud Oriented Subgraph Identification in Massive Social Networks,"Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 248-255, 2010.http://www.knowledgefrominformation.com/2010/08/01/cosi-cloud-oriented-subgraph-identification-in-massive-social-networks/
  116. 116. BACKEND AGNOSTIC -OR-
  117. 117. TITAN DISTRIBUTED VIA CASSANDRAtitan$ bin/gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin> conf = new BaseConfiguration();==>org.apache.commons.configuration.BaseConfiguration@763861e6gremlin> conf.setProperty("storage.backend","cassandra");gremlin> conf.setProperty("storage.hostname","77.77.77.77");gremlin> g = TitanFactory.open(conf);==>titangraph[cassandra:77.77.77.77]gremlin>* There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
  118. 118. INHERITED FEATURES Continuously available with no single point of failure. No write bottlenecks to the graph as there is no master/slave architecture. Built-in replication ensures data is available during machine failure. Caching layer ensures that continuously accessed data is available in memory. Elastic scalability allows for the introduction and removal of machines.Cassandra available at http://cassandra.apache.org/
  119. 119. TITAN DISTRIBUTED VIA HBASEtitan$ bin/gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin> conf = new BaseConfiguration();==>org.apache.commons.configuration.BaseConfiguration@763861e6gremlin> conf.setProperty("storage.backend","hbase");gremlin> conf.setProperty("storage.hostname","77.77.77.77");gremlin> g = TitanFactory.open(conf);==>titangraph[hbase:77.77.77.77]gremlin>* There are numerous graph configurations: https://github.com/thinkaurelius/titan/wiki/Graph-Configuration
  120. 120. INHERITED FEATURES Strictly consistent reads and writes. Linear scalability with the addition of machines. Base classes for backing Hadoop MapReduce jobs with HBase tables. HDFS-based data replication. Generally good integration with the tools in the Hadoop ecosystem.HBase available at http://hbase.apache.org/
  121. 121. TITAN AND THE CAP THEOREM Partitionability y Ava c ten il is abi s on ty li C
  122. 122. Titan is all about ...
  123. 123. Titan is all about numerous concurrent users...
  124. 124. Titan is all about numerous concurrent users... high availability....
  125. 125. Titan is all about numerous concurrent users... high availability.... dynamic scalability...
  126. 126. THE HOW OF TITANDATA MANAGEMENT EDGE COMPRESSION VERTEX-CENTRIC INDICES
  127. 127. THE HOW OF TITANDATA MANAGEMENT
  128. 128. DATA MANAGEMENT MAIN DESIGN PRINCIPLESImmutable, Atomic Edges Optimistic Concurrency Control hercules cerberus battled1 hercules time:12 cerberus2 battled + + + hercules time:12 successful:true cerberus + -3 battled + Fined-Grained Locking Control
  129. 129. DATA MANAGEMENT TYPE DEFINITION Datatype Constraints Edge Label Signatures TitanKey timeKey = TitanLabel battled = g.makeType().name("time") g.makeType().name("battled") .dataType(Integer.class) .signature(timeKey) time:12 time:"twelve" hercules cerberus battled time:12 Functional Declarations TitanLabel father = g.makeType().name("father") .functional() hercules jupiter father mars fatherData management configurations allow Titan to optimize how information is stored/retrieved from disk.
  130. 130. DATA MANAGEMENT TYPE DEFINITION Endogenous Indices g.createKeyIndex("name",Vertex.class) Unique Property Key/Value Pairs TitanKey status = name:jupiter g.makeType().name("status") name:hercules .unique() name:hermes name:jupiter name:neptune status:king of the gods status:king of the godsData management configurations allow Titan to optimize how information is stored/retrieved from disk.
  131. 131. DATA MANAGEMENT LOCKING SYSTEMEnsures consistency over non-consistent storage backends. hercules father jupiter write hercules jupiter father neptune father hercules write 1. Acquire lock at the end of the transaction. - locking mechanism depends on storage layer consistency guarantees. 2. Verify original read. 3. Fail transaction if any precondition is violated.
  132. 132. DATA MANAGEMENT ID MANAGEMENT [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine
  133. 133. DATA MANAGEMENT ID MANAGEMENT[0,1,2] [3,4,5] [0,1,2,3,4,5,6,7,8,9,10,11] Global ID Pool Maintained by Storage Engine [6,7,8] [9,10,11] Pool Subsets Assigned to Individual Instances
  134. 134. THE HOW OF TITANEDGE COMPRESSION
  135. 135. EDGE COMPRESSION Natural graphs have a small world, community/cluster property. Community 1 Community 2 High intra-connectivity within a community and low inter-connectivity between communities.Watts, D. J., Strogatz, S. H., "Collective Dynamics of Small-World Networks,"Nature 393 (6684), pp. 440–442, 1998.
  136. 136. EDGE COMPRESSION
  137. 137. EDGE COMPRESSION knows12345678 12345683
  138. 138. EDGE COMPRESSION knows 12345678 12345683
  139. 139. EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes
  140. 140. EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes 12345678 9 +5
  141. 141. EDGE COMPRESSION knows 12345678 12345683 12345678 9 12345683 24 bytes 12345678 9 +5 + 12345678 9 5 7 bytes
  142. 142. THE HOW OF TITANVERTEX-CENTRIC INDICES
  143. 143. VERTEX-CENTRIC INDICES THE SUPER NODE PROBLEMNatural, real-world graphs containvertices of high degree.Even if rare, their degree ensures thatthey exist on many paths.Traversing a high degree vertexmeans touching numerous incidentedges and potentially touching mostof the graph in only a few steps.
  144. 144. VERTEX-CENTRIC INDICES A SUPER NODE SOLUTIONA "super node" only exists from thevantage point of classic "textbookstyle" graphs.In the world of property graphs,intelligent disk-level filtering caninterpret a "super node" as a moremanageable low-degree vertex.Vertex-centric querying utilizes B-Treesand sort orders for speedy lookup ofincident edges with particular qualities.
  145. 145. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query() stars:5 likes likes stars:2 stars:2 likes knows knows stars:3 stars:3 likes likes knows 8 edges
  146. 146. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) stars:5 likes likes stars:2 stars:2 likes knows knows stars:3 stars:3 likes likes 7 edges
  147. 147. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) .labels("likes") stars:5 likes likes stars:2 stars:2 likes stars:3 stars:3 likes likes 5 edges
  148. 148. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES vertex.query().direction(OUT) .labels("likes").has("stars",5) stars:5 likes 1 edge
  149. 149. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES Query Query.direction(Direction)PREDICATES Query Query.labels(String... labels) Query Query.has(String, Object, Compare) Query Query.has(String, Object) Query Query.range(String, Object, Object)GETTERS Iterable<Vertex> Query.vertices() Iterable<Edge> Query.edges()
  150. 150. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING battledtime:1 time:2 battled time:12 battled knows knows
  151. 151. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING battledtime:1 time:2 battled battled time:12 battled knows knows knows
  152. 152. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING battledtime:1 battled w/ time 1-5 time:2 battled time:12 battled battled w/ time 5-10 knows TitanLabel battled = g.makeType().name("battled") .primaryKey(time) knows knows
  153. 153. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father mother knows battled
  154. 154. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father mother knows battled
  155. 155. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father family TypeGroup family = TypeGroup.of(2,"family"); mother TitanLabel father = g.makeType().name("father") .group(family).makeEdgeLabel(); TitanLabel mother = knows g.makeType().name("mother") .group(family).makeEdgeLabel(); TitanLabel brother = battled g.makeType().name("brother") .group(family).makeEdgeLabel();
  156. 156. VERTEX-CENTRIC INDICESDISK-LEVEL SORTING/INDEXING brother father family mother knows battled vertex.query().group("family")...
  157. 157. THAT IS HOW TITAN WORKSDATA MANAGEMENT EDGE COMPRESSION VERTEX-CENTRIC INDICES
  158. 158. WHAT IF YOU WANTED TO CREATE TWITTER FROM SCRATCH? SIMULATING TWITTER
  159. 159. 3 BILLION EDGES 100 MILLION VERTICES10000 CONCURRENT USERS 50 MACHINES 1 GRAPH DATABASE COMING JULY 2012
  160. 160. PART 3:THE FUTURE OF AURELIUSMARKO A. RODRIGUEZ MATTHIAS BROECHELER
  161. 161. AURELIUS GRAPH COMPUTING STORYTitan as the highly scalable, distributed graph database solution. OLTP
  162. 162. AURELIUS GRAPH COMPUTING STORYTitan as the highly scalable, distributed graph database solution.Titan as the source (and potential sink) for other graphprocessing solutions. OLTP OLAP
  163. 163. FAUNUSGOD OF HERDS
  164. 164. FAUNUSPATH ALGEBRA FOR HADOOP battled battled hercules cretan bull theseus A · A ◦ n(I) ally hercules theseusDerived graphs are single-relational and are typically much smaller thantheir multi-relational source. Therefore, derived graphs can be subjected totextbook-style graph algorithms in both a meaningful and efficient manner. WHO IS THE MOST CENTRAL ALLY?
  165. 165. FAUNUSPATH ALGEBRA FOR HADOOP B = A · A ◦ n(I) B · B ◦ n(I) ally ally ally ally ally ally ally ally ally ally ally ally ally My allies allies are my allies. 2 (A · A ) ◦ n(I)
  166. 166. FAUNUS PATH ALGEBRA FOR HADOOP Used for global graph operations. Implements the multi-relational path algebra as a collection of Map/Reduce operations Reduce a massive property graph into a smaller semantically-rich single-relational graph. Project codename: TinkerPoop Support for HadoopGraph and HDFS file formatsRodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks toSingle-Relational Network Analysis Algorithms,” Journal of Informetrics,4(1), pp. 29-41, 2009. http://arxiv.org/abs/0806.2274
  167. 167. FULGORAGODDESS OF LIGHTNING
  168. 168. FULGORA AN EFFICIENt IN-MEMORY GRAPH ENGINE Non-transactional, in-memory graph engine. It is not a database. Process ~90 billion edges in 68-Gigs of RAM assuming a small world topology. Perform complex graph algorithms in-memory. global graph analysis multi-relational graph analysisSimilar in spirit to Twitters Cassovary: https://github.com/twitter/cassovary
  169. 169. THE AURELIUS OLAP FLOWStores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph Map/Reduce Load into RAM on a single-machine Update graph with derived edges Update element properties with algorithm results to a stats package
  170. 170. THE AURELIUS OLAP FLOWStores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph Map/Reduce Load into RAM on a single-machine ally ally_centrality:0.0123 hercules theseus hercules to a stats package
  171. 171. THE AURELIUS OLAP FLOWStores a massive-scale property graph Analyzes compressed, large-scale single or multi-relational Generates a large-scale graphs in memory single-relational graph to a stats package
  172. 172. AURELIUS USE OF BLUEPRINTS Aurelius products use the Blueprints API so any graph product can communicate with any other graph product. The code for graph databases, frameworks, algorithms, and batch-processing are written in terms of the Blueprints API. Aurelius encourages developers to use Blueprints/ TinkerPop in order to grow a rich ecosystem of interoperable graph technologies.
  173. 173. THE GRAPH LANDSCAPE REPRISE Speed of Traversal/Process Size of Graph/Structure* Not to scale. Did not want to overlap logos.
  174. 174. NEXT STEPS Make use of and/or contribute to the free, open source Titan product.Learn about applying graphtheory and network science. http://thinkaurelius.com http://thinkaurelius.github.com/titan/
  175. 175. THANK YOU
  176. 176. CREDITS PRESENTERSMARKO A. RODRIGUEZMATTHIAS BROCHELER FINANCIAL SUPPORT PEARSON EDUCATION AURELIUSLOCATION PROVISIONS JIVE SOFTWARE MANY THANKS TO DAN LAROCQUETINKERPOP COMMUNITY STEPHEN MALLETTE BOBBY NORTON KETRINA YIM
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×