Titan NYC Meetup March 2014

1,287 views

Published on

Slides from the meetup presentation in NYC (March 2014). Covers the current version of Titan and Faunus.

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,287
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
42
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Titan NYC Meetup March 2014

  1. 1. AURELIUS THINKAURELIUS.COM TITAN Scalable Graph Database Matthias Broecheler @mbroecheler March 6th, MMXIII
  2. 2. Graph Database distributed real time open source
  3. 3. name: Hercules type: demigod name: Cerberus type: monster battled time:12 Vertex Edge Label Edge Property = key + value
  4. 4. name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2
  5. 5. name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 g.V g.E
  6. 6. v name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 v = g.V.has(‘name’,’Hercules’)
  7. 7. v name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 v.out(‘father’,’mother’)
  8. 8. v name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 v.out(‘father’).out(‘brother’).name
  9. 9. v name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 v.outE(‘battled’).has(‘time’,T.gt,5).inV.name
  10. 10. name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 g.V.has(‘age’,T.gt,4200)
  11. 11. name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 g.E.has(‘time’,T.lt,5)
  12. 12. name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 saturn.as('x').in('father') .loop('x'){it.loops < 3}.next()
  13. 13. name: Jupiter type: god name: Hercules type: demigod name: Cerberus type: monster father father motherbrother brother battled pet time:12 name: Pluto type: god age: 4000 name: Neptune type: god age: 4500 name: Alcmene type: human age: 45 name: Saturn type: titan age: 10000 name: Hydra type: monster battled time: 2 g.V.sideEffect{ it.rank = it.both.both.both.count() }
  14. 14. AURELIUS THINKAURELIUS.COM Titan Database Architecture Overview
  15. 15. Titan Features I. Data Management II. Vertex-Centric Indices
  16. 16. Titan Features III. Graph Partitioning IV. Edge Compression
  17. 17. Architecture Analogy MyISAM
  18. 18. Flexible Persistence Partitionability AvailabilityConsistency
  19. 19. g.E.has(‘location’,WITHIN, Geoshape.circle(38,24,50) Full text & Geo Search
  20. 20. I. Navigate Memory
  21. 21. Sequential Data Access
  22. 22. II. Manage Concurrency  Multiple users  Units of work  Atomicity  Isolation  Consistency  Distribution Transactions
  23. 23. Vertex Representation 5 Property Property Out-Edge In-Edge Out-Edge In-Edge In-Edge row indices for fast vertex centric queries byteordersorting cell = column + value row key
  24. 24. Titan Storage Model  Adjacency list in one column family  Row key = vertex id  Each property and edge in one column  Denormalized, i.e. stored twice  Direction and label/key as column prefix  Use slice predicate for quick retrieval 5 5
  25. 25. label id + direction sort key Δ vertex id Δ edgeid signature properties other properties Edge Representation Column Value compressed serialized objects variable long encoding Properties & Edges are atomic
  26. 26. Vertex-Centric Indices  Sort and index edges per vertex by sor tkey  Sort key can be composite  Enables efficient focused traversals  Only retrieve edges that matter  Uses push down predicates for quick, index-driven retrieval
  27. 27. v time: 1 foughtfoughtfather mother battled battled battled battled time: 3 time: 5 time: 9 v.query()
  28. 28. v time: 1 father mother battled battled battled battled time: 3 time: 5 time: 9 v.query() .direction(OUT)
  29. 29. v time: 1 battled battled battled battled time: 3 time: 5 time: 9 v.query() .direction(OUT) .labels(‘battled’)
  30. 30. v time: 1 battled battled time: 3 v.query() .direction(OUT) .labels(‘battled’) .has(‘time’,T.lt,5)
  31. 31. v time: 1 battled battled time: 3 v.query() .direction(OUT) .labels(‘battled’) .has(‘time’,T.lt,5) = v.outE(‘battled’).has (‘time’,T.lt,5).inV Query Optimization
  32. 32. Consistency  on eventually consistent storage backends, Titan can enforce consistency constraints by configuring types with UniquenessConsistency.LOCK  Titan acquires locks to avoid conflicting changes  Acquiring locks is expensive  use with care  Locking protocol used is configurable  reasonably safe implementation, not completely fail-safe
  33. 33. Token Ring Graph Partitioning assigns ids to map vertices into “optimal” token range Lots of interesting questions forfuture work uses BOP
  34. 34. Educating the Planet
  35. 35. Person PersonStudent Teacher Course Institution Concept Discussio n Comment Share enrolledIn teaches relatesTo hasCourse belongsTo follows author references hasComment relatesTo author partOf relatesTo
  36. 36. 121 Billion Edges 6.2 Billion Vertices 1 Million Universities 3 . 5 Billion Students
  37. 37. Placement Group hi1 .4xl Setup
  38. 38. 1.1 million edges / sec using batch mode Data Ingestion
  39. 39. 80 m1 .medium
  40. 40. 10,200 transactions / sec 16 randomly chosen complex traversal templates Throughput
  41. 41. Titan Local Caching
  42. 42. Flexible Persistence Partitionability AvailabilityConsistency
  43. 43. Local Deployment Application + Titan Storage Backend Application + Titan + Storage Backend (embedded)
  44. 44. Remote Deployment Application + Titan Storage Backend Cluster
  45. 45. Server Deployment II Application Cluster of: (2 JVM) - Titan + Rexster - Storage Backend (via localhost)
  46. 46.  Native Blueprints Implementation  Gremlin Query Language  Rexster Server  any Titan graph can be exposed as a REST endpoint Generic Graph API Dataflow Processing Traversal Language Object-Graph Mapper Graph Algorithms Graph Server Titan Ecosystem
  47. 47. AURELIUS THINKAURELIUS.COM Faunus Batch Graph Analytics
  48. 48.  Hadoop-based Graph Computing Framework  Graph Analytics  Breadth-first Traversals  Global Graph Computations  Batch Big Graph Data Faunus Features
  49. 49. Faunus Architecture g._()
  50. 50. Faunus Work Flow hdfs://user/ubuntu/ output/job-0/ output/job-1/ output/job-2/ { graph* sideeffect* g.V.out .out .count() Compressed HDFS Graphs  stored in sequence files  variable length encoding  prefix compression
  51. 51. Degree Distribution GitHub Network g.V.sideEffect{ it.degree = it.out(‘follows’).count() }.degree.groupCount
  52. 52. Degree Distribution P(k) ~ k-γ γ = 2.2
  53. 53. Global Recommendations gremlin> g.E.has('label','pushed','to').keep. V.out('pushed').out('to'). in('to').in('pushed'). sideEffect('{it.score =it.pathCounter}'). score.order(F.decr,'name') # Top 5: Jippi 60892182927 garbear 30095282886 FakeHeal 30038040349 brianchandotcom 24684133382 nyarla 15230275746
  54. 54. AURELIUS THINKAURELIUS.COM Big Picture Closing Thoughts
  55. 55. Value in Relationships low high Key-Value Why Graph Databases? K V BigTable K V V V V Document Relational Graph 
  56. 56. The value of data is proportional to the number of meaningful relationships
  57. 57. Social Networks
  58. 58. Recommendations Path Finding
  59. 59. Graph Search
  60. 60. Knowledge Graph
  61. 61. Markets & Risks
  62. 62. ECONOMY
  63. 63. Health & Medicine
  64. 64. HEALTH
  65. 65. June 14th 2012 September 2012 December 2012 March 2013 November 2013 Alpha Release Titan 0.1.0 Titan 0.2.0 Titan 0.3.0 Titan 0.4.0 Experimental release of a distributed, open -source graph database First stable release Rewrite of core Indexing & ElasticSearch Performance Feature Extension Fulgora Faunus Release
  66. 66. What’s Coming  Creating and updating indexes  Vertex-centric indexes  Graph indexes  Log integration  Tighter Titan-Faunus Integration  Graph Partitioning  Declarative Query Answering  Usability Improvements
  67. 67. Aurelius Graph Cluster OLTP OLAP Hadoop MapReduce Analysis results back into Titan Apache 2 g.V.label.groupCountg.v(101).out titan.thinkaurelius.com faunus.thinkaurelius.com aureliusgraphs@googlegroups.com
  68. 68. AURELIUS THINKAURELIUS.COM @AURELIUSGRAPHS

×