Graph Databases: Connecting the Dots in Big Data


Published on

Big Data problems are quickly presenting themselves in almost every area of computing from Social Network Analysis to File Processing. Many technologies, such as those in the NoSQL space were developed in response to the limitations of current storage systems as an effective mechanism to deal with these mountains of data. And much of that data is interconnected in ways that, when organized properly, gives interesting and often valuable information. InfiniteGraph, the distributed and scalable graph database, was designed specifically to traverse connections and provide the framework for a new set of products built to provide real-time business decision support and relationship analytics. This presentation examines the technology behind InfiniteGraph and explores a couple of common use cases involving very large scale graph processing.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Graph Databases: Connecting the Dots in Big Data

  1. 1. Graph  Databases  :  Connec1ng  the   Dots  in  Big  Data   Darren  Wood   Chief  Architect,  InfiniteGraph  
  2. 2. Rela8onships  are  everywhere   CRM,  Sales  &   Marke1ng   Network   Mgmt,   Intelligence   Telecom   (Government &  Business)  PLM  (Product   Lifecycle   Mgmt)   Finance   Social   Healthcare   Networks   Research:   Genomics  
  3. 3. Graph  Databases  •  Not  Really  Graph  Problems   – Average  age  of  my  customers  that  purchased  X   – Which  zip  code  buys  the  most  of  Y  •  Graph  Problems   – How  is  person  A  connected  to  person  B   – Can  suspect  Y  be  associated  with  loca8on  Z   – Who  are  influencers  within  a  social  network  ?   Copyright  ©  InfiniteGraph  
  4. 4. Graph  Databases  •  Op8mized  around  data  rela8onships   – Rela8onships  as  first  class  ci8zens   – Super  fast  traversal  between  en88es   – Rich/flexible  annota8on  of  connec8ons  •  Small  focused  API  (typically  not  SQL)   – Na8vely  work  with  concepts  of  Vertex/Edge   – SQL  has  no  concept  of  “naviga8on”   – Most  aZempts  based  in  SQL  are  convoluted   Copyright  ©  InfiniteGraph  
  5. 5. Physical  Storage  Comparison  Rows/Columns/Tables Relationship/Graph Optimized Mee8ngs   Met   P1   P2   Place   Time   Alice   5-­‐27-­‐10  Alice   Bob   Denver   5-­‐27-­‐10   Charlie   Calls  From   To   Time   Dura8on   Called   Called  Bob   Carlos   13:20   25   Bob   13:20   17:10  Bob   Charlie   17:10   15   Payments  From   To   Date   Amount   Payed  Carlos   Charlie   5-­‐12-­‐10   100000   Carlos   100000   Copyright  ©  InfiniteGraph  
  6. 6. Simple  API   Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”)); alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob); bob.addEdge(new Call(timestamp), carlos); carlos.addEdge(new Payment(100000.00), charlie); bob.addEdge(new Call(timestamp), charlie);Alice   Bob   Carlos   Charlie   Meets   Calls   Pays   Calls   Copyright  ©  InfiniteGraph  
  7. 7. Query  and  Naviga8on  •  Queries  –  but  not  as  you  know  them  •  More  like  a  rules  based  search  and  discovery  •  Asynchronous  Results   “Find all paths between Alice andAlice and Charlie” May 2010” “Find all paths between Charlie – events in Alice   Bob   Carlos   Charlie   Meets   Calls   Pays   Calls   “Find all paths between Alice and Charlie – within 2 degrees” Copyright  ©  InfiniteGraph  
  8. 8. Naviga8on  Example  // Create a qualifier that describes the target vertexQualifier findCharliePredicate = new VertexPredicate(personType, "name == ’Charlie");//  Construct  a  navigator  which  starts  with  Alice  and  uses  a  result  qualifier  //  to  find  all  paths  in  the  graph  to  Charlie  Navigator charlieFinder = alice.navigate( Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints findCharliePredicate , // find paths ending with Charlie myResultHandler); // fire results to supplied handler//  Start  the  navigator  charlieFinder.start();   Copyright  ©  InfiniteGraph  
  9. 9. Naviga8onal  Query  Performance  
  10. 10. Scaling  Graphs  –  Gegng  Data  In   App-­‐1   App-­‐2   App-­‐3   App-­‐3   (E1  2{  V1V2})   (Ingest  V1)   (E23{  V2V32})   (Ingest  V )   (Ingest  V3)   IG  Core/API   Management   Standard  Blocking  Ingest/Placement  (MDP  P  lugin)   Naviga8on   Session  / TX   Placement   Configura8on   Extensions   Execu8on   Management   Objec8vity/DB   V1   E12   V2   E23   V3   Copyright  ©  InfiniteGraph  
  11. 11. Accelerated  Ingest   IG  Core/API   Management   Naviga8on   Placement   Session  /  TX   Configura8on   Extensions   Execu8on   (Accelerated)   (Standard)   Management   E(1-­‐>2)   E(2-­‐>3)   Distributed   E(1-­‐>2)   Staging ContainersE12   E(3-­‐>1)   E(2-­‐>1)   V1   Pipeline Containers E(1-­‐>2)   E(2-­‐>3)   E(2-­‐>3)   E(2-­‐>1)  E23   E(3-­‐>1)   V2   E(2-­‐>3)   E(1-­‐>2)   E(3-­‐>1)   E(3-­‐>1)   V3   E(3-­‐>2)   Pipelines   E(3-­‐>2)   Copyright  ©  InfiniteGraph  
  12. 12. Choose  Your  Own  Consistency…  // Describe your requested model using policiesPolicyChain myPolicies = new PolicyChain(new EdgePipeliningPolicy(true));// Start a transaction with the policies you wantTransaction tx = myGraph.beginTransaction( AccessMode.READ_WRITE, myPolicies);// This code doesn’t change, can be used with any policiesalice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);tx.commit();   Copyright  ©  InfiniteGraph  
  13. 13. Indexing  Framework  •  Focused  on  providing  choice  !  •  Manual  Indexes  for  grouping  data  •  Automa8c  Indexes  for  cross  popula8on  •  Query  interface  with  qualifica8on  language  •  Pluggable  query  operators  •  External  index  support  (Lucene)   Copyright  ©  InfiniteGraph  
  14. 14. InfiniteGraph  Visualizer   Copyright  ©  InfiniteGraph  
  15. 15. Scaling  Graphs  –  Distributed   Naviga8on  •  Graph  algorithms  naturally  branch  •  Requires  orchestra8on  of  threads/agents   Bob   Carlos   Charlie   Meets   Calls   Pays   Alice   Calls   Chuck   Dave   Eve   Lives   Meets   With   Copyright  ©  InfiniteGraph  
  16. 16. Big  Distributed  Data   (Tradi8onal  -­‐  Huge  Generaliza8on)   Applica8on(s)   Distributed  API  Processor   Processor   Processor   Processor  Par88on  1   Par88on  2   Par88on  3   Par88on  ...n   Copyright  ©  InfiniteGraph  
  17. 17. Big  Distributed  Data   (Graph)   Applica8on(s)   Distributed  API  Processor   Processor   Processor   Processor  Par88on  1   Par88on  2   Par88on  3   Par88on  ...n   Copyright  ©  InfiniteGraph  
  18. 18. Some  customers  and  partners  
  19. 19. Thankyou  !   Copyright  ©  InfiniteGraph