Published on

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. AllegroGraph as a Graph Database<br />JansAasman, Ph.D.<br />CEO - Franz Inc<br /><br />
  2. 2.
  3. 3. Contents<br />AllegroGraph as a <br />QuintupleStore (well OcttupleStore in 2011)<br />RDF store<br />Graph Database<br />Agraph architecture<br />Extreme use cases<br />AMDOCS … CRM on top a trillion triples<br />Pharmaceutic … explore connections in graph space<br />Demo<br />
  4. 4. Agraph as a quintuple store<br />S, P, O, G + unique ID + transaction #<br />SPOG can be any data type<br />1 2.0 3 4<br />2001-12-12 after 010-12-12 +19258781444<br />Jans loves pizza file1 12<br />NoOne believes 12<br />And include very efficient geospatial and temporal representations and indices<br />6 default indices, 24 user controlled indices<br />Range indexing, Freetext Indexing<br />Neighborhood matrixes & UPI maps (for 1 ms access)<br />2011: time, security<br />
  5. 5. Agraph as an RDF store<br />RDF store when you adhere to the RDF conventions.<br />Full Sparql 1.0, most of Sparql 1.1<br />RDFS++ reasoner<br />GeoSpatial and Temporal representations.<br />Prolog for Rules<br />Soon Common Logic (CLIF+)<br />As a usability layer on top of Prolog<br />Easier to combine Rules and Queries<br />
  6. 6. Agraph as a Graph Database<br />If you want a Property Graph: <br />use the graph argument<br />Jans loves pizza gr1<br />gr1 weight 90<br />gr1 author Sophia<br />
  7. 7.
  8. 8. Schema<br />Node typing<br />Edge typing<br />Attributes (nodes)<br />Attributes (edges)<br />Directed edges<br />Undirected edges<br />Restricted edges<br />Loop edges<br />Attribute indexing<br />Starting node<br />Schema <br /><ul><li>Yes
  9. 9. Yes
  10. 10. Yes
  11. 11. Yes: A trusts B gr1, gr1 certainty 80.
  12. 12. Yes: A trusts B
  13. 13. Yes: if using RDFS symmetric property or generators
  14. 14. Yes, if it means there can be islands.
  15. 15. Yes, A loves A
  16. 16. Yes
  17. 17. No, although, is that a DB property?
  18. 18. Yes and No: On demand you can use Ontology and validation is straight forward</li></li></ul><li>Querying<br />Language<br />Traversals <br /><ul><li>Lisp, Prolog, JavaScript and toy version of Gremlin
  19. 19. Yes, through adjacency lists and special indices.. This seems to be an implementation point and not a fundamental property</li></li></ul><li>Database<br />Transactional<br />ACID <br />Fully Indexed<br />Distributed<br />Cache<br />Embeddable<br />Store-engine<br />Migration framework<br />Object mapping<br /><ul><li>Yes
  20. 20. Yes
  21. 21. Yes
  22. 22. Federation (in-machine, between machines), AG5
  23. 23. Yes, adjacency vectors (neighbourhoodmatrics)
  24. 24. Yes: 3.3, No: 4.2.x
  25. 25. Custom
  26. 26. From RDB to Graph DB? Various
  27. 27. Only in Lisp, not in clients.</li></li></ul><li>Utilities<br />Shell<br />Algorithms<br />Benchmark<br />Protocols<br />RDF Store<br />OWL Store<br />IDE Integration<br />Admin tool<br />Importer<br />Exporter<br />Loader<br />Scripting Language<br /><ul><li>All from Lisp shell, some from cshell, wget/curl
  28. 28. Yes, JavaScript, Prolog and Lisp
  29. 29. Yes, but only for RDF stores and reasoning
  30. 30. REST/JSON
  31. 31. Yes
  32. 32. Yes
  33. 33. Yes
  34. 34. Yes, AGWebview
  35. 35. Yes, from various input formats
  36. 36. Yes, clients lets you dump triples
  37. 37. AGLoad, Gruff, AGWebview
  38. 38. Lisp and Javascript.</li></li></ul><li>Languages<br />Java <br />Python<br />Ruby<br />C#<br />Scala<br />Clojure<br />Perl<br />PHP<br />
  39. 39. Many graph algorithms using generator model<br />Because of Social Network Analysis requirements we implement many graph algorithms.<br />Using generators<br />A first class function that takes <br />One node as input<br />Returns all children<br />And neighbourhood matrices(or adjacency hash-tables) forspeed.<br />
  40. 40. how far is Actor1 from Actor2?<br />Degrees of separation<br />How far is P1 from P2<br />Connection strength<br />How many shortest paths from P1 to P2 through a series of predicates and rules<br />
  41. 41. In what groups is this actor?<br />Find the ego-network around a person or thing<br />Friend, friends of friends, etc.<br />Find all the fully connect graphs around a personor thing<br />
  42. 42. Questions in SNA: How Important is an actor?<br />In-degree, out-degree<br />Actor degree centrality<br />I have the most connections in a group so I am more important<br />Actor closeness centrality<br />I have more shortest paths to anyone else in the group so I am more important<br />Actor betweenness centrality<br />I am more often on the shortest path between other people in the group so I am more important. I can control flow of information better than other people<br />
  43. 43. Has the group a leader, is the group cohesive?<br />Group centralization<br />How centralized is this group?<br />Does this group have a leader<br />Is there someone controllingthe information flow <br />Group cohesiveness<br />How strong and well connected is this group<br />Are most people connected<br />What is the density<br />
  44. 44. All search and SNA functions use Generators<br />Generator<br />Input: one node<br />Output: list of nodes<br />Fully functional, can be complex sparql or prolog queries <br />Or just predicates and indication of direction<br />
  45. 45. How to get from A to E??<br />subjpredobj<br />a dinner-with b<br /> a kissed-with c<br /> c movie-with e<br /> b kissed-with d<br /> d movie-with e<br /> e dinner-with a<br />(defgenerator knows (node)<br /> (objects-of :p dinner-with))<br />(defgenerator knows (node)<br /> (objects-of :p dinner-with)<br /> (subjects-of :p dinner-with))<br />
  46. 46. How to get from A to E??<br />(defgenerator knows ()<br /> (object-of :p dinner-with)<br /> (subject-of :p dinner-with)<br /> (object-of :p movie-with)<br /> (subject-of :p movie-with)<br /> (object-of :p kissed-with)<br /> (subject-of :p kissed-with))<br />(defgenerator knows ()<br /> (undirected (dinner-with movie-with kissed-with)))<br />
  47. 47. Declaratively specify <br />(generator knows (node)<br /> (select (?x)<br /> (q ??node movie-with ?x)<br /> (q ??node dinner-with ?x)<br /> (not (q ??node kissed-with ?x)))<br /> (select (?x)<br /> (q ?x movie-with ??node)<br /> (q- ?x dinner-with ??node)<br /> (not (q- ?x kissed-with ??node)))<br />
  48. 48. Sample SNA functions<br />(Ego-group actor generator depth ?group)<br />- binds ?group to group of nodes<br />(Ego-group-members actor generator depth ?a) <br />- bind ?a to every member in the group<br />(Cliques actor generator min-depth ?cl)<br />- binds ?cl to all cliques <br />(Clique-members actor generator min-depth ?cl ?a)<br />- binds ?cl to cliques and then iterates of ever member ?a in ?cl<br />(Actor-centrality actor group generator ?num) <br />- binds ?num to actorcentrality<br />(Actor-centrality-members group ?actor ?num)<br />- binds ?actor to every actor in group, ?centrality is centrality of<br /> that actor, we start with the actor with highest centrality.<br />(Group-centrality group generator ?num)<br />Actor = single node<br />Group = list of nodes<br />Depth = number<br />Generator = generator<br />
  49. 49. Integrated in Prolog and Common Logic (CLIF)<br />(defgenerator knows (node)<br /> (undirected :p (!fr:dinner-with !fr:kissed-with)))<br />(select (?x)<br /> (ego-group-members !person:jans knows ?x 2)<br /> (q ?x !geo:place ?y)<br /> (geo-box-around !geoname:Berkeley ?y 5 miles))<br />(select (?x)<br /> (ego-group !person:jans knows ?group 2)<br /> (actor-centrality-members ?group knows ?x ?num)<br /> (q ?x !geo:place ?y)<br /> (geo-box-around !geoname:Berkeley ?y 5 miles))<br />
  50. 50. Where we use this?<br />Amdocs: Know everything about every customer<br />Partitioned on customer<br />Most graph search centered in client<br />Pfizer: help me find connections between drugs, diseases, genes, side effects in a sea of clinical trials<br />Just a mess of data<br />All graph search in server<br />
  51. 51. Traditional Business Intelligence<br />Can tell you ALL about <br />the average customer<br /> but NOTHING about <br /> the individual. <br />
  52. 52. Can you in < 1 second with one push of a button<br />Predict the three most likely reasons why Joe Smith from Kansas is calling the call center? Bill unexpectedly high, loosing connection too often, doesn’t know how to use new subscription service?<br />The ten last events that happened for JS? Phone calls, sms, downloads of movie, device stopped working, payment of bill, looking at map, search for local store.<br />What is the likelyhoodthat he will change from T-Mobile to Sprint or AT&T?<br />What are his ten most important friends and what devices do they have. And who is the first to change and who follows?<br />
  53. 53. Can you in < 1 second with one push of a button<br />What are the usual daily locations for this person? What kind of shops?<br />What kind of services does he download, what kind of movies/music/games does he like, what products does he buy?<br />Is his plan the right plan for him?<br />Is he in a good mood?<br />Is he a valuable customer, is he a good payer, what is your margin on him, how many times per month does he call a call center, does he look up help for mail on the internet? Can you predict if he is going to pay the bill?<br />
  54. 54.
  55. 55. Architecture<br />Decision Engine<br />Actions<br />Events<br />SBA Application Server<br />Container<br />Amdocs <br />Event Collector<br />Container<br />Inference <br />Engine(Business <br />Rules)<br />Amdocs <br />Integration <br />Framework<br />Event<br />Ingestion<br />Events<br />Scheduled<br />Events<br />Bayesian<br />Belief<br />Network<br />CRM<br />RM<br />OMS<br />CRM<br />“Sesame”<br />Operational Systems<br />NW<br />Web 2.0<br />AllegroGraph<br />Triple Store DB<br />Event Data Sources<br />
  56. 56.
  57. 57. Work for Pharma<br />
  58. 58. sider<br />
  59. 59.
  60. 60.
  61. 61. Gruff Demo<br />
  62. 62. What about Scalability<br />
  63. 63.
  64. 64.
  65. 65.
  66. 66. Architecture overview<br />Java:<br />Sesame Jena<br />Python<br />Ruby<br />C#<br />ClojureScala<br />Perl<br />REST<br />Backup/Restore<br />Replication<br />Warm Failover<br />Security<br /> Management<br />Sparql<br />Prolog<br />Rules Clif++<br />Geo<br />SNA<br />Time<br />RDFS+<br />Java-Script<br />Session Management, Query Engine, Federation<br />Storage layer ( compression, indexing, freetext, transactions )<br />
  67. 67. Thanks…<br />