AllegroGraph as a Graph Database JansAasman, Ph.D. CEO - Franz Inc Ja@Franz.com
Contents AllegroGraph as a QuintupleStore (well OcttupleStore in 2011) RDF store Graph Database Agraph architecture Extreme use cases AMDOCS … CRM on top a trillion triples Pharmaceutic … explore connections in graph space Demo
Agraph as a quintuple store S, P, O, G + unique ID + transaction # SPOG can be any data type 1 2.0 3 4 2001-12-12 after 010-12-12 +19258781444 Jans loves pizza file1 12 NoOne believes 12 And include very efficient geospatial and temporal representations and indices 6 default indices, 24 user controlled indices Range indexing, Freetext Indexing Neighborhood matrixes & UPI maps (for 1 ms access) 2011: time, security
Agraph as an RDF store RDF store when you adhere to the RDF conventions. Full Sparql 1.0, most of Sparql 1.1 RDFS++ reasoner GeoSpatial and Temporal representations. Prolog for Rules Soon Common Logic (CLIF+) As a usability layer on top of Prolog Easier to combine Rules and Queries
Agraph as a Graph Database If you want a Property Graph: use the graph argument Jans loves pizza gr1 gr1 weight 90 gr1 author Sophia
Utilities Shell Algorithms Benchmark Protocols RDF Store OWL Store IDE Integration Admin tool Importer Exporter Loader Scripting Language
All from Lisp shell, some from cshell, wget/curl
Yes, but only for RDF stores and reasoning
Yes, from various input formats
Yes, clients lets you dump triples
AGLoad, Gruff, AGWebview
Languages Java Python Ruby C# Scala Clojure Perl PHP
Many graph algorithms using generator model Because of Social Network Analysis requirements we implement many graph algorithms. Using generators A first class function that takes One node as input Returns all children And neighbourhood matrices(or adjacency hash-tables) forspeed.
how far is Actor1 from Actor2? Degrees of separation How far is P1 from P2 Connection strength How many shortest paths from P1 to P2 through a series of predicates and rules
In what groups is this actor? Find the ego-network around a person or thing Friend, friends of friends, etc. Find all the fully connect graphs around a personor thing
Questions in SNA: How Important is an actor? In-degree, out-degree Actor degree centrality I have the most connections in a group so I am more important Actor closeness centrality I have more shortest paths to anyone else in the group so I am more important Actor betweenness centrality I am more often on the shortest path between other people in the group so I am more important. I can control flow of information better than other people
Has the group a leader, is the group cohesive? Group centralization How centralized is this group? Does this group have a leader Is there someone controllingthe information flow Group cohesiveness How strong and well connected is this group Are most people connected What is the density
All search and SNA functions use Generators Generator Input: one node Output: list of nodes Fully functional, can be complex sparql or prolog queries Or just predicates and indication of direction
How to get from A to E?? subjpredobj a dinner-with b a kissed-with c c movie-with e b kissed-with d d movie-with e e dinner-with a (defgenerator knows (node) (objects-of :p dinner-with)) (defgenerator knows (node) (objects-of :p dinner-with) (subjects-of :p dinner-with))
How to get from A to E?? (defgenerator knows () (object-of :p dinner-with) (subject-of :p dinner-with) (object-of :p movie-with) (subject-of :p movie-with) (object-of :p kissed-with) (subject-of :p kissed-with)) (defgenerator knows () (undirected (dinner-with movie-with kissed-with)))
Sample SNA functions (Ego-group actor generator depth ?group) - binds ?group to group of nodes (Ego-group-members actor generator depth ?a) - bind ?a to every member in the group (Cliques actor generator min-depth ?cl) - binds ?cl to all cliques (Clique-members actor generator min-depth ?cl ?a) - binds ?cl to cliques and then iterates of ever member ?a in ?cl (Actor-centrality actor group generator ?num) - binds ?num to actorcentrality (Actor-centrality-members group ?actor ?num) - binds ?actor to every actor in group, ?centrality is centrality of that actor, we start with the actor with highest centrality. (Group-centrality group generator ?num) Actor = single node Group = list of nodes Depth = number Generator = generator
Where we use this? Amdocs: Know everything about every customer Partitioned on customer Most graph search centered in client Pfizer: help me find connections between drugs, diseases, genes, side effects in a sea of clinical trials Just a mess of data All graph search in server
Traditional Business Intelligence Can tell you ALL about the average customer but NOTHING about the individual.
Can you in < 1 second with one push of a button Predict the three most likely reasons why Joe Smith from Kansas is calling the call center? Bill unexpectedly high, loosing connection too often, doesn’t know how to use new subscription service? The ten last events that happened for JS? Phone calls, sms, downloads of movie, device stopped working, payment of bill, looking at map, search for local store. What is the likelyhoodthat he will change from T-Mobile to Sprint or AT&T? What are his ten most important friends and what devices do they have. And who is the first to change and who follows?
Can you in < 1 second with one push of a button What are the usual daily locations for this person? What kind of shops? What kind of services does he download, what kind of movies/music/games does he like, what products does he buy? Is his plan the right plan for him? Is he in a good mood? Is he a valuable customer, is he a good payer, what is your margin on him, how many times per month does he call a call center, does he look up help for mail on the internet? Can you predict if he is going to pay the bill?
Architecture Decision Engine Actions Events SBA Application Server Container Amdocs Event Collector Container Inference Engine(Business Rules) Amdocs Integration Framework Event Ingestion Events Scheduled Events Bayesian Belief Network CRM RM OMS CRM “Sesame” Operational Systems NW Web 2.0 AllegroGraph Triple Store DB Event Data Sources