• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Allegograph
 

Allegograph

on

  • 958 views

 

Statistics

Views

Total Views
958
Views on SlideShare
957
Embed Views
1

Actions

Likes
2
Downloads
10
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Allegograph Allegograph Presentation Transcript

    • AllegroGraph as a Graph Database
      JansAasman, Ph.D.
      CEO - Franz Inc
      Ja@Franz.com
    • Contents
      AllegroGraph as a
      QuintupleStore (well OcttupleStore in 2011)
      RDF store
      Graph Database
      Agraph architecture
      Extreme use cases
      AMDOCS … CRM on top a trillion triples
      Pharmaceutic … explore connections in graph space
      Demo
    • Agraph as a quintuple store
      S, P, O, G + unique ID + transaction #
      SPOG can be any data type
      1 2.0 3 4
      2001-12-12 after 010-12-12 +19258781444
      Jans loves pizza file1 12
      NoOne believes 12
      And include very efficient geospatial and temporal representations and indices
      6 default indices, 24 user controlled indices
      Range indexing, Freetext Indexing
      Neighborhood matrixes & UPI maps (for 1 ms access)
      2011: time, security
    • Agraph as an RDF store
      RDF store when you adhere to the RDF conventions.
      Full Sparql 1.0, most of Sparql 1.1
      RDFS++ reasoner
      GeoSpatial and Temporal representations.
      Prolog for Rules
      Soon Common Logic (CLIF+)
      As a usability layer on top of Prolog
      Easier to combine Rules and Queries
    • Agraph as a Graph Database
      If you want a Property Graph:
      use the graph argument
      Jans loves pizza gr1
      gr1 weight 90
      gr1 author Sophia
    • Schema
      Node typing
      Edge typing
      Attributes (nodes)
      Attributes (edges)
      Directed edges
      Undirected edges
      Restricted edges
      Loop edges
      Attribute indexing
      Starting node
      Schema
      • Yes
      • Yes
      • Yes
      • Yes: A trusts B gr1, gr1 certainty 80.
      • Yes: A trusts B
      • Yes: if using RDFS symmetric property or generators
      • Yes, if it means there can be islands.
      • Yes, A loves A
      • Yes
      • No, although, is that a DB property?
      • Yes and No: On demand you can use Ontology and validation is straight forward
    • Querying
      Language
      Traversals
      • Lisp, Prolog, JavaScript and toy version of Gremlin
      • Yes, through adjacency lists and special indices.. This seems to be an implementation point and not a fundamental property
    • Database
      Transactional
      ACID
      Fully Indexed
      Distributed
      Cache
      Embeddable
      Store-engine
      Migration framework
      Object mapping
      • Yes
      • Yes
      • Yes
      • Federation (in-machine, between machines), AG5
      • Yes, adjacency vectors (neighbourhoodmatrics)
      • Yes: 3.3, No: 4.2.x
      • Custom
      • From RDB to Graph DB? Various
      • Only in Lisp, not in clients.
    • Utilities
      Shell
      Algorithms
      Benchmark
      Protocols
      RDF Store
      OWL Store
      IDE Integration
      Admin tool
      Importer
      Exporter
      Loader
      Scripting Language
      • All from Lisp shell, some from cshell, wget/curl
      • Yes, JavaScript, Prolog and Lisp
      • Yes, but only for RDF stores and reasoning
      • REST/JSON
      • Yes
      • Yes
      • Yes
      • Yes, AGWebview
      • Yes, from various input formats
      • Yes, clients lets you dump triples
      • AGLoad, Gruff, AGWebview
      • Lisp and Javascript.
    • Languages
      Java
      Python
      Ruby
      C#
      Scala
      Clojure
      Perl
      PHP
    • Many graph algorithms using generator model
      Because of Social Network Analysis requirements we implement many graph algorithms.
      Using generators
      A first class function that takes
      One node as input
      Returns all children
      And neighbourhood matrices(or adjacency hash-tables) forspeed.
    • how far is Actor1 from Actor2?
      Degrees of separation
      How far is P1 from P2
      Connection strength
      How many shortest paths from P1 to P2 through a series of predicates and rules
    • In what groups is this actor?
      Find the ego-network around a person or thing
      Friend, friends of friends, etc.
      Find all the fully connect graphs around a personor thing
    • Questions in SNA: How Important is an actor?
      In-degree, out-degree
      Actor degree centrality
      I have the most connections in a group so I am more important
      Actor closeness centrality
      I have more shortest paths to anyone else in the group so I am more important
      Actor betweenness centrality
      I am more often on the shortest path between other people in the group so I am more important. I can control flow of information better than other people
    • Has the group a leader, is the group cohesive?
      Group centralization
      How centralized is this group?
      Does this group have a leader
      Is there someone controllingthe information flow
      Group cohesiveness
      How strong and well connected is this group
      Are most people connected
      What is the density
    • All search and SNA functions use Generators
      Generator
      Input: one node
      Output: list of nodes
      Fully functional, can be complex sparql or prolog queries
      Or just predicates and indication of direction
    • How to get from A to E??
      subjpredobj
      a dinner-with b
      a kissed-with c
      c movie-with e
      b kissed-with d
      d movie-with e
      e dinner-with a
      (defgenerator knows (node)
      (objects-of :p dinner-with))
      (defgenerator knows (node)
      (objects-of :p dinner-with)
      (subjects-of :p dinner-with))
    • How to get from A to E??
      (defgenerator knows ()
      (object-of :p dinner-with)
      (subject-of :p dinner-with)
      (object-of :p movie-with)
      (subject-of :p movie-with)
      (object-of :p kissed-with)
      (subject-of :p kissed-with))
      (defgenerator knows ()
      (undirected (dinner-with movie-with kissed-with)))
    • Declaratively specify
      (generator knows (node)
      (select (?x)
      (q ??node movie-with ?x)
      (q ??node dinner-with ?x)
      (not (q ??node kissed-with ?x)))
      (select (?x)
      (q ?x movie-with ??node)
      (q- ?x dinner-with ??node)
      (not (q- ?x kissed-with ??node)))
    • Sample SNA functions
      (Ego-group actor generator depth ?group)
      - binds ?group to group of nodes
      (Ego-group-members actor generator depth ?a)
      - bind ?a to every member in the group
      (Cliques actor generator min-depth ?cl)
      - binds ?cl to all cliques
      (Clique-members actor generator min-depth ?cl ?a)
      - binds ?cl to cliques and then iterates of ever member ?a in ?cl
      (Actor-centrality actor group generator ?num)
      - binds ?num to actorcentrality
      (Actor-centrality-members group ?actor ?num)
      - binds ?actor to every actor in group, ?centrality is centrality of
      that actor, we start with the actor with highest centrality.
      (Group-centrality group generator ?num)
      Actor = single node
      Group = list of nodes
      Depth = number
      Generator = generator
    • Integrated in Prolog and Common Logic (CLIF)
      (defgenerator knows (node)
      (undirected :p (!fr:dinner-with !fr:kissed-with)))
      (select (?x)
      (ego-group-members !person:jans knows ?x 2)
      (q ?x !geo:place ?y)
      (geo-box-around !geoname:Berkeley ?y 5 miles))
      (select (?x)
      (ego-group !person:jans knows ?group 2)
      (actor-centrality-members ?group knows ?x ?num)
      (q ?x !geo:place ?y)
      (geo-box-around !geoname:Berkeley ?y 5 miles))
    • Where we use this?
      Amdocs: Know everything about every customer
      Partitioned on customer
      Most graph search centered in client
      Pfizer: help me find connections between drugs, diseases, genes, side effects in a sea of clinical trials
      Just a mess of data
      All graph search in server
    • Traditional Business Intelligence
      Can tell you ALL about
      the average customer
      but NOTHING about
      the individual.
    • Can you in < 1 second with one push of a button
      Predict the three most likely reasons why Joe Smith from Kansas is calling the call center? Bill unexpectedly high, loosing connection too often, doesn’t know how to use new subscription service?
      The ten last events that happened for JS? Phone calls, sms, downloads of movie, device stopped working, payment of bill, looking at map, search for local store.
      What is the likelyhoodthat he will change from T-Mobile to Sprint or AT&T?
      What are his ten most important friends and what devices do they have. And who is the first to change and who follows?
    • Can you in < 1 second with one push of a button
      What are the usual daily locations for this person? What kind of shops?
      What kind of services does he download, what kind of movies/music/games does he like, what products does he buy?
      Is his plan the right plan for him?
      Is he in a good mood?
      Is he a valuable customer, is he a good payer, what is your margin on him, how many times per month does he call a call center, does he look up help for mail on the internet? Can you predict if he is going to pay the bill?
    • Architecture
      Decision Engine
      Actions
      Events
      SBA Application Server
      Container
      Amdocs
      Event Collector
      Container
      Inference
      Engine(Business
      Rules)
      Amdocs
      Integration
      Framework
      Event
      Ingestion
      Events
      Scheduled
      Events
      Bayesian
      Belief
      Network
      CRM
      RM
      OMS
      CRM
      “Sesame”
      Operational Systems
      NW
      Web 2.0
      AllegroGraph
      Triple Store DB
      Event Data Sources
    • Work for Pharma
    • sider
    • Gruff Demo
    • What about Scalability
    • Architecture overview
      Java:
      Sesame Jena
      Python
      Ruby
      C#
      ClojureScala
      Perl
      REST
      Backup/Restore
      Replication
      Warm Failover
      Security
      Management
      Sparql
      Prolog
      Rules Clif++
      Geo
      SNA
      Time
      RDFS+
      Java-Script
      Session Management, Query Engine, Federation
      Storage layer ( compression, indexing, freetext, transactions )
    • Thanks…