Successfully reported this slideshow.
Upcoming SlideShare
×

# PSL Overview

28,028 views

Published on

An overview of probabilistic soft logic, a language for templating hinge-loss Markov random fields.

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### PSL Overview

1. 1. Probabilistic Soft Logic: An Overview Lise Getoor, Stephen Bach, Bert Huang University of Maryland, College Park http://psl.umiacs.umd.edu Additional Contributors: Matthias Broecheler, Shobeir Fakhraei, Angelika Kimmig, London, Alex Memory, Hui Miao, Lily Mihalkova, Eric Norris, Jay Pujara, Theo Rekatsinas, Arti Rames
2. 2. Probabilistic Soft Logic (PSL) Declarative language based on logics to express collective probabilistic inference problems  Predicate = relationship or property  Atom = (continuous) random variable  Rule = capture dependency or constraint  Set = define aggregates PSL Program = Rules + Input DB
3. 3. Probabilistic Soft Logic (PSL) Declarative language based on logics to express collective probabilistic inference problems  Predicate = relationship or property  Atom = (continuous) random variable  Rule = capture dependency or constraint  Set = define aggregates PSL Program = Rules + Input DB
4. 4. Probabilistic Soft Logic Entity Resolution  Entities - People References  Attributes - Name  Relationships - Friendship  Goal: Identify references that denote the same person A B John Smith J. Smith name name C E D F G H friend friend = =
5. 5. Probabilistic Soft Logic Entity Resolution  References, names, friendships  Use rules to express evidence - ‘’If two people have similar names, they are probably the same’’ - ‘’If two people have similar friends, they are probably the same’’ - ‘’If A=B and B=C, then A and C must also denote the same person’’ A B John Smith J. Smith name name C E D F G H friend friend = =
6. 6. Probabilistic Soft Logic Entity Resolution  References, names, friendships  Use rules to express evidence - ‘’If two people have similar names, they are probably the same’’ - ‘’If two people have similar friends, they are probably the same’’ - ‘’If A=B and B=C, then A and C must also denote the same person’’ A B John Smith J. Smith name name C E D F G H friend friend = = A.name ≈{str_sim} B.name => A≈B : 0.8
7. 7. Probabilistic Soft Logic Entity Resolution  References, names, friendships  Use rules to express evidence - ‘’If two people have similar names, they are probably the same’’ - ‘’If two people have similar friends, they are probably the same’’ - ‘’If A=B and B=C, then A and C must also denote the same person’’ A B John Smith J. Smith name name C E D F G H friend friend = = {A.friends} ≈{} {B.friends} => A≈B : 0.6
8. 8. Probabilistic Soft Logic Entity Resolution  References, names, friendships  Use rules to express evidence - ‘’If two people have similar names, they are probably the same’’ - ‘’If two people have similar friends, they are probably the same’’ - ‘’If A=B and B=C, then A and C must also denote the same person’’ A B John Smith J. Smith name name C E D F G H friend friend = = A≈B ^ B≈C => A≈C : ∞
9. 9. Probabilistic Soft Logic Link Prediction  Entities - People, Emails  Attributes - Words in emails  Relationships - communication, work relationship  Goal: Identify work relationships - Supervisor, subordinate, colleague        
10. 10. Probabilistic Soft Logic Link Prediction  People, emails, words, communication, relations  Use rules to express evidence - “If email content suggests type X, it is of type X” - “If A sends deadline emails to B, then A is the supervisor of B” - “If A is the supervisor of B, and A is the supervisor of C, then B and C are colleagues”        
11. 11. Probabilistic Soft Logic Link Prediction  People, emails, words, communication, relations  Use rules to express evidence - “If email content suggests type X, it is of type X” - “If A sends deadline emails to B, then A is the supervisor of B” - “If A is the supervisor of B, and A is the supervisor of C, then B and C are colleagues”         complete by due
12. 12. Probabilistic Soft Logic Link Prediction  People, emails, words, communication, relations  Use rules to express evidence - “If email content suggests type X, it is of type X” - “If A sends deadline emails to B, then A is the supervisor of B” - “If A is the supervisor of B, and A is the supervisor of C, then B and C are colleagues”        
13. 13. Probabilistic Soft Logic Link Prediction  People, emails, words, communication, relations  Use rules to express evidence - “If email content suggests type X, it is of type X” - “If A sends deadline emails to B, then A is the supervisor of B” - “If A is the supervisor of B, and A is the supervisor of C, then B and C are colleagues”        
14. 14. Probabilistic Soft Logic  Node Labeling ?
15. 15. Probabilistic Soft Logic  Voter Opinion Modeling ? \$\$ Tweet Status update
16. 16. Probabilistic Soft Logic  Voter Opinion Modeling    spouse spouse colleague colleague spouse friend friend friend friend
17. 17. Probabilistic Soft Logic  Voter Opinion Modeling    vote(A,P) ∧ spouse(B,A)  vote(B,P) : 0.8 vote(A,P) ∧ friend(B,A)  vote(B,P) : 0.3 spouse spouse colleague colleague spouse friend friend friend friend
18. 18. Probabilistic Soft Logic Multiple Ontologies 18 Organization Employees Developer Staff CustomersService & Products Hardware IT Services Sales PersonSoftware work forprovides buys helps interacts sells todevelops Company Employee Technician Accountant CustomerProducts & Services Hardware Consulting SalesSoftware Dev works fordevelop buys helps interacts with sells
19. 19. Probabilistic Soft Logic Ontology Alignment 19 Organization Employees Developer Staff CustomersService & Products Hardware IT Services Sales PersonSoftware work forprovides buys helps interacts sells todevelops Company Employee Technician Accountant CustomerProducts & Services Hardware Consulting SalesSoftware Dev works fordevelop buys helps interacts with sells
20. 20. Probabilistic Soft Logic Ontology Alignment 20 Organization Employees Developer Staff CustomersService & Products Hardware IT Services Sales PersonSoftware work forprovides buys helps interacts sells todevelops Company Employee Technician Accountant CustomerProducts & Services Hardware Consulting SalesSoftware Dev works fordevelop buys helps interacts with sells Match, Don’t Match?
21. 21. Probabilistic Soft Logic Ontology Alignment 21 Organization Employees Developer Staff CustomersService & Products Hardware IT Services Sales PersonSoftware work forprovides buys helps interacts sells todevelops Company Employee Technician Accountant CustomerProducts & Services Hardware Consulting SalesSoftware Dev works fordevelop buys helps interacts with sells Similar to what extent?
22. 22. Probabilistic Soft Logic Logic Foundation
23. 23. Probabilistic Soft Logic Rules  Atoms are real valued - Interpretation I, atom A: I(A) [0,1] - We will omit the interpretation and write A [0,1]  ∨, ∧ are combination functions - T-norms: [0,1]n →[0,1] H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn Ground Atoms [Broecheler, et al., UAI ‘1
24. 24. Probabilistic Soft Logic Rules  Combination functions (Lukasiewicz T-norm)  A ∨ B = min(1, A + B)  A ∧ B = max(0, A + B – 1) H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn [Broecheler, et al., UAI ‘1
25. 25. Probabilistic Soft Logic Satisfaction  Establish Satisfaction - ∨ (H1,..,Hm) ← ∧ (B1,..,Bn) H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn H1 ← B1:0.7 ∧ B2:0.8 [Broecheler, et al., UAI ‘1 ≥0.5
26. 26. Probabilistic Soft Logic Distance to Satisfaction  Distance to Satisfaction - max( ∧ (B1,..,Bn) - ∨ (H1,..,Hm) , 0) H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn H1:0.7 ← B1:0.7 ∧ B2:0.8 H1:0.2 ← B1:0.7 ∧ B2:0.8 0.0 0.3 [Broecheler, et al., UAI ‘1
27. 27. Probabilistic Soft Logic W: H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn Rule Weights  Weighted Distance to Satisfaction - d(R,I) = W  max( ∧ (B1,..,Bn) - ∨ (H1,..,Hm) , 0) [Broecheler, et al., UAI ‘1
28. 28. Probabilistic Soft Logic Let’s Review  Given a data set and a PSL program, we can construct a set of ground rules.  Some of the atoms have fixed truth values and some have unknown truth values.  For every assignment of truth values to the unknown atoms, we get a set of weighted distances from satisfaction.  How to decide which is best?
29. 29. Probabilistic Soft Logic Probabilistic Foundation
30. 30. Probabilistic Soft Logic Probabilistic Model Probability density over interpretation I Normalization constant Set of ground rules Distance exponent in {1, 2} Rule’s weight Rule’s distance to satisfaction
31. 31. Probabilistic Soft Logic Hinge-loss Markov Random Fields Subject to arbitrary linear constraints PSL models ground out to HL-MRFs Log-concave!
32. 32. Probabilistic Soft Logic MPE Inference
33. 33. Probabilistic Soft Logic Inferring Most Probable Explanations Objective: Convex optimization Decomposition-based inference algorithm using the ADMM framework
34. 34. Probabilistic Soft Logic Alternating Direction Method of Multipliers • We perform inference using the alternating direction method of multipliers (ADMM) • Inference with ADMM is fast, scalable, and straightforward • Optimize subproblems (ground rules) independently, in parallel • Auxiliary variables enforce consensus across subproblems [Bach et al., NIPS ’12]
35. 35. Probabilistic Soft Logic Inference Algorithm Initialize local copies of variables and Lagrange multipliers Begin inference iterations Simple updates solve subproblems for each potential... ...and each constraint Average to update global variables and clip to [0,1] [Bach, et al., Under revie
36. 36. Probabilistic Soft Logic ADMM Scalability 0 100 200 300 400 500 600 125000 225000 325000 Timeinseconds Number of potential functions and constraints ADMM Interior-point method [Bach, et al., NIPS ‘12]
37. 37. Probabilistic Soft Logic ADMM Scalability [Bach, et al., Under revie
38. 38. Probabilistic Soft Logic Distributed MPE Inference: MapReduce Implementation z1z1 zq zqz1 z2z2 zp sub problem local variable copy Mapper z1 z2z1 z1 z2 zq zq zp Reducer update global component load global variable X as side data Job Bootstrap HDFS or HBase read/write subproblem write new global variable read global variable X Hui Miao, Xiangyang Liu
39. 39. Probabilistic Soft Logic Distributed MPE Inference: GraphLab Implementation . . . . . . . . . . . . sub problem node global variable component gather get z apply update y update x scatter notify z gather get local z,y apply update z scatter unless converge notify X update i update i+1 Hui Miao, Xiangyang Liu
40. 40. Probabilistic Soft Logic Weight Learning
41. 41. Probabilistic Soft Logic Weight Learning Learn from training data No need to hand-code rule-weights Various methods: - approximate maximum likelihood [Broecheler et al., UAI ’10] - maximum pseudolikelihood - large-margin estimation [Bach, et al., UAI ’13]
42. 42. Probabilistic Soft Logic Weight Learning State-of-the-art supervised-learning performance on - Collective classification - Social-trust prediction - Preference prediction - Image reconstruction [Bach, et al., UAI ’13]
43. 43. Probabilistic Soft Logic Foundations Summary
44. 44. Probabilistic Soft Logic Foundations Summary Design probabilistic models using declarative language - Syntax based on first-order logic Inference of most-probable explanation is fast convex optimization (ADMM) Learning algorithms for training rule weights from labeled data
45. 45. Probabilistic Soft Logic PSL Applications
46. 46. Probabilistic Soft Logic Document Classification     2 2 2 2 2A B A or B? A or B? A or B?  Given a networked collection of documents  Observe some labels  Predict remaining labels using  link direction  inferred class label [Bach, et al., UAI ’13]
47. 47. Probabilistic Soft Logic Computer Vision Applications Low-level vision: - image reconstruction High-level vision: - activity recognition in videos
48. 48. Probabilistic Soft Logic Image Reconstruction RMSE reconstruction error [Bach, et al., UAI ’13]
49. 49. Probabilistic Soft Logic Activity Recognition in Videos crossing waiting queueing walking talking dancing jogging [London, et al., Under revi
50. 50. Probabilistic Soft Logic Results on Activity Recognition Recall matrix between different activity types Accuracy metrics compared against baseline features [London, et al., Under revi
51. 51. Probabilistic Soft Logic Social Trust Prediction Competing models from social psychology of strong ties - Structural balance [Granovetter ’73] - Social status [Cosmides et al., ’92] Leskovec, Huttenlocher, & Kleinberg, [2010]: effects of both models present in online social networks
52. 52. Probabilistic Soft Logic Structural Balance vs. Social Status  Structural balance: strong ties are governed by tendency toward balanced triads - e.g., the enemy of my enemy...  Social status: strong ties indicate unidirectional respect, “looking up to”, expertise status - e.g., patient-nurse-doctor, advisor-advisee
53. 53. Probabilistic Soft Logic Structural Balance in PSL [Huang, et al., SBP ‘1
54. 54. Probabilistic Soft Logic Structural Balance in PSL [Huang, et al., SBP ‘1
55. 55. Probabilistic Soft Logic Social Status in PSL [Huang, et al., SBP ‘1
56. 56. Probabilistic Soft Logic Social Status in PSL [Huang, et al., SBP ‘1
57. 57. Probabilistic Soft Logic Experiments  User-user trust ratings from two different online social networks  Observe some ratings, predict held-out  Eight-fold cross validation on two data sets: - FilmTrust - movie review network, trust ratings from 1-10 - Epinions - product review network, trust / distrust ratings {-1, 1} [Huang, et al., SBP ‘1
58. 58. Probabilistic Soft Logic Compared Methods  TidalTrust: graph-based propagation of trust - Predict trust via breadth-first search to combine closest known relationships  EigenTrust: spectral method for trust - Predict trustworthiness of nodes based on eigenvalue centrality of weighted trust network  Average baseline: predict average trust score for all relationships [Huang, et al., SBP ‘1
59. 59. Probabilistic Soft Logic FilmTrust Experiment  Normalize [1,10] rating to [0,1]  Prune network to largest connected-component  1,754 users, 2,055 relationships  Compare mean average error, Spearman’s rank coefficient, and Kendall-tau distance * measured on only non-default predictions • PSL-Status and PSL-Balance disagree on 514 relationships [Huang, et al., SBP ‘1
60. 60. Probabilistic Soft Logic Epinions Experiment  Snowball sample of 2,000 users from Epinions data set  8,675 trust scores normalized to {0,1}  Measure area under precision-recall curve for distrust edges (rarer class) [Huang, et al., SBP ‘1
61. 61. Probabilistic Soft Logic Knowledge Graph Identification  Problem: Collectively reason about noisy, inter-related fact extractions  Task: NELL fact-promotion (web-scale IE) - Millions of extractions, with entity ambiguity and confidence scores - Rich ontology: Domain, Range, Inverse, Mutex, Subsumption  Goal: Determine which facts to include in NELL’s knowledge base Jay Pujara, Hui Miao, William Cohen
62. 62. Probabilistic Soft Logic PSL Rules Candidate Extractions: - CandRel_CMC(E1,E2,R) => Rel(E1,E2,R) - CandLbl_Morph(E1,L) => Lbl(E1,L) Entity Resolution: - SameEntity(E1,E2) & Lbl(E1,L) => Lbl(E2,L) - SameEntity(E1,E2) & Rel(E1,X,R) => Rel(E2,X,R) Ontological Rules: - Domain(R,L) & Rel(E1,E2,R) => Lbl(E1,L) - Mut(L1,L2) & Lbl(E,L1) => ~Lbl(E,L2)
63. 63. Probabilistic Soft Logic Results Data: Iteration 165 of NELL - 1.2M extractions - 68K ontological constraints - 14K labeled instances (Jiang, 2012) - NELL & MLN baselines (Jiang, 2012) F1 Precision Recall AUC-PR NELL .673 .801 .580 .765 MLN .836 .837 .836 .899 PSL .841 .773 .922 .882
64. 64. Probabilistic Soft Logic Schema Matching  Correspondences between source and target schemas  Matching rules - ‘’If two concepts are the same, they should have similar subconcepts’’ - ‘’If the domains of two attributes are similar, they may be the same’’ Organization Customers Service & Products provides buys Company Customer Products & Services develop buys Portfolios includes develop(A, B) <= provides(A, B) Company(A) <= Organization(A) Products&Services(B) <= Service&Products(B)
65. 65. Probabilistic Soft Logic Schema Mapping  Input: Schema matches  Output: S-T query pairs (TGD) for exchange or mediation  Mapping rules - “Every matched attribute should participate in some TGD.” - “The solutions to the queries in TGDs should be similar.” Organization Customers Service & Products provides buys Company Customer Products & Services develop buys Portfolios includes ∃Portfolio P, develop(A, P) ∧ includes(P, B) <= provides(A, B) . . . Alex Memory, Angelika Kimmig