0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Rough Set Semantics for Identity Management on the Web

1,163

Published on

Presented at the AAAI Fall Symposium for Big Data on 2013-11-15.

Presented at the AAAI Fall Symposium for Big Data on 2013-11-15.

Published in: Education, Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
1,163
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
1
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. Rough Set Semantics for Identity management on the Web Wouter Beek (wouterbeek.com) Stefan Schlobach Frank van Harmelen
• 2. Problems of identity • Statements only hold in certain contexts (no substitution salva veritate) • Identity is mistaken for representation. • Identity is mistaken for (close) relatedness. But more importantly: • Semantics: identity assertion (claim about meaning) • Pragmatics: data linking (import additional properties) • Due to: Open World Assumption
• 3. owl:differentFrom(Semantics,Pragmatics) SEMANTICS PRACTICE 𝑎1 , 𝑎2 ∈ 𝐸𝑥𝑡 𝐼 𝑜𝑤𝑙: 𝑠𝑎𝑚𝑒𝐴𝑠 iff 𝑎1 = 𝑎2 “Link your data to other people’s data to provide context.” [5-star LOD] “RDF links often have the owl:sameAs predicate.” [VoID]
• 4. Can Leibniz help? • Indiscernibility of identicals (Leibniz’ principle) • 𝑎 = 𝑏 → ∀𝜙 𝜙 𝑎 = 𝜙 𝑏 • Identity of indiscernibles • ∀𝜙 𝜙 𝑎 = 𝜙 𝑏 → 𝑎 = 𝑏 • Trivially true, since 𝜆𝑥. (𝑥 = 𝑏) is one of the 𝜙’s
• 5. Solutions (as identified in the literature) [1/2] 1) Weaken owl:sameAs E.g. skos:closeMatch 2) Extend owl:sameAs Annotate with Fuzzyness or uncertainty. 3) Make contexts explicit E.g. use named graphs E.g. use namespaces “That is the star that can be seen in the morning, but not in the evening”@geolocation
• 6. Solutions (as identified in the literature) [2/2] 4) Use domain-specific identity relations “x and y have the same medical use” @medicine “x and y are the same molecule” @chemistry 5) Change modeling practice Notification upon read. Require reciprocal confirmation upon change. “On the Web of Data, anybody can say anything about anything.” [Van Harmelen]
• 7. Indiscernibility Identity is the smallest equivalence relation. Indiscernibility: resources are the same w.r.t. a limited set of predicates. Indiscernibility is an equivalence relation (reasoning!), although not necessarily the smallest one. Every indiscernibility relation is also an identity relation, but over a different domain: • Example: Take the set of people and property 𝑃𝑖 ⊆ 𝑃𝑒𝑜𝑝𝑙𝑒 × 𝐼𝑛𝑐𝑜𝑚𝑒. Context {𝑃𝑖 } induces the identity relation between income-groups.
• 8. Indiscernibility 1 Two resources are indiscernible w.r.t. a set of predicates 𝑃 ⊆ 𝑃 𝐺 (predicate terms in G), if they share the predicate-object pairs for 𝑃. 𝐼𝑁𝐷 𝑃 = 𝑥, 𝑦 ∈ 𝑆 2 ∀ 𝑝∈𝑃 (𝑓 𝑝 𝑥 = 𝑓 𝑝 (𝑦))} 𝐺 where 𝑓 𝑝 𝑥 = {𝑦| 𝐼 𝑥 , 𝑦 ∈ 𝐸𝑥𝑡 𝐼 𝑝 } Example: “Wouter and Stefan have the same employer, so they are indiscernible w.r.t. predicate hasEmployer.
• 9. Indiscernibility 2 • We take a given identity relation and partition it into subsets (i.e. identity sub-relations) which are described in terms of the vocabulary. • Subsets of the given identity relation are 𝑃∗ -indiscernible, for sets of predicates 𝑃∗ ⊆ ℘ 𝑃 𝐺 Example: • “(Wouter and Albert) and (Stefan and Paul) belong to the same identity sub-relation, since they are indiscernible w.r.t. the same collections of properties. • Wouter and Albert are “employedAs PhD”; Stefan and Paul are “employedAs Assistant Professor”.
• 10. Indiscernibility 2 𝑃∗ ⊆ ℘ 𝑃 𝐺 𝐼𝑁𝐷 𝑃∗ = 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 ∈ 𝑆 2 2 𝐺 ∀𝑃 ∈ 𝑃∗ ( 𝑥1 , 𝑦1 ∈ 𝐼𝑁𝐷 𝑃 ↔ 𝑥2 , 𝑦2 ∈ 𝐼𝑁𝐷(𝑃)} For comparison: 𝑃 ⊆ 𝑃𝐺 𝐼𝑁𝐷 𝑃 = 𝑥, 𝑦 ∈ 𝑆 2 ∀ 𝑝∈𝑃 𝑓 𝑝 𝑥 = 𝑓 𝑝 (𝑦)} 𝐺
• 11. Example of an indiscernibility partition
• 12. Rough set approximation Higher approximation: 𝑥 ≈ 𝐻 𝑦 ⇔ ∃𝑢, 𝑣( 𝑢, 𝑣 ℛ 𝑥, 𝑦 ∧ 𝑢 ≈ 𝑣) Lower approximation: 𝑥 ≈ 𝐿 𝑦 ⇔ ∀𝑢, 𝑣( 𝑢, 𝑣 ℛ 𝑥, 𝑦 → 𝑢 ≈ 𝑣) But what is ℛ (‘resemblance’)? ℛ = 𝐼𝑁𝐷(℘ 𝑃 𝐺 )
• 13. Example of indiscernibility approximations
• 14. Quality | ≈𝐿 | ∝ ≈ = |≈𝐻| • Based on the rough set approximation ≈ 𝐿 , ≈ 𝐻 . • Since a consistently applied identity relation has relatively many partition sets that contain either no identity pairs (small value for | ≈ 𝐻 |) or only identity pairs (large value for | ≈ 𝐿 |), a more consistent identity relation has a higher quality metric.
• 15. Generalizations • This works for any binary relation (not only owl:sameAs). • We only discussed the identity of non-property resources, but properties can also be identical. • We skipped the treatment of blank nodes and typed literals (which have special identity criteria). • The indiscernibility ‘language’ can be made must stronger, allowing more fine-grained identity sub-relations: • • • • Length-1 paths, e.g. “Wouter lives in the Netherlands.” Length-2 paths, e.g. “Wouter lives in a country which borders Germany.” Length-𝑛 paths. Intervals in the value space of typed literals, e.g. “was published between 1901 and 1905” • Natural language translation, e.g. “lives in Germany” and “lives in Deutschland”
• 16. Depth-𝑛 Predicate Path Map (PPM) A sequence of 𝑛 predicates denoting a (functional) mapping from subject terms into sets of object terms: 𝑓 𝑝1 ,…,𝑝 𝑛 𝑛−1 𝑖=1 𝑠 = {𝑜 ∈ 𝑂 𝐺 |∃𝑥1 , … , 𝑥 𝑛−1 (𝑥 𝑛 = 𝑜 ∧ 𝐼 𝑥 𝑖 , 𝐼 𝑥 𝑖+1 ∈ 𝐸𝑥𝑡 𝐼 𝑝 } 𝑝∈ 𝑝 𝑖 +1
• 17. Indiscernibility 1 (generalized) Two resources are indiscernible w.r.t a set of PPMs 𝑃 ⊂ 𝑃 𝐺𝑛 , if they share the properties denoted by 𝑃. 𝐼𝑁𝐷 𝑃 = 𝑥, 𝑦 ∈ 𝑆 2 ∀ 𝑝∈ 𝐺 𝑃 (𝑓 𝑝 𝑥 ≍ 𝑓 𝑝 (𝑦))} Example: “Wouter and Stefan have the same employer, so they are indiscernible w.r.t. has-employer. Details: • 𝑃 = 𝑝1 ,…,𝑝 𝑛 ∈𝑃 𝑝1 × ⋯ × 𝑝 𝑛
• 18. Indiscernibility 2 (generalized) We take a given set of pairs (e.g. an identity relation) and partition it into subsets which are described in terms of the schema. Subsets of the given (identity) relation are 𝑃 -indiscernible, for sets of PPNs 𝑃∗ ⊆ ℘ 𝑃 𝐺𝑛
• 19. Indiscernibility 2 (generalized) 𝑃∗ ⊆ ℘ 𝑃 𝐺𝑛 𝐼𝑁𝐷 𝑃∗ = 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 ∈ 𝑆 2 2 𝐺 ∀𝑃 ∈ 𝑃∗ ( 𝑥1 , 𝑦1 ∈ 𝐼𝑁𝐷 𝑃 ↔ 𝑥2 , 𝑦2 ∈ 𝐼𝑁𝐷(𝑃)} For comparison: 𝑃 ⊂ 𝑃 𝐺𝑛 𝐼𝑁𝐷 𝑃 = 𝑥, 𝑦 ∈ 𝑆 2 ∀ 𝑝∈ 𝐺 𝑃 𝑓 𝑝 𝑥 ≍ 𝑓 𝑝 (𝑦)}
• 20. Conclusion Problem: • There is a conflict between semantics and pragmatics of identity. • This will not be fixed in the short term by using extensions to existing logics (e.g. contexts, fuzziness, probability). Solution: • Identify different identity relations automatically, and in terms of the domain predicates (no extra constructs are needed!). • Define the meaning of a specific identity relation in terms of its indiscernibility criteria.