Topic Maps Exchange in the Absence of Shared Vocabularies

817 views

Published on

Means for the exchange of Topic Maps are crucial for their further development of Topic Maps as industry integration standard. All existing approaches base on shared vocabularies. We clarify the term "absence of shared vocabularies" in the context of Topic Map Exchange. Afterwards we introduce the existing approaches for Topic Map Exchange and emphasise their limitations in the absence of shared vocabularies. Afterwards the SIM Approach is introduced and discussed. The SIM Approach allows the exchange of Topic Maps in the absence of shared vocabularies.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
817
On SlideShare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Topic Maps Exchange in the Absence of Shared Vocabularies

  1. 1. Topic Maps Exchange in the Absence of Shared Vocabularies <ul><ul><li>TMRA'05 </li></ul></ul><ul><ul><li>International Workshop on Topic Maps Research and Applications 06.10.2005 </li></ul></ul><ul><ul><li>Lutz Maicher </li></ul></ul><ul><ul><li>University of Leipzig </li></ul></ul><ul><ul><li>[email_address] </li></ul></ul>
  2. 2. Topic Maps Exchange = Retrieval Task <ul><li>The requested peers have to decide whether a Subject Proxy indicating an identical Subject is available. </li></ul><ul><li>Subject Proxies are created in a remote environment. </li></ul>requested peer requested peer requesting peer ? ? <ul><li>A requesting peer requests further information about a Subject in interest. </li></ul>none <ul><li>Requested peers send a fragment to requesting peer. </li></ul><ul><li>Requesting peer has to merge in the requested fragments. </li></ul>
  3. 3. Enterprise Information Integration Quelle: Taylor, John: Thoughts from the Integration Consortium: Enterprise Information Integration: A New Definition, DM Review Online, (9,2004). Lutz Maicher (maicher@informatik.uni-leipzig.de)
  4. 4. Existing Approaches to Topic Maps Exchange <ul><li>TMRAP – Topic Maps Remote Access Protocol </li></ul><ul><li>TMIP – the REStful Topic Maps Interaction Protocol (formerly: Federated Topic Maps) </li></ul><ul><li>SHARK (alternatively: Knowledge Port Approach) </li></ul><ul><li>TMShare </li></ul><ul><li>all of them base on the TMDM </li></ul><ul><ul><li>if distributed peers do not use a common vocabulary (PSIs), the exchange fails completely </li></ul></ul>
  5. 5. Semantics in Topic Maps <ul><li>Topic Maps are a semantic technology ... </li></ul><ul><ul><li>...only in the perspective of information integration </li></ul></ul><ul><ul><li>„ Subject Proxies indicating identical Subjects have to be viewed as merged ones“ </li></ul></ul><ul><li>A Subject Map Disclosure (SMD) discloses: </li></ul><ul><ul><li>SMD ontology </li></ul></ul><ul><ul><ul><li>implies the Subject Indication Approach </li></ul></ul></ul><ul><ul><li>Subject Equality Decision Approach </li></ul></ul><ul><ul><ul><li>define the semantics of the given Subject Proxies in respect to the functionality of holding the Co-Location objective true </li></ul></ul></ul><ul><ul><li>Subject Viewing Approach </li></ul></ul>
  6. 6. How Subject Equality is detected? <ul><li>Subject Equality SMD i ( </li></ul>Subject Indication SMD 1 (Subject Identity Subject Stage1 ), Subject Indication SMD 2 (Subject Identity Subject Stage2 ) )  Subject Identity integration perspective ( Subject Stage 1 , Subject Stage 2 ) Subject Identity under integration perspective? Subject Equality = both Subject Proxies indicate identical Subjects governed by the Subject Equality Decision Approach SMD i Subject Identity is indicated governed by the Subject Indication Approach SMD 1
  7. 7. How Subject Equality is really detected? <ul><li>Subject Equality SMD i ( </li></ul>Subject Indication SMD 1 , Subject Indication SMD 2 , Subject Indication SMD1 Subject Map Subject Proxy1 , Subject Map Subject Proxy2 )  true | false Subject Indication SMD2 ? ? Subject Equality = both Subject Proxies indicate identical Subjects governed by the Subject Equality Decision Approach SMD i
  8. 8. Possible Subject Equality Approaches of a SMD Referential Subject Equality Approach [A reference to a discrete ‘object’ indicates the intended Subject.] - Subject Proxy 1 indicates its Subject by pointing to it with S1 - Subject Proxy 2 indicates its Subject by pointing to it with S2 - Subject Equality holds if S1=S2 Structuralist Subject Equality Approach [The Subject depends on other Subject Proxies of the Subject Map.] - Subject Proxy 1 indicates its Subject through a set of Subject Proxies s1 - Subject Proxy 2 indicates its Subject through a set of Subject Proxies s2 - Subject Equality holds if s1 = s2 (or S1 similar S2) Meaning (semantics) in linguistics referential semantics The meaning of word is defined by the object it refers to. structuralist semantics The meaning of a word is defined by its usage in the language. The different Approaches to Subject Equality define the semantics of the used vocabulary at the time of the Subject Equality Decision.
  9. 9. Absence of Shared Vocabularies Topic Map Processing Application Subject Map Disclosure ontology Subject Map ontology Subject Map Vocabulary Subject Map Disclosure (SMD) Structuralist Subject Equality Decision Referential Subject Equality Decision Referential Subject Equality Decision
  10. 10. Towards a SMD SIM Topic Map Processing Application Subject Map Disclosure ontology Subject Map ontology Subject Map vocabulary Subject Map Disclosure (SMD) Structuralist Subject Equality Decision Referential Subject Equality Decision Structuralist Subject Equality Decision
  11. 11. Subject Similarity Measure (SIM) <ul><li>SIM – Similarity of the Subject of two different Topics </li></ul><ul><li>Procedure: a Subject available in Topic Map TM2 will be requested from Topic Map TM1 </li></ul><ul><ul><li>Extract a Topic Map Fragment (F) from TM2 around the Topic representing the Subject </li></ul></ul><ul><ul><li>for each pair (T1, T2) from TM1, F </li></ul></ul><ul><ul><ul><li>depict the simDNAtype for each pair </li></ul></ul></ul><ul><ul><ul><li>calculate the simDNA for each pair </li></ul></ul></ul><ul><ul><ul><li>calculate the simDNA twice, by using the detected similarity from the first step </li></ul></ul></ul><ul><ul><ul><li>simDNA’(T1,T2) = sum of digits (simDNA(T1,T2)) </li></ul></ul></ul><ul><ul><li>Subject Equality (T1,T2) -> (max simDNA’(T1,T2)) and (simDNA(T1,T2))>threshold </li></ul></ul>
  12. 12. simDNAtype (0..*) Source Locator [Locator Item] (0..1) Subject Locator [Locator Item] (0..1) Subject Identifier [Locator Item] (0..*) Topic Names [Topic Name Item] (0..*) Source Locator [Locator Item] (0..1) Type [Topic Item] (0..*) Scope [Topic Item] (1) Value [String] (0..*) Variants [Variant Items] (0..*) Source Locators [Locator Item] (0..*) Scope [Topic Item] (0..1) Value [String] (0..1) Resource [Locator Item] (0..*) Occurrences [Occurrence Item] (0..*) Source Locators [Locator Item] (0..1) Type [Topic Item] (0..*) Scope [Topic Item] (0..1) Value [String] (0..1) Resource [Locator Item] (0..*) rolesPlayed [Association Role Item] (0..1) Type [Topic Item] (1) Parent [Association Item] TMDM simDNAType /x*y*z*w*s*1*2*3*t*n*(o)*[a]*/ x – the current Topic is typing a Topic y – the current Topic is typing an Association z – the current Topic is typing a Topic Characteristics w – the current Topic is typing a Association Role s – the current Topic is scoping a Topic Characteristic 1 – the current Topic has a Source Locator 2 – the current Topic has a Subject Locator 3 – the current Topic has a Subject Identifier t – the current Topic is typed n – the current Topic has a TopicName o – the current Topic has an Occurrence o => /(v|l)t?s*/ (OccDNAtype) a – the current Topic takes part in an Association a => /a(tp)*/ (AssDNAtype)
  13. 13. simDNA – 1. Iteration simDNAType /x*y*z*w*s*1*2*3*t*n*(o)*[a]*/ x – the current Topic is typing a Topic y – the current Topic is typing an Association z – the current Topic is typing a Topic Characteristics w – the current Topic is typing a Association Role s – the current Topic is scoping a Topic Characteristic 1 – the current Topic has a Source Locator 2 – the current Topic has a Subject Locator 3 – the current Topic has a Subject Identifier t – the current Topic is typed n – the current Topic has a TopicName o – the current Topic has an Occurrence o => /(v|l)t?s*/ (OccDNAtype) a – the current Topic takes part in an Association a => /a(tp)*/ (AssDNAtype) Example simDNAtype(T1) = x13tn x – the current Topic is typing a Topic 1 – the current Topic has a Source Locator 2 – the current Topic has a Subject Locator 3 – the current Topic has a Subject Identifier t – the current Topic is typed n – the current Topic has a Topic Name simDNA(T1,T2) = 01XX1 T2 types an Association T2 has a Source Locator T2 has none Subject Identifier T2 is not typed T2 has a Topic Name, which is not similar simDNA(T1,T3) = 21113 T2 types a Topic T2 has a Source Locator T2 has a Subject Identifier T2 is typed T2 has a Topic Namen, which is a “bit” similar
  14. 14. simDNA – 2. Iteration simDNAType /x*y*z*w*s*1*2*3*t*n*(o)*[a]*/ x – the current Topic is typing a Topic y – the current Topic is typing an Association z – the current Topic is typing a Topic Characteristics w – the current Topic is typing a Association Role s – the current Topic is scoping a Topic Characteristic 1 – the current Topic has a Source Locator 2 – the current Topic has a Subject Locator 3 – the current Topic has a Subject Identifier t – the current Topic is typed n – the current Topic has a TopicName o – the current Topic has an Occurrence o => /(v|l)t?s*/ (OccDNAtype) a – the current Topic takes part in an Association a => /a(tp)*/ (AssDNAtype) Example simDNAtype(T1) = x13tn x – the current Topic is typing a Topic 1 – the current Topic has a Source Locator 2 – the current Topic has a Subject Locator 3 – the current Topic has a Subject Identifier t – the current Topic is typed n – the current Topic has a Topic Name simDNA(T1,T2) = 01XX1 T2 types an Association T2 has a Source Locator T2 has none Subject Identifier T2 is not typed T2 has a Topic Name, which is not similar simDNA(T1,T3) = 211 3 3 T2 types a Topic T2 has a Source Locator T2 has a Subject Identifier T2 is typed, and the typing Topic is similar T2 has a Topic Name, which is a “bit” similar
  15. 15. SIM - Example 13n Beispiel.xtm#TMStandards z13n X111 t_source.xtm#t_source Similar: false xx1n Beispiel.xtm#t_person z13n 01X1 t_source.xtm#t_source Similar: false z1n Beispiel.xtm#t_introduction z13n 21X1 t_source.xtm#t_source Similar: false zz1n Beispiel.xtm#t_homepage z13n 21X1 t_source.xtm#t_source Similar: false s1n Beispiel.xtm#t_en z13n X1X1 t_source.xtm#t_source Similar: false s1n Beispiel.xtm#t_de z13n X1X1 t_source.xtm#t_source Similar: false x1n Beispiel.xtm#t_requirements z13n 01X1 t_source.xtm#t_source Similar: false ss1n Beispiel.xtm#t_nickname z13n X1X1 t_source.xtm#t_source Similar: false 13n Beispiel.xtm#t_sort z13n X111 t_source.xtm#t_source Similar: false z1n Beispiel.xtm#t_source z13n 21X3 t_source.xtm#t_source Similar: true y1nnn Beispiel.xtm#at_authorship z13n 01X1 t_source.xtm#t_source Similar: false ws1n Beispiel.xtm#art_author z13n 01X1 t_source.xtm#t_source Similar: false ws1n Beispiel.xtm#art_document z13n 01X1 t_source.xtm#t_source Similar: false 13tnn(vs)(lt)(vts)[atptp] Beispiel.xtm#M1 z13n X111 t_source.xtm#t_source Similar: false 13tnn(lt) Beispiel.xtm#M2 z13n X111 t_source.xtm#t_source Similar: false 12tn(lt)[atptp] Beispiel.xtm#RA1 z13n X1X1 t_source.xtm#t_source Similar: false
  16. 16. SIM - Assessment <ul><li>Self-Assessment </li></ul><ul><ul><li>take each Topic from the Topic Map </li></ul></ul><ul><ul><li>create a (randomly pruned) fragment around the Topic Maps, and </li></ul></ul><ul><ul><li>request the Topic Map. </li></ul></ul><ul><ul><li>pruning probabilities </li></ul></ul><ul><ul><ul><li>probType - of the Type of the Topics </li></ul></ul></ul><ul><ul><ul><li>probTopNam - of the whole Topic Name </li></ul></ul></ul><ul><ul><ul><li>probAss - of the Association the Topic plays a role </li></ul></ul></ul><ul><ul><ul><li>probOcc - of a occurrence (and all of its properties) </li></ul></ul></ul>
  17. 17. SIM - (Self-)Assessment
  18. 18. Besides the TMDM Subject Equality Approach <ul><li>Subject Equality SMD i ( </li></ul>X X Subject Indication SMD 1 , Subject Indication SMD 2 , Subject Map Subject Proxy1 , Subject Map Subject Proxy2 )  true | false How can a SMD SIM be defined: How a deterministic Subject Indication Approach can be defined? Syntax Data Model (Graph) Referential Subject Equality Structuralist Subject Equality semantics as relative value semantics as absolute value bound to SM ontology <ul><li>simpleSIM </li></ul><ul><li>yields very good results in restricted domains </li></ul><ul><li>usage of Topic is ignored </li></ul>bound to TMV vocabulary bound to SMD ontology <ul><li>SIM (bound to TMDM) </li></ul><ul><li>more generic, yields good results </li></ul><ul><li>usage of Topic is exploited </li></ul>bound to TMRM <ul><li>adoption of Melniks Similarity Flooding Approach </li></ul><ul><li>not suitable for the usage scenario, but for SM ontology matching </li></ul>bound to TMA ontology <ul><li>work to do </li></ul>O(n*n) O(n*log(n)) Sowa’s Knowledge Signature
  19. 19. Discussion

×