The Impact Of Semantic Handshakes - Presentation Transcript
The Impact of Semantic Handshakes TMRA 2006, Leipzig, 12.10.2006 Lutz Maicher, University of Leipzig [email_address]
Agenda
The Integration Model of the TMDM
Semantic Handshakes and Interaction Protocols
Simulations
Result and Discussion
Preliminary Remark
This presentation does only describe the impact of a phenomenon which is determined by the existence of
the integration model of the TMDM (Topic Maps Data Model)
Topic Maps Communication Protocols like TMRAP, TMIP, etc
This presentation does not propose any new issues
nor methodologies, technologies, paradigms or anything else
The Integration Model of the TMDM
The Integration Model of the TMDM
Two Topic Items are equal if (TMDM 5.3.5) : (they represent the same Subject)
at least one equal string in their [subject identifiers] properties ,
at least one equal string in their [item identifiers] properties,
at least one equal string in their [subject locators] properties,
an equal string in the [subject identifiers] property of the one topic item and the [item identifiers] property of the other, or
the same information item in their [reified] properties.
Equal Topic Items A and B have to be merged into C (TMDM 6.2)
… .
Set C's [subject identifiers] property to the union of the values of A and B's [subject identifiers] properties.
… .
The Integration Model of the TMDM in practice equality holds not (according TMDM) In the case of terminological diversity…. [subject identifier] {ns1:LutzMaicher} A [subject identifier] {ns2:MaicherLutz} B [subject identifier] {ns1:LutzMaicher} A [subject identifier] {ns2:MaicherLutz} B
The Integration Model of the TMDM in practice equality holds (according TMDM) In the case of terminologial alignment…. the PSI case But who can enforce universal vocabularies? [subject identifier] {ns1:LutzMaicher} A [subject identifier] {ns1:LutzMaicher} B C [subject identifier] {ns1:LutzMaicher} merging (according TMDM)
Semantic Handshakes and Interaction Protocols
Semantic Handshake equality holds (according TMDM) The author of A has decided that both terms can be used to indicate Lutz Maicher [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz} A [subject identifier] {ns2:MaicherLutz} B C [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz} merging (according TMDM)
Local Semantic Handshakes and Interaction Protocols TM1 TM3 TM2 TM4 All Topic Maps interacting using the existing protocols like TMRAP, TMIP … [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz} A [subject identifier] {ns2:MaicherLutz, ns3:ML} B [subject identifier] {ns3:ML} C [subject identifier] {ns4:Lutz, ns3:ML} D Local Semantic Handshake Local Semantic Handshake Local Semantic Handshake
Local Semantic Handshakes and Interaction Protocols Request: Do you have a Topic Item with „ns1:LutzMaicher“ or „ns2:MaicherLutz“ in the property [subject identifier]? (Do you have information about the Subject Lutz Maicher?) Step 1 [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz} A [subject identifier] {ns2:MaicherLutz, ns3:ML} B [subject identifier] {ns3:ML} C [subject identifier] {ns4:Lutz, ns3:ML} D
Local Semantic Handshakes and Interaction Protocols Request: Do you have a Topic Item with „ns1:LutzMaicher“ or „ns2:MaicherLutz“ in the property [subject identifier]? (Do you have information about the Subject Lutz Maicher?) Step 1 [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz} A [subject identifier] {ns2:MaicherLutz, ns3:ML} B [subject identifier] {ns3:ML} C [subject identifier] {ns4:Lutz, ns3:ML} D NO NO ns2:MaicherLutz, ns3:ML ns2:MaicherLutz, ns1:LutzMaicher
Local Semantic Handshakes and Interaction Protocols Request: Do you have a Topic Item with „ns1:LutzMaicher“, „ns2:MaicherLutz“ or „ns3:ML“ in the property [subject identifier]? Step 2 [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML} A [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML} B [subject identifier] {ns3:ML} C [subject identifier] {ns4:Lutz, ns3:ML} D
Local Semantic Handshakes and Interaction Protocols Request: Do you have a Topic Item with „ns1:LutzMaicher“, „ns2:MaicherLutz“ or „ns3:ML“ in the property [subject identifier]? Step 2 [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML} A [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML} B [subject identifier] {ns3:ML} C [subject identifier] {ns4:Lutz, ns3:ML} D ns1:LutzMaicher, ns3:ML, ns2:MaicherLutz ns3:ML ns4:Lutz, ns3:ML ns1:LutzMaicher, ns3:ML, ns2:MaicherLutz,
Local Semantic Handshakes leads to Global Integration TM1 TM3 TM2 TM4 Global Integration through Local Semantic Handshakes. [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML, ns4:Lutz} A [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML} B [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML, ns4:Lutz} C [subject identifier] {ns1:LutzMaicher, ns2:MaicherLutz, ns3:ML, ns4:Lutz} D
Hypothesis and Simulation Design
Hypothesis
Due to the existence of the TMDM and interaction protocols, terminological diversity will be resolved to global integration if the majority of Topics discloses one local Semantic Handshake
Simulations for testing the Hypothesis …
Simulation Design
Create Topics
Create a number ( cardE ) of Topics which are assumed to exist in the world and representing the same Subject by definition
All Topics can always interact with each other
Add Subject Identifiers randomly
Draw a number of Subject Identifieres ( nbrOfDifferentII ) which should be assigend to the Topic according to a given distribution ( distributionNbrOfII )
if number is 1 no semantic handshake
if number is bigger than 1 semantic handshakes are done
Draw for each Subject Identifier of a Topic an integer according to a given distribution ( distributionII ) in the range [1..nbrOfII]
Start Interaction between Topics
If two Topics have an identical number in their sets of Subject Identifiers they become merged (the sets of Subject Identifiers of both Topics become the union of the origin sets)
Definition of an Distribution
Distributions are defined as follows:
<{0.8,1.0},6> is similar to the lottery
that 1,2,3 is drawn with the probability 80%
that 1,2,3 is drawn with the probability 20%
<{0.8,0.9,0.97,1.0}, 100> is similar to the lottery
that a number in [1,25] is drawn with the probability 80%
that a number in [26,50] is drawn with the probability 10%
that a number in [51,75] is drawn with the probability 7%
that a number in [76,100] is drawn with the probability 3%
Analysis - Measures
Measures of Interest (after some iterations)
Number of independet clusters (integration clouds)
an integration cloud is a set of Topics which are equal
Average size of the integration clouds
clouds(E) the lower the better clouds(E) = 1 global integration the higher the better card(T) = card(E) global integration clouds(E) = 3 card(T) = 33/9 = 3,7 clouds(E) = 2 card(T) = 41/9 = 4,6
Experiment Series
Simulation: Global Ontology the PSI Case
No Simulation is necessary
each Topic has the same, globally unique Subject Identifier
clouds(E)=1 (Global Integration)
card(T) = card(E)
… but the enforcement of global ontologies is an overly optimistic premise!
Simulation: Heterogenous World without Semantic Handshakes Iteration of nbrOfDifferentII in [5,100] general parameter: card(E) =100, distributionNbrOfII =<{1.0},1> specific parameter exp01: distributionII =<{1.0},100> specific parameter exp02: distributionII =<{0.8,0.9,0.95,1.0} ,100> no Semantic Handshakes some terms are more prominent 100 different terms will be resolved less then 40 integration clouds because some authors use the same term by chance (esp. the most prominent terms)
Simulation: The Impact of Semantic Handshakes Iteration of a in distributionNbrOfII=<{a,1.0},2> in [0.0,1.0] general parameters: card =100, nbrOfDifferentII =100 specific parameters exp03: distributionII =<{1.0}, 100> specific parameters exp04: distributionII =<{0.8,0.9,0.97,1.0}, 100> high terminological diversity no semantic handshakes always a semantic handshake some terms are more prominent 100 different terms will be resolved to ten integration clouds if only 55% of all Topics disclose a Semantic Handshake!
Simulation: The Impact of the terminological diversity Iteration of nbrOfDifferentII in [2,100] general parameters: cardE =100, distributionII=<{1.0},100> specific parameter exp05: distributionNbrOfII =<{0.2,1.0},2> specific parameter exp06: distributionNbrOfII =<{0.8,1.0},2> high terminological diversity low terminological diversity semantic handshake by the majority semantic handshake by the minority 50 different terms will be resolved to global integration if 80% of all Topics disclose a Semantic Handshake!
Result and Discussion
Result
Hypothesis is proofed: Global Integration will be reached if a significant number (majority) of Topics disclose one semantic handshake.
Remark
the effect does only appear, if there exist interaction links between all topic maps
the time point the effect appears depends on the interaction frequency
The more prominent the used terms are, the lower the global number of semantic handshakes necessary for global integration.
Design Recommendation:
Assign two (prominent) Subject Identifiers to each Topic you create. (You don‘t have to be aware of all existing terms for your concept.)
Discussion
These findings include problems concerning
Wrong Semantic Handshakes (by mistake, by purpose)
Homonymy (= the same term for different concepts)
Trust (Can I trust the local Semantic Handshakes?)
This presentation is about how global terminology c more
This presentation is about how global terminology can evolve without a centralized organisation. The simple idea is, that everybody has to disclose the identity of at least two identifiers for the same think. These local semantic handshakes will have the effect of global terminological alignment. less
0 comments
Post a comment