Pay-as-you-go Multi-User Feedback Model for Ontology Matching - EKAW2014

PAY-AS-YOU-GO MULTI-USER
FEEDBACK MODEL FOR
ONTOLOGY MATCHING
Isabel F. Cruz, Francesco Loprete, Matteo Palmonari,
Cosmin Stroe, and Aynaz Taheri
EKAW 2014
Linkoping, Sweden
1
1 1
2 2
1 ADVIS lab, University of Illinois at Chicago
2 ITIS Lab, University of Milan-Bicocca

motivation - pay-as-you-go multi-user feedback model - evaluation - conclusions and future work
Motivation and Background
oUser Involvement is one of the promising challenges in
Ontology Matching [Shvaiko et al. 2013]
• Involving users to improve the matching process
• Design interaction schemes which are burdenless to the users
oA community of users
• Reduction of the effort from each user
• Correction of user errors
• Obtaining consensus
oMain challenge
• Saving users’ effort
1. While ensuring the quality of the alignment
2. While allowing users’ errors
2
Shvaiko, P., Euzenat, J.: Ontology Matching: State of the Art and Future Challenge. Knowledge and Data Engineering,
IEEE Transactions on. 25(1) (2013) 158-176

Assumptions and Principles
oAssumptions
• Consensus is obtained by majority vote
• Users are domain experts (overall reliable)
• A constant error rate is associated with a sequence of validated
mappings
• Focus on equivalence mappings
oPrinciples
• Our pay-as-you-go fashion
• Each user provides validation
• Propagation of the user feedback without considering the majority vote
• Against our pay-as-you-go fashion
• Optimally Robust Feedback Loop (ORFL)
• Propagation of the user feedback when consensus is reached
11/27/2014 3

Approach Overview
Matcher 1
Matcher 2
Matcher k
Source Ontology
Target Ontology
Validated
Mappings
Non Validated
Mappings
Initial Matching Validation Request Candidate Selection
T(mi) F(mi)
m1
1 1
m2 0 0
m3 2 1
… … …
User Validation
Alignment Selection Feedback Propagation
Feedback Aggregation
11/27/2014 4
T or F
Alignment

Mapping Quality Model
Approach Overview
Matcher 1
Matcher 2
Matcher k
Source Ontology
Target Ontology
Validated
Mappings
Non Validated
Mappings
Initial Matching Validation Request Candidate Selection
T(mi) F(mi)
m1
1 1
m2 0 0
m3 2 1
… … …
User Validation
Alignment Selection Feedback Propagation
Feedback Aggregation
11/27/2014 5
T or F
Alignment

Mapping Quality Measures
1. Automatic Matcher Agreement (AMA)
Agreement of the similarity scores assigned to a mapping
by different matchers
m1 = 1,1,0,0 ÞAMA mm ( 1) = 0 2 = 1,1,1,1 ÞAMA m( 2 ) =1 ≥
1. Cross Sum Quality (CSQ)
How safe a mapping is from potential conflicts
with other mappings
CSQ(m34 CSQ(m ) = 0.13 22 ) = 0.76 ≥
1. Similarity Score Definiteness (SSD)
0.45 0.70
0.30
0.60
0.50 0.90
0.80
0.40 0.10 0.90
How close the similarity score associated with a
mapping is to the similarity scores’ upper and
lower bounds
SSD(m31) = 0.0
0 1 2 3 4 5
0
1
2
3
4
5
An example of a similarity matrix
SSD(m34 ) = 0.8
≥
11/27/2014 6

m1 = 1,1,0,0 ÞAMA mm ( 1) = 0 2 = 1,1,1,1 ÞAMA m( 2 ) =1 ≥
with other mappings
CSQ(m34 CSQ(m ) = 0.13 22 ) = 0.76 ≥
0.45 0.70
0.30
0.60
0.50 0.90
0.80
0.40 0.10 0.90
lower bounds
SSD(m31) = 0.0
0 1 2 3 4 5
0
1
2
3
4
5
SSD(m34 ) = 0.8
≥
11/27/2014 7

m1 = 1,1,0,0 ÞAMA mm ( 1) = 0 2 = 1,1,1,1 ÞAMA m( 2 ) =1 ≥
with other mappings
CSQ(m34 CSQ(m ) = 0.13 22 ) = 0.76 ≥
0.45 0.70
0.30
0.60
0.50 0.90
0.80
0.40 0.10 0.90
lower bounds
SSD(m31) = 0.0
0 1 2 3 4 5
0
1
2
3
4
5
SSD(m34 ) = 0.8
≥
11/27/2014 8

Captures the user consensus gathered on a mapping
if T(m) ≥ MinCon or F(m) ≥ MinCon
Mapping
T( i m ) F( i m ) CON( i m ) PI( i m)
1 1 0.00 1.00
1 0 0.33 0.66
2 1 0.33 0.5
4. Consensus (CON)
CON(m) =
ì
1
T(m)- F(m)
MinCon
í
ï
î
ï
Minimum number of similar
labels that is needed to make a
correct decision on a mapping
5. Propagation Impact (PI)
Estimates the instability of the user feedback collected on a mapping
PI (m) =
otherwise
0
min(DT(m),DF(m))
max(DT(m),DF(m))
ì
í ï
î ï
if T(m) = MinCon or F(m) = MinCon
otherwise
Examples for CON and PI
1m
2 m
3 m
DT(m) =MinCon-T(m) DF(m) =MinCon-F(m)
11/27/2014 9

Quality-Based Candidate Selection
oCandidate selection strategies
• Combine different mapping quality measures
• Rank mappings in decreasing order of quality
1. Disagreement and Indefiniteness Average (DIA)
• Selects mappings with the most disagreement by the
automatic matchers and most indefinite similarity values
DIA(m) = AVG(AMA-(m),SSD-)
1. Revalidation (REV)
• Selects mappings with the lowest consensus, highest
feedback instability and highest conflict with other mappings
REV(m) = AVG(CON-(m),PI (m),CSQ-(m))
DIA
Non
Validated
Mappings
REV
Validated
Mappings
11/27/2014 10

Quality-Based Candidate Selection
Ranked list of mappings
… N5 N4 N3 N2 N1
M1
… M5 M4 M3 M2 M1
oMeta-strategy
• Combine two strategies: DIA and REV
• Revalidation Rate (RR) determines the proportion of mappings
selected from the two ranked lists for a sequence of validated
mappings
DIA
REV
Meta
Strategy
PREV Î[0,1] Revalidation Rate (RR)
E.g., for RR = 0.3, every ten iterations three mappings are picked from the REV ranked list
11/27/2014 11

Quality Agreement Propagation
o Feedback is propagated by
updating the similarity of
o The mapping labeled by the user
o A class of similar mappings
o A conservative propagation to make
the system more robust to erroneous
feedback
o Propagation is proportional to
• The quality of the labeled mapping (consensus)
• The quality of the mappings in the similarity class (matchers agreement, definiteness)
• Propagation gain defined by a constant
ìíî
s t(mc) = s t-1(mc)+min(Q(mv)*Q¢(mc)* g,1-s t-1(mc))
s t-1(mc)-min(Q(mv)*Q¢(mc)* g,s t-1(mc))
The similarity of
mapping at
mc
iteration t
v m
Propagation gain
0 £ g £1
Q(
mv) =CON(
mv) Q¢(
mc) =
AVG(AMA(
mc), SSD(
mc))
If label( )=1
If label(m v)=1
11/27/2014 12

feedback
ìíî
The similarity of
mapping at
mc
iteration t
v m
Propagation gain
0 £ g £1
Q(
mv) =CON(
mv) Q¢(
mc) =
AVG(AMA(
mc), SSD(
mc))
If label( )=1
If label(m v)=1
11/27/2014 13

feedback
ìíî
The similarity of
mapping at
mc
iteration t
v m
Propagation gain
0 £ g £1
Q(
mv) =CON(
mv) Q¢(
mc) =
AVG(AMA(
mc), SSD(
mc))
If label( )=1
If label(m v)=1
11/27/2014 14

feedback
ìíî
The similarity of
mapping at
mc
iteration t
v m
Propagation gain
0 £ g £1
Q(
mv) =CON(
mv) Q¢(
mc) =
AVG(AMA(
mc), SSD(
mc))
If label( )=1
If label(m v)=1
11/27/2014 15

Experiments
oEvaluation
Benchmark track of OAEI 2010 (101-301, 101-302, 101-303, 101-304)
• Comparison with Baseline (ORFL): user feedback is propagated when
consensus is reached
• Comparison of our candidate selection strategy with a strategy proposed in
an active learning approach [Shi et al. 2009]
oWe used two measures based on F-Measure:
16
Gain at iteration t
DF_Measure(t)=
FMeasure(t)-FMeasure(0)
Robustness at iteration t
Robustness(t)= FMeasureER=er(t)
FMeasureER=0(t)
Shi, F., Li, J., Tang, J., Xie, G., Li, H.: Actively Learning Ontology Matching via User Interaction. In
International Semantic Web Conference (ISWC). Volume 5823., Springer (2009) 585-600

Experimental Setup
DF_Measure
o Simulation of users
• Error rate (ER): 0.0, 0.05, 0.1, 0.15, 0.2
• Number of users: 10
o AgreementMaker
o matchers, alignment selection
o Propagation gain (g)
• 0.0 (no gain), 0.5
o Revalidation rate
• 0.0, 0.1, 0.2, 0.3, 0.4, 0.5
11/27/2014 17

Benchmark Track 101-303
RR=0 RR=0.1 RR=0.2
Iterations Iterations Iterations
RR=0.3 RR=0.4 RR=0.5
Iterations Iterations Iterations
The dashed lines represent a propagation gain equal to zero.
The dotted pink line represents ORFL. Initial F-Measure=72.73
11/27/2014 18

Benchmark Track 101-303
Robustness Robustness
RR=0 RR=0.1 RR=0.2
Iterations Iterations
RR=0.3 RR=0.4 RR=0.5
Iterations Iterations
Robustness
Iterations
The dashed lines represent a propagation gain equal to zero.
11/27/2014 19
Robustness
Robustness
Robustness
Iterations

Other Benchmark Tasks
DF_Measure(t) for the matching tasks:
ER RR CONF 101-301(0.92) 101-302(0.86) 101-304(0.92)
@10 @25 @50 @100 @10 @25 @50 @100 @10 @25 @50 @100
0.0 0.2 NoGain 0.03 0.05 0.05 0.05 0.03 0.05 0.06 0.08 0.0 0.05 0.05 0.05
0.0 0.2 Gain 0.03 0.04 0.04 0.05 0.03 0.06 0.06 0.08 0.0 0.05 0.05 0.05
0.0 0.3 NoGain 0.02 0.05 0.05 0.05 0.03 0.05 0.06 0.08 0.0 0.04 0.05 0.05
0.0 0.3 Gain 0.02 0.04 0.04 0.05 0.03 0.05 0.06 0.08 0.0 0.03 0.05 0.05
0.1 0.2 NoGain 0.03 0.04 0.01 -0.01 0.02 0.01 0.0 -0.02 0.0 0.03 0.03 0.0
0.1 0.2 Gain 0.03 0.03 0.01 0.0 0.02 0.03 0.01 0.01 0.0 0.03 0.03 0.00
0.1 0.3 NoGain 0.02 0.04 0.02 0.0 0.03 0.02 0.00 0.01 0.0 0.03 0.04 0.02
0.1 0.3 Gain 0.02 0.03 0.01 0.0 0.03 0.03 0.01 0.01 0.0 0.03 0.04 0.01
- 0.0 ORFL 0.0 0.02 0.04 0.05 0.01 0.03 0.05 0.05 0.0 0.0 0.0 0.05
11/27/2014 20

Comparison of Ranking Functions for
Non Validated Mappings: our DIA vs
[Shi et al. 2009]
21
• An error free setting
• No propagation
Quality
Measures
F-Measure(0) @10 @20 @30 @40 @50 @100 F-Measure(100)
Active
Learning
0.73 0.01 0.02 0.05 0.08 0.12 0.15 0.88
AVG(DIS, SSD) 0.73 0.05 0.12 0.14 0.16 0.19 0.26 0.99
Shi, F., Li, J., Tang, J., Xie, G., Li, H.: Actively Learning Ontology Matching via User Interaction. In
International Semantic Web Conference (ISWC). Volume 5823., Springer (2009) 585-600
1

Conclusion
oTwo main steps
• Candidate mapping selection: dynamic ranking of candidate mappings
• Feedback propagation: similarity propagation of validated mappings
oError and revalidation rates
o An increasing error rate counteracted by an increasing revalidation rate
oA revalidation rate equal to 0.3 achieves a good trade-off
between F-measure and Robustness
oPropagation leads to better results than no propagation
11/27/2014 22

Future Work
• User profiling and user validations weighting
• Propagation depending on the feedback quality
• Using different probability distributions to model a variety
of users’ behavior
• Determine the impact of users’ behavior along time on the
error distribution
11/27/2014 23

QUESTIONS?
We sincerely appreciate your feedback
EKAW 2014
Linkoping, Sweden
mailto: palmonari@disco.unimib.it

Pay-as-you-go Multi-User Feedback Model for Ontology Matching - EKAW2014

Recommended

Recommended

More Related Content

Similar to Pay-as-you-go Multi-User Feedback Model for Ontology Matching - EKAW2014

Similar to Pay-as-you-go Multi-User Feedback Model for Ontology Matching - EKAW2014 (20)

More from Università degli Studi di Milano-Bicocca

More from Università degli Studi di Milano-Bicocca (8)

Recently uploaded

Recently uploaded (20)

Pay-as-you-go Multi-User Feedback Model for Ontology Matching - EKAW2014

Editor's Notes