Evaluating the Stability and Credibility of Ontology Matching Methods

671 views

Published on

Ontology Track @ ESWC2011

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
671
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Evaluating the Stability and Credibility of Ontology Matching Methods

  1. 1. Evaluating the Stability and Credibility of Ontology Matching Methods Xing Niu, Haofen Wang, Gang Wu, Guilin Qi, and Yong Yu 2011.5.31
  2. 2. <ul><li>Introduction </li></ul><ul><li>Basic Concepts </li></ul><ul><ul><li>Confidence Threshold, relaxedCT </li></ul></ul><ul><ul><li>Test Unit </li></ul></ul><ul><li>Evaluation Measures and Their Usages </li></ul><ul><ul><li>Comprehensive F-measure </li></ul></ul><ul><ul><li>STD score </li></ul></ul><ul><ul><li>ROC-AUC score </li></ul></ul><ul><li>Conclusions and Future Work </li></ul>Agenda Page 
  3. 3. Introduction <ul><li>Stability </li></ul><ul><ul><li>Reference matches are scarce </li></ul></ul><ul><ul><li>Training proper parameters </li></ul></ul><ul><ul><li>High stability : a matching method performs consistently on the data of different domains or scales </li></ul></ul><ul><li>Credibility </li></ul><ul><ul><li>Candidate matches sorted by their matching confidence values </li></ul></ul><ul><ul><li>High credibility : a matching method generate true positive matches with high confidence values while return false positive ones with low values. </li></ul></ul>Page 
  4. 4. Introduction (con’t) <ul><li>Judging the basic compliance is not sufficient </li></ul><ul><ul><li>Precision </li></ul></ul><ul><ul><li>Recall </li></ul></ul><ul><ul><li>F-measure </li></ul></ul><ul><li>Measurements </li></ul><ul><ul><li>Stability : Comprehensive F-measure and STD (STandard Deviation) score </li></ul></ul><ul><ul><li>Credibility : ROC-AUC (Area Under Curve) score </li></ul></ul>Page 
  5. 5. Confidence Threshold <ul><li>A match can be represented as a 5-tuple </li></ul><ul><li>Confidence Threshold (CT) </li></ul>Page  CT
  6. 6. Relaxed CT Page  CT relaxedCT
  7. 7. Test Unit <ul><li>Test Unit </li></ul><ul><ul><li>A set of similar datasets describing the same domain and sharing many resemblances, but differing on details. </li></ul></ul><ul><li>Examples </li></ul><ul><ul><li>Benchmark 20X: structural information remains ● another language, synonyms, naming conventions </li></ul></ul><ul><ul><li>Conference Track: conference organization domain ● built by different groups </li></ul></ul><ul><ul><li>Cyclopedia: same data source ● different categories </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><ul><li>Others: same data source ● random N-folds </li></ul></ul>Page 
  8. 8. Comprehensive F-measure <ul><li>Maximum F-measure </li></ul><ul><ul><li>maxF-measure reflects the theoretical optimal matching quality of matching methods </li></ul></ul><ul><li>Uniform F-measure </li></ul><ul><ul><li>uniF-measure simulates the practical application and evaluates such stability of matching methods. </li></ul></ul><ul><li>Comprehensive F-measure </li></ul>Page 
  9. 9. Usage of comF-measure <ul><li>Draw histograms to find out ‘who’ holds the comF-measure back </li></ul>Page  Falcon-AO and RiMOM in Benchmark-20X Test *another language, synonyms
  10. 10. Usage of comF-measure (con’t) <ul><li>Use the comF-measure value </li></ul><ul><ul><li>as a indicator of matching quality </li></ul></ul><ul><ul><li>as it reflects both theoretical and practical results </li></ul></ul><ul><li>Use the comF-measure function </li></ul><ul><ul><li>as the objective function of a optimization problem </li></ul></ul><ul><ul><li>as it conceals a multi-objective optimization problem </li></ul></ul>Page 
  11. 11. STD Score Page 
  12. 12. <ul><li>Reflects the stability of a matching method </li></ul>Usage of STD Score Page  Falcon-AO & Lily in Benchmark Test *lexical information, structural information
  13. 13. ROC-AUC score Page 
  14. 14. Usage of ROC-AUC Score Page  Spider Chart for Conference Test Unit
  15. 15. Conclusions and Future Work <ul><li>Conclusions </li></ul><ul><ul><li>New evaluation measures </li></ul></ul><ul><ul><ul><li>Comprehensive F-measure </li></ul></ul></ul><ul><ul><ul><li>STD score </li></ul></ul></ul><ul><ul><ul><li>ROC-AUC score </li></ul></ul></ul><ul><ul><li>Deep analysis </li></ul></ul><ul><li>Future Work </li></ul><ul><ul><li>Extend our evaluation measures to a comprehensive strategy </li></ul></ul><ul><ul><li>Test more matching methods under other datasets </li></ul></ul><ul><ul><li>Make both stability and credibility as standard evaluation measures for ontology matching </li></ul></ul>Page 
  16. 16. APEX Data & Knowledge Management Lab

×