SlideShare a Scribd company logo
1 of 18
Download to read offline
Interlinking: Performance Assessment of User
Evaluation vs. Supervised Learning Approaches
Mofeed Hassan, Jens Lehmann and Axel-Cyrille Ngonga Ngomo
AKSW
Department of Computer Science
University of Leipzig
Augustusplatz 10, 04109 Leipzig
{mounir,lehmann,ngonga}@informatik.uni-leipzig.de
WWW home page: http://limes.sf.net
June 14, 2015
t
LDOW 2015
Why Link Discovery?
1 Fourth Linked Data principle
2 Links are central for
Cross-ontology QA
Data Integration
Reasoning
Federated Queries
...
3 Valuable asset for enterprises
4 Linked Data on the Web:
10+ thousand datasets
89+ billion triples
≈ 500+ million links?
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
2 / 16
t
LDOW 2015
Why is it difficult?
Definition (Link Discovery)
Given sets S and T of resources and relation R
Task: Find M = {(s, t) ∈ S × T : R(s, t)}
Common approaches:
Find M = {(s, t) ∈ S × T : σ(s, t) ≥ θ}
Find M = {(s, t) ∈ S × T : δ(s, t) ≤ θ}
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
3 / 16
t
LDOW 2015
Why is it difficult?
Definition (Link Discovery)
Given sets S and T of resources and relation R
Task: Find M = {(s, t) ∈ S × T : R(s, t)}
Common approaches:
Find M = {(s, t) ∈ S × T : σ(s, t) ≥ θ}
Find M = {(s, t) ∈ S × T : δ(s, t) ≤ θ}
1 Time complexity
Large number of triples
Quadratic a-priori runtime
69 days for mapping cities from
DBpedia to Geonames (1ms per
comparison)
Decades for linking DBpedia and
LGD . . .
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
3 / 16
t
LDOW 2015
Why is it difficult?
2 Complexity of specifications
Combination of several attributes required for high precision
Adequate atomic similarity functions difficult to detect
Tedious discovery of most adequate mapping
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
4 / 16
t
LDOW 2015
Motivation
Several frameworks, e.g., LIMES,
SILK, SLINT+
Differences
1 Domain dependency
2 Automation & user involvement
3 Matching and learning techniques
(unsupervised, active, batch
learning)
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
5 / 16
t
LDOW 2015
Motivation
Several frameworks, e.g., LIMES,
SILK, SLINT+
Differences
1 Domain dependency
2 Automation & user involvement
3 Matching and learning techniques
(unsupervised, active, batch
learning)
Questions
Q1 How much does a link cost?
Q2 How much does it cost to train a framework?
Q3 Where is the break-even point?
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
5 / 16
t
LDOW 2015
Empirical Study: Methodology
Define m interlinking tasks
Request links from n annotators
Measure cost/links (Q1)
Measure tool performance for increasing learning data (Q2)
Find intersection of both lines (Q3)
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
6 / 16
t
LDOW 2015
Empirical Study
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
7 / 16
t
LDOW 2015
Empirical Study
Designed simple interface to visualize resources
Users are given links and can choose between correct,
incorrect and unsure
https://github.com/AKSW/Evalink
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
8 / 16
t
LDOW 2015
Empirical Study
Source Target Restrictions Properties
DBpedia LinkedGeoData Cities Label
(18114 instances) (979 instances) Latitude
Longitude
DBpedia LinkedMDB Label
(13429 instances) (628 instances) Film Director
ReleaseDate
DBpedia Drugbank Drug Label
(5352 instances) (1652 instances) GenericName
Defined m = 3 tasks
Tasks submitted to n = 5 human annotators
Specifications for sets of links to annotate created manually
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
9 / 16
t
LDOW 2015
Evaluation: Costs/link
Task 1 Task 2 Task 3
User 1 36.8 23 10.2
User 2 21.5 18.8 20.4
User 3 12.3 39.4 9.8
User 4 10.9 11.3 34.6
User 5 38.9 43.8 44.7
High variance across users
Costs/link vary between roughly 10 and 40s/link
Highly dependent on familarity with the domain and
complexity of data
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
10 / 16
t
LDOW 2015
Evaluation: Human Accuracy
Task 1
Precision Recall F-Measure
User 1 0.81 0.98 0.89
User 2 0.83 1 0.91
User 3 0.74 0.9 0.81
User 4 0.81 0.98 0.88
User 5 0.82 0.99 0.9
Bias towards recall
F-measure varies between roughly 0.8 and 1
Variance due to familiarity with domain
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
11 / 16
t
LDOW 2015
Evaluation: Human Accuracy
Task 3
Precision Recall F-Measure
User 1 0.97 0.99 0.98
User 2 0.96 0.98 0.97
User 3 0.94 0.96 0.95
User 4 0.93 0.95 0.94
User 5 0.91 0.93 0.92
Bias towards recall
F-measure varies between roughly 0.8 and 1
Variance due to familiarity with domain
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
12 / 16
t
LDOW 2015
Evaluation: Machine Accuracy
0 10 20 30 40 50
0
0.2
0.4
0.6
0.8
1
Cost(Min.)
F−Measure
GAL
GCAL
GBL
Human
Three learning algorithms (batch learning, active learning,
clustering-based active learning)
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
13 / 16
t
LDOW 2015
Evaluation: Machine Accuracy
0 10 20 30 40 50 60
0
0.2
0.4
0.6
0.8
1
Cost(Min.)
F−Measure
GAL
GCAL
GBL
Human
Machines with above-human performance
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
14 / 16
t
LDOW 2015
Conclusions
Preliminary results
High variance of annotation costs
Costs depend mostly on evaluator and familarity with domain
(roughly 10 to 40s/link)
Machines outperform humans even on small tasks
Future work
Extend experiments with larger crowd
Try other machine-learning approaches
Evaluate different interfaces
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
15 / 16
t
LDOW 2015
That’s all Folks!
Thank you!
Questions?
Axel Ngonga
University of Leipzig
AKSW Research Group
ngonga@informatik.uni-leipzig.de
http://limes.sf.net
M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines
16 / 16

More Related Content

Similar to LDOW15

Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
Maribel Acosta Deibe
 
SCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial IntelligenceSCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial Intelligence
vivatechijri
 
New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applications
Constantin Orasan
 

Similar to LDOW15 (20)

Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
A Recommendation Engine For Predicting Movie Ratings Using A Big Data Approach
A Recommendation Engine For Predicting Movie Ratings Using A Big Data ApproachA Recommendation Engine For Predicting Movie Ratings Using A Big Data Approach
A Recommendation Engine For Predicting Movie Ratings Using A Big Data Approach
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Learning to recommend with user generated content
Learning to recommend with user generated contentLearning to recommend with user generated content
Learning to recommend with user generated content
 
Estimating distance decay for the national propensity to cycle tool
Estimating distance decay for the national propensity to cycle toolEstimating distance decay for the national propensity to cycle tool
Estimating distance decay for the national propensity to cycle tool
 
SCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial IntelligenceSCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial Intelligence
 
WIRE:WIsdom-awaRE computing
WIRE:WIsdom-awaRE computingWIRE:WIsdom-awaRE computing
WIRE:WIsdom-awaRE computing
 
TUW-ASE Summer 2015 - Quality of Result-aware data analytics
TUW-ASE Summer 2015 - Quality of Result-aware data analyticsTUW-ASE Summer 2015 - Quality of Result-aware data analytics
TUW-ASE Summer 2015 - Quality of Result-aware data analytics
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
Human Computation and Crowdsourcing for Information Systems
Human Computation and Crowdsourcing for Information SystemsHuman Computation and Crowdsourcing for Information Systems
Human Computation and Crowdsourcing for Information Systems
 
Presentation
PresentationPresentation
Presentation
 
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
 
CV
CVCV
CV
 
An Initial Homophily Indicator to Reinforce Context-Aware Semantic Computing
An Initial Homophily Indicator to Reinforce Context-Aware Semantic ComputingAn Initial Homophily Indicator to Reinforce Context-Aware Semantic Computing
An Initial Homophily Indicator to Reinforce Context-Aware Semantic Computing
 
AEIOU Framework:Towards “Laws of Service” Across Time-Space-Scale
AEIOU Framework:Towards “Laws of Service” Across Time-Space-ScaleAEIOU Framework:Towards “Laws of Service” Across Time-Space-Scale
AEIOU Framework:Towards “Laws of Service” Across Time-Space-Scale
 
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
 
Link Discovery Tutorial Introduction
Link Discovery Tutorial IntroductionLink Discovery Tutorial Introduction
Link Discovery Tutorial Introduction
 
Poster
PosterPoster
Poster
 
New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applications
 

LDOW15

  • 1. Interlinking: Performance Assessment of User Evaluation vs. Supervised Learning Approaches Mofeed Hassan, Jens Lehmann and Axel-Cyrille Ngonga Ngomo AKSW Department of Computer Science University of Leipzig Augustusplatz 10, 04109 Leipzig {mounir,lehmann,ngonga}@informatik.uni-leipzig.de WWW home page: http://limes.sf.net June 14, 2015
  • 2. t LDOW 2015 Why Link Discovery? 1 Fourth Linked Data principle 2 Links are central for Cross-ontology QA Data Integration Reasoning Federated Queries ... 3 Valuable asset for enterprises 4 Linked Data on the Web: 10+ thousand datasets 89+ billion triples ≈ 500+ million links? M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 2 / 16
  • 3. t LDOW 2015 Why is it difficult? Definition (Link Discovery) Given sets S and T of resources and relation R Task: Find M = {(s, t) ∈ S × T : R(s, t)} Common approaches: Find M = {(s, t) ∈ S × T : σ(s, t) ≥ θ} Find M = {(s, t) ∈ S × T : δ(s, t) ≤ θ} M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 3 / 16
  • 4. t LDOW 2015 Why is it difficult? Definition (Link Discovery) Given sets S and T of resources and relation R Task: Find M = {(s, t) ∈ S × T : R(s, t)} Common approaches: Find M = {(s, t) ∈ S × T : σ(s, t) ≥ θ} Find M = {(s, t) ∈ S × T : δ(s, t) ≤ θ} 1 Time complexity Large number of triples Quadratic a-priori runtime 69 days for mapping cities from DBpedia to Geonames (1ms per comparison) Decades for linking DBpedia and LGD . . . M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 3 / 16
  • 5. t LDOW 2015 Why is it difficult? 2 Complexity of specifications Combination of several attributes required for high precision Adequate atomic similarity functions difficult to detect Tedious discovery of most adequate mapping M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 4 / 16
  • 6. t LDOW 2015 Motivation Several frameworks, e.g., LIMES, SILK, SLINT+ Differences 1 Domain dependency 2 Automation & user involvement 3 Matching and learning techniques (unsupervised, active, batch learning) M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 5 / 16
  • 7. t LDOW 2015 Motivation Several frameworks, e.g., LIMES, SILK, SLINT+ Differences 1 Domain dependency 2 Automation & user involvement 3 Matching and learning techniques (unsupervised, active, batch learning) Questions Q1 How much does a link cost? Q2 How much does it cost to train a framework? Q3 Where is the break-even point? M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 5 / 16
  • 8. t LDOW 2015 Empirical Study: Methodology Define m interlinking tasks Request links from n annotators Measure cost/links (Q1) Measure tool performance for increasing learning data (Q2) Find intersection of both lines (Q3) M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 6 / 16
  • 9. t LDOW 2015 Empirical Study M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 7 / 16
  • 10. t LDOW 2015 Empirical Study Designed simple interface to visualize resources Users are given links and can choose between correct, incorrect and unsure https://github.com/AKSW/Evalink M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 8 / 16
  • 11. t LDOW 2015 Empirical Study Source Target Restrictions Properties DBpedia LinkedGeoData Cities Label (18114 instances) (979 instances) Latitude Longitude DBpedia LinkedMDB Label (13429 instances) (628 instances) Film Director ReleaseDate DBpedia Drugbank Drug Label (5352 instances) (1652 instances) GenericName Defined m = 3 tasks Tasks submitted to n = 5 human annotators Specifications for sets of links to annotate created manually M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 9 / 16
  • 12. t LDOW 2015 Evaluation: Costs/link Task 1 Task 2 Task 3 User 1 36.8 23 10.2 User 2 21.5 18.8 20.4 User 3 12.3 39.4 9.8 User 4 10.9 11.3 34.6 User 5 38.9 43.8 44.7 High variance across users Costs/link vary between roughly 10 and 40s/link Highly dependent on familarity with the domain and complexity of data M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 10 / 16
  • 13. t LDOW 2015 Evaluation: Human Accuracy Task 1 Precision Recall F-Measure User 1 0.81 0.98 0.89 User 2 0.83 1 0.91 User 3 0.74 0.9 0.81 User 4 0.81 0.98 0.88 User 5 0.82 0.99 0.9 Bias towards recall F-measure varies between roughly 0.8 and 1 Variance due to familiarity with domain M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 11 / 16
  • 14. t LDOW 2015 Evaluation: Human Accuracy Task 3 Precision Recall F-Measure User 1 0.97 0.99 0.98 User 2 0.96 0.98 0.97 User 3 0.94 0.96 0.95 User 4 0.93 0.95 0.94 User 5 0.91 0.93 0.92 Bias towards recall F-measure varies between roughly 0.8 and 1 Variance due to familiarity with domain M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 12 / 16
  • 15. t LDOW 2015 Evaluation: Machine Accuracy 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1 Cost(Min.) F−Measure GAL GCAL GBL Human Three learning algorithms (batch learning, active learning, clustering-based active learning) M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 13 / 16
  • 16. t LDOW 2015 Evaluation: Machine Accuracy 0 10 20 30 40 50 60 0 0.2 0.4 0.6 0.8 1 Cost(Min.) F−Measure GAL GCAL GBL Human Machines with above-human performance M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 14 / 16
  • 17. t LDOW 2015 Conclusions Preliminary results High variance of annotation costs Costs depend mostly on evaluator and familarity with domain (roughly 10 to 40s/link) Machines outperform humans even on small tasks Future work Extend experiments with larger crowd Try other machine-learning approaches Evaluate different interfaces M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 15 / 16
  • 18. t LDOW 2015 That’s all Folks! Thank you! Questions? Axel Ngonga University of Leipzig AKSW Research Group ngonga@informatik.uni-leipzig.de http://limes.sf.net M. Hassan, J. Lehmann and A. Ngonga June 14, 2015 Interlinking: Humans vs. Machines 16 / 16