SlideShare a Scribd company logo
1 of 36
Download to read offline
Link Discovery Tutorial
Benchmarking for Instance Matching Systems
Axel-Cyrille Ngonga Ngomo(1)
, Irini Fundulaki(2)
, Mohamed Ahmed Sherif(1)
(1) Institute for Applied Informatics, Germany
(2) FORTH, Greece
October 18th, 2016
Kobe, Japan
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 1 / 36
The Questions(s)
Instance matching research has led to the development of
various systems.
What are the problems that I wish
to solve?
What are the relevant key
performance indicators?
What is the behavior of the existing
engines w.r.t. the key performance
indicators?
Which are the tool(s) that I should use for my data
and for my use case?
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 2 / 36
Importance of Benchmarking
Benchmarks exist
To allow adequate measurements of systems
To provide evaluation of engines for real (or close to real) use cases
Provide help
Designers and Developers to assess the performance of their tools
Users to compare the different available tools and evaluate suitability for their
needs
Researchers to compare their work to others
Leads to improvements:
Vendors can improve their technology
Researchers can address new challenges
Current benchmark design can be improved to cover new necessities and
application domains
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 3 / 36
The Answer: Benchmark your engines!
Instance Matching/Linking Benchmark comprises of
Datasets: The raw material of the benchmarks. These are the source and
the target dataset that will be matched together to find the links between
resources
Test Cases: Address heterogeneities (structural, value, semantic) of the
datasets to be matched
Gold Standard (Ground Truth / Reference Alignment): The "correct
answer sheet" used to judge the completeness and soundness of the instance
matching algorithms
Metrics: The performance metric(s) that determine the systems behaviour
and performance
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 4 / 36
Benchmark Datasets: Characteristics
Nature
Real Datasets: Widely used datasets from a domain of interest
+ Realistic conditions for heterogeneity problems
+ Realistic distributions
- Error prone, hard to create Reference Alignment
Synthetic Datasets: Produced with a data generator (that hopefully produces
data with interesting characteristics
+ Fully controlled test conditions
+ Accurate, Easy to create Reference Alignments
- Unrealistic distributions
- Systematic heterogeneity problems
Schema
Datasets to be matched have the same or different schemas
Domain
Datasets come from the same or different domains
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 5 / 36
Benchmark Test Cases: Variations
Value
Name style abbreviations, Typographical errors, change format
(date/gender/number), synonym change, language change (multilinguality)
Structural
Change property depth, Delete/add property, split property values,
transformation of object/data to data/object type property
Semantics
class deletion/modification, invert property assertions, change class/property
hierarchy, assert class disjointness
Combinations of Variations
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 6 / 36
Benchmark: Gold Standard
The "correct answer sheet" used to judge the completeness and soundness of
the instance matching algorithms
Characteristics
Existence of errors / missing alignments
Representation: owl:sameAs and skos:exactMatch
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 7 / 36
Benchmark: Metrics
Precision P = tp
(tp+fn)
Recall R = tp
(tp+fp)
F-measure F = 2 × P × R
(P+R)
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 8 / 36
Instance Matching Benchmarks:
Desirable Attributes
Systematic Procedure matching tasks should be reproducible and the exe-
cution must be comparable
Availability benchmark should be available
Quality precise evaluation rules and high quality ontologies
must be provided
Equity evaluation process should not privilege any system
Dissemination benchmark should be used to evaluate instance
matching systems
Volume dataset size
Gold Standard gold standard should exist and be as accurate as pos-
sible
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 9 / 36
What about Benchmarks?
Instance matching techniques have, until recently, been
benchmarked in an ad-hoc way.
There is no standard way of benchmarking the performance
of the systems, when it comes to Linked Data.
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 10 / 36
Ontology Alignment Evaluation Initiative
IM benchmarks have been mainly driven forward by the Ontology
Alignment Evaluation Initiative (OAEI)
organizes annual campaign for ontology matching since 2005
hosts independent benchmarks
In 2009, OAEI introduced the Instance Matching (IM) Track
focuses on the evaluation of different instance matching techniques and tools
for Linked Data
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 11 / 36
Instance Matching Benchmarks
Bechmark Generators
Synthetic Benchmarks
Real Benchmarks
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 12 / 36
Semantic Web Instance Generation
(SWING) [FMN+11]
Semi automatic generator of Instance Matching Benchmarks
Contributed in the generation of IIMB Benchmarks of OAEI in 2010, 2011
and 2012 Instance Matching Tracks
Freely available at (https://code.google.com/p/swing-generator/)
All kind of variations supported into the benchmarks except multilinguality
Automatically produced gold standard
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 13 / 36
Lance [SDF+15b]
Flexible, generic and domain-independent benchmark generator which takes
into consideration RDFS and OWL constructs in order to evaluate instance
matching systems.
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 14 / 36
Lance [SDF+15b]
Lance provides support for:
Semantics-aware transformations
Complex class definitions (union, intersection)
Complex property definitions (functional properties, inverse functional
properties)
Disjointness (properties)
Standard value and structure based transformations
Weighted gold standard based on tensor factorization
Varying degrees of difficulty and fine-grained evaluation metrics
Available at http://github.com/jsaveta/Lance
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 15 / 36
Lance Architecture
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 16 / 36
Synthetic Benchmarks
Ontology Alignment Evaluation Benchmarks
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 17 / 36
Synthetic Instance Matching Benchmarks:
Overview (1)
IIMB
2009
IIMB
2010
PR
2010
IIMB
2011
Sandbox
2012
IIMB
2012
RDFT
2013
ID-
REC
2014
Author
Task
2015
Systematic Procedure
√ √ √ √ √ √ √ √ √
Availability
√ √ √ √ √
- -
√ √
Quality
√ √ √ √ √ √ √ √ √
Equity
√ √ √ √ √ √ √ √ √
Dissemination 6 3 6 1 3 4 4 5 5
Volume 0.2K 1.4K 0.86K 4K 0.375K 1.5K 0.43K 2.650K 10K
Gold Standard
√ √ √ √ √ √ √ √ √
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 18 / 36
Synthetic Instance Matching Benchmarks:
Overview (2)
IIMB
2009
IIMB
2010
PR
2010
IIMB
2011
Sandbox
2012
IIMB
2012
RDFT
2013
ID-
REC
2014
Author
Task
2015
Value Variations
√ √ √ √ √ √ √ √ √
Structural Variations
√ √ √ √
- - - + +
Logical Variations
√ √
-
√
-
√
- - -
Multilinguality - - - - - -
√ √ √
IIMB
2009
IIMB
2010
PR
2010
IIMB
2011
Sandbox
2012
IIMB
2012
RDFT
2013
ID-
REC
2014
Author
Task
2015
Blind Evaluations - - - - - -
√ √ √
1-n Mappings - -
√
- - -
√ √
-
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 19 / 36
Synthetic Instance Matching Benchmarks:
Overview (3)
IIMB
2009
IIMB
2010
PR
2010
IIMB
2011
Sandbox
2012
IIMB
2012
RDFT
2013
ID-
REC
2014
Author
Task
2015
Lance
2015
Systematic
Procedure
√ √ √ √ √ √ √ √ √ √
Availability
√ √ √ √ √
- -
√ √ √
Quality
√ √ √ √ √ √ √ √ √ √
Equity
√ √ √ √ √ √ √ √ √ √
Dissemination 6 3 6 1 3 4 4 5 5 2
Volume 0.2K 1.4K 0.86K 4K 0.375K 1.5K 0.43K 2.650K 10K > 1M
Gold
Standard
√ √ √ √ √ √ √ √ √ √
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 20 / 36
Synthetic Instance Matching Benchmarks:
Overview (4)
IIMB
2009
IIMB
2010
PR
2010
IIMB
2011
Sandbox
2012
IIMB
2012
RDFT
2013
ID-
REC
2014
Author
Task
2015
Lance
2015
Value
Variations
√ √ √ √ √ √ √ √ √ √
Structural
Variations
√ √ √ √
- - - + + +
Logical
Variations
√ √
-
√
-
√
- - - +
Multilinguality - - - - - -
√ √ √ √
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 21 / 36
Synthetic Instance Matching Benchmarks:
Overview (5)
IIMB
2009
IIMB
2010
PR
2010
IIMB
2011
Sandbox
2012
IIMB
2012
RDFT
2013
ID-
REC
2014
Author
Task
2015
Lance
2015
Blind
Evaluations
- - - - - -
√ √ √ √
1-n
Mappings
- -
√
- - -
√ √
- -
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 22 / 36
Real Benchmarks
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 23 / 36
Real Instance Matching Benchmarks:
Overview (1)
ARS DI 2010 DI 2011
Systematic Procedure
√ √ √
Availability
√ √
-
Quality
√ √ √
Equity
√ √ √
Dissemination 5 2 3
Volume 100K 6K NA
Gold Standard
√ √
+
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 24 / 36
Real Instance Matching Benchmarks:
Overview (2)
ARS DI 2010 DI 2011
Value Variations
√ √ √
Structural Variations
√ √
-
Logical Variations - - -
Multilinguality - - -
Blind Evaluations - - -
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 25 / 36
Wrapping Up
Multilinguality
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 26 / 36
Wrapping Up
Value Variations
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 27 / 36
Wrapping Up
Structural Variations
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 28 / 36
Wrapping Up
Logical Variations
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 29 / 36
Wrapping Up
Combinations of Variations
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 30 / 36
Wrapping Up
Scalability
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 31 / 36
Open Issues
Only one benchmark that tackles both, combination of variations and
scalability issues
Not enough IM benchmark using the full expressiveness of RDF/OWL
language
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 32 / 36
Systems
Systems can handle the value variations, the structural variation, and the
simple logical variations separately.
More work needed for complex variations (combination of value, structural,
and logical)
More work needed for structural variations
Enhancement of systems to cope with the clustering of the mappings (1-n
mappings)
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 33 / 36
Conclusions
Many instance matching benchmarks have been proposed
Each of them answering to some of the needs of instance matching systems.
It is essential to start creating benchmarks that will “show the way to the
future”
Extend the limits of existing systems.
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 34 / 36
Acknowledgment
This work was supported by grants from the EU H2020 Framework Programme
provided for the project HOBBIT (GA no. 688227).
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 35 / 36
References I
Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 36 / 36

More Related Content

What's hot

Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsJulien PLU
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Mehwish Alam
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDFSara EL HASSAD
 
Applications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and ClassificationApplications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and Classificationshakimov
 
Detecting paraphrases using recursive autoencoders
Detecting paraphrases using recursive autoencodersDetecting paraphrases using recursive autoencoders
Detecting paraphrases using recursive autoencodersFeynman Liang
 
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...shakimov
 
Recursive Autoencoders for Paraphrase Detection (Socher et al)
Recursive Autoencoders for Paraphrase Detection (Socher et al)Recursive Autoencoders for Paraphrase Detection (Socher et al)
Recursive Autoencoders for Paraphrase Detection (Socher et al)Feynman Liang
 
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDFESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDFFariz Darari
 
Reflection and Metadata
Reflection and MetadataReflection and Metadata
Reflection and MetadataMichal Píše
 
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphFedorNikolaev
 
MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...
MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...
MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...multimediaeval
 
DCU Search Runs at MediaEval 2014 Search and Hyperlinking
DCU Search Runs at MediaEval 2014 Search and HyperlinkingDCU Search Runs at MediaEval 2014 Search and Hyperlinking
DCU Search Runs at MediaEval 2014 Search and Hyperlinkingmultimediaeval
 
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1net2-project
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshellFabien Gandon
 
Exchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataExchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataMario Arias
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...Jeff Z. Pan
 

What's hot (20)

Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER Models
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDF
 
Applications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and ClassificationApplications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and Classification
 
Detecting paraphrases using recursive autoencoders
Detecting paraphrases using recursive autoencodersDetecting paraphrases using recursive autoencoders
Detecting paraphrases using recursive autoencoders
 
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
 
Recursive Autoencoders for Paraphrase Detection (Socher et al)
Recursive Autoencoders for Paraphrase Detection (Socher et al)Recursive Autoencoders for Paraphrase Detection (Socher et al)
Recursive Autoencoders for Paraphrase Detection (Socher et al)
 
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDFESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
 
Reflection and Metadata
Reflection and MetadataReflection and Metadata
Reflection and Metadata
 
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
 
F# and the DLR
F# and the DLRF# and the DLR
F# and the DLR
 
Multiple Dispatch
Multiple DispatchMultiple Dispatch
Multiple Dispatch
 
MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...
MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...
MediaEval 2015 - GTM-UVigo Systems for the Query-by-Example Search on Speech ...
 
DCU Search Runs at MediaEval 2014 Search and Hyperlinking
DCU Search Runs at MediaEval 2014 Search and HyperlinkingDCU Search Runs at MediaEval 2014 Search and Hyperlinking
DCU Search Runs at MediaEval 2014 Search and Hyperlinking
 
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshell
 
Exchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataExchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF Data
 
Shebanq gniezno
Shebanq gnieznoShebanq gniezno
Shebanq gniezno
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 

Similar to Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems

SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...LDBC council
 
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...Graph-TA
 
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...Ioan Toma
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Gabriel Moreira
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean MetricsYuxin Su
 
Top-N Recommendations from Implicit Feedback leveraging Linked Open Data
Top-N Recommendations from Implicit Feedback leveraging Linked Open DataTop-N Recommendations from Implicit Feedback leveraging Linked Open Data
Top-N Recommendations from Implicit Feedback leveraging Linked Open DataVito Ostuni
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...YONG ZHENG
 
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementEmil Lupu
 
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...Fabio Benedetti
 
Current Trends and Challenges in Big Data Benchmarking
Current Trends and Challenges in Big Data BenchmarkingCurrent Trends and Challenges in Big Data Benchmarking
Current Trends and Challenges in Big Data BenchmarkingeXascale Infolab
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryTim Menzies
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Quinsulon Israel
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsUniversity of Zurich
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...Tao Xie
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsEnrico Palumbo
 
Highly-scalable Reinforcement Learning RLlib for Real-world Applications
Highly-scalable Reinforcement Learning RLlib for Real-world ApplicationsHighly-scalable Reinforcement Learning RLlib for Real-world Applications
Highly-scalable Reinforcement Learning RLlib for Real-world ApplicationsBill Liu
 

Similar to Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems (20)

SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
 
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
 
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
 
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
 
Top-N Recommendations from Implicit Feedback leveraging Linked Open Data
Top-N Recommendations from Implicit Feedback leveraging Linked Open DataTop-N Recommendations from Implicit Feedback leveraging Linked Open Data
Top-N Recommendations from Implicit Feedback leveraging Linked Open Data
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
 
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
 
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
 
Current Trends and Challenges in Big Data Benchmarking
Current Trends and Challenges in Big Data BenchmarkingCurrent Trends and Challenges in Big Data Benchmarking
Current Trends and Challenges in Big Data Benchmarking
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern Discovery
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software Analytics
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
Highly-scalable Reinforcement Learning RLlib for Real-world Applications
Highly-scalable Reinforcement Learning RLlib for Real-world ApplicationsHighly-scalable Reinforcement Learning RLlib for Real-world Applications
Highly-scalable Reinforcement Learning RLlib for Real-world Applications
 

More from Holistic Benchmarking of Big Linked Data

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...Holistic Benchmarking of Big Linked Data
 
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Holistic Benchmarking of Big Linked Data
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...Holistic Benchmarking of Big Linked Data
 
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F... Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...Holistic Benchmarking of Big Linked Data
 
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignIntroducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignHolistic Benchmarking of Big Linked Data
 

More from Holistic Benchmarking of Big Linked Data (20)

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
 
Benchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT ProjectBenchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT Project
 
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
 
The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018
 
Benchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systemsBenchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systems
 
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation FrameworkSQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
 
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federationLargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
 
Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)
 
Scalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven ApplicationsScalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven Applications
 
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F... Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery ToolsSPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
 
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignIntroducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
 
OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018
 
MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018
 
Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)
 
Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017
 

Recently uploaded

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 

Recently uploaded (20)

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 

Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems

  • 1. Link Discovery Tutorial Benchmarking for Instance Matching Systems Axel-Cyrille Ngonga Ngomo(1) , Irini Fundulaki(2) , Mohamed Ahmed Sherif(1) (1) Institute for Applied Informatics, Germany (2) FORTH, Greece October 18th, 2016 Kobe, Japan Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 1 / 36
  • 2. The Questions(s) Instance matching research has led to the development of various systems. What are the problems that I wish to solve? What are the relevant key performance indicators? What is the behavior of the existing engines w.r.t. the key performance indicators? Which are the tool(s) that I should use for my data and for my use case? Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 2 / 36
  • 3. Importance of Benchmarking Benchmarks exist To allow adequate measurements of systems To provide evaluation of engines for real (or close to real) use cases Provide help Designers and Developers to assess the performance of their tools Users to compare the different available tools and evaluate suitability for their needs Researchers to compare their work to others Leads to improvements: Vendors can improve their technology Researchers can address new challenges Current benchmark design can be improved to cover new necessities and application domains Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 3 / 36
  • 4. The Answer: Benchmark your engines! Instance Matching/Linking Benchmark comprises of Datasets: The raw material of the benchmarks. These are the source and the target dataset that will be matched together to find the links between resources Test Cases: Address heterogeneities (structural, value, semantic) of the datasets to be matched Gold Standard (Ground Truth / Reference Alignment): The "correct answer sheet" used to judge the completeness and soundness of the instance matching algorithms Metrics: The performance metric(s) that determine the systems behaviour and performance Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 4 / 36
  • 5. Benchmark Datasets: Characteristics Nature Real Datasets: Widely used datasets from a domain of interest + Realistic conditions for heterogeneity problems + Realistic distributions - Error prone, hard to create Reference Alignment Synthetic Datasets: Produced with a data generator (that hopefully produces data with interesting characteristics + Fully controlled test conditions + Accurate, Easy to create Reference Alignments - Unrealistic distributions - Systematic heterogeneity problems Schema Datasets to be matched have the same or different schemas Domain Datasets come from the same or different domains Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 5 / 36
  • 6. Benchmark Test Cases: Variations Value Name style abbreviations, Typographical errors, change format (date/gender/number), synonym change, language change (multilinguality) Structural Change property depth, Delete/add property, split property values, transformation of object/data to data/object type property Semantics class deletion/modification, invert property assertions, change class/property hierarchy, assert class disjointness Combinations of Variations Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 6 / 36
  • 7. Benchmark: Gold Standard The "correct answer sheet" used to judge the completeness and soundness of the instance matching algorithms Characteristics Existence of errors / missing alignments Representation: owl:sameAs and skos:exactMatch Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 7 / 36
  • 8. Benchmark: Metrics Precision P = tp (tp+fn) Recall R = tp (tp+fp) F-measure F = 2 × P × R (P+R) Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 8 / 36
  • 9. Instance Matching Benchmarks: Desirable Attributes Systematic Procedure matching tasks should be reproducible and the exe- cution must be comparable Availability benchmark should be available Quality precise evaluation rules and high quality ontologies must be provided Equity evaluation process should not privilege any system Dissemination benchmark should be used to evaluate instance matching systems Volume dataset size Gold Standard gold standard should exist and be as accurate as pos- sible Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 9 / 36
  • 10. What about Benchmarks? Instance matching techniques have, until recently, been benchmarked in an ad-hoc way. There is no standard way of benchmarking the performance of the systems, when it comes to Linked Data. Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 10 / 36
  • 11. Ontology Alignment Evaluation Initiative IM benchmarks have been mainly driven forward by the Ontology Alignment Evaluation Initiative (OAEI) organizes annual campaign for ontology matching since 2005 hosts independent benchmarks In 2009, OAEI introduced the Instance Matching (IM) Track focuses on the evaluation of different instance matching techniques and tools for Linked Data Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 11 / 36
  • 12. Instance Matching Benchmarks Bechmark Generators Synthetic Benchmarks Real Benchmarks Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 12 / 36
  • 13. Semantic Web Instance Generation (SWING) [FMN+11] Semi automatic generator of Instance Matching Benchmarks Contributed in the generation of IIMB Benchmarks of OAEI in 2010, 2011 and 2012 Instance Matching Tracks Freely available at (https://code.google.com/p/swing-generator/) All kind of variations supported into the benchmarks except multilinguality Automatically produced gold standard Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 13 / 36
  • 14. Lance [SDF+15b] Flexible, generic and domain-independent benchmark generator which takes into consideration RDFS and OWL constructs in order to evaluate instance matching systems. Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 14 / 36
  • 15. Lance [SDF+15b] Lance provides support for: Semantics-aware transformations Complex class definitions (union, intersection) Complex property definitions (functional properties, inverse functional properties) Disjointness (properties) Standard value and structure based transformations Weighted gold standard based on tensor factorization Varying degrees of difficulty and fine-grained evaluation metrics Available at http://github.com/jsaveta/Lance Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 15 / 36
  • 16. Lance Architecture Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 16 / 36
  • 17. Synthetic Benchmarks Ontology Alignment Evaluation Benchmarks Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 17 / 36
  • 18. Synthetic Instance Matching Benchmarks: Overview (1) IIMB 2009 IIMB 2010 PR 2010 IIMB 2011 Sandbox 2012 IIMB 2012 RDFT 2013 ID- REC 2014 Author Task 2015 Systematic Procedure √ √ √ √ √ √ √ √ √ Availability √ √ √ √ √ - - √ √ Quality √ √ √ √ √ √ √ √ √ Equity √ √ √ √ √ √ √ √ √ Dissemination 6 3 6 1 3 4 4 5 5 Volume 0.2K 1.4K 0.86K 4K 0.375K 1.5K 0.43K 2.650K 10K Gold Standard √ √ √ √ √ √ √ √ √ Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 18 / 36
  • 19. Synthetic Instance Matching Benchmarks: Overview (2) IIMB 2009 IIMB 2010 PR 2010 IIMB 2011 Sandbox 2012 IIMB 2012 RDFT 2013 ID- REC 2014 Author Task 2015 Value Variations √ √ √ √ √ √ √ √ √ Structural Variations √ √ √ √ - - - + + Logical Variations √ √ - √ - √ - - - Multilinguality - - - - - - √ √ √ IIMB 2009 IIMB 2010 PR 2010 IIMB 2011 Sandbox 2012 IIMB 2012 RDFT 2013 ID- REC 2014 Author Task 2015 Blind Evaluations - - - - - - √ √ √ 1-n Mappings - - √ - - - √ √ - Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 19 / 36
  • 20. Synthetic Instance Matching Benchmarks: Overview (3) IIMB 2009 IIMB 2010 PR 2010 IIMB 2011 Sandbox 2012 IIMB 2012 RDFT 2013 ID- REC 2014 Author Task 2015 Lance 2015 Systematic Procedure √ √ √ √ √ √ √ √ √ √ Availability √ √ √ √ √ - - √ √ √ Quality √ √ √ √ √ √ √ √ √ √ Equity √ √ √ √ √ √ √ √ √ √ Dissemination 6 3 6 1 3 4 4 5 5 2 Volume 0.2K 1.4K 0.86K 4K 0.375K 1.5K 0.43K 2.650K 10K > 1M Gold Standard √ √ √ √ √ √ √ √ √ √ Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 20 / 36
  • 21. Synthetic Instance Matching Benchmarks: Overview (4) IIMB 2009 IIMB 2010 PR 2010 IIMB 2011 Sandbox 2012 IIMB 2012 RDFT 2013 ID- REC 2014 Author Task 2015 Lance 2015 Value Variations √ √ √ √ √ √ √ √ √ √ Structural Variations √ √ √ √ - - - + + + Logical Variations √ √ - √ - √ - - - + Multilinguality - - - - - - √ √ √ √ Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 21 / 36
  • 22. Synthetic Instance Matching Benchmarks: Overview (5) IIMB 2009 IIMB 2010 PR 2010 IIMB 2011 Sandbox 2012 IIMB 2012 RDFT 2013 ID- REC 2014 Author Task 2015 Lance 2015 Blind Evaluations - - - - - - √ √ √ √ 1-n Mappings - - √ - - - √ √ - - Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 22 / 36
  • 23. Real Benchmarks Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 23 / 36
  • 24. Real Instance Matching Benchmarks: Overview (1) ARS DI 2010 DI 2011 Systematic Procedure √ √ √ Availability √ √ - Quality √ √ √ Equity √ √ √ Dissemination 5 2 3 Volume 100K 6K NA Gold Standard √ √ + Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 24 / 36
  • 25. Real Instance Matching Benchmarks: Overview (2) ARS DI 2010 DI 2011 Value Variations √ √ √ Structural Variations √ √ - Logical Variations - - - Multilinguality - - - Blind Evaluations - - - Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 25 / 36
  • 26. Wrapping Up Multilinguality Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 26 / 36
  • 27. Wrapping Up Value Variations Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 27 / 36
  • 28. Wrapping Up Structural Variations Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 28 / 36
  • 29. Wrapping Up Logical Variations Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 29 / 36
  • 30. Wrapping Up Combinations of Variations Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 30 / 36
  • 31. Wrapping Up Scalability Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 31 / 36
  • 32. Open Issues Only one benchmark that tackles both, combination of variations and scalability issues Not enough IM benchmark using the full expressiveness of RDF/OWL language Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 32 / 36
  • 33. Systems Systems can handle the value variations, the structural variation, and the simple logical variations separately. More work needed for complex variations (combination of value, structural, and logical) More work needed for structural variations Enhancement of systems to cope with the clustering of the mappings (1-n mappings) Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 33 / 36
  • 34. Conclusions Many instance matching benchmarks have been proposed Each of them answering to some of the needs of instance matching systems. It is essential to start creating benchmarks that will “show the way to the future” Extend the limits of existing systems. Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 34 / 36
  • 35. Acknowledgment This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227). Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 35 / 36
  • 36. References I Ngonga Ngomo et al. (AKSW & FORTH) LD Tutorial:Benchmarking October 17, 2016 36 / 36