SlideShare a Scribd company logo
Link Reuse and Evolution for Data Integration
Anika Groß
It‘s all about the
data
Link Reuse and Evolution
for Data Integration
Anika Groß
8. Leipziger Semantic Web Tag, 17.06.2020
Link Reuse and Evolution for Data Integration
Anika Groß
Data Science Workflow
2
Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
Link Reuse and Evolution for Data Integration
Anika Groß
Data integration (variety): combine data from various sources
• Exploit potential of data
• Added value, new insights
• Improved interoperability
Data Science Workflow
2
Data
extraction
& cleaning
Data
integration
& enrichment
Visualization Interpretation
Data
acquisition
Analytics
- descriptive
- predictive
Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
Link Reuse and Evolution for Data Integration
Anika Groß
HOW?
Based on links between objects
Data integration (variety): combine data from various sources
• Exploit potential of data
• Added value, new insights
• Improved interoperability
Data Science Workflow
2
Data
extraction
& cleaning
Data
integration
& enrichment
Visualization Interpretation
Data
acquisition
Analytics
- descriptive
- predictive
Matching /
Linking
Post-
processing
Pre-
processing
Merge /
Fusion
Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
Link Reuse and Evolution for Data Integration
Anika Groß
HOW?
Based on links between objects
Data integration (variety): combine data from various sources
• Exploit potential of data
• Added value, new insights
• Improved interoperability
Data Science Workflow
2
Data
extraction
& cleaning
Data
integration
& enrichment
Visualization Interpretation
Data
acquisition
Analytics
- descriptive
- predictive
Matching /
Linking
Post-
processing
Pre-
processing
Merge /
Fusion
Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
Link Reuse and Evolution for Data Integration
Anika Groß
Matching / Linking
• Schema level
• Schema and ontology matching
• Schema merging
3
…Hämatologische
Krankheit
…
Krankheiten
Blutarmut Leukopenie
…Hematological
Disease
Disease
Cytopenia
Anemia Leukopenia
Thrombo
cytopenia
…
Aim: (Semi-)automatically interconnect different data sources via explicit links
Link Reuse and Evolution for Data Integration
Anika Groß
Matching / Linking
• Schema level
• Schema and ontology matching
• Schema merging
• Instance level
• Entity resolution, link discovery
• Object fusion
3
Severe anemia
(hemoglobin < 8 g/dL),
leukopenia (white blood
cell count [WBC] < 2500
mm3), thrombocytopenia
(platelet count < 80,000
mm3)
Patients with
significantly impaired
bone marrow function
or significant anemia,
leukopenia, or
thrombocytopenia
…Hämatologische
Krankheit
…
Krankheiten
Blutarmut Leukopenie
…Hematological
Disease
Disease
Cytopenia
Anemia Leukopenia
Thrombo
cytopenia
…
Aim: (Semi-)automatically interconnect different data sources via explicit links
Link Reuse and Evolution for Data Integration
Anika Groß
Matching / Linking
• Schema level
• Schema and ontology matching
• Schema merging
• Semantic annotation
• Linking instances
with ontology concepts
• Entity linking
• Instance level
• Entity resolution, link discovery
• Object fusion
3
Severe anemia
(hemoglobin < 8 g/dL),
leukopenia (white blood
cell count [WBC] < 2500
mm3), thrombocytopenia
(platelet count < 80,000
mm3)
Patients with
significantly impaired
bone marrow function
or significant anemia,
leukopenia, or
thrombocytopenia
…Hämatologische
Krankheit
…
Krankheiten
Blutarmut Leukopenie
…Hematological
Disease
Disease
Cytopenia
Anemia Leukopenia
Thrombo
cytopenia
…
Aim: (Semi-)automatically interconnect different data sources via explicit links
Link Reuse and Evolution for Data Integration
Anika Groß
Data is not static
4
≥ 2 Input
sources
Integration &
Enrichment
linking, fusion, …
Analysis
e.g. graph-based
Result
interpretation
Intra-source links
Inter-source links
Link Reuse and Evolution for Data Integration
Anika Groß
Evolution, Dynamics
Data is not static
4
≥ 2 Input
sources
Integration &
Enrichment
linking, fusion, …
Analysis
e.g. graph-based
Result
interpretation
Intra-source links
Inter-source links
Links between different versions, temporal links
Link Reuse and Evolution for Data Integration
Anika Groß
Agenda
✓Introduction
✓ Data Science Workflow
✓ Matching / Linking
✓ Evolution
• Link Reuse
• Link Evolution and Temporal Linking
• Future Research Directions
5
Link Reuse and Evolution for Data Integration
Anika Groß
Can be real/tiny/no
improvement
Many many
test runs
Cooperativeness of
domain experts
Again and again …
• Implementation of matching tools/algorithms
• Configuration of matching workflows
• Verification of links
6
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse
Again and again …
• Implementation of matching tools/algorithms
• Configuration of matching workflows
• Verification of links
Existing links between (meta)data sources
• Linked Open Data Cloud
• Repositories/platforms: Bioportal, local / own project, sameas.org
…
6
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse
Again and again …
• Implementation of matching tools/algorithms
• Configuration of matching workflows
• Verification of links
Existing links between (meta)data sources
• Linked Open Data Cloud
• Repositories/platforms: Bioportal, local / own project, sameas.org
…
x No solution
Manual or (semi-)
automatic Matching
6
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse
Again and again …
• Implementation of matching tools/algorithms
• Configuration of matching workflows
• Verification of links
Existing links between (meta)data sources
• Linked Open Data Cloud
• Repositories/platforms: Bioportal, local / own project, sameas.org
…
✓ Complete solution⸦ Partial solution
Link reuse instead of full (manual
or automatic) re-determination
x No solution
Manual or (semi-)
automatic Matching
6
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse
Again and again …
• Implementation of matching tools/algorithms
• Configuration of matching workflows
• Verification of links
Existing links between (meta)data sources
• Linked Open Data Cloud
• Repositories/platforms: Bioportal, local / own project, sameas.org
…
✓ Complete solution⸦ Partial solution
Link reuse instead of full (manual
or automatic) re-determination
x No solution
Manual or (semi-)
automatic Matching
Aims
• Improved match result quality
• Less effort
• Link update (evolution)
6
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse - Methods
7
Composition
Combine mappings via intermediate sources
I1
I2
S1 S2
indirect
direct
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse - Methods
7
Composition
Combine mappings via intermediate sources
I1
I2
S1 S2
indirect
direct
Clustering
Create groups of (connected) entities
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse - Methods
7
Composition
Combine mappings via intermediate sources
I1
I2
S1 S2
indirect
direct
Clustering
Create groups of (connected) entities
Supervised Learning
Train ML model
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse - Methods
7
Composition
Combine mappings via intermediate sources
I1
I2
S1 S2
indirect
direct
Clustering
Create groups of (connected) entities
Evolution
Connect and update over time
Supervised Learning
Train ML model
Link Reuse and Evolution for Data Integration
Anika Groß
Link Reuse – in my research
8
Composition
• Indirect Ontology Matching
(schema level)
Clustering
• Holistic entity clustering
for linked data (instance level)
• Semantic annotation of
medical documents
Supervised Learning
• Combination of results from
different semantic annotation
tools
Temporal Linking
• Ontology mapping evolution and
update (schema level)
• Temporal group linkage for
census data (instance level)
Evolution
• Ontology mapping evolution
and update (schema level)
• Temporal group linkage for
census data (instance level)
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
Find links between different
source versions or temporal datasets
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
Find links between different
source versions or temporal datasets
S1’’ …
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
Update set of outdated links
between older versions
Find links between different
source versions or temporal datasets
S1’’ …
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
S1
S2
𝑴 𝑺𝟏,𝑺𝟐
Update set of outdated links
between older versions
Find links between different
source versions or temporal datasets
S1’’ …
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
𝒅𝒊𝒇𝒇 𝑺𝟐,𝑺𝟐′
S1
S2
𝑴 𝑺𝟏,𝑺𝟐
S1’
S2’
𝑴 𝑺𝟏,𝑺𝟏′
𝑴 𝑺𝟐,𝑺𝟐′
Update set of outdated links
between older versions
Find links between different
source versions or temporal datasets
S1’’ …
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
𝒅𝒊𝒇𝒇 𝑺𝟐,𝑺𝟐′
S1
S2
𝑴 𝑺𝟏′,𝑺𝟐′𝑴 𝑺𝟏,𝑺𝟐
S1’
S2’
𝑴 𝑺𝟏,𝑺𝟏′
𝑴 𝑺𝟐,𝑺𝟐′
Update set of outdated links
between older versions
Find links between different
source versions or temporal datasets
S1’’ …
Link Reuse and Evolution for Data Integration
Anika Groß
Link Evolution and Temporal Linking
Reuse existing intra- or intersource links
9
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
S1 S1’
𝑴 𝑺𝟏,𝑺𝟏′
𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′
𝒅𝒊𝒇𝒇 𝑺𝟐,𝑺𝟐′
S1
S2
𝑴 𝑺𝟏′,𝑺𝟐′𝑴 𝑺𝟏,𝑺𝟐
S1’
S2’
𝑴 𝑺𝟏,𝑺𝟏′
𝑴 𝑺𝟐,𝑺𝟐′
Update set of outdated links
between older versions
Find links between different
source versions or temporal datasets
S1’’ …
Link Reuse and Evolution for Data Integration
Anika Groß
Evolution
• Ontology mapping evolution
and update (schema level)
• Temporal group linkage for
census data (instance level)
Link Reuse - Methods
10
Composition
• Indirect Ontology Matching
(schema level)
Clustering
• Holistic entity clustering
for linked data (instance level)
• Semantic annotation of
medical documents
Supervised Learning
• Combination of results from
different semantic annotation
tools
Temporal Linking
• Ontology mapping evolution and
update (schema level)
• Temporal group linkage for
census data (instance level)
• Temporal group linkage for
census data (instance level)
Link Reuse and Evolution for Data Integration
Anika Groß
Temporal Group Linkage
for Census Data
11
• 6 census (1851-1901) in Rawtenstall, Lancashire, U.K.
• Household graphs (known family connections) but unknown temporal links
Temporal Linking
Instance level
Reuse
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
Link Reuse and Evolution for Data Integration
Anika Groß
Temporal Group Linkage
for Census Data
11
• 6 census (1851-1901) in Rawtenstall, Lancashire, U.K.
• Household graphs (known family connections) but unknown temporal links
Elizabeth
Ashworth
John
Riley
William
Ashworth
wife
father
in law
daughter son
wife son
head
John Ashworth
Alice
Ashworth
head
John Smith
Elizabeth
Smith
Steve
Smith
Elizabeth
Ashworth
William
Ashworth
wife son
wife
head
John Ashworth
head
John Smith
Elizabeth Smith
wife
head
Steve Smith
Alice Smith Mary Smith
daughter
1871 1881
Temporal Linking
Instance level
Reuse
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
Link Reuse and Evolution for Data Integration
Anika Groß
Temporal Group Linkage
for Census Data
11
• 6 census (1851-1901) in Rawtenstall, Lancashire, U.K.
• Household graphs (known family connections) but unknown temporal links
Elizabeth
Ashworth
John
Riley
William
Ashworth
wife
father
in law
daughter son
wife son
head
John Ashworth
Alice
Ashworth
head
John Smith
Elizabeth
Smith
Steve
Smith
Elizabeth
Ashworth
William
Ashworth
wife son
wife
head
John Ashworth
head
John Smith
Elizabeth Smith
wife
head
Steve Smith
Alice Smith Mary Smith
daughter
1871 1881
Temporal Linking
Instance level
Reuse
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
Link Reuse and Evolution for Data Integration
Anika Groß
Temporal Group Linkage
for Census Data
11
• 6 census (1851-1901) in Rawtenstall, Lancashire, U.K.
• Household graphs (known family connections) but unknown temporal links
Elizabeth
Ashworth
John
Riley
William
Ashworth
wife
father
in law
daughter son
wife son
head
John Ashworth
Alice
Ashworth
head
John Smith
Elizabeth
Smith
Steve
Smith
Elizabeth
Ashworth
William
Ashworth
wife son
wife
head
John Ashworth
head
John Smith
Elizabeth Smith
wife
head
Steve Smith
Alice Smith Mary Smith
daughter
1871 1881Problems
• Attribute values change over time (surname, occupation)
• Difficult disambiguation (same pre- and surname)
• Poor data quality (misspelling etc.)
• …
Temporal Linking
Instance level
Reuse
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
Link Reuse and Evolution for Data Integration
Anika Groß
Temporal Group Linkage
for Census Data
11
• 6 census (1851-1901) in Rawtenstall, Lancashire, U.K.
• Household graphs (known family connections) but unknown temporal links
Elizabeth
Ashworth
John
Riley
William
Ashworth
wife
father
in law
daughter son
wife son
head
John Ashworth
Alice
Ashworth
head
John Smith
Elizabeth
Smith
Steve
Smith
Elizabeth
Ashworth
William
Ashworth
wife son
wife
head
John Ashworth
head
John Smith
Elizabeth Smith
wife
head
Steve Smith
Alice Smith Mary Smith
daughter
1871 1881Problems
• Attribute values change over time (surname, occupation)
• Difficult disambiguation (same pre- and surname)
• Poor data quality (misspelling etc.)
• …
Temporal Linking
Instance level
Reuse
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
Temporal Entity and Group Linkage
• Method → paper
• ≈ 96% F-Measure for record and group mapping
(2-9% improvement over compared approaches)
Link Reuse and Evolution for Data Integration
Anika Groß
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
• Evolution patterns on individual (preserve, add, remove)
and group level (split, merge, move, …)
Evolution Patterns and Evolution Graph
12
Link Reuse and Evolution for Data Integration
Anika Groß
Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data.
Intl. Conf. on Extending Database Technology (EDBT), 2017.
• Evolution patterns on individual (preserve, add, remove)
and group level (split, merge, move, …)
• Evolution Graph over longer time periods
Evolution Patterns and Evolution Graph
12
Link Reuse and Evolution for Data Integration
Anika Groß
“The Reuse Application”: Knowledge Graphs
13
& many more
• Continuous reuse and integration
• of instances, ontology concepts and links from various sources
• methods:
• matching/link discovery, NLP, entity linking, clustering, fusion/merging, …
+ expert knowledge / verification
• Evolution and update
• Can be highly dynamic graph
• Direct change in knowledge graph
• Extension and update based on usage (user queries)
• Update when source versions evolve
• Integrate additions, deletions, structural changes, …
• Complex: keep meanwhile verified changes
Link Reuse and Evolution for Data Integration
Anika Groß
✓ Improved link
quality
✓ Less effort /
more efficient
✓ Up-to-date links
✓ New temporal links
• Data sources evolve over time … and so do the links between them
• Reuse existing verified links to create new links for new versions
• Create new temporal links between objects and object groups
• Problems: poor trust, missing context, no knowledge of existing links, …
need to be overcome
• Lineage, provenance, data profiling, accessibility …
Conclusion
14
Link Reuse and Evolution for Data Integration
Anika Groß
Future Research Directions
15
Evolution of
Knowledge Graphs
• Evolution of
integrated sources
• evolution-aware
ontology merge,
knowledge graph
update
• Scalable iterative
integration
• Temporal patterns
on graph data
• …
Semantic
Interoperability
• Semantic Annotation
of heterogenous, un-
/ semi structured
data
• Multilingual
Matching
• Semantic Mappings
(“beyond sameAs”)
• …
End-to-End
Analytics Workflows
• Close to seamless
data integration for
complex analytics
workflows
• Management and
reproducibility of
scientific workflows
• …
Link Reuse and Evolution for Data Integration
Anika Groß
Future Research Directions
15
Evolution of
Knowledge Graphs
• Evolution of
integrated sources
• evolution-aware
ontology merge,
knowledge graph
update
• Scalable iterative
integration
• Temporal patterns
on graph data
• …
Semantic
Interoperability
• Semantic Annotation
of heterogenous, un-
/ semi structured
data
• Multilingual
Matching
• Semantic Mappings
(“beyond sameAs”)
• …
End-to-End
Analytics Workflows
• Close to seamless
data integration for
complex analytics
workflows
• Management and
reproducibility of
scientific workflows
• …
Link Reuse and Evolution for Data Integration
Anika Groß
References
Reuse Annotation
• Christen, Lin, Groß, Domingos Cardoso, Pruski, Da Silveira, Rahm: A Learning-Based Approach to Combine Medical Annotation Results - (Short Paper).
13th Intl. Conference on Data Integration in the Life Sciences (DILS), 2018.
• Christen, Groß, Rahm: A Reuse-based Annotation Approach for Medical Documents. The Semantic Web -- ISWC 2016: 15th Intl. Semantic Web
Conference, 2016.
Reuse Entity Links
• Nentwig, Groß, Möller, Rahm: Distributed Holistic Clustering on Linked Data. Proc. OTM 2017 Conferences - Confederated International Conferences:
CoopIS, C&TC, and ODBASE, 2017.
• Nentwig, Groß, Rahm: Holistic Entity Clustering for Linked Data. IEEE 16th International Conference on Data Mining Workshops (ICDMW), 2016.
Temporal Linking / Entity Evolution
• Christen, Groß, Fisher, Wang, Christen, Rahm: Temporal group linkage and evolution analysis for census data. 19th Intl. Conference on Extending
Database Technology (EDBT), 2017.
Mapping/Link Composition
• A. Groß, Hartung, Kirsten, Rahm: Mapping Composition for Matching Large Life Science Ontologies. 2nd Intl. Conference on Biomedical Ontology
(ICBO), 2011.
• M. Hartung, Groß, Rahm: Composition Methods for Link Discovery. Proc. of 15. GI-Fachtagung für Datenbanksysteme in Business, Technologie und
Web (BTW), 2013.
Mapping / Link Evolution
• Groß, Pruski, Rahm: Evolution of Biomedical Ontologies and Mappings: Overview of Recent Approaches. Computational and Structural Biotechnology
Journal. 14, 2016.
• Groß, dos Reis, Hartung, Pruski, Rahm: Semi-automatic Adaptation of Mappings between Life Science Ontologies. 9th Intl. Conference on Data
Integration in the Life Sciences (DILS), 2013.
• Groß, Hartung, Prüfer, Kelso, Rahm: Impact of ontology evolution on functional analyses. Bioinformatics 28(20), 2012.
• Groß, Hartung, Kirsten, Rahm: Estimating the Quality of Ontology-Based Annotations by Considering Evolutionary Changes.
6th Intl. Workshop on Data Integration in the Life Sciences, 2009.
Ontology Evolution
• Christen, Groß, Hartung: REX - A Tool for Discovering Evolution Trends in Ontology Regions. 10th Intl. Conference on Data Integration in the Life
Sciences (DILS), 2014.
• Hartung, Groß, Rahm: COnto-Diff: Generation of Complex Evolution Mappings for Life Science Ontologies. Journal of Biomedical Informatics 46(1),
2013.
• M. Hartung, Groß, Rahm: CODEX: exploration of semantic changes between ontology versions. Bioinformatics 28(6), 2012.
16

More Related Content

What's hot

Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
semanticsconference
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
Open Knowledge Belgium
 
SEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
SemLib Project
 
Migration to Drupal
Migration to DrupalMigration to Drupal
Migration to Drupal
Will Hall
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
Vital.AI
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data Lakes
Pradeeban Kathiravelu, Ph.D.
 
Top 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLTop 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLMongoDB
 
Modeling employees relationships with Apache Spark
Modeling employees relationships with Apache SparkModeling employees relationships with Apache Spark
Modeling employees relationships with Apache Spark
Wassim TRIFI
 
Iterative data discovery and transformation with open refine
Iterative data discovery and transformation with open refineIterative data discovery and transformation with open refine
Iterative data discovery and transformation with open refineMartin Magdinier
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesSrinath Srinivasa
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routing
chennaijp
 
Democratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data DiscoveryDemocratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data Discovery
Mark Grover
 
Evolution of big data
Evolution of big dataEvolution of big data
Evolution of big data
ShilpaKrishna6
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
Piet J.H. Daas
 
The Big Metadata
The Big MetadataThe Big Metadata
The Big Metadata
Daniela Tomova
 
Towards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF DataTowards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF Data
Linked Enterprise Date Services
 
Documenting Data Transformations
Documenting Data TransformationsDocumenting Data Transformations
Documenting Data Transformations
ARDC
 
keyword query routing
keyword query routingkeyword query routing
keyword query routing
swathi78
 
Online retail a look at data consulting approach
Online retail   a look at data consulting approachOnline retail   a look at data consulting approach
Online retail a look at data consulting approach
Shesha R
 

What's hot (20)

Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
 
SEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
 
Migration to Drupal
Migration to DrupalMigration to Drupal
Migration to Drupal
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data Lakes
 
Top 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLTop 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQL
 
Modeling employees relationships with Apache Spark
Modeling employees relationships with Apache SparkModeling employees relationships with Apache Spark
Modeling employees relationships with Apache Spark
 
Iterative data discovery and transformation with open refine
Iterative data discovery and transformation with open refineIterative data discovery and transformation with open refine
Iterative data discovery and transformation with open refine
 
Graph db
Graph dbGraph db
Graph db
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and Opportunities
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routing
 
Democratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data DiscoveryDemocratizing Data within your organization - Data Discovery
Democratizing Data within your organization - Data Discovery
 
Evolution of big data
Evolution of big dataEvolution of big data
Evolution of big data
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
The Big Metadata
The Big MetadataThe Big Metadata
The Big Metadata
 
Towards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF DataTowards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF Data
 
Documenting Data Transformations
Documenting Data TransformationsDocumenting Data Transformations
Documenting Data Transformations
 
keyword query routing
keyword query routingkeyword query routing
keyword query routing
 
Online retail a look at data consulting approach
Online retail   a look at data consulting approachOnline retail   a look at data consulting approach
Online retail a look at data consulting approach
 

Similar to Link Reuse and Evolution for Data Integration (LSWT 2020)

Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
Ken Karapetyan
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
DIACHRON Project Overview
DIACHRON Project OverviewDIACHRON Project Overview
DIACHRON Project Overview
PRELIDA Project
 
What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.
Andy Petrella
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
OpenAIRE
 
Democratizing Data at Airbnb
Democratizing Data at AirbnbDemocratizing Data at Airbnb
Democratizing Data at Airbnb
Neo4j
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
Novartis Institutes for BioMedical Research
 
OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015
Joaquin Vanschoren
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
Christopher Williams
 
Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)
Andy Petrella
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 
DataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open DataDataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open Data
dapaasproject
 
Antelope: A Web service for publishing Life Cycle Assessment models and resul...
Antelope: A Web service for publishing Life Cycle Assessment models and resul...Antelope: A Web service for publishing Life Cycle Assessment models and resul...
Antelope: A Web service for publishing Life Cycle Assessment models and resul...
Brandon Kuczenski
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
DataDryad
 
How Graph Technology is Changing AI
How Graph Technology is Changing AIHow Graph Technology is Changing AI
How Graph Technology is Changing AI
Databricks
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
ER 2016 Tutorial
ER 2016 TutorialER 2016 Tutorial
ER 2016 Tutorial
Rim Moussa
 
DataHub
DataHubDataHub
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
Bonnie Hurwitz
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
Michele Pasin
 

Similar to Link Reuse and Evolution for Data Integration (LSWT 2020) (20)

Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
DIACHRON Project Overview
DIACHRON Project OverviewDIACHRON Project Overview
DIACHRON Project Overview
 
What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
Democratizing Data at Airbnb
Democratizing Data at AirbnbDemocratizing Data at Airbnb
Democratizing Data at Airbnb
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
 
Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)Towards a rebirth of data science (by Data Fellas)
Towards a rebirth of data science (by Data Fellas)
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
DataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open DataDataGraft: Data-as-a-Service for Open Data
DataGraft: Data-as-a-Service for Open Data
 
Antelope: A Web service for publishing Life Cycle Assessment models and resul...
Antelope: A Web service for publishing Life Cycle Assessment models and resul...Antelope: A Web service for publishing Life Cycle Assessment models and resul...
Antelope: A Web service for publishing Life Cycle Assessment models and resul...
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
How Graph Technology is Changing AI
How Graph Technology is Changing AIHow Graph Technology is Changing AI
How Graph Technology is Changing AI
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
ER 2016 Tutorial
ER 2016 TutorialER 2016 Tutorial
ER 2016 Tutorial
 
DataHub
DataHubDataHub
DataHub
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
 

Recently uploaded

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 

Recently uploaded (20)

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 

Link Reuse and Evolution for Data Integration (LSWT 2020)

  • 1. Link Reuse and Evolution for Data Integration Anika Groß It‘s all about the data Link Reuse and Evolution for Data Integration Anika Groß 8. Leipziger Semantic Web Tag, 17.06.2020
  • 2. Link Reuse and Evolution for Data Integration Anika Groß Data Science Workflow 2 Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
  • 3. Link Reuse and Evolution for Data Integration Anika Groß Data integration (variety): combine data from various sources • Exploit potential of data • Added value, new insights • Improved interoperability Data Science Workflow 2 Data extraction & cleaning Data integration & enrichment Visualization Interpretation Data acquisition Analytics - descriptive - predictive Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
  • 4. Link Reuse and Evolution for Data Integration Anika Groß HOW? Based on links between objects Data integration (variety): combine data from various sources • Exploit potential of data • Added value, new insights • Improved interoperability Data Science Workflow 2 Data extraction & cleaning Data integration & enrichment Visualization Interpretation Data acquisition Analytics - descriptive - predictive Matching / Linking Post- processing Pre- processing Merge / Fusion Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
  • 5. Link Reuse and Evolution for Data Integration Anika Groß HOW? Based on links between objects Data integration (variety): combine data from various sources • Exploit potential of data • Added value, new insights • Improved interoperability Data Science Workflow 2 Data extraction & cleaning Data integration & enrichment Visualization Interpretation Data acquisition Analytics - descriptive - predictive Matching / Linking Post- processing Pre- processing Merge / Fusion Logos/pictures: pixabay.com, © Can Stock Photo / memoangeles
  • 6. Link Reuse and Evolution for Data Integration Anika Groß Matching / Linking • Schema level • Schema and ontology matching • Schema merging 3 …Hämatologische Krankheit … Krankheiten Blutarmut Leukopenie …Hematological Disease Disease Cytopenia Anemia Leukopenia Thrombo cytopenia … Aim: (Semi-)automatically interconnect different data sources via explicit links
  • 7. Link Reuse and Evolution for Data Integration Anika Groß Matching / Linking • Schema level • Schema and ontology matching • Schema merging • Instance level • Entity resolution, link discovery • Object fusion 3 Severe anemia (hemoglobin < 8 g/dL), leukopenia (white blood cell count [WBC] < 2500 mm3), thrombocytopenia (platelet count < 80,000 mm3) Patients with significantly impaired bone marrow function or significant anemia, leukopenia, or thrombocytopenia …Hämatologische Krankheit … Krankheiten Blutarmut Leukopenie …Hematological Disease Disease Cytopenia Anemia Leukopenia Thrombo cytopenia … Aim: (Semi-)automatically interconnect different data sources via explicit links
  • 8. Link Reuse and Evolution for Data Integration Anika Groß Matching / Linking • Schema level • Schema and ontology matching • Schema merging • Semantic annotation • Linking instances with ontology concepts • Entity linking • Instance level • Entity resolution, link discovery • Object fusion 3 Severe anemia (hemoglobin < 8 g/dL), leukopenia (white blood cell count [WBC] < 2500 mm3), thrombocytopenia (platelet count < 80,000 mm3) Patients with significantly impaired bone marrow function or significant anemia, leukopenia, or thrombocytopenia …Hämatologische Krankheit … Krankheiten Blutarmut Leukopenie …Hematological Disease Disease Cytopenia Anemia Leukopenia Thrombo cytopenia … Aim: (Semi-)automatically interconnect different data sources via explicit links
  • 9. Link Reuse and Evolution for Data Integration Anika Groß Data is not static 4 ≥ 2 Input sources Integration & Enrichment linking, fusion, … Analysis e.g. graph-based Result interpretation Intra-source links Inter-source links
  • 10. Link Reuse and Evolution for Data Integration Anika Groß Evolution, Dynamics Data is not static 4 ≥ 2 Input sources Integration & Enrichment linking, fusion, … Analysis e.g. graph-based Result interpretation Intra-source links Inter-source links Links between different versions, temporal links
  • 11. Link Reuse and Evolution for Data Integration Anika Groß Agenda ✓Introduction ✓ Data Science Workflow ✓ Matching / Linking ✓ Evolution • Link Reuse • Link Evolution and Temporal Linking • Future Research Directions 5
  • 12. Link Reuse and Evolution for Data Integration Anika Groß Can be real/tiny/no improvement Many many test runs Cooperativeness of domain experts Again and again … • Implementation of matching tools/algorithms • Configuration of matching workflows • Verification of links 6
  • 13. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse Again and again … • Implementation of matching tools/algorithms • Configuration of matching workflows • Verification of links Existing links between (meta)data sources • Linked Open Data Cloud • Repositories/platforms: Bioportal, local / own project, sameas.org … 6
  • 14. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse Again and again … • Implementation of matching tools/algorithms • Configuration of matching workflows • Verification of links Existing links between (meta)data sources • Linked Open Data Cloud • Repositories/platforms: Bioportal, local / own project, sameas.org … x No solution Manual or (semi-) automatic Matching 6
  • 15. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse Again and again … • Implementation of matching tools/algorithms • Configuration of matching workflows • Verification of links Existing links between (meta)data sources • Linked Open Data Cloud • Repositories/platforms: Bioportal, local / own project, sameas.org … ✓ Complete solution⸦ Partial solution Link reuse instead of full (manual or automatic) re-determination x No solution Manual or (semi-) automatic Matching 6
  • 16. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse Again and again … • Implementation of matching tools/algorithms • Configuration of matching workflows • Verification of links Existing links between (meta)data sources • Linked Open Data Cloud • Repositories/platforms: Bioportal, local / own project, sameas.org … ✓ Complete solution⸦ Partial solution Link reuse instead of full (manual or automatic) re-determination x No solution Manual or (semi-) automatic Matching Aims • Improved match result quality • Less effort • Link update (evolution) 6
  • 17. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse - Methods 7 Composition Combine mappings via intermediate sources I1 I2 S1 S2 indirect direct
  • 18. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse - Methods 7 Composition Combine mappings via intermediate sources I1 I2 S1 S2 indirect direct Clustering Create groups of (connected) entities
  • 19. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse - Methods 7 Composition Combine mappings via intermediate sources I1 I2 S1 S2 indirect direct Clustering Create groups of (connected) entities Supervised Learning Train ML model
  • 20. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse - Methods 7 Composition Combine mappings via intermediate sources I1 I2 S1 S2 indirect direct Clustering Create groups of (connected) entities Evolution Connect and update over time Supervised Learning Train ML model
  • 21. Link Reuse and Evolution for Data Integration Anika Groß Link Reuse – in my research 8 Composition • Indirect Ontology Matching (schema level) Clustering • Holistic entity clustering for linked data (instance level) • Semantic annotation of medical documents Supervised Learning • Combination of results from different semantic annotation tools Temporal Linking • Ontology mapping evolution and update (schema level) • Temporal group linkage for census data (instance level) Evolution • Ontology mapping evolution and update (schema level) • Temporal group linkage for census data (instance level)
  • 22. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ Find links between different source versions or temporal datasets
  • 23. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ Find links between different source versions or temporal datasets S1’’ …
  • 24. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ Update set of outdated links between older versions Find links between different source versions or temporal datasets S1’’ …
  • 25. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ S1 S2 𝑴 𝑺𝟏,𝑺𝟐 Update set of outdated links between older versions Find links between different source versions or temporal datasets S1’’ …
  • 26. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ 𝒅𝒊𝒇𝒇 𝑺𝟐,𝑺𝟐′ S1 S2 𝑴 𝑺𝟏,𝑺𝟐 S1’ S2’ 𝑴 𝑺𝟏,𝑺𝟏′ 𝑴 𝑺𝟐,𝑺𝟐′ Update set of outdated links between older versions Find links between different source versions or temporal datasets S1’’ …
  • 27. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ 𝒅𝒊𝒇𝒇 𝑺𝟐,𝑺𝟐′ S1 S2 𝑴 𝑺𝟏′,𝑺𝟐′𝑴 𝑺𝟏,𝑺𝟐 S1’ S2’ 𝑴 𝑺𝟏,𝑺𝟏′ 𝑴 𝑺𝟐,𝑺𝟐′ Update set of outdated links between older versions Find links between different source versions or temporal datasets S1’’ …
  • 28. Link Reuse and Evolution for Data Integration Anika Groß Link Evolution and Temporal Linking Reuse existing intra- or intersource links 9 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ S1 S1’ 𝑴 𝑺𝟏,𝑺𝟏′ 𝒅𝒊𝒇𝒇 𝑺𝟏,𝑺𝟏′ 𝒅𝒊𝒇𝒇 𝑺𝟐,𝑺𝟐′ S1 S2 𝑴 𝑺𝟏′,𝑺𝟐′𝑴 𝑺𝟏,𝑺𝟐 S1’ S2’ 𝑴 𝑺𝟏,𝑺𝟏′ 𝑴 𝑺𝟐,𝑺𝟐′ Update set of outdated links between older versions Find links between different source versions or temporal datasets S1’’ …
  • 29. Link Reuse and Evolution for Data Integration Anika Groß Evolution • Ontology mapping evolution and update (schema level) • Temporal group linkage for census data (instance level) Link Reuse - Methods 10 Composition • Indirect Ontology Matching (schema level) Clustering • Holistic entity clustering for linked data (instance level) • Semantic annotation of medical documents Supervised Learning • Combination of results from different semantic annotation tools Temporal Linking • Ontology mapping evolution and update (schema level) • Temporal group linkage for census data (instance level) • Temporal group linkage for census data (instance level)
  • 30. Link Reuse and Evolution for Data Integration Anika Groß Temporal Group Linkage for Census Data 11 • 6 census (1851-1901) in Rawtenstall, Lancashire, U.K. • Household graphs (known family connections) but unknown temporal links Temporal Linking Instance level Reuse Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017.
  • 31. Link Reuse and Evolution for Data Integration Anika Groß Temporal Group Linkage for Census Data 11 • 6 census (1851-1901) in Rawtenstall, Lancashire, U.K. • Household graphs (known family connections) but unknown temporal links Elizabeth Ashworth John Riley William Ashworth wife father in law daughter son wife son head John Ashworth Alice Ashworth head John Smith Elizabeth Smith Steve Smith Elizabeth Ashworth William Ashworth wife son wife head John Ashworth head John Smith Elizabeth Smith wife head Steve Smith Alice Smith Mary Smith daughter 1871 1881 Temporal Linking Instance level Reuse Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017.
  • 32. Link Reuse and Evolution for Data Integration Anika Groß Temporal Group Linkage for Census Data 11 • 6 census (1851-1901) in Rawtenstall, Lancashire, U.K. • Household graphs (known family connections) but unknown temporal links Elizabeth Ashworth John Riley William Ashworth wife father in law daughter son wife son head John Ashworth Alice Ashworth head John Smith Elizabeth Smith Steve Smith Elizabeth Ashworth William Ashworth wife son wife head John Ashworth head John Smith Elizabeth Smith wife head Steve Smith Alice Smith Mary Smith daughter 1871 1881 Temporal Linking Instance level Reuse Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017.
  • 33. Link Reuse and Evolution for Data Integration Anika Groß Temporal Group Linkage for Census Data 11 • 6 census (1851-1901) in Rawtenstall, Lancashire, U.K. • Household graphs (known family connections) but unknown temporal links Elizabeth Ashworth John Riley William Ashworth wife father in law daughter son wife son head John Ashworth Alice Ashworth head John Smith Elizabeth Smith Steve Smith Elizabeth Ashworth William Ashworth wife son wife head John Ashworth head John Smith Elizabeth Smith wife head Steve Smith Alice Smith Mary Smith daughter 1871 1881Problems • Attribute values change over time (surname, occupation) • Difficult disambiguation (same pre- and surname) • Poor data quality (misspelling etc.) • … Temporal Linking Instance level Reuse Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017.
  • 34. Link Reuse and Evolution for Data Integration Anika Groß Temporal Group Linkage for Census Data 11 • 6 census (1851-1901) in Rawtenstall, Lancashire, U.K. • Household graphs (known family connections) but unknown temporal links Elizabeth Ashworth John Riley William Ashworth wife father in law daughter son wife son head John Ashworth Alice Ashworth head John Smith Elizabeth Smith Steve Smith Elizabeth Ashworth William Ashworth wife son wife head John Ashworth head John Smith Elizabeth Smith wife head Steve Smith Alice Smith Mary Smith daughter 1871 1881Problems • Attribute values change over time (surname, occupation) • Difficult disambiguation (same pre- and surname) • Poor data quality (misspelling etc.) • … Temporal Linking Instance level Reuse Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017. Temporal Entity and Group Linkage • Method → paper • ≈ 96% F-Measure for record and group mapping (2-9% improvement over compared approaches)
  • 35. Link Reuse and Evolution for Data Integration Anika Groß Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017. • Evolution patterns on individual (preserve, add, remove) and group level (split, merge, move, …) Evolution Patterns and Evolution Graph 12
  • 36. Link Reuse and Evolution for Data Integration Anika Groß Christen, Groß, Fisher et al.: Temporal group linkage and evolution analysis for census data. Intl. Conf. on Extending Database Technology (EDBT), 2017. • Evolution patterns on individual (preserve, add, remove) and group level (split, merge, move, …) • Evolution Graph over longer time periods Evolution Patterns and Evolution Graph 12
  • 37. Link Reuse and Evolution for Data Integration Anika Groß “The Reuse Application”: Knowledge Graphs 13 & many more • Continuous reuse and integration • of instances, ontology concepts and links from various sources • methods: • matching/link discovery, NLP, entity linking, clustering, fusion/merging, … + expert knowledge / verification • Evolution and update • Can be highly dynamic graph • Direct change in knowledge graph • Extension and update based on usage (user queries) • Update when source versions evolve • Integrate additions, deletions, structural changes, … • Complex: keep meanwhile verified changes
  • 38. Link Reuse and Evolution for Data Integration Anika Groß ✓ Improved link quality ✓ Less effort / more efficient ✓ Up-to-date links ✓ New temporal links • Data sources evolve over time … and so do the links between them • Reuse existing verified links to create new links for new versions • Create new temporal links between objects and object groups • Problems: poor trust, missing context, no knowledge of existing links, … need to be overcome • Lineage, provenance, data profiling, accessibility … Conclusion 14
  • 39. Link Reuse and Evolution for Data Integration Anika Groß Future Research Directions 15 Evolution of Knowledge Graphs • Evolution of integrated sources • evolution-aware ontology merge, knowledge graph update • Scalable iterative integration • Temporal patterns on graph data • … Semantic Interoperability • Semantic Annotation of heterogenous, un- / semi structured data • Multilingual Matching • Semantic Mappings (“beyond sameAs”) • … End-to-End Analytics Workflows • Close to seamless data integration for complex analytics workflows • Management and reproducibility of scientific workflows • …
  • 40. Link Reuse and Evolution for Data Integration Anika Groß Future Research Directions 15 Evolution of Knowledge Graphs • Evolution of integrated sources • evolution-aware ontology merge, knowledge graph update • Scalable iterative integration • Temporal patterns on graph data • … Semantic Interoperability • Semantic Annotation of heterogenous, un- / semi structured data • Multilingual Matching • Semantic Mappings (“beyond sameAs”) • … End-to-End Analytics Workflows • Close to seamless data integration for complex analytics workflows • Management and reproducibility of scientific workflows • …
  • 41. Link Reuse and Evolution for Data Integration Anika Groß References Reuse Annotation • Christen, Lin, Groß, Domingos Cardoso, Pruski, Da Silveira, Rahm: A Learning-Based Approach to Combine Medical Annotation Results - (Short Paper). 13th Intl. Conference on Data Integration in the Life Sciences (DILS), 2018. • Christen, Groß, Rahm: A Reuse-based Annotation Approach for Medical Documents. The Semantic Web -- ISWC 2016: 15th Intl. Semantic Web Conference, 2016. Reuse Entity Links • Nentwig, Groß, Möller, Rahm: Distributed Holistic Clustering on Linked Data. Proc. OTM 2017 Conferences - Confederated International Conferences: CoopIS, C&TC, and ODBASE, 2017. • Nentwig, Groß, Rahm: Holistic Entity Clustering for Linked Data. IEEE 16th International Conference on Data Mining Workshops (ICDMW), 2016. Temporal Linking / Entity Evolution • Christen, Groß, Fisher, Wang, Christen, Rahm: Temporal group linkage and evolution analysis for census data. 19th Intl. Conference on Extending Database Technology (EDBT), 2017. Mapping/Link Composition • A. Groß, Hartung, Kirsten, Rahm: Mapping Composition for Matching Large Life Science Ontologies. 2nd Intl. Conference on Biomedical Ontology (ICBO), 2011. • M. Hartung, Groß, Rahm: Composition Methods for Link Discovery. Proc. of 15. GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web (BTW), 2013. Mapping / Link Evolution • Groß, Pruski, Rahm: Evolution of Biomedical Ontologies and Mappings: Overview of Recent Approaches. Computational and Structural Biotechnology Journal. 14, 2016. • Groß, dos Reis, Hartung, Pruski, Rahm: Semi-automatic Adaptation of Mappings between Life Science Ontologies. 9th Intl. Conference on Data Integration in the Life Sciences (DILS), 2013. • Groß, Hartung, Prüfer, Kelso, Rahm: Impact of ontology evolution on functional analyses. Bioinformatics 28(20), 2012. • Groß, Hartung, Kirsten, Rahm: Estimating the Quality of Ontology-Based Annotations by Considering Evolutionary Changes. 6th Intl. Workshop on Data Integration in the Life Sciences, 2009. Ontology Evolution • Christen, Groß, Hartung: REX - A Tool for Discovering Evolution Trends in Ontology Regions. 10th Intl. Conference on Data Integration in the Life Sciences (DILS), 2014. • Hartung, Groß, Rahm: COnto-Diff: Generation of Complex Evolution Mappings for Life Science Ontologies. Journal of Biomedical Informatics 46(1), 2013. • M. Hartung, Groß, Rahm: CODEX: exploration of semantic changes between ontology versions. Bioinformatics 28(6), 2012. 16