SlideShare a Scribd company logo
Enriching Linked Open Data
with distributional semantics
to study concept drift
Astrid van Aggelen, Laura Hollink, Jacco van Ossenbruggen

Information Access Group
What is concept drift?
Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014.
Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011.
Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015.
The phenomenon where the characteristics of a concept
change over time, signifying a shift in meaning
What is concept drift?
• Intension: definitions, properties, necessary and sufficient condition
• e.g. science, gender nonconformity
Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014.
Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011.
Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015.
The phenomenon where the characteristics of a concept
change over time, signifying a shift in meaning
What is concept drift?
• Intension: definitions, properties, necessary and sufficient condition
• e.g. science, gender nonconformity
Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014.
Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011.
Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015.
The phenomenon where the characteristics of a concept
change over time, signifying a shift in meaning
• Extension: the instances of a class
• e.g. new Nobel prize winners, EU member states
What is concept drift?
• Intension: definitions, properties, necessary and sufficient condition
• e.g. science, gender nonconformity
Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014.
Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011.
Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015.
The phenomenon where the characteristics of a concept
change over time, signifying a shift in meaning
• Extension: the instances of a class
• e.g. new Nobel prize winners, EU member states
• Labels: words used to refer to to a concept
• e.g. “migrant”, “refugee”
Linked Open Data
Classes, instances, their properties and labels are
explicitly encoded in formal languages.
class
class class
i i i i i i
i i i i i
label
label
label
label
Concept drift problems in LOD applications
Semantic annotation under concept drift
Ontology matching under concept drift
Interpreting user input under concept drift
Premenstrual
tension
syndromes
Tension
syndromes
Menstrual
migraine
Migraine
x
ICD9 2009
Premenstrual
tension
syndromes
Tension
syndromes
synonyms
"menstrual
migrane"
De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous
oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540.
ICD9 2008
Ontology A
Ontology A'
Ontology B
Ontology B'
matched
?
??
new version new version
Semantic annotation under concept drift
Premenstrual
tension
syndromes
Tension
syndromes
synonyms
"menstrual
migrane"
De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous
oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540.
ICD9 2008
Semantic annotation under concept drift
Example adapted from:
Cédric Pruski, keynote presentation at Drift-a-LOD’17, First workshop
on Detection, Representation and Management of Concept Drift in
Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016.
Premenstrual
tension
syndromes
Tension
syndromes
synonyms
"menstrual
migrane"
De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous
oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540.
ICD9 2008
Semantic annotation under concept drift
Example adapted from:
Cédric Pruski, keynote presentation at Drift-a-LOD’17, First workshop
on Detection, Representation and Management of Concept Drift in
Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016.
Premenstrual
tension
syndromes
Tension
syndromes
Menstrual
migraine
Migraine
x
ICD9 2009
Premenstrual
tension
syndromes
Tension
syndromes
synonyms
"menstrual
migrane"
De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous
oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540.
ICD9 2008
Interpreting user input under concept drift
http://www.delpher.nl provides access to the digitised collections from
the National Library of the Netherlands.
Interpreting user input under concept drift
http://www.delpher.nl provides access to the digitised collections from
the National Library of the Netherlands.
S: (n) Holocaust, final solution (the mass
murder of Jews under the German Nazi
regime from 1941 until 1945)
Semantic annotation / named entity detection
x
Ontology matching under concept drift
Example adapted from:
Julio Cesar dos Reis, Cédric Pruski, Marcos Da Silveira, Chantal
Reynaud-Delaître, Understanding semantic mapping evolution by
observing changes in biomedical ontologies, Journal of
Biomedical Informatics, Volume 47, February 2014, Pages 71-82
Ontology A Ontology Bmatched
Ontology matching under concept drift
Example adapted from:
Julio Cesar dos Reis, Cédric Pruski, Marcos Da Silveira, Chantal
Reynaud-Delaître, Understanding semantic mapping evolution by
observing changes in biomedical ontologies, Journal of
Biomedical Informatics, Volume 47, February 2014, Pages 71-82
Ontology A
Ontology A'
Ontology Bmatched
?new version
Ontology A Ontology Bmatched
Ontology matching under concept drift
Example adapted from:
Julio Cesar dos Reis, Cédric Pruski, Marcos Da Silveira, Chantal
Reynaud-Delaître, Understanding semantic mapping evolution by
observing changes in biomedical ontologies, Journal of
Biomedical Informatics, Volume 47, February 2014, Pages 71-82
Ontology A
Ontology A'
Ontology B
Ontology B'
matched
?
??
new version new version
Ontology A
Ontology A'
Ontology Bmatched
?new version
Ontology A Ontology Bmatched
Studying concept drift in Linked Open Data
Which concept will
be deleted /
merged / split /
edited?
Prediction Versioning “RDF diff”
Keeping links &
annotations up to
date when entities
change
Which syntactic
change is also a
semantic change?
Studying concept drift in Linked Open Data
Which concept will
be deleted /
merged / split /
edited?
Prediction Versioning “RDF diff”
Keeping links &
annotations up to
date when entities
change
Which syntactic
change is also a
semantic change?
Recent work: tracking changes on LOD scale
Studying concept drift in Linked Open Data
Which concept will
be deleted /
merged / split /
edited?
Prediction Versioning “RDF diff”
Keeping links &
annotations up to
date when entities
change
Which syntactic
change is also a
semantic change?
Recent work: tracking changes on LOD scale
Table from: Käfer, Tobias, et al. "Observing linked data dynamics."
Extended Semantic Web Conference. Springer Berlin Heidelberg, 2013.
Studying concept drift in Linked Open Data
Which concept will
be deleted /
merged / split /
edited?
Prediction Versioning “RDF diff”
Keeping links &
annotations up to
date when entities
change
Which syntactic
change is also a
semantic change?
Recent work: tracking changes on LOD scale
Table from: Käfer, Tobias, et al. "Observing linked data dynamics."
Extended Semantic Web Conference. Springer Berlin Heidelberg, 2013.
Apart from
these practical
issues, it is also
just interesting
to see how
knowledge
evolves!
Changes in explicit knowledge are
explicit too.
We can now measure where and when
intensional, extensional and label
changes took place.
Changes in explicit knowledge are
explicit too.
But only to the entend that the facts are
explicitly modelled.
• The association between science and
religion is not explicit.
• The prevalent meaning of polysemous
words is not explicit.
We can now measure where and when
intensional, extensional and label
changes took place.
Changes in explicit knowledge are
explicit too.
But only to the entend that the facts are
explicitly modelled.
• The association between science and
religion is not explicit.
• The prevalent meaning of polysemous
words is not explicit.
We can now measure where and when
intensional, extensional and label
changes took place.
Changes in explicit knowledge are
explicit too.
But only to the entend that the facts are
explicitly modelled.
• The association between science and
religion is not explicit.
• The prevalent meaning of polysemous
words is not explicit.
We can now measure where and when
intensional, extensional and label
changes took place.
Distributional semantics works well for detecting
changes in word meaning
Evaluated e.g. in Frermann &
Lapata. A Bayesian Model of
Diachronic Meaning Change.
examples by Aurelie Herbelot,
http://aurelieherbelot.net/research/distributional-semantics-intro/
matrices from https://cs224d.stanford.edu/lecture_notes/notes1.pdf
Image from: Lea Frermann. “Modelling fine-grained Change in Word Meaning over centuries from Large Collections
of Unstructured Text." Keynote presentation at Drift-a-LOD’17, First workshop on Detection, Representation and
Management of Concept Drift in Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016.
Image from: Lea Frermann. “Modelling fine-grained Change in Word Meaning over centuries from Large Collections
of Unstructured Text." Keynote presentation at Drift-a-LOD’17, First workshop on Detection, Representation and
Management of Concept Drift in Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016.
Information on the level of individual words
Open questions:
Have synonyms changed too? And hyponyms?
Have all the words for political systems changed?
Which group of words has changed most?
Enriching Linked Open Data with distributional
semantics
+
Enriching Linked Open Data with distributional
semantics
GTAA
+
* A method to link the two data sources
* A data model to represent the combination
* An RDF dataset that can be queried:
https://github.com/aan680/
SemanticChange_data
Enriching Linked Open Data with distributional
semantics
GTAA
+
* A method to link the two data sources
* A data model to represent the combination
* An RDF dataset that can be queried:
https://github.com/aan680/
SemanticChange_data ✤ Code
✤ Embeddings derived from google books
✤ Change scores for top 10.000 words
✤ between each decade over 200 years.
WordNet Data Model
example of data from WordNet RDF
Synset
(democracy)
LexicalEntry
Form
Synset
(political system)
"a political system in which the
supreme power lies in a body of
citizens who can elect people to
represent them"
"democracy"@en
gloss
noun.group
domain
Synset
(parliamentary
democracy)
noun
part of speech
"a political system in which
a mob is the source of
control; government by the
masses"
Synset
(mobocracy)
gloss
Synset
(political party)
meronym hypernym hypernym
hypernym
Data model for change scores
{lexical entry, decade 1, decade 2,
change score}
Data model for change scores
8.878 matches (out of 10.000) 

mapped on 12.469 lexical entries
Example query
WordNet synsets are classified into 46 ‘domains’.
Which domain has changes most in the past two centuries?
.
:
Follow-up query
Top 10 changing words within the “process” domain
Follow-up query
Which subconcept of “Psychological state” has changed most?
Example query
Relation between polysemy (nr. of senses of a word in
WordNet) and change score?
.
:
Example query
• Which linguistic
category has
changed most?
Late breaking results
• Can we use relations
in LOD to study how
a concept has
changed? Instead of
only how much?
Late breaking results
• Can we use relations
in LOD to study how
a concept has
changed? Instead of
only how much?
Gay
Late breaking results
• Can we use relations
in LOD to study how
a concept has
changed? Instead of
only how much?
Gay
Call
Conclusion
A first step to enrich LOD with information about lexical
change, obtained from large volumes of unstructured text.
GTAA
Next steps: enrich
LOD with info
about how
concepts are used:
• popularity?
• importance?
• controversy?
Published as:
A. van Aggelen, L. Hollink and J. van Ossenbruggen.
Combining distributional semantics and structured data
to study lexical change. In proceedings of the first Drift-
a-LOD workshop, co-located with EKAW, Bologna, Italy,
20 Nov. 2016

More Related Content

Similar to Enriching Linked Open Data with distributional semantics to study concept drift

11The integrity of science – Lost in translationMatth.docx
11The integrity of science – Lost in translationMatth.docx11The integrity of science – Lost in translationMatth.docx
11The integrity of science – Lost in translationMatth.docx
hyacinthshackley2629
 
IJISRT23SEP512 Cross cultural frame of reference.pdf
IJISRT23SEP512 Cross cultural frame of reference.pdfIJISRT23SEP512 Cross cultural frame of reference.pdf
IJISRT23SEP512 Cross cultural frame of reference.pdf
Sujay Rao Mandavilli
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
Connected Data World
 
God do i have your attention (colzato et al. 2010)
God   do i have your attention (colzato et al. 2010)God   do i have your attention (colzato et al. 2010)
God do i have your attention (colzato et al. 2010)
SPK División Gráfica Digital de Surpack S.A.
 
Bibliographic coupling
Bibliographic couplingBibliographic coupling
Bibliographic coupling
Ritesh Tiwari
 
Prosdocimi ucb cdao
Prosdocimi ucb cdaoProsdocimi ucb cdao
Prosdocimi ucb cdao
Francisco Prosdocimi
 
Drifting distributions? Possibilities and risks of using distributional seman...
Drifting distributions? Possibilities and risks of using distributional seman...Drifting distributions? Possibilities and risks of using distributional seman...
Drifting distributions? Possibilities and risks of using distributional seman...
Antske Fokkens
 
Cultural Contradictions of Scanning in an Evidence-based Policy Environment
Cultural Contradictions of Scanning in an Evidence-based Policy EnvironmentCultural Contradictions of Scanning in an Evidence-based Policy Environment
Cultural Contradictions of Scanning in an Evidence-based Policy Environment
Wendy Schultz
 
17766461-Communication-Theory.pdf
17766461-Communication-Theory.pdf17766461-Communication-Theory.pdf
17766461-Communication-Theory.pdf
Thahsin Thahir
 
Primer in theory construction
Primer in theory constructionPrimer in theory construction
Primer in theory construction
Mohammad kermani
 
Essay On Evolution. East-West University
Essay On Evolution. East-West UniversityEssay On Evolution. East-West University
Essay On Evolution. East-West University
Tammy Chmielorz
 
5 Paragraph Essay Outline Example Telegraph
5 Paragraph Essay Outline Example Telegraph5 Paragraph Essay Outline Example Telegraph
5 Paragraph Essay Outline Example Telegraph
Lisa Diaz
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
Anita de Waard
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
Anita de Waard
 
should scientist embrace realist or antirealist
should scientist embrace realist or antirealistshould scientist embrace realist or antirealist
should scientist embrace realist or antirealist
Manuel Marozwa
 
The Degree Of Innovation: Through Incremental To Radical
The Degree Of Innovation: Through Incremental To RadicalThe Degree Of Innovation: Through Incremental To Radical
The Degree Of Innovation: Through Incremental To Radical
Dmytro Shestakov
 
Comparative Analysis Essays. How to write a comparative analysis essay. How ...
Comparative Analysis Essays.  How to write a comparative analysis essay. How ...Comparative Analysis Essays.  How to write a comparative analysis essay. How ...
Comparative Analysis Essays. How to write a comparative analysis essay. How ...
Mari Howard
 
Science and research
Science and researchScience and research
Science and research
gs. bhatnagar
 
Vidal 2008 what-is-a-worldview
Vidal 2008 what-is-a-worldviewVidal 2008 what-is-a-worldview
Vidal 2008 what-is-a-worldview
Jonathan Dunnemann
 
Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...
Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...
Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...
Sujay Rao Mandavilli
 

Similar to Enriching Linked Open Data with distributional semantics to study concept drift (20)

11The integrity of science – Lost in translationMatth.docx
11The integrity of science – Lost in translationMatth.docx11The integrity of science – Lost in translationMatth.docx
11The integrity of science – Lost in translationMatth.docx
 
IJISRT23SEP512 Cross cultural frame of reference.pdf
IJISRT23SEP512 Cross cultural frame of reference.pdfIJISRT23SEP512 Cross cultural frame of reference.pdf
IJISRT23SEP512 Cross cultural frame of reference.pdf
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
 
God do i have your attention (colzato et al. 2010)
God   do i have your attention (colzato et al. 2010)God   do i have your attention (colzato et al. 2010)
God do i have your attention (colzato et al. 2010)
 
Bibliographic coupling
Bibliographic couplingBibliographic coupling
Bibliographic coupling
 
Prosdocimi ucb cdao
Prosdocimi ucb cdaoProsdocimi ucb cdao
Prosdocimi ucb cdao
 
Drifting distributions? Possibilities and risks of using distributional seman...
Drifting distributions? Possibilities and risks of using distributional seman...Drifting distributions? Possibilities and risks of using distributional seman...
Drifting distributions? Possibilities and risks of using distributional seman...
 
Cultural Contradictions of Scanning in an Evidence-based Policy Environment
Cultural Contradictions of Scanning in an Evidence-based Policy EnvironmentCultural Contradictions of Scanning in an Evidence-based Policy Environment
Cultural Contradictions of Scanning in an Evidence-based Policy Environment
 
17766461-Communication-Theory.pdf
17766461-Communication-Theory.pdf17766461-Communication-Theory.pdf
17766461-Communication-Theory.pdf
 
Primer in theory construction
Primer in theory constructionPrimer in theory construction
Primer in theory construction
 
Essay On Evolution. East-West University
Essay On Evolution. East-West UniversityEssay On Evolution. East-West University
Essay On Evolution. East-West University
 
5 Paragraph Essay Outline Example Telegraph
5 Paragraph Essay Outline Example Telegraph5 Paragraph Essay Outline Example Telegraph
5 Paragraph Essay Outline Example Telegraph
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
 
should scientist embrace realist or antirealist
should scientist embrace realist or antirealistshould scientist embrace realist or antirealist
should scientist embrace realist or antirealist
 
The Degree Of Innovation: Through Incremental To Radical
The Degree Of Innovation: Through Incremental To RadicalThe Degree Of Innovation: Through Incremental To Radical
The Degree Of Innovation: Through Incremental To Radical
 
Comparative Analysis Essays. How to write a comparative analysis essay. How ...
Comparative Analysis Essays.  How to write a comparative analysis essay. How ...Comparative Analysis Essays.  How to write a comparative analysis essay. How ...
Comparative Analysis Essays. How to write a comparative analysis essay. How ...
 
Science and research
Science and researchScience and research
Science and research
 
Vidal 2008 what-is-a-worldview
Vidal 2008 what-is-a-worldviewVidal 2008 what-is-a-worldview
Vidal 2008 what-is-a-worldview
 
Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...
Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...
Theory of paradoxes and contradictory rule sets FINAL FINAL FINAL FINAL FINAL...
 

More from Laura Hollink

Creating and Analysing Linked Open Data for the EU Parliament
Creating and Analysing Linked Open Data for the EU ParliamentCreating and Analysing Linked Open Data for the EU Parliament
Creating and Analysing Linked Open Data for the EU Parliament
Laura Hollink
 
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
Guest Lecture: Linked Open Data for the Humanities and Social SciencesGuest Lecture: Linked Open Data for the Humanities and Social Sciences
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
Laura Hollink
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
Laura Hollink
 
Images in Online News: demo scenario
Images in Online News: demo scenarioImages in Online News: demo scenario
Images in Online News: demo scenario
Laura Hollink
 
Connecting political data to media data
Connecting political data to media dataConnecting political data to media data
Connecting political data to media data
Laura Hollink
 
Talk of Europe: Linked data of the European Parliament
Talk of Europe:  Linked data of the European ParliamentTalk of Europe:  Linked data of the European Parliament
Talk of Europe: Linked data of the European Parliament
Laura Hollink
 
Presentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH projectPresentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH project
Laura Hollink
 
Talk of Europe @ DHBenelux2015
Talk of Europe @ DHBenelux2015Talk of Europe @ DHBenelux2015
Talk of Europe @ DHBenelux2015
Laura Hollink
 
Connecting political data to media data
Connecting political data to media dataConnecting political data to media data
Connecting political data to media data
Laura Hollink
 
WWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic AnalysisWWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic Analysis
Laura Hollink
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic Web
Laura Hollink
 

More from Laura Hollink (11)

Creating and Analysing Linked Open Data for the EU Parliament
Creating and Analysing Linked Open Data for the EU ParliamentCreating and Analysing Linked Open Data for the EU Parliament
Creating and Analysing Linked Open Data for the EU Parliament
 
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
Guest Lecture: Linked Open Data for the Humanities and Social SciencesGuest Lecture: Linked Open Data for the Humanities and Social Sciences
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Images in Online News: demo scenario
Images in Online News: demo scenarioImages in Online News: demo scenario
Images in Online News: demo scenario
 
Connecting political data to media data
Connecting political data to media dataConnecting political data to media data
Connecting political data to media data
 
Talk of Europe: Linked data of the European Parliament
Talk of Europe:  Linked data of the European ParliamentTalk of Europe:  Linked data of the European Parliament
Talk of Europe: Linked data of the European Parliament
 
Presentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH projectPresentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH project
 
Talk of Europe @ DHBenelux2015
Talk of Europe @ DHBenelux2015Talk of Europe @ DHBenelux2015
Talk of Europe @ DHBenelux2015
 
Connecting political data to media data
Connecting political data to media dataConnecting political data to media data
Connecting political data to media data
 
WWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic AnalysisWWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic Analysis
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic Web
 

Recently uploaded

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Ukraine
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Fwdays
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
AlexanderRichford
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
Sease
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
ScyllaDB
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
zjhamm304
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
FilipTomaszewski5
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)
HarpalGohil4
 

Recently uploaded (20)

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)AWS Certified Solutions Architect Associate (SAA-C03)
AWS Certified Solutions Architect Associate (SAA-C03)
 

Enriching Linked Open Data with distributional semantics to study concept drift

  • 1. Enriching Linked Open Data with distributional semantics to study concept drift Astrid van Aggelen, Laura Hollink, Jacco van Ossenbruggen Information Access Group
  • 2. What is concept drift? Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014. Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011. Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015. The phenomenon where the characteristics of a concept change over time, signifying a shift in meaning
  • 3. What is concept drift? • Intension: definitions, properties, necessary and sufficient condition • e.g. science, gender nonconformity Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014. Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011. Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015. The phenomenon where the characteristics of a concept change over time, signifying a shift in meaning
  • 4. What is concept drift? • Intension: definitions, properties, necessary and sufficient condition • e.g. science, gender nonconformity Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014. Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011. Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015. The phenomenon where the characteristics of a concept change over time, signifying a shift in meaning • Extension: the instances of a class • e.g. new Nobel prize winners, EU member states
  • 5. What is concept drift? • Intension: definitions, properties, necessary and sufficient condition • e.g. science, gender nonconformity Betti, A, van den Berg, H. Modelling the history of ideas. British Journal for the History of Philosophy, 22(4):812-835, 2014. Wang, S, Schlobach, S, Klein, M. Concept drift and how to identify it. Journal of Web Semantics 9.3:247- 265, 2011. Kenter, T, Wevers, M, Huijnen, P, de Rijke, M. Ad Hoc Monitoring of Vocabulary Shifts over Time. In Proceedings of CIKM, October 2015. The phenomenon where the characteristics of a concept change over time, signifying a shift in meaning • Extension: the instances of a class • e.g. new Nobel prize winners, EU member states • Labels: words used to refer to to a concept • e.g. “migrant”, “refugee”
  • 6. Linked Open Data Classes, instances, their properties and labels are explicitly encoded in formal languages. class class class i i i i i i i i i i i label label label label
  • 7. Concept drift problems in LOD applications Semantic annotation under concept drift Ontology matching under concept drift Interpreting user input under concept drift Premenstrual tension syndromes Tension syndromes Menstrual migraine Migraine x ICD9 2009 Premenstrual tension syndromes Tension syndromes synonyms "menstrual migrane" De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540. ICD9 2008 Ontology A Ontology A' Ontology B Ontology B' matched ? ?? new version new version
  • 8. Semantic annotation under concept drift Premenstrual tension syndromes Tension syndromes synonyms "menstrual migrane" De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540. ICD9 2008
  • 9. Semantic annotation under concept drift Example adapted from: Cédric Pruski, keynote presentation at Drift-a-LOD’17, First workshop on Detection, Representation and Management of Concept Drift in Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016. Premenstrual tension syndromes Tension syndromes synonyms "menstrual migrane" De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540. ICD9 2008
  • 10. Semantic annotation under concept drift Example adapted from: Cédric Pruski, keynote presentation at Drift-a-LOD’17, First workshop on Detection, Representation and Management of Concept Drift in Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016. Premenstrual tension syndromes Tension syndromes Menstrual migraine Migraine x ICD9 2009 Premenstrual tension syndromes Tension syndromes synonyms "menstrual migrane" De Lignieres, B., et al. "Prevention of menstrual migraine by percutaneous oestradiol." British medical journal (Clinical research ed.) 293.6561 (1986): 1540. ICD9 2008
  • 11. Interpreting user input under concept drift http://www.delpher.nl provides access to the digitised collections from the National Library of the Netherlands.
  • 12. Interpreting user input under concept drift http://www.delpher.nl provides access to the digitised collections from the National Library of the Netherlands. S: (n) Holocaust, final solution (the mass murder of Jews under the German Nazi regime from 1941 until 1945) Semantic annotation / named entity detection x
  • 13. Ontology matching under concept drift Example adapted from: Julio Cesar dos Reis, Cédric Pruski, Marcos Da Silveira, Chantal Reynaud-Delaître, Understanding semantic mapping evolution by observing changes in biomedical ontologies, Journal of Biomedical Informatics, Volume 47, February 2014, Pages 71-82 Ontology A Ontology Bmatched
  • 14. Ontology matching under concept drift Example adapted from: Julio Cesar dos Reis, Cédric Pruski, Marcos Da Silveira, Chantal Reynaud-Delaître, Understanding semantic mapping evolution by observing changes in biomedical ontologies, Journal of Biomedical Informatics, Volume 47, February 2014, Pages 71-82 Ontology A Ontology A' Ontology Bmatched ?new version Ontology A Ontology Bmatched
  • 15. Ontology matching under concept drift Example adapted from: Julio Cesar dos Reis, Cédric Pruski, Marcos Da Silveira, Chantal Reynaud-Delaître, Understanding semantic mapping evolution by observing changes in biomedical ontologies, Journal of Biomedical Informatics, Volume 47, February 2014, Pages 71-82 Ontology A Ontology A' Ontology B Ontology B' matched ? ?? new version new version Ontology A Ontology A' Ontology Bmatched ?new version Ontology A Ontology Bmatched
  • 16. Studying concept drift in Linked Open Data Which concept will be deleted / merged / split / edited? Prediction Versioning “RDF diff” Keeping links & annotations up to date when entities change Which syntactic change is also a semantic change?
  • 17. Studying concept drift in Linked Open Data Which concept will be deleted / merged / split / edited? Prediction Versioning “RDF diff” Keeping links & annotations up to date when entities change Which syntactic change is also a semantic change? Recent work: tracking changes on LOD scale
  • 18. Studying concept drift in Linked Open Data Which concept will be deleted / merged / split / edited? Prediction Versioning “RDF diff” Keeping links & annotations up to date when entities change Which syntactic change is also a semantic change? Recent work: tracking changes on LOD scale Table from: Käfer, Tobias, et al. "Observing linked data dynamics." Extended Semantic Web Conference. Springer Berlin Heidelberg, 2013.
  • 19. Studying concept drift in Linked Open Data Which concept will be deleted / merged / split / edited? Prediction Versioning “RDF diff” Keeping links & annotations up to date when entities change Which syntactic change is also a semantic change? Recent work: tracking changes on LOD scale Table from: Käfer, Tobias, et al. "Observing linked data dynamics." Extended Semantic Web Conference. Springer Berlin Heidelberg, 2013. Apart from these practical issues, it is also just interesting to see how knowledge evolves!
  • 20. Changes in explicit knowledge are explicit too. We can now measure where and when intensional, extensional and label changes took place.
  • 21. Changes in explicit knowledge are explicit too. But only to the entend that the facts are explicitly modelled. • The association between science and religion is not explicit. • The prevalent meaning of polysemous words is not explicit. We can now measure where and when intensional, extensional and label changes took place.
  • 22. Changes in explicit knowledge are explicit too. But only to the entend that the facts are explicitly modelled. • The association between science and religion is not explicit. • The prevalent meaning of polysemous words is not explicit. We can now measure where and when intensional, extensional and label changes took place.
  • 23. Changes in explicit knowledge are explicit too. But only to the entend that the facts are explicitly modelled. • The association between science and religion is not explicit. • The prevalent meaning of polysemous words is not explicit. We can now measure where and when intensional, extensional and label changes took place.
  • 24. Distributional semantics works well for detecting changes in word meaning Evaluated e.g. in Frermann & Lapata. A Bayesian Model of Diachronic Meaning Change. examples by Aurelie Herbelot, http://aurelieherbelot.net/research/distributional-semantics-intro/ matrices from https://cs224d.stanford.edu/lecture_notes/notes1.pdf
  • 25. Image from: Lea Frermann. “Modelling fine-grained Change in Word Meaning over centuries from Large Collections of Unstructured Text." Keynote presentation at Drift-a-LOD’17, First workshop on Detection, Representation and Management of Concept Drift in Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016.
  • 26. Image from: Lea Frermann. “Modelling fine-grained Change in Word Meaning over centuries from Large Collections of Unstructured Text." Keynote presentation at Drift-a-LOD’17, First workshop on Detection, Representation and Management of Concept Drift in Linked Open Data, at EKAW, Bologna, Italy, 20 November 2016.
  • 27. Information on the level of individual words Open questions: Have synonyms changed too? And hyponyms? Have all the words for political systems changed? Which group of words has changed most?
  • 28. Enriching Linked Open Data with distributional semantics +
  • 29. Enriching Linked Open Data with distributional semantics GTAA + * A method to link the two data sources * A data model to represent the combination * An RDF dataset that can be queried: https://github.com/aan680/ SemanticChange_data
  • 30. Enriching Linked Open Data with distributional semantics GTAA + * A method to link the two data sources * A data model to represent the combination * An RDF dataset that can be queried: https://github.com/aan680/ SemanticChange_data ✤ Code ✤ Embeddings derived from google books ✤ Change scores for top 10.000 words ✤ between each decade over 200 years.
  • 31. WordNet Data Model example of data from WordNet RDF Synset (democracy) LexicalEntry Form Synset (political system) "a political system in which the supreme power lies in a body of citizens who can elect people to represent them" "democracy"@en gloss noun.group domain Synset (parliamentary democracy) noun part of speech "a political system in which a mob is the source of control; government by the masses" Synset (mobocracy) gloss Synset (political party) meronym hypernym hypernym hypernym
  • 32. Data model for change scores {lexical entry, decade 1, decade 2, change score}
  • 33. Data model for change scores 8.878 matches (out of 10.000) 
 mapped on 12.469 lexical entries
  • 34. Example query WordNet synsets are classified into 46 ‘domains’. Which domain has changes most in the past two centuries? . :
  • 35. Follow-up query Top 10 changing words within the “process” domain
  • 36. Follow-up query Which subconcept of “Psychological state” has changed most?
  • 37. Example query Relation between polysemy (nr. of senses of a word in WordNet) and change score? . :
  • 38. Example query • Which linguistic category has changed most?
  • 39. Late breaking results • Can we use relations in LOD to study how a concept has changed? Instead of only how much?
  • 40. Late breaking results • Can we use relations in LOD to study how a concept has changed? Instead of only how much? Gay
  • 41. Late breaking results • Can we use relations in LOD to study how a concept has changed? Instead of only how much? Gay
  • 42. Call
  • 43. Conclusion A first step to enrich LOD with information about lexical change, obtained from large volumes of unstructured text. GTAA Next steps: enrich LOD with info about how concepts are used: • popularity? • importance? • controversy? Published as: A. van Aggelen, L. Hollink and J. van Ossenbruggen. Combining distributional semantics and structured data to study lexical change. In proceedings of the first Drift- a-LOD workshop, co-located with EKAW, Bologna, Italy, 20 Nov. 2016