Data and knowledge as commodities
Mathieu d’Aquin
@mdaquin
Nancy (2002-2006)
LORIA
UHP Nancy 1
Milton Keynes (2006-2017)
Knowledge Media Institute
The Open University
Galway (2017-2021)
Data Science Institute
NUI Galway
Nancy (2002-2006)
LORIA
UHP Nancy 1
Milton Keynes (2006-2017)
Knowledge Media Institute
The Open University
Galway (2017-2021)
Data Science Institute
NUI Galway
Nancy (2002-2006)
LORIA
UHP Nancy 1
Milton Keynes (2006-2017)
Knowledge Media Institute
The Open University
Galway (2017-2021)
Data Science Institute
NUI Galway
Nancy (2002-2006)
LORIA
UHP Nancy 1
Milton Keynes (2006-2017)
Knowledge Media Institute
The Open University
Galway (2017-2021)
Data Science Institute
NUI Galway
Analytical Statements
Synthetic Statements
All humans are mortals
Everyday, Bob wakes up at 6am
Analytical Statements
Synthetic Statements
Intension
Extension
Analytical Statements
Synthetic Statements
Ontologies /
Knowledge-based Systems
Machine Learning /
Data Mining
Analytical Statements
Synthetic Statements
Knowledge-driven
Data-driven
Analytical Statements
Synthetic Statements
Knowledge-driven
Data-driven
Grounded
by
Interpreted
by
Analytical Statements
Synthetic Statements
Knowledge
Data
Grounded
by
Interpreted
by
Access
Access
Analytical Statements
Synthetic Statements
Knowledge
Data
Grounded
by
Interpreted
by
Access
Access
Link /
connect
Analytical Statements
Synthetic Statements
Knowledge
Data
Grounded
by
Interpreted
by
Access
Access
Link /
connect
Starting with access issues
The Semantic Web and Linked Data
d'Aquin, Mathieu, and
Enrico Motta. "The
epistemology of intelligent
semantic web systems."
Synthesis Lectures on the
Semantic Web: Theory and
Technology 6, no. 1 (2016):
1-88.
https://lod-cloud.net/
Gene
Ontology
FMA
Ontology
LODE
BIBO
Geo
Ontology
DBPedia
Ontology
Dublin
Core
FOAF
DOAP
SIOC
Music
Ontology
Media
Ontology
rNews
https://lod-cloud.net/
d'Aquin, Mathieu, and
Enrico Motta. "The
epistemology of intelligent
semantic web systems."
Synthesis Lectures on the
Semantic Web: Theory and
Technology 6, no. 1 (2016):
1-88.
Example: data.open.ac.uk
Daga, Enrico, Mathieu d’Aquin, Alessandro Adamou, and
Stuart Brown. "The open university linked data–data.
open. ac. uk." Semantic Web 7, no. 2 (2016): 183-191.
owl:sameAs
mlo:offers
mlo:location
http://data.open.ac.uk/course/m366
http://sws.geonames.org/2963597/ (Ireland)
http://data.open.ac.uk/organization/the_open_university
http://education.data.gov.uk/id/school/133849
Resource Discovery
Mobile and
Personal
Semantics
Research
Exploration
Social
d'Aquin et al. (2013) Assessing
the Educational Linked Data
Landscape, Web Science 2013
Learner
Platform
Analytics
VLE | Website | Library
Assessment | Enrollment
School/University
Prediction Drop out
BI
Planning
Recommendation
Sentiment Analysis
Collective Intelligence Behaviour Analysis
Collaboration
Community Support
d'Aquin, Mathieu, Dominik
Kowald, Angela Fessl,
Elisabeth Lex, and Stefan
Thalmann. "Afel-analytics for
everyday learning." In
Companion Proceedings of
the The Web Conference
2018, pp. 439-440. 2018.
Data Infrastructure for the city of Milton
Keynes, enabling sharing and consuming
varied and diverse city scale data.
d'Aquin, Mathieu, John Davies,
and Enrico Motta. "Smart cities'
data: Challenges and
opportunities for semantic
technologies." IEEE Internet
Computing 19, no. 6 (2015):
66-70.
Watson - A Semantic Web Search Engine
d'Aquin, Mathieu, and Enrico
Motta. "Watson, more than a
semantic web search engine."
Semantic Web 2, no. 1 (2011):
55-63.
Analytical Statements
Synthetic Statements
Knowledge
Data
Grounded
by
Interpreted
by
Access
Access
Link /
connect
Analytical Statements
Synthetic Statements
Knowledge
Data
Grounded
by
Interpreted
by
Access
Access
Link /
connect
What’s an explanation anyway?
Tiddi, Ilaria, Mathieu d'Aquin, and Enrico Motta.
"An ontology design pattern to define
explanations." In Proceedings of the 8th
International Conference on Knowledge Capture,
pp. 1-8. 2015.
Explaining patterns emerging from data
Tiddi, Ilaria, Mathieu d’Aquin, and Enrico
Motta. "Data patterns explained with
linked data." In Joint European
Conference on Machine Learning and
Knowledge Discovery in Databases, pp.
271-275. Springer, Cham, 2015.
Explaining patterns emerging from data
:India
:Ethiopia
:Somalia
sA
sA
sA
subj
subj
subj
related
related
gdp
gdp
≤
≤
≤
gdp
  4,000/pp
cat:LeastDeveloped
Countries
600/pp
3,800/pp
1,200/pp
cat:Africa
cat:SouthAsia
db:Somalia
db:Ethiopia
db:India
Explaining patterns emerging from data
Countries where men are more educated than women
skos:exactMatch🡺dbp:hdiRank ≥ 126 87.8% 197”
skos:exactMatch🡺dc:subject
db:Category:Least_Developed_Countries
74.7% 524’’
skos:exactMatch🡺dbp:gdpPPPPerCapitaRank ≥ 89 68.3% 269”
Countries where women are more educated than men
skos:exactMatch🡺dbp:hdiRank ≤ 119 63.4% 198”
skos:exactMatch🡺dbp:gdpPPPPerCapitaRank ≤ 56 62.3% 236’’
Countries where education is equal
skos:exactMatch🡺dbp:gdpPPPRank ≥ 64 62.0% 234”
skos:exactMatch🡺dbp:gdpPPPPerCapitaRank ≥ 29 61.0% 268’’
Explaining patterns emerging from data
Tiddi, Ilaria, Mathieu d’Aquin, and
Enrico Motta. "Learning to assess
linked data relationships using
genetic programming." In
International Semantic Web
Conference, pp. 581-597. Springer,
Cham, 2016.
Towards model interpretability
Given his biography:
Tiziano Vecelli(+0.04)(1488/1490 – 27 August 1576), known in
English as Titian, was an Italian(-0.03) painter, the most
important member of the 16th-century Venetian school(+0.01).
He was born in Pieve di Cadore(-0.03), near Belluno(-0.01), in
Veneto(-0.01)(Republicof Venice(-0.01)). His painting methods
would profoundly influence future generations of Western
Art(+0.001).
Would Titian’s art be on display in some major
European art museum? The model says no.
Nikolov, Andriy, and Mathieu d'Aquin.
"Uncovering Semantic Bias in Neural
Network Models Using a Knowledge
Graph." In Proceedings of the 29th ACM
International Conference on Information &
Knowledge Management, pp. 1175-1184.
2020.
Towards model interpretability
Examples of results
Paintings from painters from southern Netherlands that are members of the
AntwerpGuild are more likely to be classified positively by the model.
Also, painters who have works mentioned in their biography that represent the
Virgin Marie are more likely to be classified positively by the model.
Painters from certain places in Italy are more likely to be classified negatively
by the model.
Examples of results
What to do with it
As a way to debug the model: Rules that a domain expert would
consider invalid or counter-intuitive might represent some bias in
the data or the use of irrelevant information coincidently related to
the classification.
As a way to extract more from the model: Rules that appear
surprising or counter intuitive to the domain expert might also
represent unknown valid connections in the data.
As a way to assess generalisability: Rules that refer to
characteristics that are unrelated to the domain could lead to the
misclassification of entities described similarly in another context.
For example, some footballers are classified by our model as being
likely to be exposed in European museum.
Where does that get us?
Analytical Statements
Synthetic Statements
Grounded
by
Interpreted
by
Where does that get us?
Analytical Statements
Synthetic Statements
Grounded
by
Interpreted
by
Intelligent Data Understanding
Where does that get us?
Analytical Statements
Synthetic Statements
Grounded
by
Interpreted
by
Intelligent Data Understanding
Using knowledge-driven approaches to understand
provenance and access to data.
Where does that get us?
Analytical Statements
Synthetic Statements
Grounded
by
Interpreted
by
Intelligent Data Understanding
Using knowledge-driven approaches to understand
provenance and access to data.
Using hybrid (data- and knowledge-) driven approaches to
understand the content of data and what it means.
Where does that get us?
Analytical Statements
Synthetic Statements
Grounded
by
Interpreted
by
Intelligent Data Understanding
Using knowledge-driven approaches to understand
provenance and access to data.
Using hybrid (data- and knowledge-) driven approaches to
understand the content of data and what it means.
Using hybrid (data- and knowledge-) driven approaches to
understand the results of analysing data.
Thanks!
Enrico Motta, Stefan Dietze, Marta Sabou, Jean Lieber, Ilaria Tiddi, Alessandro Adamou, Amedeo Napoli,
Enrico Daga, Carlo Allocca, Eelco Herder, Andriy Nikolov, Fouad Zablith, Hendrik Drachsler, Fadi Badra,
Aldo Gangemi, Davide Taibi, Ning Li, Laurian Gridinoc, Shuangyan Liu, Keerthi Thomas, Silvio Peroni,
Sandrine Lafrogne, Dragan Gasevic, Laszlo Szathmary, Vanessa López, Emanuele Bastianelli, Ratnesh
Sahay, Sofia Angeletou, Sabrina Kirrane, Pinelopi Troullinou, Grigoris Antoniou, Marieke Guy, Jorge
Gracia, Thomas Meilender, Heiner Stuckenschmidt, Helen Barlow, Holger Lewen, Joachim Kimmerle,
Daniela Oliveira, Stefan Thalmann, Steffen Staab, Simon Brown, Peter Holtz, Nicolas Jay, Christopher
Brewster, Stefan Decker, Alokkumar Jha, Anne Schlicht, Dimitris Plexousakis, Jeff Heflin, Jeff Z. Pan,
Dominik Kowald, Óscar Corcho, Yasar Khan, Valentina Presutti, Wassim Derguech, José Manuél
Gómez-Pérez, Eduardo Mena, Stuart Brown, Syeda Sana e Zainab, Kavitha Srinivas, Sébastien Brachais,
Krishnaprasad Thirunarayan, Elisabeth Lex, Ran Yu, Paul Mulholland, Paul Groth, Christoph Lange, Michel
Dumontier, Angela Fessl, Giorgos Flouris, Markus Strohmaier, Martin Dzbor, Mari Carmen
Suárez-Figueroa
Mathieu d’Aquin
@mdaquin

Data and Knowledge as Commodities