SlideShare a Scribd company logo
Measuring Metadata Quality
Péter Király
Comparative Studies (General and Comparative Literature and Cultural Studies)
Georg-August-Universität Göttingen
2019-06-24
slides: http://bit.ly/qa-defense
metadata
http://bit.ly/qa-defense
2
metadata
something else
here:
cultural heritage objects
★ describes
★ explains
★ locates
★ represents
the problem
https://twitter.com/fxru/status/1052838758066868224
http://bit.ly/qa-defense
3
dates (MoMa collection)
Harald Klinke (LMU München) https://twitter.com/HxxxKxxx/status/1066805548866289664
4
http://bit.ly/qa-defense
title – thumbnail
5
http://bit.ly/qa-defense
multilinguality
6
★ Mona Lisa → 456
results
★ La Gioconda → 365
results
★ La Joconde → 71
results
http://bit.ly/qa-defense
information
metadata data
consequence
7
no metadata
no access to data no data usage
bad metadata
http://bit.ly/qa-defense Anonyme: Le Liseur © 2008 Hulton-Deutsch Collection / Corbis
laws of library science
8
http://bit.ly/qa-defense Anonyme: Le Liseur © 2008 Hulton-Deutsch Collection / Corbis
1. Books are for use.
2. Every person his or her book.
3. Every book its reader.
4. Save the time of the reader.
5. Library is a growing organism.
S. R. Ranganathan, 1931
objective
9
there are “good” and “bad” metadata records
functional requirements
(metrics)
good
acceptable
bad
http://bit.ly/qa-defense
metadata quality metrics in literature
Bruce and Hillman (2004); Ochoa and Duval (2009); Palavitsinis (2014); Zaveri et al. (2015)
https://www.zotero.org/groups/488224/metadata_assessment
10
completeness
accuracy
consistency
...
correctness
objectiveness
appropriateness
http://bit.ly/qa-defense
complication
11
★ lack of details
★ no shared implementation
★ not flexible (collection specificity)
★ not scalable
http://bit.ly/qa-defense
12
★ Q1: What are the relevant quality dimensions in
two different cultural heritage data sources?
★ Q2: How could it be implemented in a flexible way?
★ Q3: How could it be implemented in scalable way?
★ Q4: How could Big Data analysis be conducted with
limited computational resources?
http://bit.ly/qa-defense
questions
structure is algorithmically measurable, content is not
13
★ number of fields
★ uniqueness of values
★ language annotation
★ is it really the Mona Lisa?
★ is it about Lower Saxony?
★ is it in Polish?
http://bit.ly/qa-defense
hypothesis
14
by measuring structural elements we
can approximate metadata quality
≃ metadata smell
http://bit.ly/qa-defense
LAM institution
workflow
1. ingest
2. measure records
3. aggregate
4. report
5. evaluate with experts
15
improve records
http://bit.ly/qa-defense
quality assessment tool
Measuring Europeana
http://bit.ly/qa-defense
organisational approach
17
Europeana Data Quality Committee
★ analysing/revising metadata schema
★ functional requirement analysis
★ problem catalog
★ multilinguality
http://bit.ly/qa-defense
technical approach
18
“Metadata Quality Assessment Framework”
★ adaptable to different metadata schemas
★ scalable (to Big Data)
★ generates understandable reports for data curators
★ open source
http://bit.ly/qa-defense
what to measure?
19
★structural and semantic features
completeness, cardinality, uniqueness, length, dictionary entry, data type
conformance, multilinguality (generic metrics)
★functional requirement analysis
★problem catalog
http://bit.ly/qa-defense
http://bit.ly/qa-defense
20
<#record> a ore:Proxy ;
dc:subject “Ballet”, “Opera” .
<#record> a ore:Proxy ; edm:europeanaProxy true ;
dc:subject <http://data.europeana.eu/concept/base/264>
, <http://data.europeana.eu/concept/base/247> .
<http://data.europeana.eu/concept/base/264> a skos:Concept .
skos:prefLabel "Ballett"@no, "बैले"@hi, "Ballett"@de, "Балет"@be, "Балет"@ru
, "Balé"@pt, "Балет"@bg, "Baletas"@lt, "Balet"@hr, "Balets"@lv .
<http://data.europeana.eu/concept/base/247>
skos:prefLabel "Opera"@no, "ओपेरा (गीतिनाटक)"@hi, "Oper"@de, "Ooppera"@fi
, "Опера"@be, "Опера"@ru, "Ópera"@pt, "Опера"@bg, "Opera"@lt .
multilinguality
21
0
0
11 19
Distinct languages Tagged literals 1,7 Literals per language
dereferencing
http://bit.ly/qa-defense
19%
58%
63%
13.3% 23.7%
http://bit.ly/qa-defense
22
Measuring library
catalogues
Card catalogue at Gent University Library, photo: Pieter Morlion, 2010 CC-BY 4.0
https://commons.wikimedia.org/wiki/File:Boekentoren_2010PM_1179_21H9015.JPG
http://bit.ly/qa-defense
a (pretty printed) example
LDR 01136cnm a2200253ui 4500
001 002032820
005 20150224114135.0
008 031117s2003 gw 000 0 ger d
020 $a3805909810
100 1 $avon Staudinger, Julius,$d1836-1902$0(viaf)14846766
245 10$aJ. von Staudingers Kommentar zum ... /$cJ. von Staudinger.
250 $aNeubearb. 2003$bvon Jörn Eckert
260 $aBerlin :$bSellier-de Gruyter,$c2003.
300 $a534 p. ;.
500 $aCiteertitel: BGB.
500 $aBandtitel: Staudinger BGB.
700 1 $aEckert, Jörn
852 4 $xRE$bRE55$cRBIB$jRBIB.BUR 011 DE 021$p000000800147
24
http://bit.ly/qa-defense
semantic elements
25
MARC 21 versions total
control fields 7 7
control subfields 211 211
data fields 215 68 283
indicators 175 8 183
subfields 2259 344 2603
3287
Java classes
qa-metadata-marc.jar
Avram JSON
data model
export
machine readable standard
http://bit.ly/qa-defense
proportion of records with issues
26
library all core
bay 100.0 18.8
bzb 100.0 76.1
cer 2.8 2.8
col 90.4 66.0
dnb 13.9 0.2
gen 40.8 27.3
har 100.0 97.3
loc 30.5 29.3
library all core
mic 80.8 67.5
nfi 62.1 58.1
ris 99.7 57.1
sfp 82.7 60.4
sta 92.7 92.5
szt 30.8 30.6
tib 100.0 100.0
tor 100.0 74.2
core = issues in the documented elements
http://bit.ly/qa-defense
strange
almost
error-less
surprising
issue types
record level
★ ambiguous linkage
★ invalid linkage
★ type error
control field
★ invalid code
★ invalid value
27
data field
★ missing reference
subfield (880$6)
★ non-repeatable field
★ undefined field
indicator
★ invalid value
★ non-empty value
★ obsolete value
subfield
★ classification
★ invalid ISBN
★ invalid ISSN
★ invalid length
★ invalid value
★ repetition
★ undefined subfield
★ non well-formatted
value
http://bit.ly/qa-defense
most frequent issues
completeness by field groups
28
http://bit.ly/qa-defense
29
★ Q1 quality dimensions:
completeness, multilinguality, issue detection
★ Q2 flexibility:
schema abstraction, decoupling measurements
★ Q3 scalability:
using frameworks such as Apache Spark
★ Q4 limited resource:
measuring performance, optimizing parameters
http://bit.ly/qa-defense
conclusions
publications
★ J. Stiller, P. Király (2017) Multilinguality of Metadata Measuring the Multilingual Degree of
Europeana’s Metadata. In Proc. of the 15th Intl. Symp. of Information Sci. 164–176.
★ P. Király (2017) Towards an extensible measurement of metadata quality. In Second International
Conference on Digital Access to Textual Cultural Heritage. 111–115. 10.1145/3078081.3078109
★ P. Király (2017) Measuring completeness as metadata quality metric in Europeana. In Digital
Humanities 2017 Conference Abstracts. 291–293.
★ V. Charles, J. Stiller, P. Király, W. Bailer, N. Freire (2017) Evaluating Data Quality in Europeana:
Metrics for Multilinguality. In Joint Proceedings of TDDL 2017, MDQual 2017 and Futurity 2017
★ P. Király (2018) Adat a könyvtárban. In Hagyomány és újítás a 21. századi könyvtárban. 49–74.
★ P. Király, M. Büchler (2018) Measuring completeness as metadata quality metric in Europeana. In
2018 IEEE International Conference on Big Data. 2711–2720. 10.1109/BigData.2018.8622487
★ P. Király, J. Stiller, V. Charles, W. Bailer, N. Freire (2019) Evaluating Data Quality in Europeana:
Metrics for Multilinguality. In Metadata and Semantic Research 2018. 199–211. 10.1007/978-3-
030-14401-2_19
30
http://bit.ly/qa-defense
conferences
#1 International Symposium on Information Science 2017, Berlin (with Juline Stiller) #2 #dariahTeach
2017, Lausanne (poster) #3 SI & IT Workshop 2017, Göttingen (with Juline Stiller) #4 Linked Data
Quality workshop (@ ESWC 2017), Portorož (invited keynote speech) #5 DATeCH 2017, Göttingen
#6 ELAG 2017, Athens (with Valentine Charles) #7 Digital Humanities 2017, Montréal #8 Linked Data
Quality Workshop (@ Semantics 2017), Amsterdam (organizer and presenter) #9 (Meta)-Data Quality
Workshop (@ TPDL 2017), Thessaloniki (presented by Juliane Stiller) #10 ADOCHS meeting 2017,
Brussels (invited speech) #11 LDCX 2018, Stanford University #12 ELAG 2018, Prague (workshop
together with Anette Strauch, Patrick Hochstenbach, Mark Phillips) #13 12th International Conference
on Metadata and Semantics Research 2018, Limassol (with Juliane Stiller) #14 Open Research Knowledge
Graph workshop 2018, Hannover #15 Research Infrastructure on Religious Studies Workshop on FAIR
Research Data Management 2018, Mainz (invited speech) #16 Computational Archival Science
workshop (@ IEEE Big Data 2018), Seattle #17 DATeCH 2019, Brussels
31
http://bit.ly/qa-defense

More Related Content

Similar to Measuring Metadata Quality (doctoral defense 2019)

Metadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full versionMetadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full version
Péter Király
 
A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
Michel Dumontier
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
IMPACT Centre of Competence
 
Metadata Quality Assurance
Metadata Quality AssuranceMetadata Quality Assurance
Metadata Quality Assurance
Péter Király
 
Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)
Péter Király
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
Ghislain Atemezing
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
Piet J.H. Daas
 
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
DataScienceConferenc1
 
Measuring completeness as metadata quality metric in Europeana (DH 2017)
Measuring completeness as metadata quality metric in Europeana (DH 2017)Measuring completeness as metadata quality metric in Europeana (DH 2017)
Measuring completeness as metadata quality metric in Europeana (DH 2017)
Péter Király
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Pistoia Alliance
 
The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?
The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?
The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?
newmanld
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdf
RAKESHG79
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In Practice
Marcia Zeng
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Europe
 
Konrad cedem praesi
Konrad cedem praesiKonrad cedem praesi
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the Web
John Domingue
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
IanFurlong4
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
GigaScience, BGI Hong Kong
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
Pistoia Alliance
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 

Similar to Measuring Metadata Quality (doctoral defense 2019) (20)

Metadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full versionMetadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full version
 
A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Metadata Quality Assurance
Metadata Quality AssuranceMetadata Quality Assurance
Metadata Quality Assurance
 
Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)Validating 126 million MARC records (DATeCH 2019)
Validating 126 million MARC records (DATeCH 2019)
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
[DSC DACH 23] The Modern Data Stack - Bogdan Pirvu
 
Measuring completeness as metadata quality metric in Europeana (DH 2017)
Measuring completeness as metadata quality metric in Europeana (DH 2017)Measuring completeness as metadata quality metric in Europeana (DH 2017)
Measuring completeness as metadata quality metric in Europeana (DH 2017)
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?
The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?
The Quest for Digital Preservation: Will Part of Math History Be Gone Forever?
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdf
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In Practice
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
 
Konrad cedem praesi
Konrad cedem praesiKonrad cedem praesi
Konrad cedem praesi
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the Web
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
 

More from Péter Király

Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Péter Király
 
Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)
Péter Király
 
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
Péter Király
 
Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)
Péter Király
 
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Péter Király
 
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Péter Király
 
Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)
Péter Király
 
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Péter Király
 
FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)
Péter Király
 
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
Péter Király
 
Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...
Péter Király
 
Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)
Péter Király
 
Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)
Péter Király
 
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Péter Király
 
Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)
Péter Király
 
Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)
Péter Király
 
SHACL shortly (ELAG 2018)
SHACL shortly (ELAG 2018)SHACL shortly (ELAG 2018)
SHACL shortly (ELAG 2018)
Péter Király
 
Stiller & Király, Multilinguality of Metadata
Stiller & Király, Multilinguality of MetadataStiller & Király, Multilinguality of Metadata
Stiller & Király, Multilinguality of Metadata
Péter Király
 
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Péter Király
 
Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)
Péter Király
 

More from Péter Király (20)

Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
 
Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)Empirical evaluation of library catalogues (SWIB 2019)
Empirical evaluation of library catalogues (SWIB 2019)
 
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
GRO.data - Dataverse in Göttingen (Dataverse Europe 2020)
 
Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)Data element constraints for DDB (DDB 2021)
Data element constraints for DDB (DDB 2021)
 
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
Incubating Göttingen Cultural Analytics Alliance (SUB 2021)
 
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
Continuous quality assessment for MARC21 catalogues (MINI ELAG 2021)
 
Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)Magyar irodalom idegen nyelven (BTK ITI 2021)
Magyar irodalom idegen nyelven (BTK ITI 2021)
 
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
Validating JSON, XML and CSV data with SHACL-like constraints (DINI-KIM 2022)
 
FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)FRBR a book history perspective (Bibliodata WG 2022)
FRBR a book history perspective (Bibliodata WG 2022)
 
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
GRO.data - Dataverse in Göttingen (Magdeburg Coffee Lecture, 2022)
 
Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...Understanding, extracting and enhancing catalogue data (CE Book history works...
Understanding, extracting and enhancing catalogue data (CE Book history works...
 
Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)Measuring cultural heritage metadata quality (Semantics 2017)
Measuring cultural heritage metadata quality (Semantics 2017)
 
Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)Measuring library catalogs (ADOCHS 2017)
Measuring library catalogs (ADOCHS 2017)
 
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
Evaluating Data Quality in Europeana: Metrics for Multilinguality (MTSR 2018)
 
Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)Researching metadata quality (ORKG 2018)
Researching metadata quality (ORKG 2018)
 
Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)Measuring MARC (ELAG 2018)
Measuring MARC (ELAG 2018)
 
SHACL shortly (ELAG 2018)
SHACL shortly (ELAG 2018)SHACL shortly (ELAG 2018)
SHACL shortly (ELAG 2018)
 
Stiller & Király, Multilinguality of Metadata
Stiller & Király, Multilinguality of MetadataStiller & Király, Multilinguality of Metadata
Stiller & Király, Multilinguality of Metadata
 
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
 
Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)Improving data quality at Europeana (SWIB 2016)
Improving data quality at Europeana (SWIB 2016)
 

Recently uploaded

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 

Recently uploaded (20)

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 

Measuring Metadata Quality (doctoral defense 2019)

  • 1. Measuring Metadata Quality Péter Király Comparative Studies (General and Comparative Literature and Cultural Studies) Georg-August-Universität Göttingen 2019-06-24 slides: http://bit.ly/qa-defense
  • 2. metadata http://bit.ly/qa-defense 2 metadata something else here: cultural heritage objects ★ describes ★ explains ★ locates ★ represents
  • 4. dates (MoMa collection) Harald Klinke (LMU München) https://twitter.com/HxxxKxxx/status/1066805548866289664 4 http://bit.ly/qa-defense
  • 6. multilinguality 6 ★ Mona Lisa → 456 results ★ La Gioconda → 365 results ★ La Joconde → 71 results http://bit.ly/qa-defense
  • 7. information metadata data consequence 7 no metadata no access to data no data usage bad metadata http://bit.ly/qa-defense Anonyme: Le Liseur © 2008 Hulton-Deutsch Collection / Corbis
  • 8. laws of library science 8 http://bit.ly/qa-defense Anonyme: Le Liseur © 2008 Hulton-Deutsch Collection / Corbis 1. Books are for use. 2. Every person his or her book. 3. Every book its reader. 4. Save the time of the reader. 5. Library is a growing organism. S. R. Ranganathan, 1931
  • 9. objective 9 there are “good” and “bad” metadata records functional requirements (metrics) good acceptable bad http://bit.ly/qa-defense
  • 10. metadata quality metrics in literature Bruce and Hillman (2004); Ochoa and Duval (2009); Palavitsinis (2014); Zaveri et al. (2015) https://www.zotero.org/groups/488224/metadata_assessment 10 completeness accuracy consistency ... correctness objectiveness appropriateness http://bit.ly/qa-defense
  • 11. complication 11 ★ lack of details ★ no shared implementation ★ not flexible (collection specificity) ★ not scalable http://bit.ly/qa-defense
  • 12. 12 ★ Q1: What are the relevant quality dimensions in two different cultural heritage data sources? ★ Q2: How could it be implemented in a flexible way? ★ Q3: How could it be implemented in scalable way? ★ Q4: How could Big Data analysis be conducted with limited computational resources? http://bit.ly/qa-defense questions
  • 13. structure is algorithmically measurable, content is not 13 ★ number of fields ★ uniqueness of values ★ language annotation ★ is it really the Mona Lisa? ★ is it about Lower Saxony? ★ is it in Polish? http://bit.ly/qa-defense
  • 14. hypothesis 14 by measuring structural elements we can approximate metadata quality ≃ metadata smell http://bit.ly/qa-defense
  • 15. LAM institution workflow 1. ingest 2. measure records 3. aggregate 4. report 5. evaluate with experts 15 improve records http://bit.ly/qa-defense quality assessment tool
  • 17. organisational approach 17 Europeana Data Quality Committee ★ analysing/revising metadata schema ★ functional requirement analysis ★ problem catalog ★ multilinguality http://bit.ly/qa-defense
  • 18. technical approach 18 “Metadata Quality Assessment Framework” ★ adaptable to different metadata schemas ★ scalable (to Big Data) ★ generates understandable reports for data curators ★ open source http://bit.ly/qa-defense
  • 19. what to measure? 19 ★structural and semantic features completeness, cardinality, uniqueness, length, dictionary entry, data type conformance, multilinguality (generic metrics) ★functional requirement analysis ★problem catalog http://bit.ly/qa-defense
  • 21. <#record> a ore:Proxy ; dc:subject “Ballet”, “Opera” . <#record> a ore:Proxy ; edm:europeanaProxy true ; dc:subject <http://data.europeana.eu/concept/base/264> , <http://data.europeana.eu/concept/base/247> . <http://data.europeana.eu/concept/base/264> a skos:Concept . skos:prefLabel "Ballett"@no, "बैले"@hi, "Ballett"@de, "Балет"@be, "Балет"@ru , "Balé"@pt, "Балет"@bg, "Baletas"@lt, "Balet"@hr, "Balets"@lv . <http://data.europeana.eu/concept/base/247> skos:prefLabel "Opera"@no, "ओपेरा (गीतिनाटक)"@hi, "Oper"@de, "Ooppera"@fi , "Опера"@be, "Опера"@ru, "Ópera"@pt, "Опера"@bg, "Opera"@lt . multilinguality 21 0 0 11 19 Distinct languages Tagged literals 1,7 Literals per language dereferencing http://bit.ly/qa-defense
  • 23. Measuring library catalogues Card catalogue at Gent University Library, photo: Pieter Morlion, 2010 CC-BY 4.0 https://commons.wikimedia.org/wiki/File:Boekentoren_2010PM_1179_21H9015.JPG http://bit.ly/qa-defense
  • 24. a (pretty printed) example LDR 01136cnm a2200253ui 4500 001 002032820 005 20150224114135.0 008 031117s2003 gw 000 0 ger d 020 $a3805909810 100 1 $avon Staudinger, Julius,$d1836-1902$0(viaf)14846766 245 10$aJ. von Staudingers Kommentar zum ... /$cJ. von Staudinger. 250 $aNeubearb. 2003$bvon Jörn Eckert 260 $aBerlin :$bSellier-de Gruyter,$c2003. 300 $a534 p. ;. 500 $aCiteertitel: BGB. 500 $aBandtitel: Staudinger BGB. 700 1 $aEckert, Jörn 852 4 $xRE$bRE55$cRBIB$jRBIB.BUR 011 DE 021$p000000800147 24 http://bit.ly/qa-defense
  • 25. semantic elements 25 MARC 21 versions total control fields 7 7 control subfields 211 211 data fields 215 68 283 indicators 175 8 183 subfields 2259 344 2603 3287 Java classes qa-metadata-marc.jar Avram JSON data model export machine readable standard http://bit.ly/qa-defense
  • 26. proportion of records with issues 26 library all core bay 100.0 18.8 bzb 100.0 76.1 cer 2.8 2.8 col 90.4 66.0 dnb 13.9 0.2 gen 40.8 27.3 har 100.0 97.3 loc 30.5 29.3 library all core mic 80.8 67.5 nfi 62.1 58.1 ris 99.7 57.1 sfp 82.7 60.4 sta 92.7 92.5 szt 30.8 30.6 tib 100.0 100.0 tor 100.0 74.2 core = issues in the documented elements http://bit.ly/qa-defense strange almost error-less surprising
  • 27. issue types record level ★ ambiguous linkage ★ invalid linkage ★ type error control field ★ invalid code ★ invalid value 27 data field ★ missing reference subfield (880$6) ★ non-repeatable field ★ undefined field indicator ★ invalid value ★ non-empty value ★ obsolete value subfield ★ classification ★ invalid ISBN ★ invalid ISSN ★ invalid length ★ invalid value ★ repetition ★ undefined subfield ★ non well-formatted value http://bit.ly/qa-defense most frequent issues
  • 28. completeness by field groups 28 http://bit.ly/qa-defense
  • 29. 29 ★ Q1 quality dimensions: completeness, multilinguality, issue detection ★ Q2 flexibility: schema abstraction, decoupling measurements ★ Q3 scalability: using frameworks such as Apache Spark ★ Q4 limited resource: measuring performance, optimizing parameters http://bit.ly/qa-defense conclusions
  • 30. publications ★ J. Stiller, P. Király (2017) Multilinguality of Metadata Measuring the Multilingual Degree of Europeana’s Metadata. In Proc. of the 15th Intl. Symp. of Information Sci. 164–176. ★ P. Király (2017) Towards an extensible measurement of metadata quality. In Second International Conference on Digital Access to Textual Cultural Heritage. 111–115. 10.1145/3078081.3078109 ★ P. Király (2017) Measuring completeness as metadata quality metric in Europeana. In Digital Humanities 2017 Conference Abstracts. 291–293. ★ V. Charles, J. Stiller, P. Király, W. Bailer, N. Freire (2017) Evaluating Data Quality in Europeana: Metrics for Multilinguality. In Joint Proceedings of TDDL 2017, MDQual 2017 and Futurity 2017 ★ P. Király (2018) Adat a könyvtárban. In Hagyomány és újítás a 21. századi könyvtárban. 49–74. ★ P. Király, M. Büchler (2018) Measuring completeness as metadata quality metric in Europeana. In 2018 IEEE International Conference on Big Data. 2711–2720. 10.1109/BigData.2018.8622487 ★ P. Király, J. Stiller, V. Charles, W. Bailer, N. Freire (2019) Evaluating Data Quality in Europeana: Metrics for Multilinguality. In Metadata and Semantic Research 2018. 199–211. 10.1007/978-3- 030-14401-2_19 30 http://bit.ly/qa-defense
  • 31. conferences #1 International Symposium on Information Science 2017, Berlin (with Juline Stiller) #2 #dariahTeach 2017, Lausanne (poster) #3 SI & IT Workshop 2017, Göttingen (with Juline Stiller) #4 Linked Data Quality workshop (@ ESWC 2017), Portorož (invited keynote speech) #5 DATeCH 2017, Göttingen #6 ELAG 2017, Athens (with Valentine Charles) #7 Digital Humanities 2017, Montréal #8 Linked Data Quality Workshop (@ Semantics 2017), Amsterdam (organizer and presenter) #9 (Meta)-Data Quality Workshop (@ TPDL 2017), Thessaloniki (presented by Juliane Stiller) #10 ADOCHS meeting 2017, Brussels (invited speech) #11 LDCX 2018, Stanford University #12 ELAG 2018, Prague (workshop together with Anette Strauch, Patrick Hochstenbach, Mark Phillips) #13 12th International Conference on Metadata and Semantics Research 2018, Limassol (with Juliane Stiller) #14 Open Research Knowledge Graph workshop 2018, Hannover #15 Research Infrastructure on Religious Studies Workshop on FAIR Research Data Management 2018, Mainz (invited speech) #16 Computational Archival Science workshop (@ IEEE Big Data 2018), Seattle #17 DATeCH 2019, Brussels 31 http://bit.ly/qa-defense