The Europeana meeting under the Romanian Presidency, Exposing Online the European Cultural Heritage: The impact of Cultural Heritage on the Digital Transformation of The Society, Iasi, Romania - 17 & 18 April 2019
This document discusses several topics related to AI and digital culture including metadata enrichment, machine learning, deep neural networks, supervised learning, datasets, crowd and machine intelligence, and semantic enrichment. Metadata can be enriched through manual and automatic processes including machine learning. Machine learning algorithms use sample training data to make predictions while deep neural networks and supervised learning use labeled input-output datasets. Large annotated datasets are needed to train machine learning models and crowdsourcing can be used to obtain this data. Crowd and machine intelligence can cooperate by using crowdsourced labels to train models and models to validate labels. Semantic enrichment involves mapping metadata to controlled vocabularies using tools like those developed by EKT to normalize values.
Similar to The Europeana meeting under the Romanian Presidency, Exposing Online the European Cultural Heritage: The impact of Cultural Heritage on the Digital Transformation of The Society, Iasi, Romania - 17 & 18 April 2019
Similar to The Europeana meeting under the Romanian Presidency, Exposing Online the European Cultural Heritage: The impact of Cultural Heritage on the Digital Transformation of The Society, Iasi, Romania - 17 & 18 April 2019 (20)
The Europeana meeting under the Romanian Presidency, Exposing Online the European Cultural Heritage: The impact of Cultural Heritage on the Digital Transformation of The Society, Iasi, Romania - 17 & 18 April 2019
1. AI & Digital Culture
Vassilis Tzouvaras , NTUA | IASI Romania 2019
2. Metadata Enrichment
• Quality of Metadata
• Data models, EDM, XML, RDF, LoD, URIs, SKOS,
WikiData, Geonames
• Manual process
• Automatic enrichment, machine learning
• Crowdsourcing, Human in the loop,
Why is needed?
CC BY-SA
Letter carrier from "Letters from the Land of the Rising
Sun”.1886 - 1892, British Library
United Kingdom, Public Domain
3. Machine Learning
CC BY-SA
Machine learning algorithms build a
mathematical model of sample data, known as
"training data", in order to make predictions or
decisions
Technologies: NLP, object detection, machine
translation, historical event detection, visual
similarity,
5. Supervised Learning
CC BY-SA
Supervised learning algorithms
build a mathematical model of a set
of data that contains both the inputs
and the desired outputs. The data is
known as training data, and consists
of a set of training examples.
6. Datasets???
CC BY-SA
Machine learning requires high volumes of data for training, validation, and
testing.
it’s important to have the right data, structured in the right format, covering all
the variation of your solution.
So how do you get the large volume of structured data you need? Human-
annotated data is the key to successful machine learning.
7. Crowd & Machine
Intelligence
CC BY-SA
Machine intelligence and human intelligence can cooperate and improve each
other in a mutually rewarding way.
● Exploit the user obtained annotations for training/improving machine
learning algorithms
● Use machine learning methods to validate user acquired labels
● Crowdsourcing campaign with specifically selected content which will
improve performance of automated machine learning system
9. Title here
CC BY-SA
9
Geo tagging
Drop pins to
countries or
locations in the
map that
represents the
picture
9
10. Greek Aggregator
Title here
CC BY-SA
10
EKT has developed an aggregation infrastructure
that consists of various platforms and systems that
cover the lifecycle of the digital content
aggregation, from harvesting and validation, to
cleansing, transformation, semantic
enrichment and secured preservation.
11. semantic enrichment
Title here
CC BY-SA
11
● Semantics.gr: a platform developed by EKT where institutions can create,
establish and publish LOD vocabularies, taxonomies, thesauri and authority files
(SKOS but also other schemas as well)
○ Vocabulary of 139 item types
○ Vocabulary of 94 Greek historical periods
● Enrichment tool of Semantics.gr: a tool for setting
enrichment mapping rules from distinct metadata values to vocabulary terms
○ Mappings per collection
○ Semi-automatic (automatic suggestions and curation)
○ Mappings can be based on one or more metadata fields
● Time normalization tool of the aggregator platform: a tool for setting
parametric normalization patterns of time values