Advertisement
Advertisement

More Related Content

Similar to CARARE: Data adventures in heritage science(20)

Advertisement
Advertisement

CARARE: Data adventures in heritage science

  1. Data adventures in heritage science Dimitris Gavrilis, Digital Curation Unit – IMIS, Athena Research Centre
  2. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Introduction •Large number of content (data) exist and are produced in heritage science • Ease through technology • Organizations such as Europeana • …
  3. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Technology brings changes •Richness of both data and metadata (resolution, expressiveness) has become a problem • Smartphones can now make use of 4K • APIs enable easy re-use of content •A search bar/form is not what people want to use anymore •Stories / collections / curated content is the new trend • Navigation using tap/gestures … not the keyboard • Tap
  4. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Resolution and Quality •All of these “new” trends require better quality content: • Higher resolution for data • Richer, more expressive metadata
  5. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Data models, metadata schemas •One to rule them all ? •We’re not there yet…
  6. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Diversity & aggregation •Specialized, highly expressive data models  CARARE schema •Aggregators enable transformation of metadata among different data models (or metadata schemas)
  7. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Let’s see an example MORe http://more.dcu.gr/
  8. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Quality metrics •Completeness •Accuracy •Consistency •Appropriateness •Auditability •…
  9. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Quality driven enrichment ?
  10. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Quality driven enrichment ? Lot’s of services out there but… How do I detect which parts of the record need to be enriched ?
  11. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Identifying quality issues •Presence of an element •Rules • e.g. Schematron •Statistical metrics •  e.g. Distinct number of values
  12. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 But what about this subject ?
  13. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Content based quality metrics <car:subject> Art on a portable stone: burial cairn </car:subject> Extract some features Length of text Number of words Number of common words 34, 7, 2
  14. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Let’s put his on a graph • Extract features from all the content available • Cluster them into 2 classes • Good • Bad • For every new record, identify the subject, extract the features • Measure it’s distance between the two centers (Good, Bad).
  15. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 What then ? •Automatic enrichment •Crowd sourcing •Gamification •…
  16. Archaeology and Architecture in Europeana, Leiden, 13-14th June 2017 Thanks for your attention www.carare.eu
Advertisement