Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Europeana Newspapers in a Nutshell

37 views

Published on

Europeana Newspapers @DASI Forum Berlin, 2018

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Europeana Newspapers in a Nutshell

  1. 1. Europeana Newspapers in a Nutshell Clemens Neudecker (@cneudecker) Staatsbibliothek zu Berlin – Preußischer Kulturbesitz
  2. 2. Introduction/Background • Europeana Newspapers EU Project (2012 – 2015) • http://www.europeana-newspapers.eu/ • Main objectives: – Collect metadata for digitised newspapers in EU – Perform OCR (text recognition) and OLR (article separation) on the digitised newspapers – Develop a common portal for search & discovery – Establish standards and best practices for (historical) newspaper digitisation
  3. 3. Collection Stats • Covers newspapers from 1618 – 2016 • 12 EU national and/or research libraries • >1.000 newspaper titles, ca. 3.3m issues • 40 languages, 4 alphabets • Metadata for approx. 20m pages • 12m pages fully searchable by keyword (OCR errors – your mileage may vary…) • Data (scans & OCR) public domain, metadata CC0 licensed
  4. 4. TEL Historic Newspaper Browser Keyword search
  5. 5. TEL Historic Newspaper Browser Various facets
  6. 6. TEL Historic Newspaper Browser Browse by newspaper title list
  7. 7. TEL Historic Newspaper Browser Browse by date of issue
  8. 8. Collaboration with Researchers • Oceanic Exchanges (Digging Into Data Transatlantic Platform, 2017-2019) • impresso (Swiss National Science Fund, 2017-2020) • NewsEye (EU H2020, 2018 – 2020) • CLARIN (EU DSI, ongoing) • Interviews with researchers • Numerous research groups throughout EU (though mainly DACH)
  9. 9. Outlook • Relaunch of Europeana Newspapers with redeveloped search and browse interface integrated directly with the Europeana Portal as a thematic collection (July/August 2018) • Support of IIIF API for the online presentation and aggregation of newspapers in Europeana in the longer term, this will open up the possibility to bookmark, annotate, transcribe or correct and connect disparate newspaper sources directly in your web browser • Named Entity Recognition for newspapers
  10. 10. Thank you for your attention! Clemens Neudecker (@cneudecker) Staatsbibliothek zu Berlin – Preußischer Kulturbesitz

×