Your SlideShare is downloading. ×
Europeana Newspapers Polish Information Day
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Europeana Newspapers Polish Information Day

291

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
291
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Europeana Newspapers Project "Distant Reading: Historic Newspapers in the Digital Age“ National Library, Warsaw, Poland January 16, 2014 Ulrike Kölsch, Project Coordinator - Berlin State Library
  • 2. Europeana Newspapers 16 January 2014 – Warsaw– Morning Edition
  • 3. Europeana Newspapers Project On 15th April 1912, the passenger ship Titanic, carrying over 2000 passengers and crew, crashed into an iceberg on its maiden voyage from Southampton to New York This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 3
  • 4. Europeana Newspapers Project Responses to the Titanic Disaster This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 4
  • 5. Europeana Newspapers Project Responses to the Titanic Disaster This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 5
  • 6. Europeana Newspapers Project Responses to the Titanic Disaster This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 6
  • 7. Europeana Newspapers Project Responses to the Titanic Disaster This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 7
  • 8. Europeana Newspapers Project Responses to the Titanic Disaster This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 8
  • 9. Europeana Newspapers Project Responses to the Titanic Disaster This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 9
  • 10. Europeana Newspapers Project News travels at different speeds, with importance that diminishes at different rates. This is true now as is was in 1912. (though the web changes things …) This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 10
  • 11. Europeana Newspapers Project The Europeana Newspapers Project is making this kind of investigation easier, in several ways This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 11
  • 12. Europeana Newspapers Project 1. By creating full text for 8m pages 2. By undertaking article segmentation for 2m pages 3. By undertaking named entity extraction for 2m pages 4. By developing a cross-searchable newspapers browser at The European Library (with metadata forwarded to Europeana) This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 12
  • 13. Europeana Newspapers Project Best Practice Network that aims at aggregating 18 million digitised historic newspaper pages from 12 European libraries, drastically improving search and retrieve possibilities. Volume Cross European cultures Sharing best practices Improving accessibility Improving availability This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 13
  • 14. The challenges…… Newspapers were not meant to be preserved…  frail and crumbly paper  missing edition  incomplete supplements  poorly bound  fading ink  different fonts  legal uncertainties with contemporary material This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
  • 15. Who 12 content providers Blue– Providing Content Yellow –Providing Technical Services Green – Associate Partners 2 networking partners This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
  • 16. Who 4 technology providers 12 content providers 1 aggregator 2 networking partners This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
  • 17. Challenges and Solutions in Creating a European Historic Newspapers Browser I Creating a newspapers interface that ... Provides unique value to users Reflects relationship to original physical newspaper collections Is sustainable Offers contributors added value Defines relationship to Europeana Respects library wishes This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 17
  • 18. Challenges and Solutions in Creating a European Historic Newspapers Browser II What content will be included ? Full Images, Full Text, Metadata Latvia, Belgrade, Germany (Hamburg, Berlin), Estonia, Finland, Netherlands , Austria Snippets of Images, Full Text, Metadata Italy, France , Poland This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 18
  • 19. Challenges and Solutions in Creating a European Historic Newspapers Browser III First Iteration - Basic text search - Filtering of results by date, country, newspaper, language, library - OCR shown - Zoom able version of full image - Clickable links between full text and image (sometimes) - Link to newspaper source library This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 19
  • 20. Challenges and Solutions in Creating a European Historic Newspapers Browser IV This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 20
  • 21. Challenges and Solutions in Creating a European Historic Newspapers Browser V Complete Newspaper image can be shown Eesti Potimees ehk Naddaleleht, 2 November 1866 (National Library of Estonia) This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 21
  • 22. Challenges and Solutions in Creating a European Historic Newspapers Browser VI Fragment of Newspaper image can be shown Dziennik Slaskui, 10 June 1915 (National Library of Poland) This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 22
  • 23. Challenges and Solutions in Creating a European Historic Newspapers Browser VII • Just title level metadata can be shown: “Kleine Blatt, 15 November 1932” (National Library of Austria) This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 23
  • 24. Challenges and Solutions in Creating a European Historic Newspapers Browser VIII Zooming in This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 24
  • 25. Challenges and Solutions in Creating a European Historic Newspapers Browser IX Second Iteration - Fragments - See information on particular title - See what was published on a particular day - Search over titles (not just text) - Other browse-able visualisations of publication and library source - Search / browse via entities This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 25
  • 26. Challenges and Solutions in Creating a European Historic Newspapers Browser X Who are the users ? - Historians - Researchers - Students - Genealogists - Teachers and school pupils - Interested public  Citizen researcher … This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 26
  • 27. Challenges for Users “Texts are designed to “speak” to us, and so, they always end up telling us something; but archives are not messages that were meant to address us, and so they say absolutely nothing until one asks the right question.” (Franco Moretti "Distant Reading“, 2013) This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 27
  • 28. Share best practices … via workshops and national information days Image: Australian National Maritime Museum This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 28
  • 29. Network Partner Project Europeana Collections 1914-1918 – Remembering the First World War Unlocking Sources – The First World War online & Europeana“, 30./31.01.2014 2014 will mark the centenary of the outbreak of the First World War, which will be commemorated worldwide. In recent years a wide range of European cultural institutions, including the Staatsbibliothek zu Berlin, have digitized manuscript and print materials as well as film holdings. Books, photos, films, posters, manuscripts, and song lyrics have recently been made available online. On 30 and 31 January 2014 the Staatsbibliothek zu Berlin will host the event “Unlocking Sources – The First World War online & Europeana” to mark the commemoration. More information : www.unlocking-sources.eu This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 29
  • 30. Thank you for interest! More information on our website www.europeana-newspapers.eu

×