Context: de Digitale Collectie Nederland
Het ICT-register van DEN
Onderzoek naar het gebruik van metadata door erfgoedinstellingen
Gebruik van erfgoedthesauri
Minimale eisen voor de nationale infrastructuur
Enterprise Europe Network presentatie VIT bij WTC Twente 2018Olaf ter Haar
Presentatie van het Enterprise Europe Network voor leden van de VIT / Industriële Kring Twente op de bijeenkomst bij het WTC Twent op 14 mei 2018 met als thema internationalisatie.
Sessie 'Onderduikkaarten online: hoe zet je een collectie om in informatie?' door Otto Kuipers en Hans Laagland van Tresoar, tijdens de Noordelijke Netwerkdag Oorlogsbronnen 'Samen zien we meer' op 28 mei 2018. Een presentatie over de stand van zaken rond de themawebsite Ondergedokeninfryslan.nl en de rol van Redbot in deze case.
Dit is het deel dat Hans Laagland presenteerde tijdens de sessie.
Context: de Digitale Collectie Nederland
Het ICT-register van DEN
Onderzoek naar het gebruik van metadata door erfgoedinstellingen
Gebruik van erfgoedthesauri
Minimale eisen voor de nationale infrastructuur
Enterprise Europe Network presentatie VIT bij WTC Twente 2018Olaf ter Haar
Presentatie van het Enterprise Europe Network voor leden van de VIT / Industriële Kring Twente op de bijeenkomst bij het WTC Twent op 14 mei 2018 met als thema internationalisatie.
Sessie 'Onderduikkaarten online: hoe zet je een collectie om in informatie?' door Otto Kuipers en Hans Laagland van Tresoar, tijdens de Noordelijke Netwerkdag Oorlogsbronnen 'Samen zien we meer' op 28 mei 2018. Een presentatie over de stand van zaken rond de themawebsite Ondergedokeninfryslan.nl en de rol van Redbot in deze case.
Dit is het deel dat Hans Laagland presenteerde tijdens de sessie.
Presentatie tijdens de tweejaarlijkse conferentie voor informatieprofessionals Informatie aan Zee 2019 over de outcomes van het Wiki Loves Heritage project met een focus op het spoor voor contentdonaties aan Wikimediaplatformen van collectiebeherende instellingen zelf.
Matthias Priem, manager digitalisering bij VIAA, schetst hoe VIAA verschillende ontsluitingsplatformen heeft uitgebouwd, en gaat dieper in op enkele belangrijke aspecten van ontsluiting:
1) Afspraken en licenties 2) Metadata 3) Content: uniforme formaten 4) Architectuur 5) Rapportering
Inctspiratie 2009 - KB - Op weg naar de digitale bibliotheekElco van Staveren
Verhaal over het proces waarin de KB zich ontwikkelt tot een volwaardige digitale bibliotheek. Daarbij aandacht voor de positioneringsvraag van online diensten, en welke (nieuwe) afspraken met uitgevers daarvoor nodig zijn.
Nathalie Monteyne en Eline Wellens van KMSKA vertelden hoe zij het ‘rechtenverhaal’ pragmatisch hebben aangepakt en binnen de organisatie een regelwerk opstelden om rechten te documenteren in hun collectiedatabank (TMS) en in hun beeldbeheersysteem (Resource Space).
Olivier Van Dhuynslager kwam vertellen over datacleaning bij het Gentse Design Museum. Nadat hij vaststelde dat hij een opstapje zou moeten maken van Open Refine naar Python om te kunnen doen wat hij wilde, heeft een steile leercurve van twee weken er voor gezorgd dat hij zijn projectdoelstellingen binnen enkele weken voltooid had.
On 03/07/2019 PACKED center of expertise for digital heritage presented a series op cultural open datasets published by themselves and in cooperation with the Flemish Art Collection along with a linked open dataset published by VIAA the Institute for Digital Archiving, to be used by the participants by the annual Apps for Ghent hackanthon.
Aan de slag met digitale transformatie.
De digitale omslag vereist dat culturele organisaties hun werking omvormen en digitale technologie aanwenden om hun missie te vervullen. Nood aan advies of ondersteuning bij die digitale transformatie? De fusie-organisatie LUKAS / PACKED / VIAA bundelt de krachten en kan misschien ook jou van dienst zijn. Deze sessie introduceert enkele recente projecten die inspiratie bieden om zelf op een doordachte manier met digitale transformatie aan de slag te gaan:
Het semantische web vereist dat culturele organisaties zich nieuwe digitale technologieën eigen maken. Rony Vissers en Bert Lemmens tonen hoe je van start kunt gaan met een digitale strategie voor je organisatie.
Hoe ver staat jouw organisatie op vlak van digitale ontwikkeling? Evalueer het zelf met de Zelfevaluatietool Digitale Maturiteit. Leer uit jouw score en vergelijk met andere culturele organisaties. Bart Magnus stelt de nieuwe tool voor en geeft een eerste inkijk in het gebruik ervan.
Een belangrijk aandachtspunt bij de opmaak van een digitale strategie is de gebruiker. Karen Vander Plaetse vertelt je hoe je bij ontsluitingsprojecten rekening kan houden met de gebruiker en geeft enkele concrete voorbeelden mee van VIAA's eigen platformen.
This presentation showcases PACKED's ongoing commitement to find novel ways to open up digital heritage collections on third party platforms such as the Wikimedia platforms, specifically targeted at the Flemish heritage sector during the partner day of the aggregators Erfgoedplus and Erfgoedinzicht.
Presentatie van het project "Annotatie webarchief Groninger Archieven mbv GTAA-Onderwerpsas", van het Netwerk Digitaal Erfgoed. Gegeven op het AVA_Net symposium bij Beeld en Geluid in Hilversum op 1 juli 2016.
Office and Projectmanagement for architects and engineers. Works also on Windows and Mac.
ArchX is avalable in various modules ArchX.Time, ArchX.Doc and ArchX.Full.
For more information look at www.archx.eu
The Europeana Newspapers Workshop presentation discusses a project that aims to make 18 million digitized historical European newspaper pages available through one search portal. The project involves 12 content providers, 2 networking partners, 4 technology providers, and 1 aggregator working to improve the accessibility of these newspapers by fully digitizing and making searchable 10 million pages. The presentation outlines the challenges of preserving fragile newspaper content and the project's efforts to apply optical character recognition to pages, align metadata, and share best practices through its website and workshops.
The document summarizes the Europeana Newspapers Project, which digitized 18 million newspaper pages from across Europe between the 17th-20th centuries. The project aims to improve search capabilities and access to these historical newspapers by applying optical character recognition (OCR) and extracting metadata on people, places and organizations mentioned in articles. A network of 12 content providers, technical partners and others collaborated on enrichment, aggregation and dissemination of the newspaper content so it can be explored through Europeana and other online interfaces.
Toine Pieters of Utrecht University will lead a team of researchers to map trends and changes related to the economic, cultural, and scientific influence of the US as a reference culture in Europe from 1815-1992 using digital humanities tools. The team will analyze a 10% sample of over 9 million digitized newspaper pages from the National Library of the Netherlands stored in an ElasticSearch database. They will use text mining tools integrated in the xTAS interface to conduct searches, named entity recognition, sentiment analysis and visualization of search results over time to allow both close and distant reading of the newspaper texts.
Presentatie tijdens de tweejaarlijkse conferentie voor informatieprofessionals Informatie aan Zee 2019 over de outcomes van het Wiki Loves Heritage project met een focus op het spoor voor contentdonaties aan Wikimediaplatformen van collectiebeherende instellingen zelf.
Matthias Priem, manager digitalisering bij VIAA, schetst hoe VIAA verschillende ontsluitingsplatformen heeft uitgebouwd, en gaat dieper in op enkele belangrijke aspecten van ontsluiting:
1) Afspraken en licenties 2) Metadata 3) Content: uniforme formaten 4) Architectuur 5) Rapportering
Inctspiratie 2009 - KB - Op weg naar de digitale bibliotheekElco van Staveren
Verhaal over het proces waarin de KB zich ontwikkelt tot een volwaardige digitale bibliotheek. Daarbij aandacht voor de positioneringsvraag van online diensten, en welke (nieuwe) afspraken met uitgevers daarvoor nodig zijn.
Nathalie Monteyne en Eline Wellens van KMSKA vertelden hoe zij het ‘rechtenverhaal’ pragmatisch hebben aangepakt en binnen de organisatie een regelwerk opstelden om rechten te documenteren in hun collectiedatabank (TMS) en in hun beeldbeheersysteem (Resource Space).
Olivier Van Dhuynslager kwam vertellen over datacleaning bij het Gentse Design Museum. Nadat hij vaststelde dat hij een opstapje zou moeten maken van Open Refine naar Python om te kunnen doen wat hij wilde, heeft een steile leercurve van twee weken er voor gezorgd dat hij zijn projectdoelstellingen binnen enkele weken voltooid had.
On 03/07/2019 PACKED center of expertise for digital heritage presented a series op cultural open datasets published by themselves and in cooperation with the Flemish Art Collection along with a linked open dataset published by VIAA the Institute for Digital Archiving, to be used by the participants by the annual Apps for Ghent hackanthon.
Aan de slag met digitale transformatie.
De digitale omslag vereist dat culturele organisaties hun werking omvormen en digitale technologie aanwenden om hun missie te vervullen. Nood aan advies of ondersteuning bij die digitale transformatie? De fusie-organisatie LUKAS / PACKED / VIAA bundelt de krachten en kan misschien ook jou van dienst zijn. Deze sessie introduceert enkele recente projecten die inspiratie bieden om zelf op een doordachte manier met digitale transformatie aan de slag te gaan:
Het semantische web vereist dat culturele organisaties zich nieuwe digitale technologieën eigen maken. Rony Vissers en Bert Lemmens tonen hoe je van start kunt gaan met een digitale strategie voor je organisatie.
Hoe ver staat jouw organisatie op vlak van digitale ontwikkeling? Evalueer het zelf met de Zelfevaluatietool Digitale Maturiteit. Leer uit jouw score en vergelijk met andere culturele organisaties. Bart Magnus stelt de nieuwe tool voor en geeft een eerste inkijk in het gebruik ervan.
Een belangrijk aandachtspunt bij de opmaak van een digitale strategie is de gebruiker. Karen Vander Plaetse vertelt je hoe je bij ontsluitingsprojecten rekening kan houden met de gebruiker en geeft enkele concrete voorbeelden mee van VIAA's eigen platformen.
This presentation showcases PACKED's ongoing commitement to find novel ways to open up digital heritage collections on third party platforms such as the Wikimedia platforms, specifically targeted at the Flemish heritage sector during the partner day of the aggregators Erfgoedplus and Erfgoedinzicht.
Presentatie van het project "Annotatie webarchief Groninger Archieven mbv GTAA-Onderwerpsas", van het Netwerk Digitaal Erfgoed. Gegeven op het AVA_Net symposium bij Beeld en Geluid in Hilversum op 1 juli 2016.
Office and Projectmanagement for architects and engineers. Works also on Windows and Mac.
ArchX is avalable in various modules ArchX.Time, ArchX.Doc and ArchX.Full.
For more information look at www.archx.eu
The Europeana Newspapers Workshop presentation discusses a project that aims to make 18 million digitized historical European newspaper pages available through one search portal. The project involves 12 content providers, 2 networking partners, 4 technology providers, and 1 aggregator working to improve the accessibility of these newspapers by fully digitizing and making searchable 10 million pages. The presentation outlines the challenges of preserving fragile newspaper content and the project's efforts to apply optical character recognition to pages, align metadata, and share best practices through its website and workshops.
The document summarizes the Europeana Newspapers Project, which digitized 18 million newspaper pages from across Europe between the 17th-20th centuries. The project aims to improve search capabilities and access to these historical newspapers by applying optical character recognition (OCR) and extracting metadata on people, places and organizations mentioned in articles. A network of 12 content providers, technical partners and others collaborated on enrichment, aggregation and dissemination of the newspaper content so it can be explored through Europeana and other online interfaces.
Toine Pieters of Utrecht University will lead a team of researchers to map trends and changes related to the economic, cultural, and scientific influence of the US as a reference culture in Europe from 1815-1992 using digital humanities tools. The team will analyze a 10% sample of over 9 million digitized newspaper pages from the National Library of the Netherlands stored in an ElasticSearch database. They will use text mining tools integrated in the xTAS interface to conduct searches, named entity recognition, sentiment analysis and visualization of search results over time to allow both close and distant reading of the newspaper texts.
This document summarizes a presentation about using digital technologies and "big data" to study the emergence of the United States as a "reference culture" in public discourse in the Netherlands between 1890-1990. It discusses both the promises and limitations of digital approaches, including the ability to analyze large amounts of newspaper text but also the need to move from just finding information to exploring meaningful patterns and relationships in the data.
This document summarizes Europeana's strategy to improve access to digital cultural heritage between 2020. It discusses Europeana's achievements between 2008-2014 including the number of metadata records and openly licensed objects. It outlines Europeana's plans to transform from a portal to a platform, with priorities of improving data quality, access conditions, and creating value for partners. Europeana's goals include becoming self-sustainable, innovating business models, and operating as a multi-sided platform to improve collaboration and access to shareable content.
Kramerius 3: The Digital Library from the National Library of the Czech RepublicEuropeana Newspapers
The Europeana Newspapers Project held a workshop in Amsterdam in September 2013. At the workshop, Thomas Foltyn of the National Library of the Czech Republic presented his library's digital newspaper browser, Kramerius 3.
Aly Conteh of the British Library presents the library's work to make digitised historic newspapers accessible online. This presentation was delivered at the Europeana Newspapers Project workshop in Amsterdam.
This document outlines an aggregation and indexing plan for digitized newspaper content from several European national libraries. The plan involves harvesting metadata and full text from partner libraries over multiple quarters in 2013-2014. Content will be indexed in a newspaper content browser and delivered to Europeana and other databases. Metadata and images will be ingested from libraries and made available with different viewing options. Quality control and customer relationship management systems will track the process.
The document discusses a workshop on refining digitized newspaper collections. It describes objectives like analyzing available newspaper collections, defining quality standards, and processing 10 million pages using refinement technologies. Challenges include balancing processing speed and quality given the large volume of diverse content. The refinement workflow involves binarization, file renaming, analysis, optical character recognition to extract full text, optical layout recognition to separate articles, and named entity recognition to tag people, places and organizations. The goal is to enhance access to digitized newspapers through Europeana.
presentatie over diverse referentiekaders voor digitale competenties in onderwijs. Focus ligt op DigComp en afgeleide instrumenten zoals DigCompOrg en DigCompEdu en de lokalisering in Vlaanderen.
Deze presentatie schetst vertrekpunt en uitwerking van de aanbesteding van de digitale infrastructuur van het Stadsarchief. Presentatie door Marc Holtman en Jeanine de Gier.
Onze ICT boeit studenten niet, I-strategie in de 21e eeuw - Jacco Jasperse - ...SURF Events
Dinsdag 11 november 2014
Sessieronde 1
Titel: Onze ICT boeit studenten niet, I-strategie in de 21e eeuw
Spreker: Jacco Jasperse (Hogeschool Zeeland)
Zaal: Rotterdam Hall
Watch full webinar here: https://bit.ly/3pCYPJQ
The data minimization principle says that organizations must only process personal information that they actually need to achieve the objective of processing the data. Join this session to learn about data minimization and its principles and understand how it interacts with your way of working and design decisions in a modern data architecture.
Heel wat erfgoedverenigingen en erfgoedvrijwilligers zijn geïnteresseerd in het digitaliseren en online publiceren van hun documentaire collectie. Maar hoe begin je daaraan? Er wordt een beknopte introductie gegeven in hoe je je documenten het best kunt organiseren en digitaliseren. Vervolgens zien we ook hoe je de digitale bestanden kunt archiveren en ontsluiten. Hierbij komt ook de technische kant aan bod. Een basiskennis in het gebruik van een computer is vereist.
Similar to Europeana Newspapers LFT Infoday Thompson (20)
The Presentation of Hans-Jörg Lieder, Staatsbibliothek zu Berlin – Preußischer Kulturbesitz, at the BnF Information Day for Europeana Newspapers (November 2014).
Optical Character Recognition (OCR) technology can help users in their research by digitizing printed texts and enabling full-text search. However, OCR quality varies and error rates can be as high as 10-40% depending on factors like language and publication date. This can negatively impact researchers seeking all occurrences of search terms. Crowd-sourcing corrections for searched words and utilizing external knowledge sources like Wikipedia could help improve search results and researchers' experiences. Machine learning applied to large digitized collections also has potential to extract additional useful information and insights not readily apparent from the text alone.
The document discusses Optical Layout Recognition (OLR) to convert scanned newspaper pages into structured digital files. It describes CCS's role in providing OLR technology and services to structure over 2 million newspaper pages from 5 European library partners. The general OLR workflow involves scanning, layout analysis to identify text blocks and zones, OCR, and quality assurance. CCS will analyze page layouts to recognize elements like articles, headlines, images and classify page types. Libraries can perform final quality assurance checking on the structured output, which is packaged in METS and ALTO formats for preservation and improved search and access capabilities.
The Europeana Newspapers project is digitizing newspapers from the 17th-20th centuries across 22 European languages. It has provided full text for over 2 million newspaper pages and metadata for over 18 million additional pages. Usability testing was conducted with researchers and improvements were made to search, browsing, and display functionality based on feedback. Researchers value the project for enabling new large-scale, interdisciplinary, and computational analyses of digitized newspaper archives.
The document discusses the Europeana Newspapers project, which aims to digitize over 18 million newspaper pages from various European newspapers ranging from the 17th to 20th centuries. The project involves 12 content providers, 2 networking partners, 4 technology providers and 1 aggregator working together to improve access to historical newspapers. Key aspects of the project include cultural cooperation, skills sharing, improved search capabilities through technologies like optical character recognition. The project highlights how digitization has improved access to historical newspapers and their coverage of events like the Titanic disaster across different European countries.
This document discusses optical character recognition (OCR) of historical newspapers. It describes the digitization process, which includes image capturing, text and structure recognition, natural language processing, and content representation. OCR accuracy can be improved through layout analysis, structural metadata extraction, and identifying different content units like articles, advertisements, and entertainment sections. The goal is to make the content and knowledge within digitized newspapers accessible beyond the scanned text.
The document describes a project called OPATCH that aims to create an advanced online search infrastructure for a historical newspaper archive. OPATCH will use computational linguistic methods like parsing, tagging, and named entity recognition to correct errors from optical character recognition (OCR) processing on the newspapers, which are from 1910-1920 and in difficult-to-read Fraktur font. The project will start with error-prone OCR text that cannot be manually corrected at scale. It will develop and test a method to generate and select candidates for correcting OCR errors using edit distances and ngram frequencies.
1. Digitisation at the Wellcome Library: Lessons learned & shared.
Historical Newspapers in the Digital Age, Bolzano
October, 2014 Dave Thompson Digital Curator, Wellcome Library
2. The Wellcome Library
•Part of Wellcome Collection, astonishing public venue in London developed by the Wellcome Trust. Where people can learn more about medicine through the ages & across cultures.
•More than 10,000 readers visit us each year, including historians, academics, students, health professionals & consumers, journalists, artists & members of the general public.
3. Digitisation in the Wellcome Library
•Strategic approach, conscious planned decisions.
•Library transformation strategy, physical to digital.
•From ‘project’ to ‘production’.
•Digitisation as a sustainable end-to-end process.
4. Overview – four IT systems…
1.Workflow management system – ‘Goobi’ = PRODUCTION.
2.Digital object repository – ‘Preservica’ = STORAGE.
3.Front end - ‘the player’ = ACCESS.
4.Temporary & permanent storage for content = 70tb
6. Digitisation: Image upload
Digitised images (Internally or externally digitised) are imported into Goobi & normalised to JPEG2000.
7. Digitisation: Upload, ftp, harvesting
ftp’d content can be automatically imported into Goobi & processed or IA content can be automatically harvested.
8. Digitisation: METS/ALTO for access
Content is OCR’d & METS /ALTO files are created in Goobi. Manual/automatic.
11. Or from a different perspective…
Goobi (METS/OCR)
Preservica
In-house
Institutions
Contractors
Harvesting
TIFF or JP2
TIFF or JP2
HD & ftp
TIFF or JP2
Normalises TIFF to JP2
Manual
Automatic
Jpylyzer validates JP2
Auto harvesting of JP2 & DMD
Grey literature
PDF
Project Managers / Ingest Officer
Project Managers
Ingest Officer / Digital Curator
Snagging
Snagging
12. Lesson 1 - Digitisation as a social activity
1.Digitisation is not a technical problem; it’s a social activity between creator & user.
2.Internally: Digitisation engages with all parts of the organisation, & draws of many different skills.
3.Externally: Engaging with (Between…?) creators & users, moving data into public realms, providing access.
http://www.emmanueladegbola.com/networking-leads/
13. Projects & workflows
1.Standardised processes to deal with differences in content & themes.
2.Use ‘projects’ & workflows to define activities & automated steps to handle material from transfer/acquisition to dissemination.
3.Projects & workflows allow us to manage our processes & to report activity.
http://www.amross.sd/
14. Standardised formats
1.Digitisation process built around a small number of formats.
2.Only accept – or create - TIFF or JPEG2000 image format for digitisation. MPEG2 for video.
3.Share our JPEG2000 profile with creators & validate images at point of processing.
4.Standardised metadata format(s) for discovery – MARC - & retrieval – ALTO/JSON.
http://blog.absolutvision.com/en/jpeg2000-format/
15. Lesson 2 – It’s a strategic issue
1.Given the scale & complexity clear strategic direction is essential.
2.Digitisation has to support an institutions users & their information needs.
3.Digitisation has to be a strategic decision supporting an institutions purpose.
4.Digitisation doesn’t change the mission of an organisation.
16. Industrialisation of processes
1.Digitisation built around a small number of formats. Workflows built around a small number of pre-defined steps.
2.Common workflow activities mean less system development, we can build our own processes.
3.Easier for humans to learn, less training, more certainty/reliability.
4.Industrialisation supports processes that are sustainable.
http://www.howtobeadad.com/2013/14723/unicorn-poop-how-i-fell-in- love-with-the-daughter-i-never-had
17. Lesson 3 – sustainability or bust
1.Digitisation has to be a sustainable process.
2.Processes have to be scalable to ambition.
3.Design, re-design & review processes constantly & integrate with existing services.
4.Digitisation as evolution, learn from what has been done, apply & move forward.
http://planetivy.com/gaming/25273/natural-selection-2-gaming-evolution-in-action/
18. Automation is key
1.Automation is essential to scalability & efficiency.
2.Within digitisation some activities very susceptible to automation. Automate them.
3.Automation standardises processes. Good for life cycle management of data.
4.Automated processes maximise investment in digitisation & support scalability.
http://www.technibble.com/automating-computer-business-for- profit/
19. Automated harvesting of IA content
Content processed automatically, including creation of METS & ALTO.
Goobi has a ‘repository’ of IA identifiers for searching/harvesting.
Goobi harvests data from Internet Archive website.
Content available in the player.
Content stored in Preservica.
DDS creates JSON for the player & pre- caches some content.
20.
21. Lesson 4: Nothing without imagination
1.The power of digitisation can only be revealed if we can imagine the uses the data can be put to.
2.Digitisation is not an exercise in technology for its own sake.
3.There is nothing that cannot be achieved, but it takes more than kit, tools, computers, software.
4.Digitisation is about engaging with creators & consumers, with the data & with the future.
22. Digitisation is not a separate activity
•Starts with alignment with the institutional mission.
•Builds on strategic vision.
•Digitisation as a strategic activity, planned & supported.
•Integrate all institutional systems, bibliographic, IT & human.
http://ocdindia.com/
23. Lesson 5 – The complete package
1.Digitisation is much more than sticking stuff under a camera or on a scanner.
2.Digitisation has to be developed as a whole & complete end-to-end process.
http://veritusgroup.com/how-to-create-a-dynamic-strategy-for- every-single-donor-a-step-by-step-process/
24. So, lessons learned
•Digitisation is a social activity.
•Digitisation as a planned strategic activity.
•Digitisation has to be a sustainable & scalable activity.
•Automation is key.
•Nothing without imagination.
•Digitisation has to be a complete package.