LIBER is a network of over 425 European research libraries from over 40 countries. It focuses on activities like scholarly communication, digitization, and participation in EU projects. The Europeana Newspapers Project aims to aggregate over 18 million digitized newspaper pages from European libraries for the Europeana and European Library websites. It will refine the texts using techniques like OCR, named entity recognition and analyze existing newspaper collections. The goal is to improve access and searchability of these historical newspaper pages.
Presentation given by Hildelies Balk during the 2nd LIBER-EBLIDA Workshop on Digitisation of Library Material in Europe (19-21 October 2009, The Hague, the Netherlands)
Project Review, Den Haag, February 2013
Presentation by:
Louise Edwards, The European Library
(Project Coordinator 2011)
Alastair Dunning, The European Library
(Project Coordinator 2012)
Presentation given by Hildelies Balk during the 2nd LIBER-EBLIDA Workshop on Digitisation of Library Material in Europe (19-21 October 2009, The Hague, the Netherlands)
Project Review, Den Haag, February 2013
Presentation by:
Louise Edwards, The European Library
(Project Coordinator 2011)
Alastair Dunning, The European Library
(Project Coordinator 2012)
Spagettidiagrammet er med til at synliggøre, om man har et optimalt layout i en produktion - eller måske på kontoret.
Et enkelt værktøj, som er let at benytte i mange sammenhænge.
Representation and Absence in Digital Resources: The Case of Europeana Newspa...TU Delft, Netherlands
Presentation at Digital Humanities 2014, Lausanne. Looks at some of the issues related to digitising historic newspapers in Europe, particularly how a website that can search through all of them can be built
Reshaping the research library.LIBER's involvement in The European LibraryLIBER Europe
Izaskun Lacunza, LIBER Executive Director, presents LIBER's strategy and LIBER's involvement in The European Library at the Research Library United Kingdgom (RLUK) workshop for members
By Dr. Petra Hauke,
IFLA Environment, Sustainability and Libraries Section (ENSULIB)
Netzwerk Grüne Bibliothek (German Green Library Network)
Presented at the ENSULIB Satellite Meeting in Cork, Ireland, July 2022
20yrs: 2007 Brussels Digital Preservation: Setting the Course for a Decade of...Neil Beagrie
“Digital Preservation: Setting the Course for a Decade of Change” , a conference keynote from 2007, available now on Slideshare is the ninth of 12 presentations I’ve selected to mark 20 years in Digital Preservation. The remainder will be published at monthly intervals over 2015.
This presentation was the opening keynote to a conference in 2007 held by the Belgian Association of Documentation (BDA) to celebrate its 60th anniversary. It dates from my time at the British Library.
The conference theme was "Europe facing the challenge of the long term conservation of digitalised archives". My keynote synthesised many of the topics I was focussing on at the time (and have featured in some of my earlier slide shares in this series) including encouraging University libraries to engage more actively with research data management in the sciences, to begin developing digital special collections of individuals, and to support international efforts to ensure continuing access and preservation of e-Journals as part of the scholarly record. In addition, given the European focus I briefly covered some of the major European initiatives in digital preservation at that time.
I have selected this presentation as one of the 12 in this series, not only as it is synthesising these key themes but also because it includes some thoughts on whether digital preservation needed to be evolution or revolution (or a bit of both) for libraries and archives.
EconBiz Open: Open Access Material for Business and EconomicsTamara Pianos
The presentation describes the ZBW and EconBiz services. EconBiz searches several databases simultaniously. EconBiz Open only covers Open Access content (about 1 million documents). The EconBiz Open International Partner network connects users from different parts of the world.
The state of Open Access in Belgian French-speaking universitiesFrançois Renaville
An overview of the BICfB report "L'Open Access en Belgique francophone : étude de la BICfB réalisée à la demande des Recteurs des universités et du F.R.S.-FNRS" (May 2012) (http://hdl.handle.net/2268/124876).
Knowledge and Wisdom: the role of research libraries in supporting the Europe...LIBER Europe
The paper will set the scene for challenges facing research libraries in Europe using the the United Kingdom (UK) experience as exemplar. Included will be a look at pan-European development to bring resource discovery to the network layer highlighting two developments: Europeana, Libraries and Research; and, as a case study, the introduction of the Primo search engine into UCL Library Services (University College London) in the UK. In addition, Open Access to research publications and its potential impact on the dissemination of scholarly research outputs will be examined including PEER's (Publishing and the Ecology of European Research) investigation of the effects of the large-scale, systematic depositing of authors’ final peer reviewed accepted manuscripts (so-called Green Open Access) with the aim of providing input for evidence-based policy-making in the area of Green Open Access. Also, two examples of Gold Open Access will be illustrated: Gold Open Access monograph publishing and the development of Gold ‘overlay journals’. This will be followed by a look at Research Data and the importance of data-driven science concentrating on three exemplars from the UK. The requirements for the storage and preservation of research data will be explored and the potential of tools offered by Ex Libris investigated to see what it required. Finally, the paper will map the findings of the paper in terms of network developments, Open Access to research publications, and the storage and re-use of research data against the findings of the opening section – the strategic needs of European research Universities. This paper will end by identifying how the technical developments outlined in the paper need to be aligned with the top-level strategic needs of European Universities in order for research libraries to support their home Universities.
15. Sächsisches GI/GIS/GDI Forum
Dresden, 15. September 2015
GI29015 – INTRODUCTION TO OPEN DATA MANAGEMENT IN EUROPE OF REGIONS –
Doz. Dr. Frank HOFFMANN, CSc – Vorstandsvorsitzender IGN e.V.
Academician of International Eurasian Academy of Sciences (IEAS)
15. Sächsisches GI/GIS/GDI Forum und Club of Ossiach Workshops,
Dresden: 15. September 2015
CLUB OF OSSIACH & GI2015 WORKSHOPS
PROGRAMME & PROCEEDINGS
Edited by F. HOFFMANN (IGN)
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...IGN Vorstand
15. Sächsisches GI/GIS/GDI Forum und Club of Ossiach Workshops,
Dresden: 15. September 2015
CLUB OF OSSIACH RECOMMENDATION FOR ICT FOR FAMILY FARMING
Karel CHARVAT, Club of Ossiach / CCSS (CZ)
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missingIGN Vorstand
15. Sächsisches GI/GIS/GDI Forum und Club of Ossiach Workshops,
Dresden: 15. September 2015
ICT FOR A SUSTAINABLE AGRICULTURE AND FORESTRY STATUS AND MISSING
Walter H. MAYER, CEO PROGIS / Treasurer of CoO
15. Sächsisches GI/GIS/GDI Forum und Club of Ossiach Workshops
COPERNICUS PROGRAMME AND SENTINEL DATA FOR AGRICULTURE AND FORESTRY
Lenka Hladíková, CENIA, Czech Environmental Information Agency (CZ)
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Mind map of terminologies used in context of Generative AI
GI2012 pekarek-liber
1. LIBER, Europeana and the
Europeana Newspapers Project
GI2012, Dresden: 18.05.2012
Aleš Pekárek, Association of European Research Libraries, Den Haag, NL
2. What is LIBER?
• the largest network of European research/academic libraries:
more than 425 institutions, from over 40 countries
• LIBER's network is not restricted to the area of the European
Union and it covers the whole Europe
• LIBER was founded in 1971. Since 2009, LIBER has had its
seat in The Hague (Netherlands)
15/05/2012 / GI2012 2
3. LIBER’s main activities (1)
Strategic Plan 2009 – 2012 focuses on the following areas:
• Scholarly Communication
• Digitisation and Resource Discovery
• Heritage Collections and Preservation
• Organisation and Human Resources
• LIBER Services
Through – Steering Committees & Working groups
15/05/2012 / GI2012 3
4. LIBER’s main activities (2)
Other main activities:
• Participation in EU projects
• Lobbying
• Publications
• Annual Conference
• Networking
15/05/2012 / GI2012 4
5. Participation in EU projects
Example: EUROPEANA LIBRARIES
http://www.europeana-libraries.eu/
5 million digital objects from 19 leading European research
libraries freely accessible on The European Library and
Europeana websites.
LIBER Members are welcome to participate on EU projects!!!
15/05/2012 / GI2012 5
6. Lobbying - examples
• Copyright
LIBER signs MoU on Out of Commerce Works on behalf of
European Research Libraries
• Digitisation
LIBER’s expression of interest for involvement in the EU Digital
Agenda
• European Research Area
LIBER response to the EC survey on Scientific Information in the
Digital Age
15/05/2012 / GI2012 6
7. LIBER Annual Conference
The largest European conference of research libraries
2012 – Tartu, Estonia
2013 – Munich, Germany
2014 – Varna, Bulgary
15/05/2012 / GI2012 7
8. LIBER’s main communication channels
Website: www.libereurope.eu
LinkedIn: LIBEReurope
Twitter: @LIBEReurope
FOLLOW LIBER
15/05/2012 / GI2012 8
9. JOIN LIBER!
Become a
See the following links for the membership benefits and
application form
Any questions or ideas? Contact us at liber@kb.nl
15/05/2012 / GI2012 9
11. EUROPEANA means...
For users:
Europeana is a single access point to millions of books, paintings, films,
museum objects and archival records that have been digitised throughout
Europe. It is an authoritative source of information coming from European
cultural and scientific institutions.
For heritage institutions:
Europeana is an opportunity to reach out to more users, increase their web
traffic, enhance their users' experience and build new partnerships.
For professionals in the heritage sector:
Europeana is a platform for knowledge exchange between librarians,
curators, archivists and the creative industries.
For policy-makers and funders:
Europeana is a prestigious initiative endorsed by the European
Commission, and is a means to stimulate creative economy and promote
cultural tourism.
15/05/2012 / GI2012 11
12. Europeana Newspapers: Aims and Objectives
Europeana Newspapers
• aims at the aggregation and refinement of newspapers for The
European Library and Europeana.
• will use refinement methods for OCR, OLR (article segmentation), and
named entity (NER) and class recognition
• the libraries participating in the project will provide around 18 million
digitised newspaper pages to Europeana
• Further libraries will be encouraged to contribute newspapers to
Europeana and TEL by the project
15/05/2012 / GI2012 12
13. Project Profile: Consortium & stakeholders
• 17 partners from 12 countries within the consortium
• National libraries
• University libraries
• SME
• External partners and stakeholders:
• Involvement of libraries outside the project consortium
• Framework:
• Funded as a Best Practise Network in the ICTPSP programme of
the European Commission
• Project Duration: February 2012 – January 2015
15/05/2012 / GI2012 13
15. Consortium Partners
1. Staatsbibliothek zu Berlin 9. University of Salford
(project co-ordinator) 10. CCS Content Conversion
2. National Library of the Specialists GmbH
Netherlands 11. Stichting LIBER
3. National Library of Estonia 12. National Library of Latvia
4. Österreichische 13. National Library of Turkey
Nationalbibliothek 14. University Library of Belgrade
5. National Library of Finland 15. University of Innsbruck
6. Staats- und 16. Landesbibliothek Dr. Friedrich
Universitätsbibliothek Hamburg Tessmann
7. Bibliothèque nationale de 17. The British Library
France
8. National Library of Poland
15/05/2012 / GI2012 15
16. Project Profile: Objectives
1) Selection, Refinement & Aggregation of content
• Make Europeana the largest provider of pan-European newspaper collections
• Provision of more than 18 million newspaper pages to Europeana, many of
those with full-texts
• Support move from images to texts in Europeana
2) Analysis of existing newspaper collections
• Survey of newspaper holdings in Europe
3) Quality Assurance & Best practise recommendations
• Contribute to optimised workflows and data aggregation infrastructures
• Provide best practice recommendations for digitization, refinement, workflows,
metadata etc. and evaluation tools
4) Presentation and full-text search
• Improve access to newspaper collections within Europeana
15/05/2012 / GI2012 16
17. 1) Selection, Refinement & Aggregation of content
• Aggregation of 18 million pages of digitised
newspapers to Europeana and to The
European Library
• 8 million pages “as is” (content providers)
• 10 million refined pages: OCR (UIBK,
Austria) www.europeana.eu/
• 2 million refined pages: OCR/OLR (article
segmentation) (CCS, Germany)
• Analysis of available digital newspaper
collections and selection of subsets
suitable for refinement
www.theeuropeanlibrary.org/
15/05/2012 / GI2012 17
18. 1) Refinement – OCR and OLR
• 10 million refined pages:
OCR (UIBK, Austria)
• 2 million refined pages:
OCR/OLR (article segmentation)
(CCS, Germany)
• UIBK enriches the OCR with structural information CCS: Column recognition, article segmentation
from their Document Understanding Platform
• CCS produces OCR and verification of column
recognition, zoning, article segmentation, and page
class recognition
• CCS provides libraries with a client technology for
manual correction of recognition and segmentation
results
UIBK: Detection of headings, footnotes, etc.
Table of contents extraction
15/05/2012 / GI2012 18
19. 1) Refinement - Named Entity Recognition
• KB provides named entities recognition (NER) for material from up to
three languages (Dutch, English, and German)
15/05/2012 / GI2012 19
20. 2) Analysis of existing digitised newspaper collections
• Project partners and others will be contacted until summer 2012 to
analyse the extent of digitised newspapers collections at their institutions
• Results will be embedded in “Zeitschriftendatenbank” of
Staatsbibliothek zu Berlin (Union Catalogue of Serials)
• Potential new partners for the extension of the network will be
suggested by survey
• May also be useful to judge technical status of digitised data and as part
of gathering descriptive metadata
• If you hold digital newspaper collection and like to participate in the
survey please contact: survey@europeana-newspapers.eu/
15/05/2012 / GI2012 20
21. 3) Analysis of work & Best Practise Recommendations
• Analysis of metadata formats in use by libraries in digitisation projects
• Align metadata models with the METS/ALTO standard and release best
practise recommendation on how to apply these formats in newspaper
digitisation and refinement
• Usability of the recommendation will be tested through an evaluation
cycle
• Provide recommendations on best practices for refinement of digitized
newspaper collections for Europeana
15/05/2012 / GI2012 21
22. 4) Presentation & Access to full-texts
• Within the lifetime of the project, a content browser
will be built within TEL portal so that users can …
• Search full text, e.g.
• by search term,
• by named entities
• by collections of newspapers
• by date ….
• See newspaper images
• Be linked to relevant library sources
• This browser will be built in TEL during project;
and exported to Europeana after the project
15/05/2012 / GI2012 22
23. 5) Dissemination
• Objectives:
• Establishment of publicity
• Increasing usage of Europeana
• Awareness raising among target groups
• Tasks:
1. Media Communication
2. Workshops and conferences
• Three main dissemination workshops
• National information days
• Network extension
3. Exploitation
15/05/2012 / GI2012 23
24. Check it out!
• http://www.europeana-newspapers.eu/ WEBSITE
• http://www.facebook.com/EuropeanaNewspapers FB SITE
• http://www.linkedin.com/groups?gid=4425919 LinkedIn
15/05/2012 / GI2012 24
25. Thank you for your attention!
Aleš Pekárek, LIBER
ales.pekarek@kb.nl
www.europeana-newspapers.eu/