This document summarizes Max Kaiser's presentation on the Austrian National Library's digitization partnership with Google. Some key points:
1. The partnership involves digitizing 600,000 volumes from the library's historical book holdings from the 16th to late 19th century.
2. It is the largest public-private partnership in the Austrian cultural sector. Google handles the scanning and OCR while the library provides metadata and ensures quality control.
3. Kaiser discusses some of the challenges involved, such as the extensive logistics and preparation needed to process the large number of books while ensuring their preservation.
Research partnerships, user participation, extended outreach – some of ETH L...ETH-Bibliothek
IFLA Satellite Meeting 2017: Digital Humanities, Berlin, August 2017
> From "boutique" to mass digitization
> (Cooperative) online platforms for digitized content
> Research Partnerships
> User Participation
> Outreach
GLAM-Wiki. Galleries, libraries, archives and museums cooperating with Wikipe...Iolanda Pensa
GLAM-Wiki. Galleries, libraries, archives and museums cooperating with Wikipedia and the Wikimedia projects. A presentation by Iolanda Pensa and Federico Leva
Festival dello Sviluppo Sostenibile 2017, Università Bocconi, Milano, 26 May 2017.
Workshop Slides by Douglas McCarthy, Collections Manager,
Europeana Art & Europeana Photography.
Sharing is Caring - Hamburg Extension
Museum für Kunst und Gewerbe Hamburg 20 April 2017
http://sharecare.nu/hamburg-2017/
The Library as a Digital Research infrastructure: Digital Initiatives and Dig...lorna_hughes
Memory institutions have built up expertise and taken the lead in all aspects of digital humanities, especially the development and implementation of digital methods for the capture, analysis and dissemination of archives and special collections, including manuscripts. In recent years, these initiatives have become embedded into Digital Humanities Initiatives, Centres and Programmes within research libraries, adding value to the existing relationships between libraries and scholarly iniatiatives. These activities have fostered the development of new projects that bring into collaboration the skills and expertise of academics, librarians, and digital humanists, making the Library increasingly a “digital research infrastructure”. This presentation will discuss these developments based on the experience of the Research Programme in Digital Collections at the National Library of Wales, specifically discussing some recent experimentation with new methods for manuscript digitization and dissemination, including hyperspectral digitization of the Library’s Chaucer manuscripts. The presentation will also discuss the wider embedding of this work within the European Digital Humanities Context, through collaborations with the ESF Research Network Programe NeDiMAH (Network for Digital Methods in the Arts and Humanities).
Sam Donvil of PACKED vzw Center for Digital Heritage zooms in on the perspective of the citizen who wants to access, engage with and use out-of-copyright publicly funded cultural heritage, but also that of the heritage institution, which can share and enrich its knowledge about their collections by publishing their data as linked open data. This requires a fundamental change in how a heritage institution sees its role in society and the way it provides services towards its audience. The Wikimedia ecosystem (Wikipedia, Wikidata, Wikimedia Commons) provides a good environment in which cultural heritage institutions can experiment with redefining themselves as truly open institutions. Public Domain Day provides a low-threshold context for institutions to start small and donate data and images of artists that died 70 years ago and therefore entered the public domain.
Presentada en la Jornada Internacional sobre Archivos Web y Depósito Legal Electrónico, en la Biblioteca Nacional de España (BNE), el día 9 de julio de 2013.
Wikipedia at "Heritage Commons", the conference of the Italian Presidency of ...Iolanda Pensa
Presentation "The role of Wikipedia and its community in cultural production and in supporting the heritage sector " by Iolanda Pensa for the conference "Heritage Commons: Towards a participative heritage governance in the third millennium”, session "Digital technology as an enabler, for new narratives of heritage, new ways of supporting heritage as a collective resource, and new means of increasing access to heritage and developing new audiences”, Royal Palace of the Venaria Reale, Turin, Italy, 24 settembre 2014.
With 500 million readers, 30 million articles in 280 languages and around 80,000 contributors, Wikipedia is a powerful and existing tool to increase access to heritage and to support heritage as a collective resource. Wikipedia online communities are already active in creating encyclopedic knowledge for everyone: they produce texts, images and translations, they respect copyright, they support institutions in providing open content for Wikipedia, and they have engaged with the contest Wiki Loves Monuments over 15’000 participants in 50 countries to produce 900,000 images documenting cultural heritage. Wikipedia recognizes the value of culture and it is an ally in heritage development; governments and institutions can play a key role in facilitating volunteer contributions with authorizations and freedom of panorama, and in acknowledging the value of this active citizenship supporting the heritage sector.
IBL PAN
Presentation delivered during the workshop
BEYOND APCS: ALTERNATIVE OPEN ACCESS PUBLISHING BUSINESS MODELS
Royal Library, The Hague, Netherlands
April 5th and 6th, 2018
Entitle Libraries for Lifelong Learning - Best PracticesMDR Partners
Best Practices of Lifelong Learning in libraries from countries involved in the ENTITLE project.
www.entitlelll.eu
This presentation was shown throughout the day at the Entitle Final Conference, Budapest, Hungary on 16 October 2009.
The presentation was produced as a result of contributions from the ENTITLE project partners
Library labs as experimental incubators for digital humanities researchSally Chambers
This presentation was delivered as one of the keynotes at the 23rd International Conference on Theory and Practice of Digital Libraries (TPDL 2019) on 9-12 September 2019 at OsloMet - Oslo Metropolitan University, Oslo, Norway. http://www.tpdl.eu/tpdl2019/keynotes/
Can we consider libraries as the laboratories of the humanities? If so, would they be good places to observe and better understand the everyday practices of the humanist at work? Similarly, can the notion of the laboratory as a place of scientific experimentation be applied to libraries as a place to experiment with digital cultural heritage collections? Could “library labs” enable humanities researchers, cultural heritage professionals and computer scientists to work more closely together to push the boundaries of contemporary humanistic enquiry? Using Bruno Latour’s anthropological observations of the scientific practices of biologists in their laboratory as a starting point, we will consider the concept of libraries as the laboratories of the humanities. Extending this concept further, we will consider, “what is a library lab?” by examining the activities of library labs internationally. Finally, we will introduce the emerging Digital Research Lab at the Royal Library of Belgium (KBR) as part of a long-term collaboration with the Ghent Centre for Digital Humanities (GhentCDH). Using “KBR Labs” as a case study, we will consider the role that library labs could play as experimental incubators for digital humanities research.
Research partnerships, user participation, extended outreach – some of ETH L...ETH-Bibliothek
IFLA Satellite Meeting 2017: Digital Humanities, Berlin, August 2017
> From "boutique" to mass digitization
> (Cooperative) online platforms for digitized content
> Research Partnerships
> User Participation
> Outreach
GLAM-Wiki. Galleries, libraries, archives and museums cooperating with Wikipe...Iolanda Pensa
GLAM-Wiki. Galleries, libraries, archives and museums cooperating with Wikipedia and the Wikimedia projects. A presentation by Iolanda Pensa and Federico Leva
Festival dello Sviluppo Sostenibile 2017, Università Bocconi, Milano, 26 May 2017.
Workshop Slides by Douglas McCarthy, Collections Manager,
Europeana Art & Europeana Photography.
Sharing is Caring - Hamburg Extension
Museum für Kunst und Gewerbe Hamburg 20 April 2017
http://sharecare.nu/hamburg-2017/
The Library as a Digital Research infrastructure: Digital Initiatives and Dig...lorna_hughes
Memory institutions have built up expertise and taken the lead in all aspects of digital humanities, especially the development and implementation of digital methods for the capture, analysis and dissemination of archives and special collections, including manuscripts. In recent years, these initiatives have become embedded into Digital Humanities Initiatives, Centres and Programmes within research libraries, adding value to the existing relationships between libraries and scholarly iniatiatives. These activities have fostered the development of new projects that bring into collaboration the skills and expertise of academics, librarians, and digital humanists, making the Library increasingly a “digital research infrastructure”. This presentation will discuss these developments based on the experience of the Research Programme in Digital Collections at the National Library of Wales, specifically discussing some recent experimentation with new methods for manuscript digitization and dissemination, including hyperspectral digitization of the Library’s Chaucer manuscripts. The presentation will also discuss the wider embedding of this work within the European Digital Humanities Context, through collaborations with the ESF Research Network Programe NeDiMAH (Network for Digital Methods in the Arts and Humanities).
Sam Donvil of PACKED vzw Center for Digital Heritage zooms in on the perspective of the citizen who wants to access, engage with and use out-of-copyright publicly funded cultural heritage, but also that of the heritage institution, which can share and enrich its knowledge about their collections by publishing their data as linked open data. This requires a fundamental change in how a heritage institution sees its role in society and the way it provides services towards its audience. The Wikimedia ecosystem (Wikipedia, Wikidata, Wikimedia Commons) provides a good environment in which cultural heritage institutions can experiment with redefining themselves as truly open institutions. Public Domain Day provides a low-threshold context for institutions to start small and donate data and images of artists that died 70 years ago and therefore entered the public domain.
Presentada en la Jornada Internacional sobre Archivos Web y Depósito Legal Electrónico, en la Biblioteca Nacional de España (BNE), el día 9 de julio de 2013.
Wikipedia at "Heritage Commons", the conference of the Italian Presidency of ...Iolanda Pensa
Presentation "The role of Wikipedia and its community in cultural production and in supporting the heritage sector " by Iolanda Pensa for the conference "Heritage Commons: Towards a participative heritage governance in the third millennium”, session "Digital technology as an enabler, for new narratives of heritage, new ways of supporting heritage as a collective resource, and new means of increasing access to heritage and developing new audiences”, Royal Palace of the Venaria Reale, Turin, Italy, 24 settembre 2014.
With 500 million readers, 30 million articles in 280 languages and around 80,000 contributors, Wikipedia is a powerful and existing tool to increase access to heritage and to support heritage as a collective resource. Wikipedia online communities are already active in creating encyclopedic knowledge for everyone: they produce texts, images and translations, they respect copyright, they support institutions in providing open content for Wikipedia, and they have engaged with the contest Wiki Loves Monuments over 15’000 participants in 50 countries to produce 900,000 images documenting cultural heritage. Wikipedia recognizes the value of culture and it is an ally in heritage development; governments and institutions can play a key role in facilitating volunteer contributions with authorizations and freedom of panorama, and in acknowledging the value of this active citizenship supporting the heritage sector.
IBL PAN
Presentation delivered during the workshop
BEYOND APCS: ALTERNATIVE OPEN ACCESS PUBLISHING BUSINESS MODELS
Royal Library, The Hague, Netherlands
April 5th and 6th, 2018
Entitle Libraries for Lifelong Learning - Best PracticesMDR Partners
Best Practices of Lifelong Learning in libraries from countries involved in the ENTITLE project.
www.entitlelll.eu
This presentation was shown throughout the day at the Entitle Final Conference, Budapest, Hungary on 16 October 2009.
The presentation was produced as a result of contributions from the ENTITLE project partners
Library labs as experimental incubators for digital humanities researchSally Chambers
This presentation was delivered as one of the keynotes at the 23rd International Conference on Theory and Practice of Digital Libraries (TPDL 2019) on 9-12 September 2019 at OsloMet - Oslo Metropolitan University, Oslo, Norway. http://www.tpdl.eu/tpdl2019/keynotes/
Can we consider libraries as the laboratories of the humanities? If so, would they be good places to observe and better understand the everyday practices of the humanist at work? Similarly, can the notion of the laboratory as a place of scientific experimentation be applied to libraries as a place to experiment with digital cultural heritage collections? Could “library labs” enable humanities researchers, cultural heritage professionals and computer scientists to work more closely together to push the boundaries of contemporary humanistic enquiry? Using Bruno Latour’s anthropological observations of the scientific practices of biologists in their laboratory as a starting point, we will consider the concept of libraries as the laboratories of the humanities. Extending this concept further, we will consider, “what is a library lab?” by examining the activities of library labs internationally. Finally, we will introduce the emerging Digital Research Lab at the Royal Library of Belgium (KBR) as part of a long-term collaboration with the Ghent Centre for Digital Humanities (GhentCDH). Using “KBR Labs” as a case study, we will consider the role that library labs could play as experimental incubators for digital humanities research.
Max Kaiser (Austrian National Library, Vienna) on”‘Austrian Books Online‘. A Public Private Partnership between the Austrian National Library and Google” held on 28.04.2015 at the international conference "Archival Cooperation and Community Building in the Digital Age" within the panel “Public Private Partnerships: beneficial all around?" at Břevnov Archabbey in Prague (CZ).
Europeana 2019 - Connect Communities - Pitch your projectEuropeana
Slides 3 - 10: The GIFT Box: Helping museums make richer digital experiences for their visitors by Anders Sundnes Lovlie
Slides 11 - 18: Between people and things - Transfer of knowledge at SHMH by Elisabeth Böhm
Slides 19 - 30: Automated recognition of historical image content by Tino Mager
Slides 31 - 51: 50s in Europe: Kaleidoscope by Sofie Taes
Slides 52 - 63: CrowdHeritage: Crowdsourcing Platform for Enriching Europeana Metadata by Vassilis Tzouvaras
Slides 64 - 73: One by One: developing digital literacy in museums by Anra Kennedy
Slides 74 - 85: HeritageMaps.ie - Ireland's One-Stop Heritage Portal by Patrick Reid
Slides 86 - 90: Open GLAM now! - Sharing knowledge openly online by Larissa Borck
Slides 91 - 103: Endangered Archives Programme the world's most diverse online archive by Tristan Roddis
Slides 104 - 109: We transform the world with culture - Our impact on climate change by Barbara Fischer, Killian Downing and Peter Soemers
Mate Toth: Digitisation and creative re-use of cultural content #blokexpertuKISK FF MU
Slides for the lecture given at Department of Library and Information Studies. // Slajdy k přednášce pro předmět Blok expertů na KISKu (kisk.cz/blok-expertu).
Making cultural content available for everyone via mass digitisation is still a challenge for the European ALM (Archives, libraries and museums) sector. Most European memory institutions intend to digitise their whole collection and develop projects for the attractive presentation of their online available electronic content.
The creative industry expects content that is ready for remix and reuse even for business purposes. Based on the experiences of the meetings of Member States Expert Group on Digitisation and Digital Preservation the lecture will summarize the main factors that challenge the realization of this aim and outline possible solutions.
I will present the business needs (what creative reuse means), the legal barriers (how existing copyright rules make creative reuse difficult), the memory institutions’ perspective and some landmark projects from all over Europe that makes it clear that there is a light at the end of the tunnel!
Talk given at the SMK/Maersk event Data in Art | Art in Data
with Jonas Heide Smith, Head of Digital, SMK
26 April 2017
Statens Museum for Kunst, Copenhagen
https://www.eventbrite.com/e/data-in-art-art-in-data-tickets-33142653569
Europeana Network Association AGM 2016 - 8 November - Ignite talks round 1 - ...Europeana
Ignite Talks round 1
1. Karolina Tabak, National Museum in Warsaw, “Let’s be open”
2. Maria Drabczyk, National Audiovisual Institute, “Tu Europeana”
3. Antonella Fresa, Promoter srl, “Europeana Space”
4. Ad Pollé, Europeana Foundation, “The Europeana transcription tool”
5. Peter Hofmann, Hochschule Mainz, "Europanorama – A Big Data book about European culture"
Presentation to the National Science Library of the Chinese Academy of Scienceslabsbl
1100 - 1300, Thursday, 26th April 2018,
British Library Labs and Digital Scholarship at the British Library, Harley Room, British Library, St Pancras, London.
Presentation to the National Science Library of the Chinese Academy of Sciences
by Mahendra Mahey Manager of BL Labs
The Work of British Library Labs and Digital ScholarshipInsights from British Library Labs and an emerging role for Libraries
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Austrian Books Online. The Austrian National Library's Large-Scale Digitisation Public-Private Partnership with Google
1. @maxkaiser
Austrian Books Online
The Austrian National Library’s
large-scale digitisation public-private partnership
with Google
Max Kaiser
Head R&D, Austrian National Library
Library Science Talk
Geneva, 15 October 2012
Bern, 16 October 2012
13. @maxkaiser
→ Picture Archives and Graphics Department
→ Map Department
→ Music Department
→ Literary Archives
→ Papyri Department
→ Department of Planned Languages
→ Department of Rare Books and Manuscripts
43. @maxkaiser
Vision 2025Knowledge for the world of tomorrow
Our holdings are digitized
We collect and sustain knowledge
Access to our knowledge is simple
With us, research is more faceted and effective
We enrich cultural and social life
44. @maxkaiser
→substantial parts of holdings digitized
→cooperation with private partners
→full text search
→added-value services like semantic search
→unified access system
Our holdings are digitized
45. @maxkaiser
→focal point of collection policy is digital
→preference for digital versions of publications
→user generated content and social networks
→digital photography
→preservation of analogue and digital
collections
→scalable digital archive
We collect and sustain knowledge
46. @maxkaiser
→unified access system for all collections
→focus of cataloguing: metadata enrichment
→linking of metadata with external resources
→open data
→APIs and support for third party apps
Access to our knowledge is simple
47. @maxkaiser
→integration of digital content in virtual
research environments
→support for digital humanities
→strong research collections and libraries
→cooperation with universities and research
centres
With us, research is more faceted and simple
48. @maxkaiser
→digital services, reading rooms and
museums
→innovative interfaces
→mobile services
→cooperation with private partners: reuse
of data for innovative services
→reinforce library as social space
We enrich cultural and social life
60. @maxkaiser
service contract or service outsourcing
→long duration of the relationship
→substantial investment by private
partner
→distribution of risks
≠
61. @maxkaiser
rationales for PPPs
→private funding for Public Sector
→benefit from know-how and working
methods of the private sector
→but not a „miracle solution“
for the public sector
(EC Green Paper on Public Private Partnerships, 2004)
63. @maxkaiser
objectives for public partners
→funding for digitisation
→enhanced access
→engaging new audiences
→access to technology
→access to private sector competencies
→commercial income through user fees,
royalties or revenue share
→lobbying effort to increase public funding
64. @maxkaiser
objectives for private partners
→commercial objectives
→access to new markets or customer groups
→association with strong public brands
→access to (rare, unique) content
→corporate social responsibility
65. @maxkaiser
benefits for citizens
→increased online access
→democratisation of access to knowledge
→added-value services
→benefit for learning and tourism
→new creative endeavours
67. @maxkaiser
„Stimulating the flow of private funds
for the digitisation of cultural assets through
equitable public private partnerships
appears as a viable and sustainable way
of tackling the pressing question
of making Europe’s cultural wealth
accessible online and preserving it
for future generations.“
68. @maxkaiser
„The key question is not
whether public-private
partnerships for digitisation
should be encouraged, but
how‚ and under which
conditions.“
70. @maxkaiser
„(...) recommends that Member States (...)
encourage partnerships between cultural
institutions and the private sector in
order to create new ways of funding
digitisation of cultural material and to
stimulate innovative uses of the material,
while ensuring that public private
partnerships for digitisation are fair and
balanced (…).“
71. @maxkaiser@maxkaiser
key principles:
1. respect for intellectual property rights
→ ONB-Google: only public-domain works
digitised
2. non-exclusivity
→ ONB-Google: ONB free to digitise material
with other partners
3. transparency of the process
→ ONB-Google: public tender
72. @maxkaiser
key principles:
4. transparency of agreements
→ ONB-Google: Very detailed FAQs online
5. accessibility through Europeana
→ ONB-Google:
→ all files available for non-commercial use
→ access via platforms like Europeana
→ provision to research partners
6. key criteria
→ [Next slide]
73. @maxkaiser
key criteria for assessing PPPs
→ total investment by private partner / effort of
public partner
→ (free) access to material for general public,
including through Europeana
→ cross-border access
→ length of any period of preferential commercial
use by private partner
→ quality of digital copies for public partner
→ usage conditions for public partner in non-
commercial context
→ time-scale of project
74. @maxkaiser
additional key elements in
ONB-Google cooperation:
→selection of books by library
→Institute for Conservation involved
→termination
76. @maxkaiser
„Genuine PPPs currently not a widespread
method for financing digitisation by cultural
institutions in Europe.“
Commission Staff Working Paper Accompanying the document Commission Recommendation
on the digitisation and online accessibility of cultural material and digital preservation, p18
http://ec.europa.eu/information_society/activities/digital_libraries/doc/recommendation/recom28nov_all_versions/staff_working_paper.pdf
77. @maxkaiser
aim to maximize access
and re-use via digitisation
access restrictions /
re-Use limitations in PPPs
79. @maxkaiser
Cultural Commons
→Body of work freely available to the public for
legal use, sharing, repurposing, and remixing
→Source for cultural creativity
→http://creativcommons.org/culture
83. @maxkaiser
Public Domain Mark
„This work has been identified
as being free of known
restrictions under copyright
law, including all related and
neighbouring rights.
You can copy, modify,
distribute and perform the
work, even for commercial
purposes, all without asking
permission.“
http://creativecommons.org/publicdomain/mark/1.0/
84. @maxkaiser
Public Domain Charter
„Public-Private Partnerships have become one
option for funding large scale digitisation efforts.
Commercial content aggregators pay for the
digitisation in exchange for privileged access to the
digitised collections. These activities are seen as a
reason for attempting to exercise as much control as
possible over digital reproductions of Public Domain
works. Organisations are claiming exclusive rights in
digitised versions of Public Domain works and are
entering into exclusive relationships with commercial
partners that hinder free access.”
87. @maxkaiser
PSI Directive
→EC “Directive on the Re-Use of Public Sector
Information” (31 Dec. 2003)
→aim: Foster re-use of PSI
→legally binding document
→implemented by all Member States in 2008
→currently: Cultural & research institutions
excluded from directive
88. @maxkaiser
key provisions of PSI Directive
→clear procedures for re-use requests
→upper limit for charging
→transparency of conditions and standard
charges for re-use
→avoid discrimination between players
→prohibition of exclusive agreements
90. @maxkaiser
proposed changes
→withdraw current exemption for cultural
institutions
→restrict public sector bodies to only apply
charges for re-used based on marginal
costs
→exemption for libraries, archives, museums
→prohibit agreement of terms for re-use
which grant exclusive rights to any one
party
99. @maxkaiser
Austrian National Library:
→ provision of Metadata
→ selection
→ internal logistics
→ conservational assessment
→ barcoding
→ metadata adjustments
→ data download and control
→ data storage & digital preservation
→ Digital Library
156. @maxkaiser
digitisation
→ scanning Center in Germany
→ procedures agreed
→ Austrian Federal Office for Monuments involved
→ each volume checked after return
→ books unavailable to users for ~ 3 months
168. @maxkaiser
quality control
→goal: Automated jobs
→representative samples
→IT assisted discovery of error clusters
→error candidates checked manually
→detect systematic
and critical errors
169. @maxkaiser
error model
→ level 1: data / information
→ image (thick, broken)
→ illustration (scanner effects, tone, color etc)
→ full-text (OCR errors per page-image)
→ level 2: entire page
→ blur / warp / skew
→ cropping
→ obscure / cleaned
→ colorization
→ full-text (OCR error patterns at page level)
Informed by „Validating Quality in
Large-Scale Digitization“ project
of Univ. of Michigan & Univ. of Minesota,
http://hathitrust-quality.projects.si.umich.edu/
170. @maxkaiser
error model
→ level 3: whole volume
→ order of pages
→ missing pages
→ duplicate pages
→ false pages
→ full text (OCR error patterns at volume level)
Informed by „Validating Quality in
Large-Scale Digitization“ project
of Univ. of Michigan & Univ. of Minesota,
http://hathitrust-quality.projects.si.umich.edu/
171. @maxkaiser
use cases
→reading online images
→printing on demand
→processing full text data
→managing collections
Informed by „Validating Quality in
Large-Scale Digitization“ project
of Univ. of Michigan & Univ. of Minesota,
http://hathitrust-quality.projects.si.umich.edu/
186. hadoop / map reduce
SLAVE 1
Task Tracker
Data Node
SLAVE 2
Task Tracker
Data Node
SLAVE n
Task Tracker
Data Node
MASTER
Job Tracker
Name Node
Hadoop Distributed File System (HDFS)
→ experimental 5 server cluster at ONB:
→ 40 cores in total
→ 30 cores assigned to task trackers
187. @maxkaiser
use case 1: duplicate pages
in one book
→books with duplicated pages
→due to scanning process & post processing
→use key points of images to determine
structural image similarity
190. @maxkaiser
use case 2: book comparison
based on image similarity
→different instances of one book, coming
→e.g. from different downloads of one book
at different points in time
→book similarity measure
→based on comparison of book page images
from two different book instances
191. use case 2: book comparison
based on image similarity
measure for book similarity
based on book page image
similarity
helps finding prominent
changes in book re-
downloads
192. @maxkaiser
large scale document processing
→extract image metadata using Exiftool
→large scale batch processing using Apache
Hadoop Streaming API
→bash script using Exiftool is executed on the
cluster
→book page image data is accessible from
each node of the cluster
→parallelisation of batch processing
194. @maxkaiser
large scale document processing
→ store once in HDFS and read many times
→ small files (TXT, HTML) stored in HDFS
→ files of each file type stored as one big file
(SequenceFile)
→ principle: store once in HDFS and read many times
→ example:
→ storing OCR results of 24 mio pages (ca. 60.000
books) reading data from file server and storing on
cluster takes more than 1 day
→ subsequent processing of a Map/Reduce job (calculate
average block width) takes 6 hours
203. @maxkaiser
outlook
→ full-text: new possibilities for research
→ data enrichment
→ named entity recognition
→ linked data
→ new data centric research in the Humanities
& Social Sciences
→ http://www.diggingintodata.org/
205. @maxkaiser
DM2E
→http://dm2e.eu/
→European Commission co-funded project
→stimulate creation of new tools and
services for re-use of Europeana data in
the Digital Humanities
→implementation of semantic annotation
tool
→Austrian Books Online data part of the
project
206. @maxkaiser
next steps
→80.000 books already accessible via
Google Books
→Spring 2013: launch of Austrian Books
Online Viewer
→full text search