SlideShare a Scribd company logo
1 of 12
Central Registry
                          for Digitized Objects:
                         Linking Production and
                          Bibliographic Control



Ralf Stockmann
Göttinger Digitization Center
As things are now
• Huge ventures in
  – Digitization
     •   Google
     •   Microsoft
     •   National programs
     •   Local centers
  – Accessibility
     •   World Digital Library
     •   European Digital Library
     •   National portals
     •   Google Book Search
As things are now
• We just face the dawn of mass digitization
  – Leaving behind the state of
    manufacturing
  – Entering industrialization
  – Scanning Robots
  – Accessible Full Text (OCR)
Lack of …
• Coordination in
  digitization activities
   – Who scans what
     where when
     in which quality
     and how will it
     be accessible
      • How is “quality” defined?
      • Do we agree on “what”?
Facing the Consequences
                                                Technical
                                                Improvements
                                                               Costs




                                 Waste of Ressources
Costs / Value




                                                               Additional
                                                               Benefit

                 Number of digitized items per volume
The Solution
• Central registry for digitized objects
• Focused on the production context (no user
  frontend)
• API driven
  – Application Programming Interface
  – Query / Ingest
  – Simple implementation into existing workflow-tools
• Batch mode (lists)
• Open Source / free service
• Matching on volume level
  – Score / probability
Implementation
                           Backend Services
                               EROMM / EDL / OCLC / …



                      Registry / Meta Data Store

                  Aggregator / Normalizer / Mapping

                                          API
                       Query

      Ingest                                      Ingest       Ingest




                      ? ? ?                                !      !      !
Present Collections             Running Project            Notice of Intent
Metadata Store
•   Bibliographic
     –   Title
     –   Author
     –   Date
     –   Place of publication      Matching / Score
     –   Number of Pages (?)       „what“
     –   Language
     –   Print / Format
     –   Edition
•   Technical
     –   Resolution
     –   Color depth
     –   File type / compression
•   Accessibility                  Additional Judging
     –   Institution               „who, where, which
     –   Persistent identifier     quality, how
     –   Rights                    accesible“
     –   URL
•   Status
     –   Digitized
     –   In Progress               Decisive Factor
     –   Intended (Timeline?)      „when“
     –   Requested?
Obstacles
• (open source) Tools for automated matching /
  scoring?
• Interface for manual comparison / decision making
• Multivolume works: low rate of uniformity (near
  50% of physical SUB stock before 1900)
• Unicode
• Transliteration tables
• Random bound books
• Reliable identifier
   – ISBN for old books?

• Anticipated rate of accuracy: 50 – 70 %
Appreciation of Values
• The goal is NOT to build a reliable database in terms of
  library standards
• But to prevent further waste of resources.
• If we manage to archive just 50% precision,
• We saved a min. 50% of founding!
Work Packages
• Define metadata model
• Set up database
• Implement mapping tools
• Define API calls
• Implement API
• Build some connectors to popular mass digitization workflow
  tools (e.g. “Goobi”)
• Establish ISBN workflow
• Harvest existing sources
• Start with a community of actual projects

• Get some (!) founding
• Estimated schedule plan: 6 months
Thank You
(stockmann@uni-goettingen.de)

More Related Content

Viewers also liked

Das materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltDas materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltRalf Stockmann
 
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Ralf Stockmann
 
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRDFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRRalf Stockmann
 
eAqua und europeana4D - 2009
eAqua und europeana4D - 2009eAqua und europeana4D - 2009
eAqua und europeana4D - 2009Ralf Stockmann
 
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ralf Stockmann
 
Visualisierung bibliographischer Daten
Visualisierung bibliographischer DatenVisualisierung bibliographischer Daten
Visualisierung bibliographischer DatenRalf Stockmann
 
GUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungGUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungRalf Stockmann
 
maple , part2
maple , part2maple , part2
maple , part2ahamidp
 
Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12canolfanbedwyr
 
Fireside Chats
Fireside ChatsFireside Chats
Fireside Chatscrissy3258
 
Lecture04- Use Case Diagrams
Lecture04- Use Case DiagramsLecture04- Use Case Diagrams
Lecture04- Use Case Diagramsartgreen
 
Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Adalberto Geradini
 
Il processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaIl processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaAdalberto Geradini
 
C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!Adalberto Geradini
 
The Genocide In Rwanda
The Genocide In RwandaThe Genocide In Rwanda
The Genocide In Rwandabpersett
 
Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Adalberto Geradini
 
Gli s-vantaggi della relazione
Gli s-vantaggi della relazioneGli s-vantaggi della relazione
Gli s-vantaggi della relazioneAdalberto Geradini
 

Viewers also liked (20)

Das materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltDas materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen Welt
 
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
 
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRDFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
 
eAqua und europeana4D - 2009
eAqua und europeana4D - 2009eAqua und europeana4D - 2009
eAqua und europeana4D - 2009
 
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
 
Visualisierung bibliographischer Daten
Visualisierung bibliographischer DatenVisualisierung bibliographischer Daten
Visualisierung bibliographischer Daten
 
GUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungGUI-Mockups in der Softwareentwicklung
GUI-Mockups in der Softwareentwicklung
 
maple , part2
maple , part2maple , part2
maple , part2
 
Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12
 
Fireside Chats
Fireside ChatsFireside Chats
Fireside Chats
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Lecture04- Use Case Diagrams
Lecture04- Use Case DiagramsLecture04- Use Case Diagrams
Lecture04- Use Case Diagrams
 
Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Out of comfort zone, into the adventure
Out of comfort zone, into the adventure
 
Il processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaIl processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda Sanitaria
 
Visioning the vision
Visioning the visionVisioning the vision
Visioning the vision
 
C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!
 
The Genocide In Rwanda
The Genocide In RwandaThe Genocide In Rwanda
The Genocide In Rwanda
 
Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?
 
Gli s-vantaggi della relazione
Gli s-vantaggi della relazioneGli s-vantaggi della relazione
Gli s-vantaggi della relazione
 
Be unique
Be unique Be unique
Be unique
 

Similar to Central Registry for Digitized Objects Links Production and Bibliographic Control

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual ObservatoryJose Enrique Ruiz
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenChristopher Whitaker
 
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Miguel Pastor
 
Wordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisWordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisLingoport (www.lingoport.com)
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search SolutionsFindwise
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsreemgsree
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems
 
Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesAnil Mishra
 
BI on Cloud Computing
BI on Cloud ComputingBI on Cloud Computing
BI on Cloud Computingtdwiindia
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Den Delimarsky
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsHisham Arafat
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia WorkflowsDwight Kelly
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnRobert H. McDonald
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practiceHelen Nneka Okpala
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011DLFCLIR
 
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...Robert H. McDonald
 

Similar to Central Registry for Digitized Objects Links Production and Bibliographic Control (20)

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual Observatory
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014
 
Wordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisWordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static Analysis
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search Solutions
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsree
 
Cassandra eu
Cassandra euCassandra eu
Cassandra eu
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for Libraries
 
BI on Cloud Computing
BI on Cloud ComputingBI on Cloud Computing
BI on Cloud Computing
 
32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia Workflows
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
 
E meyer lamp2012
E meyer lamp2012E meyer lamp2012
E meyer lamp2012
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practice
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
 
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
 

More from Ralf Stockmann

Freiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetFreiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetRalf Stockmann
 
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Ralf Stockmann
 
Wie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannWie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannRalf Stockmann
 
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Ralf Stockmann
 
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintDer Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintRalf Stockmann
 
BibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeBibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeRalf Stockmann
 
Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Ralf Stockmann
 
Grundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungGrundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungRalf Stockmann
 
Was Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenWas Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenRalf Stockmann
 
Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Ralf Stockmann
 
Keynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopKeynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopRalf Stockmann
 
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenVisually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenRalf Stockmann
 
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Ralf Stockmann
 
Goobi Rollen Und Rechte
Goobi Rollen Und RechteGoobi Rollen Und Rechte
Goobi Rollen Und RechteRalf Stockmann
 
Persitent Identifier in Goobi
Persitent Identifier in GoobiPersitent Identifier in Goobi
Persitent Identifier in GoobiRalf Stockmann
 
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungKooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungRalf Stockmann
 

More from Ralf Stockmann (17)

Freiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetFreiräume schaffen - im Social Intranet
Freiräume schaffen - im Social Intranet
 
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
 
Wie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannWie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kann
 
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
 
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintDer Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
 
BibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeBibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale Wissensräume
 
Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)
 
Grundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungGrundlagen Digitaler Mediengestaltung
Grundlagen Digitaler Mediengestaltung
 
Was Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenWas Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich Wollen
 
Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?
 
Keynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopKeynote Studip Zukunftsworkshop
Keynote Studip Zukunftsworkshop
 
Zukunft der E Books
Zukunft der E BooksZukunft der E Books
Zukunft der E Books
 
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenVisually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
 
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
 
Goobi Rollen Und Rechte
Goobi Rollen Und RechteGoobi Rollen Und Rechte
Goobi Rollen Und Rechte
 
Persitent Identifier in Goobi
Persitent Identifier in GoobiPersitent Identifier in Goobi
Persitent Identifier in Goobi
 
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungKooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
 

Recently uploaded

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Central Registry for Digitized Objects Links Production and Bibliographic Control

  • 1. Central Registry for Digitized Objects: Linking Production and Bibliographic Control Ralf Stockmann Göttinger Digitization Center
  • 2. As things are now • Huge ventures in – Digitization • Google • Microsoft • National programs • Local centers – Accessibility • World Digital Library • European Digital Library • National portals • Google Book Search
  • 3. As things are now • We just face the dawn of mass digitization – Leaving behind the state of manufacturing – Entering industrialization – Scanning Robots – Accessible Full Text (OCR)
  • 4. Lack of … • Coordination in digitization activities – Who scans what where when in which quality and how will it be accessible • How is “quality” defined? • Do we agree on “what”?
  • 5. Facing the Consequences Technical Improvements Costs Waste of Ressources Costs / Value Additional Benefit Number of digitized items per volume
  • 6. The Solution • Central registry for digitized objects • Focused on the production context (no user frontend) • API driven – Application Programming Interface – Query / Ingest – Simple implementation into existing workflow-tools • Batch mode (lists) • Open Source / free service • Matching on volume level – Score / probability
  • 7. Implementation Backend Services EROMM / EDL / OCLC / … Registry / Meta Data Store Aggregator / Normalizer / Mapping API Query Ingest Ingest Ingest ? ? ? ! ! ! Present Collections Running Project Notice of Intent
  • 8. Metadata Store • Bibliographic – Title – Author – Date – Place of publication Matching / Score – Number of Pages (?) „what“ – Language – Print / Format – Edition • Technical – Resolution – Color depth – File type / compression • Accessibility Additional Judging – Institution „who, where, which – Persistent identifier quality, how – Rights accesible“ – URL • Status – Digitized – In Progress Decisive Factor – Intended (Timeline?) „when“ – Requested?
  • 9. Obstacles • (open source) Tools for automated matching / scoring? • Interface for manual comparison / decision making • Multivolume works: low rate of uniformity (near 50% of physical SUB stock before 1900) • Unicode • Transliteration tables • Random bound books • Reliable identifier – ISBN for old books? • Anticipated rate of accuracy: 50 – 70 %
  • 10. Appreciation of Values • The goal is NOT to build a reliable database in terms of library standards • But to prevent further waste of resources. • If we manage to archive just 50% precision, • We saved a min. 50% of founding!
  • 11. Work Packages • Define metadata model • Set up database • Implement mapping tools • Define API calls • Implement API • Build some connectors to popular mass digitization workflow tools (e.g. “Goobi”) • Establish ISBN workflow • Harvest existing sources • Start with a community of actual projects • Get some (!) founding • Estimated schedule plan: 6 months