Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos

  1. 1. PrestoPRIMEFP7-ICT-2007-3 231161<br />Higher Education Institutions and AV digitisation<br />Günter Mühlberger & Andy StauderUniversity of Innsbruck Library<br />24 June 2010<br />
  2. 2. Department for Digitisation & Digital Preservation (DEA)<br />Founded in 2002<br />Currently 3 permanent staff, 11 FTEs from third party projects (9 from R&D, 2 from commercial projects)<br />Specialised in book and paper digitisation, digital library technology, Optical Character Recognition, software development, etc.<br />Coordinator of eBooks-on-Demand Network (27 member libraries delivering every public domain book in digital format)<br />Involved in several EU projects, e.g. IMPACT (mass digitisation and OCR processing), PrestoPrime (preservation),...<br />Currently finishing one of the largest non-Google projects: Digitisation of 216.000 theses, with more than 24 M pages (1800 m shelf)<br />AV digitisation & preservation: A new terrain for us!<br />Our ambition: <br />Set up a university wide Digitisation and Preservation Strategy (currently in the stage of negotiations with the University Management board)<br />Apply our experiences from other mass-digitisation projects to the AV domain<br />In 5-10 years all the analogue AV material of the university should be available in digital format<br />
  3. 3. Higher Education Institution Scenario<br />Defined a scenario for Higher Education Institutions in PrestoPrime<br />University of Innsbruck as example<br />25.000 students, 3000+ researchers<br />AV material<br />More than 25 research collections, 90.000+ hours of AV material<br />90% comes from broadcasters, but many rare programmes<br />Also unique material (research and cultural value)<br />95% still not available in digital format<br />Switch to digital workflow (semi-professional production of AV content)<br />Usage <br />Research, teaching, cultural activities,... <br />Copyright privileges.<br />Goal of PrestoPrime<br />To find a practical solution for preservation and access<br />To provide guidance to other HEI<br />
  4. 4. Pilot project<br />Collection of the Slavonic Studies Department<br />Multimedia collection of research papers, newspaper clippings, photos and audio & video material (since more than 25 years ago)<br />Medium sized: around 2000 VHS cassettes with about 3000 hours of video material<br />Rare programmes from Russian and former Soviet countries from the early 80ies until today<br />Important material for research and teaching<br />Heavily used by students and researchers<br />Technical situation<br />In-house Oracle g10 database for metadata (mainly descriptive)<br />No specific preservation strategy<br />Currently highly ineffective “on-demand-digitisation” for VHS cassettes<br />
  5. 5. Our approach<br />Run a mass-digitisation project, where all processes are carried out as a batch process including metadata extraction, quality control, storage, etc.<br />Afterwards it should not be necessary to touch the analogue material again<br />Digitise with a reasonable quality which is adequate to the original material (VHS) and corresponds ot the fact that we are not in a broadcaster environment<br />In cases where unique material in high quality (e.g. DigiBeta is available certainly higher quality would be necessary)<br />Use a sub-set of the descriptive metadata for the digital repository but do not touch the metadata management system currently in use<br />Systems have developed, researchers and users are familiar with “their” database, etc...<br />Highly political aspect<br />Adapt the digital repository so that it is able to handle AV material<br />Storage strategies, etc.<br />
  6. 6. Implementation: Mass-digitisation<br />‘DEA-VHSS-1’ VHS digitisation machine<br />One server architecture computer<br />8 USB 2.0 analogue-to-digital converters (external)<br />8 VHS video recorders (S-VHS recorders or audio cassette recorders could be used as well)<br />Standard computer peripherals (human input devices, monitor etc.)<br />Output<br />4:2:0 sub-sampled Intel video<br />Capture Rate PAL up to 720x576 pixels/25fps<br />Capture Rate NTSC: up to 720x480 pixels/29,9fps<br />PCM raw audio<br />16 bit depth<br />44.1 kHz sampling rate<br />Encoding<br />Currently h.264 video and mpeg I audio layer 3 (mp3) in AVI-container format<br />Productivity<br />4 runs per working day (=32 cassettes resp. 40-50 hours per day with one machine) with a minimum of human effort<br />Several machines in several university departments for parallel processing<br />
  7. 7. Implementation : Metadata, quality control<br />Descriptive Metadata<br />XML Export from Oracle database, mapping to simple Dublin Core within the repository<br />Linked via a barcode (ID of the record) which is scanned with a barcode scanner during the digitisation process<br />Technical metadata<br />Joanneum Research develops a content based quality control tool within PrestoPrime<br />It needs to be specified what shall be done during the ingest process and what shall be done as routine process in the preservation life cycle<br />Output again a XML file with technical information (to be defined in detail, first version available e2010)<br />Structural metadata<br />Annotation and tagging tool from B&G<br />Course participants, students, researchers have a clear interest in the material, e.g. write a thesis, diploma, etc. and are therefore very likely willing to annotate the video<br />
  8. 8. Implementation: Storage<br />
  9. 9. Implementation: Storage<br />Netapp storage<br />Relatively expensive<br />Extension is not that easy<br />Currently 25 TB available for our unit<br />IBM band storage<br />Very cheap (once it is available)<br />Currently 10 TB used by our unit, but a lot more would be available<br />Disadvantage: slow! Takes some minutes to retrieve a file<br />As a rule of thumb we expect 600 MB for one hour – so 100.000 hours would sum up to 60 TB<br />This cannot be managed with the current infrastructure, but infrastructure can relatively easily be upgraded<br />
  10. 10. Implementation: Preservation<br />PrestoPrime: Exlibris<br />Rosetta – Digital Preservation System<br />Integration and validation<br />Library already runs Aleph and PRIMO<br />Test installation<br />Institutional repository<br />In-house solution (beta version)<br />Oracle 10g XDB<br />METS objects<br />Descriptive, technical and structural metadata will be transformed into METS file which is than ingested into the database<br />Features: user management, searching, browsing, OAI-PMH interface, etc.<br />Our task<br />To compare both solutions<br />
  11. 11. What can you expect until the end of the project?<br />A paper with a more detailed description of the Higher Education Institution Scenario<br />We will contact several universities<br />Carry out a survey on their collections, usage, preservation strategies, experiences, etc. of AV material<br />Structured interviews for a closer look <br />A paper describing in detail our approach for some pilot projects as the one described above<br />Considerations, approaches, workflows, used tools, etc.<br />Real world data<br />A technical description of the VHS digitisation machine<br />We believe that a number of institutions will be in the same situation, i.e. that they are holders of S-VHS or VHS collections which need to be digitised<br />We are also willing to assemble such machines on a contractual basis<br />
  12. 12. Thank you for your attention!<br />