Digitisation Revolutionising Library Management  Day 2 Sydney, April 2007 Mal Booth  – Head, Research Centre
Where am I from? The Memorial’s Research Centre functions as a library and an archive. We develop, manage and provide public access to Australia’s official, personal, & published records of war.
Global trends in digitisation Faster, better, cheaper equipment & storage Better DAMS & CMS software Institutional repositories More audio & film Collaboration Shared collections (eg. Picture Australia) Mass digitisation programs:  Google, Microsoft, Yahoo, Open Content Alliance (OCA),  Internet Archive
I’m not sure what these are, but they are important! Dynamism Preservation  (as a benefit & obligation) Playing Management & planning Compromise Access
Recent Digitisation Examples WW1, WW2, Korea & Vietnam  unit war diaries 260k+  images  of our collections Official histories  (published works) Digitisation on demand
Digitisation on demand Currently running at 90,000 pp p.a.
About one fifth of these images
What we will cover today 1. GETTING STARTED a. Why and what to digitise? b. How (preservation/access) & Principles c. Copyright and IP considerations (briefly) d. Resources needed; in-house or outsource? e. Process outline: from planning to long term maintenance (life-cycle) 2. METHODS, CONTENT & STORAGE a. Production: file formats & standards, scanners & cameras, software b. Output: indexing, access, search optimisation, delivery options c. Storage, ongoing maintenance & management requirements d. Just doing it, lessons learned & key issues
Why and what to digitise? WHY   Increase & broaden access (remote & 24/7) Fragile, valuable &/or unique materials (loss or damage would be catastrophic) Support research & education Anticipating future use or re-use Improved search & retrieval  Promoting knowledge, understanding & recognition of collections Relationships to other collections Preservation of at-risk collections by risk reduction & conservation WHAT : popular collections; fragile/unique; at-risk; significant priorities; relationships (corporate or collaborative); & what you have the right to digitise!
How: some Principles* -  Collections   ( organised groups of objects ) Agreed collection development policy Sound description Lifecycle curation Broad access to all Respect for IP Evaluation for use & usefulness Interoperability Integration of staff & user workflows Sustainability & continued usability *  NISO Framework of Guidance for the Building of Good Digital Collections
How: some Principles -  Objects   ( digital assets ) Production ensures collection priorities & maintains interoperability and re-use Preservability: persistence & accessibility over time; across evolving media, software & formats Meaningful outside its context: portable, reusable, interoperable Persistent identifiers: URLs or URIs Authentication: veracity, accuracy & authenticity Inclusion of associated metadata: descriptive, administrative & structural
How: some Principles -  Metadata   ( selection and implementation of information about objects: descriptive; administrative; technical; structural; & preservation ) Appropriate to materials, users and use Support for interoperability: mappings & crosswalks between schemes Use of authority control and content standards Includes a clear statement on conditions of use for the objects (eg. fair use) Support for long term management, eg. PREMIS Metadata records are treated as digital objects
How: some Principles -  Initiatives   ( the creation & management of collections ) A substantial design and planning component Appropriate staffing and expertise Best practice project management An evaluation plan A project report that documents the process & outcomes Consideration of the entire lifecycle (ongoing management)
Copyright & Intellectual Property (1) Concerns: What sort of items are protected by copyright?  What is the duration of copyright protection?  What sorts of activities infringe copyright?  When is a copyright licence required? Understanding the “exceptions” to copyright infringement See:  Copyright and Cultural Institutions: Short Guidelines for Digitisation  by Emily Hudson and Andrew Kenyon & ACC’s  S pecial case exception: education, libraries, collections   (deals with the new section 200AB)
IFLA/IPA Statement on Orphaned Works
 
 
 
Resources required (1) Hardware  – scanners, cameras, computers, monitors, digital storage, memory & processing power Software  – scanning, OCR, office apps, image editing & management, DAM?, video/audio capture, metadata capture?, file conversion, calibration Furnishings  – for staff, computers, scanners, storage Facility space  – scanning, preparation & storage, QA Specialist staff  – curatorial, cataloguers, IT/DBA, web, scanning, project management, conservators Training needs Conservation needs – archival supplies & consultancies Budget funds  – salaries, hardware/software purchases & lease, licenses, running/ongoing costs, contingency Corporate support  – context within corporate or other priorities and strategies
WW1 Diaries scanning facilities Approximately 200,000 high res. images per year
Outsource  or  Inhouse? Contractor responsible for capital equipment, training and technology obsolescence costs costs No need to find scanning space Less need for digitisation knowledge Economies of scale (& capability for large volumes & throughput) The bureau may be able to achieve a better quality result & have a broader range of services A better fix on costs and timescales (but these can vary widely) Better institutional knowledge, understanding & capacity Less risk than working with external parties Better ability to meet specific needs and deadlines? Cheaper costs for oversized or non-standard materials? QA may be more efficient Saving on transport and insurance and less risk with onsite scanning Assured staff and expertise
Dealing with an external bureau Clear contracts are important Choosing a bureau  – check with reference sites Range and scope of material  -  non-standard materials Collaboration with others to achieve further economies of scale  may be possible QA  can be a project killer Metadata  – what will the bureau record? Consider partial outsourcing or bringing a specialist partner onsite
Some funding options Program funding  – dependent on corporate priorities User pays  – but will they? Grants  -  eg.  http:// www.nla.gov.au /chg/   Donors or sponsors  -   from or associated with a web presence Collection Depreciation  – depends on valuation and an accounting standard As a training activity  – can be viable learning experience for a small team & project New policy proposals
“ Investing in an Intangible Asset” The benefits of long term preservation of digital assets are difficult to value (reliably and objectively), but the costs of not doing so are high if action isn’t taken.  More information on costs and benefits is needed . Digital preservation is still new, so there is  scope for market creation & development, research and experimentation . Information managers know why such programs are important, but find it hard to communicate this to those who control our finances.  Business cases  based on empirical evidence need something like the balanced scorecard approach to  bridge the gap between us and decision makers . Digital preservation is still an  organisational innovation  and must be  managed effectively  as it is dependent on independently driven technological developments. From DCC’s  Investment in an Intangible Asset
The AWM Document Digitisation Process
Cornell’s digital imaging process map Radiating out from the goals and deliverables of the project are the institutional resources The outer wheel represents the processes or stages of digital  imaging initiatives – clockwise from Selection
Draft DCC Curation Lifecycle
PRODUCTION: file formats and standards Commonly used formats: TIFF JPEG GIF PDF Future formats: JPEG  2000 PNG
PRODUCTION: file formats –  how and where they are used
PRODUCTION: scanners & cameras Flatbed scanners Map/plan scanners Overhead scanners Digital cameras Book scanners Book-edge scanners Microfilm and slide scanners
PRODUCTION: software Image editing software Consider: cost; hardware requirements; usability; functionality Options :  Adobe Photoshop CS3  (expensive/best) &  Photoshop Elements  (cheap);  Gimp  (free); + prop. software for RAW files Derivative and pdf production:  Acrobat Writer  (expensive);  ImageMagick  (conversion software);  Ghostscript  (pdf interpreter); &  pdftk  (pdf toolkit) Other useful open source software: JHOVE  object validation FedoraCommons  object repository management system ebXML  e-business suite Xena  digital document preservation software (from NAA) DSpace  institutional repository system DROID  automated batch identification of file formats (from TNA UK)
OUTPUT Indexing Most descriptive metadata will come from your MARC records If a separate database is needed: Access, SQL & Oracle Access options  (also part of  just doing it ) Collection OPACs ,  databases ,  Zoomify ,  EAD , DVDs, CDs Other:  Blogs ,  Facebook  ArtShare ,  Flickr ,  Facebook page Search engine optimisation How can I create a Google-friendly site?
STORAGE & MAINTENANCE Storage Consider : Speed (read/write, data transfer); Capacity;  Reliability (stability, redundancy); Standardization; Cost; & Fitness to task Management, maintenance & preservation Digital preservation practices Preservation metadata Trusted digital repositories?
What we want Accuracy / authenticity Searchability  Easy navigation & download Cost effectiveness  Good quality product Text capture and search (OCR) where poss. Integration Scalability Web interactivity Simple solutions Costs estimates escalate  Technology has limits, but is improving You learn with new technology by doing There is more to copyright than owning it Anticipate needs & increasing expectations $ hard to find for access  (sponsorship?) Better management & storage of assets A need to educate managers & suppliers! Keeping trained staff is a challenge Costs/benefits of new technologies ( risk ?) Importance of QA in projects! Need for a strategic plan(s) Be prepared to compromise What we are finding Lessons
Enterprise Content Management: management, search & web facilities for digital assets and services Extensive  digital asset management  features Excellent  electronic document & record management Intuitive  web content management  features Facilitate simple and complex  workflow  processes Extensive and  unified searching  constructs Scaleable  Compliant  with all government recordkeeping requirements & emerging  digital preservation standards Integrate  easily with existing Memorial systems Simple to administer  in terms of security, auditing & storage management
Other Corporate Systems Digital Asset Management Electronic Document & Records Management Record Management E:mail Memorial  Intranet Web Content Management AJRP Website Lotus Notes OAI Interface FIRST OPAC MICA OPAC (CAS) ‏ ECM  - Conceptual Overview CMS Digital Object Mgmt System DOMS Biographical Databases & War Diaries RecordSearch NAA Collection Mgmt MICA Library System FIRST Fund Raising System Raisers Edge Financial & HR System SAP POS System, Advance Retail CAS Internal Orders OnLine Shop Search Photocopy Quotes ReQuest eSales PICTION
implementing user-friendly technologies make sure they are findable and useable  pick a few “winners” & lead by example  collaborate & network get involved in your core business  don't leave it to IT-staff learn to compromise (the 80:20 rule) ‏ experiment start now!  it is sometimes easier to seek forgiveness than gain permission
JISC 2007  – five key issues for digitisation Re-focus on the user (simple, easily found & used output)  Aggregate and present content that can resonate with multiple communities  Learn from Google & YouTube but keep our values New business models are needed, collaborating with and without the private sector More collaboration between publishers, curators, funders, users, vendors and standards bodies

Digitisation Workshop Pres 2008(V1)

  • 1.
    Digitisation Revolutionising LibraryManagement Day 2 Sydney, April 2007 Mal Booth – Head, Research Centre
  • 2.
    Where am Ifrom? The Memorial’s Research Centre functions as a library and an archive. We develop, manage and provide public access to Australia’s official, personal, & published records of war.
  • 3.
    Global trends indigitisation Faster, better, cheaper equipment & storage Better DAMS & CMS software Institutional repositories More audio & film Collaboration Shared collections (eg. Picture Australia) Mass digitisation programs: Google, Microsoft, Yahoo, Open Content Alliance (OCA), Internet Archive
  • 4.
    I’m not surewhat these are, but they are important! Dynamism Preservation (as a benefit & obligation) Playing Management & planning Compromise Access
  • 5.
    Recent Digitisation ExamplesWW1, WW2, Korea & Vietnam unit war diaries 260k+ images of our collections Official histories (published works) Digitisation on demand
  • 6.
    Digitisation on demandCurrently running at 90,000 pp p.a.
  • 7.
    About one fifthof these images
  • 8.
    What we willcover today 1. GETTING STARTED a. Why and what to digitise? b. How (preservation/access) & Principles c. Copyright and IP considerations (briefly) d. Resources needed; in-house or outsource? e. Process outline: from planning to long term maintenance (life-cycle) 2. METHODS, CONTENT & STORAGE a. Production: file formats & standards, scanners & cameras, software b. Output: indexing, access, search optimisation, delivery options c. Storage, ongoing maintenance & management requirements d. Just doing it, lessons learned & key issues
  • 9.
    Why and whatto digitise? WHY Increase & broaden access (remote & 24/7) Fragile, valuable &/or unique materials (loss or damage would be catastrophic) Support research & education Anticipating future use or re-use Improved search & retrieval Promoting knowledge, understanding & recognition of collections Relationships to other collections Preservation of at-risk collections by risk reduction & conservation WHAT : popular collections; fragile/unique; at-risk; significant priorities; relationships (corporate or collaborative); & what you have the right to digitise!
  • 10.
    How: some Principles*- Collections ( organised groups of objects ) Agreed collection development policy Sound description Lifecycle curation Broad access to all Respect for IP Evaluation for use & usefulness Interoperability Integration of staff & user workflows Sustainability & continued usability * NISO Framework of Guidance for the Building of Good Digital Collections
  • 11.
    How: some Principles- Objects ( digital assets ) Production ensures collection priorities & maintains interoperability and re-use Preservability: persistence & accessibility over time; across evolving media, software & formats Meaningful outside its context: portable, reusable, interoperable Persistent identifiers: URLs or URIs Authentication: veracity, accuracy & authenticity Inclusion of associated metadata: descriptive, administrative & structural
  • 12.
    How: some Principles- Metadata ( selection and implementation of information about objects: descriptive; administrative; technical; structural; & preservation ) Appropriate to materials, users and use Support for interoperability: mappings & crosswalks between schemes Use of authority control and content standards Includes a clear statement on conditions of use for the objects (eg. fair use) Support for long term management, eg. PREMIS Metadata records are treated as digital objects
  • 13.
    How: some Principles- Initiatives ( the creation & management of collections ) A substantial design and planning component Appropriate staffing and expertise Best practice project management An evaluation plan A project report that documents the process & outcomes Consideration of the entire lifecycle (ongoing management)
  • 14.
    Copyright & IntellectualProperty (1) Concerns: What sort of items are protected by copyright? What is the duration of copyright protection? What sorts of activities infringe copyright? When is a copyright licence required? Understanding the “exceptions” to copyright infringement See: Copyright and Cultural Institutions: Short Guidelines for Digitisation by Emily Hudson and Andrew Kenyon & ACC’s S pecial case exception: education, libraries, collections (deals with the new section 200AB)
  • 15.
    IFLA/IPA Statement onOrphaned Works
  • 16.
  • 17.
  • 18.
  • 19.
    Resources required (1)Hardware – scanners, cameras, computers, monitors, digital storage, memory & processing power Software – scanning, OCR, office apps, image editing & management, DAM?, video/audio capture, metadata capture?, file conversion, calibration Furnishings – for staff, computers, scanners, storage Facility space – scanning, preparation & storage, QA Specialist staff – curatorial, cataloguers, IT/DBA, web, scanning, project management, conservators Training needs Conservation needs – archival supplies & consultancies Budget funds – salaries, hardware/software purchases & lease, licenses, running/ongoing costs, contingency Corporate support – context within corporate or other priorities and strategies
  • 20.
    WW1 Diaries scanningfacilities Approximately 200,000 high res. images per year
  • 21.
    Outsource or Inhouse? Contractor responsible for capital equipment, training and technology obsolescence costs costs No need to find scanning space Less need for digitisation knowledge Economies of scale (& capability for large volumes & throughput) The bureau may be able to achieve a better quality result & have a broader range of services A better fix on costs and timescales (but these can vary widely) Better institutional knowledge, understanding & capacity Less risk than working with external parties Better ability to meet specific needs and deadlines? Cheaper costs for oversized or non-standard materials? QA may be more efficient Saving on transport and insurance and less risk with onsite scanning Assured staff and expertise
  • 22.
    Dealing with anexternal bureau Clear contracts are important Choosing a bureau – check with reference sites Range and scope of material - non-standard materials Collaboration with others to achieve further economies of scale may be possible QA can be a project killer Metadata – what will the bureau record? Consider partial outsourcing or bringing a specialist partner onsite
  • 23.
    Some funding optionsProgram funding – dependent on corporate priorities User pays – but will they? Grants - eg. http:// www.nla.gov.au /chg/ Donors or sponsors - from or associated with a web presence Collection Depreciation – depends on valuation and an accounting standard As a training activity – can be viable learning experience for a small team & project New policy proposals
  • 24.
    “ Investing inan Intangible Asset” The benefits of long term preservation of digital assets are difficult to value (reliably and objectively), but the costs of not doing so are high if action isn’t taken. More information on costs and benefits is needed . Digital preservation is still new, so there is scope for market creation & development, research and experimentation . Information managers know why such programs are important, but find it hard to communicate this to those who control our finances. Business cases based on empirical evidence need something like the balanced scorecard approach to bridge the gap between us and decision makers . Digital preservation is still an organisational innovation and must be managed effectively as it is dependent on independently driven technological developments. From DCC’s Investment in an Intangible Asset
  • 25.
    The AWM DocumentDigitisation Process
  • 26.
    Cornell’s digital imagingprocess map Radiating out from the goals and deliverables of the project are the institutional resources The outer wheel represents the processes or stages of digital imaging initiatives – clockwise from Selection
  • 27.
  • 28.
    PRODUCTION: file formatsand standards Commonly used formats: TIFF JPEG GIF PDF Future formats: JPEG 2000 PNG
  • 29.
    PRODUCTION: file formats– how and where they are used
  • 30.
    PRODUCTION: scanners &cameras Flatbed scanners Map/plan scanners Overhead scanners Digital cameras Book scanners Book-edge scanners Microfilm and slide scanners
  • 31.
    PRODUCTION: software Imageediting software Consider: cost; hardware requirements; usability; functionality Options : Adobe Photoshop CS3 (expensive/best) & Photoshop Elements (cheap); Gimp (free); + prop. software for RAW files Derivative and pdf production: Acrobat Writer (expensive); ImageMagick (conversion software); Ghostscript (pdf interpreter); & pdftk (pdf toolkit) Other useful open source software: JHOVE object validation FedoraCommons object repository management system ebXML e-business suite Xena digital document preservation software (from NAA) DSpace institutional repository system DROID automated batch identification of file formats (from TNA UK)
  • 32.
    OUTPUT Indexing Mostdescriptive metadata will come from your MARC records If a separate database is needed: Access, SQL & Oracle Access options (also part of just doing it ) Collection OPACs , databases , Zoomify , EAD , DVDs, CDs Other: Blogs , Facebook ArtShare , Flickr , Facebook page Search engine optimisation How can I create a Google-friendly site?
  • 33.
    STORAGE & MAINTENANCEStorage Consider : Speed (read/write, data transfer); Capacity; Reliability (stability, redundancy); Standardization; Cost; & Fitness to task Management, maintenance & preservation Digital preservation practices Preservation metadata Trusted digital repositories?
  • 34.
    What we wantAccuracy / authenticity Searchability Easy navigation & download Cost effectiveness Good quality product Text capture and search (OCR) where poss. Integration Scalability Web interactivity Simple solutions Costs estimates escalate Technology has limits, but is improving You learn with new technology by doing There is more to copyright than owning it Anticipate needs & increasing expectations $ hard to find for access (sponsorship?) Better management & storage of assets A need to educate managers & suppliers! Keeping trained staff is a challenge Costs/benefits of new technologies ( risk ?) Importance of QA in projects! Need for a strategic plan(s) Be prepared to compromise What we are finding Lessons
  • 35.
    Enterprise Content Management:management, search & web facilities for digital assets and services Extensive digital asset management features Excellent electronic document & record management Intuitive web content management features Facilitate simple and complex workflow processes Extensive and unified searching constructs Scaleable Compliant with all government recordkeeping requirements & emerging digital preservation standards Integrate easily with existing Memorial systems Simple to administer in terms of security, auditing & storage management
  • 36.
    Other Corporate SystemsDigital Asset Management Electronic Document & Records Management Record Management E:mail Memorial Intranet Web Content Management AJRP Website Lotus Notes OAI Interface FIRST OPAC MICA OPAC (CAS) ‏ ECM - Conceptual Overview CMS Digital Object Mgmt System DOMS Biographical Databases & War Diaries RecordSearch NAA Collection Mgmt MICA Library System FIRST Fund Raising System Raisers Edge Financial & HR System SAP POS System, Advance Retail CAS Internal Orders OnLine Shop Search Photocopy Quotes ReQuest eSales PICTION
  • 37.
    implementing user-friendly technologiesmake sure they are findable and useable pick a few “winners” & lead by example collaborate & network get involved in your core business don't leave it to IT-staff learn to compromise (the 80:20 rule) ‏ experiment start now! it is sometimes easier to seek forgiveness than gain permission
  • 38.
    JISC 2007 – five key issues for digitisation Re-focus on the user (simple, easily found & used output) Aggregate and present content that can resonate with multiple communities Learn from Google & YouTube but keep our values New business models are needed, collaborating with and without the private sector More collaboration between publishers, curators, funders, users, vendors and standards bodies

Editor's Notes

  • #2 Links used in this presentation: Most can be found via http://del.icio.us/malbooth AWM home page http://www.awm.gov.au/ Unit war diaries online http://www.awm.gov.au/diaries/ Private records currently online (eg) http://www.awm.gov.au/findingaids/process.asp?collection=private&item=100days Official Histories online http://www.awm.gov.au/histories/index.asp Blogs: http://blog.awm.gov.au/ http://blog.awm.gov.au/1917/ http://blog.awm.gov.au/lambert/ http://blog.awm.gov.au/focus/ http://blog.awm.gov.au/lawrence/ Encyclopedia http://www.awm.gov.au/encyclopedia/index.htm Podcasts feed (RSS) http://www.awm.gov.au/podcast/index.asp