• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Digitisation workshop pres 2009(v1)

Digitisation workshop pres 2009(v1)



Slides from a half day workshop that I gave a couple of times in 2009. Better late than never I suppose. You need to read my blog post here: ...

Slides from a half day workshop that I gave a couple of times in 2009. Better late than never I suppose. You need to read my blog post here: http://frommelbin.blogspot.com/2010/09/some-old-news-about-digitisation.html for an explanation about some slides and for references.



Total Views
Views on SlideShare
Embed Views



5 Embeds 25

http://frommelbin.blogspot.com 15
http://malbooth.posterous.com 5
http://frommelbin.blogspot.in 2
http://frommelbin.blogspot.co.nz 2
http://frommelbin.blogspot.com.au 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.


11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Digitisation workshop pres 2009(v1) Digitisation workshop pres 2009(v1) Presentation Transcript

    • Managing DigitisationPrograms
      16 July 2009
      Mal Booth – DERSU
    • My background?The Australian War Memorial’s Research Centre functions as a library and an archive. It develops, manages and provides public access to Australia’s official, personal, & published records of war.
    • Global trends in digitisation
      • Faster, better, cheaper equipment & storage
      • Better DAMS & CMS software
      • Institutional & shared repositories
      • More audio & film
      • Collaboration
      • Shared collections (eg. Picture Australia)
      • Mass digitisation programs: Google, Microsoft, Yahoo, Open Content Alliance (OCA), Internet Archive
      • Pressure for online access & pressures on real storage space
    • I’m not sure what these are, but they are important!
      • Dynamism
      • Preservation (as a benefit & obligation where necessary)
      • Playing
      • Management & planning
      • Compromise
      • Access
    • Recent Examples - AWM
      WW1, WW2, Korea & Vietnam unit war diaries
      260k+ images of our collections
      Official histories (published works)
      Digitisation on demand
    • Digitisation for Accessc90,000 ppper year
    • Recent Examples – UTS Library
      Supporting Teaching & Learning
      • Digital Resource Register
      • Alternative Format Service
      • Exam Papers
      Access only
      Supporting Research
      • eScholarship (UTS ePress, iResearch, eData)
      • Australian Digital Theses Collection
      Access & Preservation (data curation)
    • About one fifth of these images
    • What we will cover today
      a. Why and what to digitise?
      b. How (preservation/access) & Principles
      c. Copyright and IP considerations (briefly)
      d. Resources needed; in-house or outsource?
      e. Process outline: from planning to long term maintenance (life-cycle)
      a. Production: file formats & standards, scanners & cameras, software
      b. Output: indexing, access, search optimisation, delivery options
      c. Storage, ongoing maintenance & management requirements
      d. Just doing it, lessons learned & key issues
    • Why and what to digitise?
      Increase & broaden access (remote & 24/7)
      Fragile, valuable &/or unique materials (loss or damage would be catastrophic)
      Support research & education
      Anticipating future use or re-use
      Improved search, retrieval & storage
      Promoting knowledge, understanding & recognition of collections
      Relationships to other collections
      Preservation of at-risk collections by risk reduction & conservation
      WHAT: popular collections; fragile/unique; at-risk; significant priorities; relationships (corporate or collaborative); & what you have the right to digitise!
    • How: some Principles* - Collections
      (organised groups of objects)
      Agreed collection development policy
      Sound description
      Lifecycle curation
      Broad access to all
      Respect for IP
      Evaluation for use & usefulness
      Integration of staff & user workflows
      Sustainability & continued usability
      * NISO Framework of Guidance for the Building of Good Digital Collections
    • How: some Principles - Objects
      (digital assets)
      Production ensures collection priorities & maintains interoperability and re-use
      Preservability: persistence & accessibility over time; across evolving media, software & formats
      Meaningful outside its context: portable, reusable, interoperable
      Persistent identifiers: URLs or URIs
      Authentication: veracity, accuracy & authenticity
      Inclusion of associated metadata: descriptive, administrative & structural
    • How: some Principles - Metadata
      (selection and implementation of information about objects: descriptive; administrative; technical; structural; & preservation)
      Appropriate to materials, users and use
      Support for interoperability: mappings & crosswalks between schemes
      Use of authority control and content standards
      Includes a clear statement on conditions of use for the objects (eg. fair use)
      Support for long term management, eg. PREMIS
      Metadata records are treated as digital objects
      RUBRIC overview:
    • How: some Principles - Initiatives
      (the creation & management of collections)
      A substantial design and planning component
      Appropriate staffing and expertise
      Best practice project management
      An evaluation plan
      A project report that documents the process & outcomes
      Consideration of the entire lifecycle (ongoing management)
    • Copyright & Intellectual Property (1)
      What sort of items are protected by copyright?
      What is the duration of copyright protection?
      What sorts of activities infringe copyright?
      When is a copyright licence required?
      Understanding the “exceptions” to copyright infringement
      See: Copyright and Cultural Institutions: Short Guidelines for Digitisation by Emily Hudson and Andrew Kenyon
      & ACC’s Special case exception: education, libraries, collections(deals with the new section 200AB)
    • IFLA/IPA Statement on Orphaned Works
    • Resources required (1)
      Hardware – scanners, cameras, computers, monitors, digital storage, memory & processing power
      Software – scanning, OCR, office apps, image editing & management, DAM?, video/audio capture, metadata capture?, file conversion, calibration
      Furnishings – for staff, computers, scanners, storage
      Facility space – scanning, preparation & storage, QA
      Specialist staff – curatorial, cataloguers, IT/DBA, web, scanning, project management, conservators
      Training needs
      Conservation needs – archival supplies & consultancies
      Budget funds – salaries, hardware/software purchases & lease, licenses, running/ongoing costs, contingency
      Corporate support – context within corporate or other priorities and strategies
    • WW1 Diaries scanning facilities
      Approximately 200,000 high res. images per year
    • Outsource or Inhouse?
      Contractor responsible for capital equipment, training and technology obsolescence costs costs
      No need to find scanning space
      Less need for digitisation knowledge
      Economies of scale (& capability for large volumes & throughput)
      The bureau may be able to achieve a better quality result & have a broader range of services
      A better fix on costs and timescales (but these can vary widely)
      • Better institutional knowledge, understanding & capacity
      • Less risk than working with external parties
      • Better ability to meet specific needs and deadlines?
      • Cheaper costs for oversized or non-standard materials?
      • QA may be more efficient
      • Saving on transport and insurance and less risk with onsite scanning
      • Assured staff and expertise
    • Dealing with an external bureau
      • Clear contracts are important
      • Choosing a bureau – check with reference sites
      • Range and scope of material - non-standard materials
      • Collaboration with others to achieve further economies of scale may be possible
      • QA can be a project killer
      • Metadata – what will the bureau record?
      • Consider partial outsourcing or bringing a specialist partner onsite
    • Some funding options
      • Program funding – dependent on corporate priorities
      • User pays – but will they?
      • Grants - eg. http://www.nla.gov.au/chg/
      • Donors or sponsors -from or associated with a web presence
      • Collection Depreciation – depends on valuation and an accounting standard
      • As a training activity – can be viable learning experience for a small team & project
      • New policy proposals
    • “Investing in an Intangible Asset”
      The benefits of long term preservation of digital assets are difficult to value (reliably and objectively), but the costs of not doing so are high if action isn’t taken. More information on costs and benefits is needed.
      Digital preservation is still new, so there is scope for market creation & development, research and experimentation.
      Information managers know why such programs are important, but find it hard to communicate this to those who control our finances. Business cases based on empirical evidence need something like the balanced scorecard approach to bridge the gap between us and decision makers.
      Digital preservation is still an organisational innovation and must be managed effectively as it is dependent on independently driven technological developments.
      From DCC’sInvestment in an Intangible Asset
    • The AWM Document Digitisation Process
    • Cornell’s digital imaging process map
      • Radiating out from the goals and deliverables of the project are the institutional resources
      • The outer wheel represents the processes or stages of digital imaging initiatives – clockwise from Selection
    • PRODUCTION: file formats and standards
      Commonly used formats:
      • TIFF
      • JPEG
      • GIF
      • PDF (accessible text!)
      Contemporary & future formats:
      • JPEG 2000
      • PNG
      • DNG
    • PRODUCTION: file formats – how and where they are used
    • PRODUCTION: scanners & cameras
      • Flatbed scanners
      • Map/plan scanners
      • Overhead scanners
      • Digital cameras
      • Book scanners
      • Book-edge scanners
      • Microfilm and slide scanners
    • PRODUCTION: software
      Image editing software
      • Consider: cost; hardware requirements; usability; functionality
      • Options : Adobe Photoshop CS3 (expensive/best) & Photoshop Elements (cheap); Gimp (free); + prop. software for RAW files
      • Derivative, OCR and pdf production: Adobe Acrobat 9 Pro; OmniPage; ImageMagick (conversion software); Ghostscript (pdf interpreter); & pdftk (pdf toolkit)
      Other useful open source software:
      • JHOVE object validation
      • FedoraCommons object repository management system
      • ebXML e-business suite
      • Xena digital document preservation software (from NAA)
      • DSpace institutional repository system
      • DROID automated batch identification of file formats (from TNA UK)
      • OpenEdit ; Razuna ; ResourceSpace - Open source & free DAM software
    • OUTPUT
      • Most descriptive metadata will come from your MARC records
      • If a separate database is needed: Access, SQL & Oracle
      Access options (also part of just doing it)
      • Collection OPACs, databases, Zoomify, EAD, DVDs, CDs
      • Other: Blogs, Facebook ArtShare, Flickr, Flickr Commons, Facebook page
      Search engine optimisation
      • How can I create a Google-friendly site?
      Consider: Speed (read/write, data transfer); Capacity; Reliability (stability, redundancy); Standardization; Cost; & Fitness to task
      Management, maintenance & preservation
      • Digital preservation practices
      • Preservation metadata
      • Trusted digital repositories?
    • Lessons
      What we want
      What we are finding
      • Accuracy / authenticity
      • Accessibility
      • Searchability
      • Easy navigation & download
      • Cost effectiveness
      • Good quality product
      • Text capture and search (OCR) where poss.
      • Integration
      • Scalability
      • Web interactivity
      • Simple solutions
      • Costs estimates escalate
      • Technology has limits, but is improving
      • You learn with new technology by doing
      • There is more to copyright than owning it
      • Anticipate needs & increasing expectations
      • $ hard to find for access (sponsorship?)
      • Better management & storage of assets
      • A need to educate managers & suppliers!
      • Keeping trained staff is a challenge
      • Costs/benefits of new technologies (risk?)
      • Importance of QA in projects!
      • Need for a strategic plan(s)
      • Be prepared to compromise
    • Enterprise Content Management: management, search & web facilities for digital assets and services
      Extensive digital asset management features
      Excellent electronic document & record management
      Intuitive web content management features
      Facilitate simple and complex workflow processes
      Extensive and unified searching constructs
      Compliant with all government recordkeeping requirements & emerging digital preservation standards
      Integrate easily with existing systems
      Simple to administer in terms of security, auditing & storage management
    • implementing user-friendly technologies
      • make sure they are findable and useable
      • pick a few “winners” & lead by example
      • collaborate & network
      • get involved in your core business
      • don't leave it just to IT-staff (get involved)
      • learn to compromise (the 80:20 rule)‏
      • experiment
      • start now! it is sometimes easier to seek forgiveness than gain permission
    • JISC 2007 – five key issues for digitisation
      Re-focus on the user (simple, easily found & used output)
      Aggregate and present content that can resonate with multiple communities
      Learn from Google & YouTube but keep your values
      New business models are needed, collaborating with and without the private sector
      More collaboration between publishers, curators, funders, users, vendors and standards bodies