Your SlideShare is downloading. ×
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Digital libraries:  successfully designing developing and implementing your digitization strategy
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Digital libraries: successfully designing developing and implementing your digitization strategy

710

Published on

Rewritten paper - based on previous presentation

Rewritten paper - based on previous presentation

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
710
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Digital Libraries: Successfully designing, developing and implementing your digitization strategy Presented by Beatrice Adera Amollo Head of Library & Resources Australian Studies Institute, Nairobi Kenya December, 2012
  • 2. Table of Contents INTRODUCTION......................................................................................................................... 1 KEY TERMS, PRINCIPLES AND LEGAL ISSUES ................................................................... 3 Digitization.............................................................................................................................. 3 Library portal........................................................................................................................... 3 Digital repository vs. Institutional repository.......................................................................... 3 Digital Library Management System ...................................................................................... 3 Digital preservation ................................................................................................................. 4 Access rights............................................................................................................................ 4 Metadata .................................................................................................................................. 4 THE DIGITIZATION STRATEGY........................................................................................... 5 DESIGNING & DEVELOPING THE DIGITIZATION STRATEGY.......................................... 7 PROJECT PLANNING............................................................................................................... 7 Objective of digital collection or digitization.......................................................................... 7 Type of project: ....................................................................................................................... 8 The target audience.................................................................................................................. 8 Preliminary assessment............................................................................................................ 8 Fund raising........................................................................................................................... 11 Legislative framework........................................................................................................... 12 PRE-DIGITIZATION ACTIVITIES............................................................................................ 14 Digitization tools and resources ................................................................................................ 14 Training Requirements.............................................................................................................. 16 Metadata for user access ........................................................................................................... 16 Who is responsible for metadata creation?............................................................................ 19 IMPLEMENTING YOUR DIGITIZATION STRATEGY.......................................................... 20 DIGITAL CONVERSION........................................................................................................ 20 Document preparation ........................................................................................................... 20 Scanning pages ...................................................................................................................... 20 Formatting and editing........................................................................................................... 21
  • 3. Creating a searchable database.............................................................................................. 22 Making digital library accessible........................................................................................... 23 POST DIGITIZATION................................................................................................................. 24 Quality Control.......................................................................................................................... 24 Review and Evaluation of the Digitization Process .............................................................. 24 Testing Access....................................................................................................................... 24 Continuity of digital collection. ................................................................................................ 25 REFERENCES ............................................................................................................................. 26
  • 4. 1 INTRODUCTION Digitization is the process of taking traditional library materials that are in form of books and papers and converting them to the electronic form where they can be stored and manipulated by a computer (Witten & Bainbridge, 2003 cited in Kamusiime & Mukasa, 2012). The United Nations (2011) describes it as the process of converting analogue items, such as a paper records, photographs or graphic items, into an electronic representation or image that can be accessed and stored electronically. It involves translation of data into digital form (binary coded files for use in computers). Scanning images, sampling sound, converting text on paper into text in computer files, all are examples of digitization (Lopatin, 2006.) Borgman (1999) summarises the definition well. She defines digital libraries as ‘a set of electronic resources and associated technical capabilities for creating, searching and using information’. The common components of the above and other existing definitions of the digital library are:- Organized information collection Electronic or digital form Preserved information Online access A digital collection consists of digital objects that are selected and organized for ease of their discovery, access, and use. Objects, metadata, and the user interface together create the user experience of a collection. According to National Information Standards Organization (NISO), a good digital collection: is created according to an explicit collection development policy. is curated, which is to say, its resources are actively managed during their entire lifecycle. is broadly available and avoids unnecessary impediments to use. Collections should be accessible to persons with disabilities, and usable effectively in conjunction with adaptive technologies. is interoperable.
  • 5. 2 is sustainable over time respects intellectual property rights has mechanisms to supply usage data and other data that allows standardized measures of usefulness to be recorded. integrates into the users own workflow Collections should be described so that a user can discover characteristics of the collection, including scope, format, restrictions on access, ownership, and any information significant for determining the collection’s authenticity, integrity, and interpretation. Most libraries today are hybrid libraries. They have both physical and digital collections, and provide services both digitally and in a physical library place Digital libraries: Improve and widen access to a library's collection. While as one copy of a document in hard copy format can only be read by one person at any given time, the electronic version can be accessed by multiple users at the same time. Increase longevity of information material: Since the information is electronic, damage and loss is greatly reduced. Encourages and facilitates resource sharing amongst libraries: Digital collections can be shared or transferred over the electronic network. It saves money and time. Users do not have to move from one library to another to look for information. They can access collections of different libraries from one point. Ensure standardization and conformity amongst libraries. Reduce duplication of work: With improved access and to a wider library collection, researchers and scholars can effectively review work that is already done by most of their peers.
  • 6. 3 KEY TERMS, PRINCIPLES AND LEGAL ISSUES Digitization Some materials in a digital library are 'born digital'; that is to say they were created, and are always used, in digital form. But much digital library content has to be created by a process of digitization. Digitization is converting print-on-paper resources to digital form, usually by scanning. Library portal This is sometimes confused with digital library. A portal is an interface to all the digital information resources available in the library to the users. It is a 'gateway' that gives access to the library's own digital collections and to material from other library collections, or web search engines. Digital repository vs. Institutional repository A digital repository is just an organised store of digital information items while an Institutional repository is more specific. It is a special kind of digital library and is managed by an organisation to make all its own digital materials available. For example, a university library might run a repository for all the articles and reports written by the university's professors. Because the materials in this kind of repository are freely available to anyone, they are sometimes called 'open access archives'. Digital Library Management System A digital library management system (DLMS) is a kind of software system which provides the functions for creating and managing a digital library collection, and providing services for the users’ service. DLMSs usually allow specialized software to be added to meet particular needs. Such systems may be sold by commercial suppliers, or may be built around standard repository software such as Greenstone, DSpace, EPrints, Fedora or WebAGRIS. These five are free, open
  • 7. 4 source systems for managing digital libraries and repositories. It takes some programming knowledge to set and customize most of these open source software. The library can buy a commercial software package, if funds are available, and if there is a reason to do so; for example, if other cooperating libraries use it, and for the sake of consistency. CONTENTdm digital library software and ExLibris DigiTool commercial repository software are two examples. The other alternative is to write or develop own software, one that is tailor made for the library. This is a major task that may not be worthwhile, since free and open source systems can be modified to suit the library’s specifications. Adobe Acrobat Reader, which is a freely available software and Windows based Ms Access and Excel have been used in creating and managing digital collections. Digital preservation Digital preservation involves all the activities undertaken to ensure that digital information is maintained for as long as it is needed. The digital information should remain available in its original form for the stipulated time. Access rights In the digital library, contents are usually licensed and different users may have access to the same materials. There is need for controls to monitor usage and counter misuse so that there is fair use of the authors’ intellectual output. Librarians need to make sure that users have access to information which they are entitled to and no more. Metadata Metadata is data about data. It is the information about information resources. This includes:- description of the item i.e. physical description (format, size) and subject or topic author, title, publisher, date of publication
  • 8. 5 preservation or archiving information; and access information and copyright THE DIGITIZATION STRATEGY Before one embarks on a journey, they must be clear about the purpose of the trip, where to board the plane or train, how much luggage they will need or are allowed to carry and the destination. In the same way, a library must be clear about the objectives and resources for an effective digital library. There must be a strategic plan and framework to help map the digital project activities to the overall and final target. Ayris (1998) asserts that the library must be clear about why it needs to digitize and answer the following questions before embarking on the plan. If the answers are mostly negative, then the digitization project cannot take off. The questions are: Are the material to be digitized substantial enough to warrant digitization, and do they posses sufficient intrinsic value to ensure interest in digitization? Will access to the information be enhanced or usage levels increased as a result? Are the goals to be met agreeable and in line with the overall objectives? Are rights and permissions for electronic distribution securable? Does current technology yield images of sufficient quality to meet stated goals? Does technology allow digital capture from a photo intermediate? Can the project get sufficient funding? Are the costs reasonable and within the target financier’s capability? Does the institution have sufficient expertise in project management? Is the overall organisational and technical infrastructure adequate? If the project team is confident and is able to affirm and address the above questions, then the digitization strategy can be formulated. A digitization strategy is a plan that is theoretically and practically laid out by the organization or library, using available and required resources for the creation of an effective digital collection.
  • 9. 6 Digitization is a strategy that should be undertaken as part of the parent organisation’s overall information management programme. It should not be a stand alone, library initiative. The digitization strategy must be aligned to the digital project objective that is directly connected to the overall organizational mission and goals. A carefully planned strategy is necessary since digital projects are extremely complex. Digital projects can be expensive and if not well managed can lead to greater expenses than necessary. The strategy must factor in all the requirements for the project to succeed. Effective project management that includes managing budgets, staffing, workflow, determining technical specifications and metadata creation is necessary. Given the scale of the physical information material, the digitization exercise will need the most cost-effective and efficient complement of approaches and tools to achieve the long-term preservation of and access to the information material. The digitization strategy is therefore based on the digitization process. The United Nations (2011) outlines four key phases that we will focus on and map against the three elements of the digitization strategy – design, development and implementation. The phases are: 1. Project planning 2. Pre-digitization activities 3. Digital Conversion 4. Post Digitization In formulating the digitization strategy, the design, development and implementation stages are factored within the four phases. The project planning and pre-digitization activities will be outlined in the design and development section of the strategy. The digital conversion and post digitization fall under the implementation stage.
  • 10. 7 DESIGNING & DEVELOPING THE DIGITIZATION STRATEGY PROJECT PLANNING When determining what and how items should be set for digitization, there are a number of critical planning issues to consider before commencing the actual digitization exercise. According to Lopatin (2006), digital projects can be extremely complex, and effective project management – including planning, managing budgets, staffing, workflow, determining technical specifications, and metadata creation – is vital for a successful digitization project. Objective of digital collection or digitization When contemplating the digitization project, the project leader and organization must agree on the main goals or purpose of the digitization exercise. They must have answered the initial questions to justify digitization within the library and the organization as a whole. The key objectives will stand out as the main guiding principle upon which all the planned activities are carried out within the project timeframe. The library or project team must settle on any or all of the following as the objective or reason for the digitization project. (i) For effective use of limited physical space There might be a lot of information, but very little storage space. Digital records can be stored on electronic storage devices and thus reduce the need for physical space while also ensuring that they are secure and protected from intrusion or damage. (ii) For improved information access and sharing Digitization and increased use of digital information improves information sharing by allowing access to digital information on web sites, information management systems. The library may be driven by the need to provide better access to its users. There is also efficient dissemination of information through emails and other forms of online communication. (iii) For preservation Information material is preserved when digitized as it reduces the handling of original hard copies. Access of digital information is guaranteed over a longer period of time, as opposed to
  • 11. 8 the print edition which is subject to poor handling and misuse that leads to wear, tear and loss. The organization or library may undertake digitization simply to preserve its fragile materials Apart from the three common objectives above, others may include: (iv) To add value to the information collection or resources (v) To support educational and research activities, if it is an academic or research institution. (vi) To fulfill strategic mission and goals of the organisation Type of project: The type of digital project must also be established as this will influence the strategy that is adopted. There are different types of digital projects, namely:- Special and Archival collections Reformatting content from other non-print resources Born digital projects The target audience Apart from the type and objective, the strategy must be clear about who the digital collection is targeting. The team must be clear about its primary audience, and probably the secondary audience, particularly if there will be no restrictions to the public to access. The primary audience for a university digital collection could be Scholars, university students and lecturers. The secondary audience would be anyone with access to the web, including the general public. The project must place all this into consideration. Preliminary assessment – of bulk of the project, resource requirements and stakeholder studies A value assessment needs to be conducted in order to establish the justification for a digitization project. The purposes or expected benefits of the digitization exercise will form the basis for establishing a value assessment for the digitization. This will help the project team determine whether to proceed or not with the digitization plan. A table showing expected impact or benefit
  • 12. 9 and the prevailing conditions that would necessitate such an action, like the one below can be included in the plan. Expected benefit Prevailing condition Improved information access and sharing The information material are accessed frequently and used constantly within the organization or library. The print copies can be accessed by one or limited persons at any given time. Preservation and Protection of information The information in the materials have historical and legal value or are still active or semi-active but in a fragile condition. The materials contain vital information that requires protection and preservation as stipulated by the institutional or national law. Reduction of printed information material Less storage space requirement Information material storage costs outweigh the benefit of keeping physical resources of temporary value. The storage space is minimal or fast running out. The organization is implementing an information system that will facilitate access or retrieval of digitized information from the library. If the information materials are frequently accessed and used, digitization will ensure that there is better information sharing and ease of dissemination. Materials that are in danger of damage due to aging or their fragile nature are best candidates for digitization. A digital master copy can be created and copy made available for access to ensure their preservation. The library must also estimate the bulk, type and source of material to be digitized. The amount will help the project team estimate the amount of time, staff and equipment requirements. The information collected may be tabulated as shown in the table below.
  • 13. 10 Table 1: Example of a preliminary digitization materials assessment for an organisation Department Types of information generated/stored Size/Volume Purpose & target users Access/user controls Research & Development Research papers, articles, reports, conference papers, etc X X Restricted - Internal Documentation Unit Subscription – back files, Researchers profiles, Theses, dissertation X X Public – approval required Library X X X Open The library might have very little control over how fast material will be received from other organizational departments, as opposed to what is readily available within the library’s jurisdiction. Thus, the need to sell the digitization strategy at all levels, from the top, middle and bottom. The digitization assessment will entail assessment of the users and stakeholders. The table below shows different methods that can be used to gather the required information. A stakeholder study that entails collecting data from the target users, creators and contributors of the digital collection should be conducted to establish the expectations and needs of these groups from such a project. This will also be a platform for the library to educate and inform them about the purposes of digitization, the expected benefits for the individual, the library and the organization. Methods of assessment Interviews Administering of questions through questionnaires Workshops for all stakeholders Meetings with management staff
  • 14. 11 During the assessment stage, the library should garner sufficient support from the parent organization and the contributors to ensure growth and sustainability of the digital collection. Fund raising Fund raising and mobilization of support for implementation and sustainability of digital projects It is a known fact that budgets are perpetually stretched thin in libraries. Therefore, anything that will require a major chunk of library resources is a challenge that requires major preparation. Digitization is expensive, particularly when unique materials are involved. Although some people might think digitization is much like photocopying, that's not the case. In digitization of hard copy material, sometimes there is need to prepare the material before digitizing. The material may be fragile and need to be conserved first to minimize damage during scanning. Hughes (2004) warns that “there are no short-term cost savings to be realized by digitizing collections”. Digital initiatives require a high start up costs and digital environment is not static. Data migration is not a “once-in-a-lifetime” thing, but rather its ongoing. As is known by now, digitisation is time consuming, labour and resource intensive. This means that the library or project team might have to go out of its way to ensure that the project succeeds. It may require more than the usual number of library staff to finish the work in reasonable time. Additional temporary or casual staff may have to be recruited to assist in the exercise. These costs, combined with chronic budget constraints, force institutions to make choices about what to digitize first. The entire exercise is gradual and may have to stretch over a lengthy period of time. For example, text that is keyed in or OCR (Optical Character Recognition)-generated, character accuracy will be the conventional measure of quality. Character accuracy is vital if the information is to be disseminated as it is, without distorting meanings and context. Although some OCR software generates confidence scores that predict the likelihood of errors being present on any page, human comparison of the digitized output to the source, or manual keying of the entire page a second time may have to be done to ensure accuracy. Thus, the process of scanning, converting to editable text and proofreading require a lot of time.
  • 15. 12 The project team may have to get additional funding, out of the usual budgetary allocation in order to achieve its goals. Ways of raising the required capital for the necessary resources have to be outlined in the strategy or plan. Some of the ways of fund raising would include: 1. The institutional budget a. Present proposal that will convince the management of the benefits of the digital collection 2. Private or public donor agency a. Here, the library will have to draft a proposal using the donor’s prescribed format. 3. Library budget a. Use allocated library budget to build the collection is there is sufficient staffing and basic equipment. This is rare. 4. Foundations – public and private 5. Stakeholders’ contributions a. If the stakeholders are convinced of the advantages of the digital collection, then they can all contribute towards its realization. This will also depend on the type of library and the guiding principles of the parent organization. Legislative framework One of the main issue in digitizing a library collections or items is that of copyright law. Before beginning the digitization process, librarians have to consider whether or not the digitized material will violate copyright and intellectual property laws. (Yan, 2004). They need to take into consideration whether or not the material to be digitized is protected by the copyright law, or whether it is in the public domain. Works in the public domain can be used freely without seeking permission or paying royalties or fees. Many institutions struggle with the question of whether it is permissible to digitize and index protected materials for preservation purposes only. Current laws can be unclear on whether it is allowable to digitize rights-protected materials for preservation purposes, even if there is no intention to publish the content.
  • 16. 13 If the material to be digitized is not in the public domain, and if copyright exceptions are not applicable, the library will need to obtain permission from the copyright holder to digitize the material. Lopatin (2006) warns that tracking down the copyright holder(s) and coming to an agreement with them can be a very difficult and time-consuming process. One member of staff or project team can be assigned the responsibility of confirming material copyright. Hughes (2004) says that this person will be responsible for clearing copyright, including identifying and contacting the copyright holder(s) and even documenting all the steps taken. After an institution mounts a digital project on the web, the institution needs to manage the rights to the digital objects. Other ways that institutions protect their digital material include: Inserting banners or captions into the image, usually along the border of the image. Use of watermarks that involve the use of inserting of marks or labels into the digital content in a subtle, or transparent, way Use of low resolution images which have inadequate detail that would enable copying.
  • 17. 14 PRE-DIGITIZATION ACTIVITIES The project team should examine and address requirements for information material sorting, classification, metadata requirements, digitization, information security, and disposition that will be required to support digitization efforts. The following must be factored into the strategy and included in the plan for execution. Digitization tools and resources Digitization process as seen above requires additional and specific resources. Tools and resources such as scanners, imaging software, a storage solution and staff are mandatory. The team must consider these when budgeting and assessing the available resources. Tools required for digitization include: Digital scanner; Imaging and scanning software; Electronic record keeping system; Electronic records storage device, server; Records management programme with file plan, retention schedule and metadata schema. Before the equipment is acquired the team must decide how the information will be used. Will they be solely for access via the web, for hardcopy publication, or perhaps for detailed examination by scholars. Low-resolution images are sufficient for web use, but high-resolution images are required for publication or detailed study. (Eileen, 2004). The main equipment for converting print to electronic is a scanner and digital camera. The type of scanner chosen will depend on the document type and volume of records. Scanning and imaging software and file type (PDF, JPEG etc) will also depend on the system to be used and the user needs. Yan (2004) noted that there was a trend toward using mounted digital cameras rather than the famous flatbed scanners. However, over time different types of scanners have been produced for
  • 18. 15 the market. Libraries are however constrained by restrictive budgetary allocation that may not give librarians the luxury of purchasing hi-tech scanners for the library. Since most libraries are cash strapped, we will assume that you have a flat bed (fig 1) scanner for the job. The flat bed may be slow, but with proper planning will help the library convert a sizeable number of documents within reasonable time. Figure 1: Flatbed scanner Figure 2: Figure 2: Overhead Scanner If on the other hand, the library has the opportunity to choose and acquire a new scanner, then the following specifications should be considered:- 1. Pixel size of the frame. For example, 7000 pixels x 7000 pixels. 2. Non-interpolated resolution. Interpolation refers to the process of "estimating" the value of a pixel, determined by the color values of surrounding pixels. The higher the interpolation, the better. 3. Dynamic range. This is the ability to capture detail in highlights and shadows.
  • 19. 16 4. Bit depth. Most software processes up to 24 bit depth, but many scanners can capture up to 48 bit depth. Training Requirements The human resources or staff is particularly necessary to perform such responsibilities as scanning, performing quality control, and creating the metadata. To ensure that all members of the project team or staff are well versed with the digitization processes, a training programme should be developed. The training of all staff will ensure that the project proceeds smoothly and it will give the project leader a chance to answer any questions and address any fears by project members. The digitization training programme will include the following: Overview of the digitization initiative, its size, timeframe, purpose and desired outcomes, project manager How to use the digitization hardware and software Digital image formats Proper handling techniques to avoid damage to the information material Classification scheme for organizing images and other digital objects Maintaining the material in their original order How to identify and process records containing sensitive information Digitization documentation requirements How to identify and process records that require specialized digitization techniques, such as photographs or large format items The standards and procedures to ensure quality control Metadata for user access Yan (2004) contributes that there is need for widespread standardization for improved information access in and between different libraries. A library that is preparing to digitize should develop rules and standards specifically for the processes it intends to use.
  • 20. 17 Standardization ensures interoperability between cooperating libraries. Metadata standards and guidelines are commonly sought when planning digitization projects. Metadata is structured information associated with an object for purposes of discovery, description, use, management, and preservation. Lopatin (2006) adds that good metadata creation is important for representing information about an object such as structure, creators, format, and technical information. Thus, the creation of metadata is considered a major component the digital project. Different types of metadata include (i) Descriptive metadata which helps users find and obtain objects, distinguish one object or group of objects from one another, and discover the subject or contents. This type of metadata focuses on Dublin Core (DC) which is `one of the most active and accepted metadata standards used in more than 20 countries, (ii) Administrative metadata helps collection managers keep track of objects for such purposes as file management, rights management, and preservation and (iii) Structural metadata documents relationships within and among objects and enables users to navigate complex objects, such as the pages and chapters of a book. Apart from Dublin Core, other commonly used metadata include, Resource Description Framework (RDF), Encoded Archival Description (EAD), Text Encoding Initiative (TEI), and Standard Generalized Mark-UP Language (SGML) and its descendents Extensible Mark-Up Language (XML) and HTML. The MARC standard has been used as the standard interchange format in representing catalog records electronically. (Yan, 2004).
  • 21. 18 Table 2: Example of some basic metadata elements for an information item METADATA ELEMENTS OBLIGATION Identifier Mandatory Title Mandatory Subject Optional Description Optional Creator Mandatory Date Mandatory Addressee Mandatory for email, Optional for other records Record type Mandatory where applicable Relation Mandatory where applicable Function Optional but highly recommended Aggregation Mandatory Language Mandatory Location Optional Security & Access Mandatory Disposal Mandatory Format Mandatory Preservation Optional Source: United Nations, 2011 The two metadata schemes that have been primarily developed to accommodate digital resources are:- i. metadata object and description schema (MODS) ii. metadata encoding and transmission standards (METS). MODS was developed by the Library of Congress to provide an alternative between a simple metadata format with a minimum of fields and little or no substructure such as Dublin Core and a very detailed format with many data elements having various structural complexities such as MARC 21 (Guenther and McCallum,2003, p. 12). MODS is a descriptive metadata scheme.
  • 22. 19 METS is an XML document incorporating different types of metadata – descriptive, administrative, structural, rights and other data needed for retrieving, preserving and serving up digital resources” (Guenther and McCallum, 2003, p. 14). An increasing number of libraries and archives are implementing METS for a vast variety of digital library projects. The digitization strategy must include details on the metadata and vocabulary sets to be used for the digital project. The metadata standard(s) to adopt and what levels of description to apply should be made within the context of the main purpose for creating the digital collection, the available resources, the users and intended usage. Examples of metadata-based access tools include library catalogs, archival finding aids, museum inventory control systems, and search utilities such as Google. Who is responsible for metadata creation? At the creation stage, metadata about an object’s authors, contributors, source, and intended audience is usually provided by the original authors. At the organization stage, metadata about subjects, publishing history, and access rights are recorded by catalogers or indexers. At the access and usage stage, evaluative information such as reviews and annotations could be added by the user. National Information Standards Organization (2007) states that the creators of digital objects should be encouraged to embed as much metadata as possible within the object before it is shared or distributed.
  • 23. 20 IMPLEMENTING YOUR DIGITIZATION STRATEGY DIGITAL CONVERSION Digitizing information starts with images of pages, progresses through raw character recognizing to corrected text, and finishes with assigning descriptive data for easy retrieval. Each step provides more utility but at additional cost. (Berger, 1999) Below are the steps to take when dealing with any information that is in hard copy. Born digital material is handled differently as it requires less work to add to the library’s digital collection. Document preparation remove pages from their bindings; trim each page so that no rough edges are left; mend rips and tears with acid-free binding tape; Registering documents to keep track of them. A record of the documents must be kept for reference whenever the need arises. The register helps to show who is handling what and how far the library has gone in the digitization project. Scanning pages To scan a document, place it face down on the scanner platen, or put the pages into the sheet feeder. If you have a sheet-fed scanner, cut the book open (easy and neat if you use a printer's cutting machine) to get individual sheets you can feed through the scanner. If necessary, you can rebind the books later. Before scanning, the library has to set parameters to be applied on all the documents. This will ensure uniformity. Figure 3: An example of defined parameters
  • 24. 21 The scanner usually comes with software for converting image to text. This will help with the conversion from image to editable text. Formatting and editing The scanned image can be saved as a PDF (Portable Document Format) file using Adobe Acrobat; or GIF (Graphical Interchange Format) or JPEG (Joint Photographic Experts Group) file. The format of saving is determined at the planning stage. However most people prefer to save the images as JPEG since JPEG can be uploaded as HTML and furthermore, JPEG retains more information as it compresses an image. JSTOR uses GIF files to display the textual images. Images should be scanned at the highest quality possible to not only have a good quality copy to archive, but also to ensure that with improvements in technology, the scanned copy would not become obsolete. It is preferable to scan and save all documents first before embarking on the editing. A file structure should be created to save the images. Directories with their sub-directories should be set up for each series or volume of documents. For example, each page scanned can be named individually (i.e. Thes_12010-1p22), all pages in a document can be saved together in a folder (i.e. Th2010-1) and all issues would be saved together in a volume folder (i.e. Thes_1). This logical progression will make it possible to keep the order intact, even when an individual page is missing and has to be scanned at another time. The scanning should be in chronological order. When editing images, blemishes must be removed from the pages, the height and width of the images have to be standardized and the images must be blurred or sharpened as necessary. Photo imaging software applications, such as Paint Shop Pro and some versions of Adobe Photoshop can be used to do this work. Optical Character Recognition Software (OCR) is then used to edit text versions of the images. OCR is the process whereby a computer program "reads" the text from an image of a document and converts it into ASCII text. The use of optical character recognition is ideal as it converts scanned pages of text into electronic text, which enables full-text searching.
  • 25. 22 PDF files are favorable as most academic users are accustomed to them and already have Acrobat Reader; if not, the Reader is free and easy to install. Acrobat uses an efficient compression scheme and provides a user-friendly means of navigating through documents. It also automatically sizes each page for viewing and printing. (Anonymous, 2004) Figure 4: Diagrammatic representation of process – from single hard copy to digital copy Creating a searchable database of the text, filing structure; The structure of work involved in creating the searchable database will be determined by the DLMS that the library decides to adapt for its collection. Regardless of the system adapted, certain action must be taken in order to make the digital items searchable and accessible to the target audience. For the digital content to be searchable, accurate metadata must be assigned to each item. The library must agree on the standards to use in this exercise. The digital content and meta-data
  • 26. 23 about physical content must be contained, controlled, and understood in order for the basic functions of content management and content preservation to occur. Metadata can be assigned at the point of editing and formatting or collectively afterwards. The order is determined by the staff beforehand. Creating searchable database of the digital content will then be possible once the metadata has been created. Many institutions, including large universities, have created their own databases using such tools as SQL and MS Access. MySQL is an open source database that has become popular in nonprofit environments. Many libraries have also used FileMaker Pro, a cross-platform relational database application from FileMaker Inc. FileMaker Pro and MS Access require experience with relational databases. They are not as scalable as SQL, i.e., they become sluggish and start to break down at about 70,000 records. FileMaker Pro is cross-platform, whereas MS Access works only in a PC environment. Making digital library accessible. The final stage is migration of the digital collection or database to the web, to allow for remote or wider access. At this stage, it is important to note that user-friendliness of the system is important. Therefore, the online version must be searchable, should have a minimum of side-to- side scrolling, should load fairly quickly and the pages need to be clear and easy to read. Some of the features may be embedded in the system and so the library staff might have little control over the extent to which they can configure the user interface. For instance, the navigation mechanisms of Greenstone are good and easy to use. The user can move through a document page by page, or jump ahead or backwards by clicking on a page number. Variations of the "Next page" and "Previous page" buttons also exist to facilitate navigation.
  • 27. 24 POST DIGITIZATION After all the material is digitized, the project team will not ‘close shop’. As stated elsewhere digitization process is an on-going activity since new information is acquired or generated every day. The strategy must therefore include activities that have to be undertaken after the digital library is established. The post digitization activities derived from the United Nations (2011) standards are discussed in this section Quality Control Quality control measures include periodic testing and cleaning of scanning equipment, imaging calibration and verification of authenticity. The details of the quality control measure will depend on the digitization tools, such as the software, scanners and the digital management system. Quality control measures for checking digital images should be applied throughout the process, preferably by an assigned team member who can chart any errors. The process include checking individual files for any errors in arrangement, imaging or classification. For large volumes, random sampling (5% to 10%) can be applied. Review and Evaluation of the Digitization Process As in any other process it is productive to review the digitization process and evaluate how effective it has been. This is an opportunity to identify any redundancies, inefficiencies or problems in the process and revise and enhance the process accordingly. Any changes to the process must be reflected in an updated training programme. Testing Access It is important to test retrievability of and access to digital images that have been captured into database, server or some other electronic storage system. A retrieval exercise involving a sample of digitised records should suffice.
  • 28. 25 Continuity of digital collection. The digital library should not be static. The collection needs to be updated, upgraded and expanded with time. The initial phase is the hardest. The library should set regulations for continuity of the digital collection. In the case of the academic library, staff, students and scholars have to be put on board and be part of the digitization project. The importance, benefits and use of the digital library must be communicated to them through memos, newsletters, seminars and workshops to ensure that the collection continues to be current and relevant. Some guidelines set by the parent institution to support the digital library will go a long way in ensuring that the collection continues to grow, consistently. For example, the following policies can be set to ensure continuity:- (1) Library users contributing own literally works, must put them in electronic format. For example, in the university, theses or dissertations can be submitted in PDF format. (2) Documents must be stored electronically. (3) Documents must then be made accessible online.
  • 29. 26 REFERENCES Adam Northam. (2010). We started to a digital collection for next to nothing and You can too. Computers in Libraries, 30(5), 15. Anonymous. (2004). Levels of service for text digitization. Library Technology Reports, 40(5), 39. Ayris, P. (1998). Guidance for selecting materials for digitization. Joint RLG and NPO Preservation Conference: Guidelines for Digital Imaging, Warwick. Retrieved from http://eprints.ucl.ac.uk/492/1/paul_ayris3.pdf Berger, M. (1999) Digitization for preservation and access: a case study. Library Hi Tech. 17(2), 146 Borgman, Christine L. (1999). What are digital libraries? Competing visions. Information Processing and Management. 35, p. 227- 243 Eileen Mathias. (2004). Anatomy of a digitization project. Library Journal, 129(1), S2. FAO. (2010) Digital libraries, repositories and documents .Information Management Resource Kit (IMARK). Retrieved from http://www.imarkgroup.org Hopkinson, A. (2009). Library automation in developing countries: The last 25 years. Information Development. 25, p. 304. DOI: 10.1177/0266666909349678 Hughes, L. M. (2004). Digitizing collections: strategic issues for the information manager. London: Facet. p. 7 Kamusiime, N. & Mukasa, G (2012). The value of managing library information resources in digital form in Uganda. Presented at SCECSAL XXth Conference on 4th-8th June 2012 in Nairobi, Kenya Keller, M.A. (2009). Establishing a digital library. [White Paper]. Sun Microsystems, Inc. Retrieved from https://www.sun.com/offers/details/digital_libraries.xml Lopatin, L. (2006). Library digitization projects, issues and guidelines: A survey of the literature, Library Hi Tech. Vol. 24 No. 2, 2006. pp. 273-289. DOI 10.1108/07378830610669637 National Information Standards Organization (NISO). (2007) A framework of guidance for building good digital collections (3rd ed.). NISO Framework Working Group with support from the Institute of Museum and Library Services retrieved from http://www.niso.org/publications/rp/framework3.pdf Seadle, M., & Greifeneder, E. (2007). Defining a digital library. Library Hi Tech., 25(2), pp. 169-173
  • 30. 27 Singh, G., Mittal, R., & Ahmad, M. (2007). A bibliometric study of literature on digital libraries. The Electronic Library., 25(3), pp. 342-348 Sutherland, J. (2008). A mass digitization primer. Library Trends, 57:1. pp. 17-23 United Nations (2011). Record-keeping requirements for digitization. Department of Management Archives and Records Management Section.Retrieved from www. archives.un.org/ARMS/.../Standard_RKreqfor%20digitisation3.pdf United Nations Public Administration Network. (2010). Building digital libraries in Africa. World summit on the information society. Geneva 2003 - Tunis 2005. Retrieved on June 20, 2010 from http://www.unpan.org/Library/SearchDocuments/tabid/70/ModuleID/985/mctl/DocumentDet ails/did/22047/language/en-US/Default.aspx Witten, I.H. and Bainbridge, D. (2003), How to build a digital library. San Francisco, CA.: Morgan Kaufmann Publishers Yan Quan Liu. (2004). Best practices, standards and techniques for digitizing library materials: A snapshot of library digitization practices in the USA. Online Information Review, 28(5), 338.

×