Information Audit Project

1
COLLEGE OF INFORMATION STUDIES, UNIVERSITY OF MARYLAND, COLLEGE PARK
Information Audit Project
INFM 736 – Information Management Team Experience
Organization: Niels Bohr Library & Archives
Akashdeep Ray, Jeroen de Lange, Nishita Thakker, Thet Oo, You Zheng
5/13/2010

2
Table of Content
Exectutive Summary........................................................................................................... 5
1.Introduction...................................................................................................................... 6
2.Background Information.................................................................................................. 6
3.Project Rationale.............................................................................................................. 7
4.Information Audit ............................................................................................................ 8
5.Methodology.................................................................................................................... 9
5.1 Interviews................................................................................................................... 9
5.2 Litrature Review ........................................................................................................ 9
6.Business Process Maps .................................................................................................. 10
6.1 Collections/Manuscripts ......................................................................................... 10
6.2 Photo Collections.................................................................................................... 13
6.3 Oral Histories.......................................................................................................... 14
6.4 Books ...................................................................................................................... 15
7. Existing IT Structure..................................................................................................... 16
8. Fit Gap Analysis ........................................................................................................... 18
8.1 Organization............................................................................................................ 19
8.2 Usability.................................................................................................................. 23
9. Literature Review.......................................................................................................... 24
9.1 Standards and Policies ............................................................................................ 24
9.2 Adopting IT Systems .............................................................................................. 25
9.3 Case Studies............................................................................................................ 27
10. Recommendations....................................................................................................... 28
11. Limitations.................................................................................................................. 29
Appendix........................................................................................................................... 30
References......................................................................................................................... 32

3
List of Diagrams
Diagram 1 Business Process of Collections/Manuscripts................................................. 10
Diagram 2 Business Process of Photo Collections ........................................................... 13
Diagram 3 Business Process of Oral Histories ................................................................. 14
Diagram 4 Business Process of Books.............................................................................. 15
Diagram 5 Existing IT Structure....................................................................................... 17

4
List of Tables
Table 1 Introduction of IT systems................................................................................... 16
Table 2 Different file formats used by NBL&A............................................................... 21
Table 3 Overview of Metadata collected by NBL&A in ICOS........................................ 21
Table 4 Overview of Metadata used by NBL&A outside ICOS…………………………22

5
Executive Summary:
The project audits the digital assets of Neils Bohr Library and Archives, mapping the
business process flows for each of its information assets. This information audit has been
conducted by a team of graduate students from the University of Maryland as part of a team
capstone project.
Information was gathered through interviews conducted at the organization with various
staff members responsible for cataloguing and archiving different types of assets. A comparative
literature review has also been done to understand the current industry trends in the information
technology and processes used by libraries and archives. Using this information, business
process maps were created. Through this, an overall picture of the current situation at the
organization emerges.
A fit gap analysis identifying the gaps in the current system to address the organization‟s
need was also examined. Some of the key findings of the analysis were that multiple systems
were used for various digital assets, work organization exists in silos, cumbersome HTML links
between different collections, failure to store and organize different types of collections
(single/multi item) in an integrated manner, manual data reentry, lack of customized access
control and a lack of unified search engine for various digital assets.
The IT systems for digital asset management in the library and archives environment are
in a dynamic state which makes it a volatile buying decision. Tentative recommendations are to
adopt common and open industry standards in data. Open source technology should be adopted;
however with any selected system technical expertise would be required for customization based
on organization needs.

6
1. Introduction:
The aim of the project was to perform an Information Audit for the Niels Bohr Library &
Archives (NBL&A), by thoroughly mapping the business processes to identify problems with the
existing information environment, perform an in-depth problem analysis and offer a range of
broad recommendations. The report first explains the project rationale, defines the problem
statement and scope; defines the analysis approach and methodology. The business processes
maps are explained for certain members of the organization based on the information digital
assets they associate with at the NBL&A. The current IT platform for the digital assets is
explained along with its existing problems. The fit-gap analysis maps the current needs with the
ideal system and how it can be used to resolve certain issues. A few broad ranges of solutions are
provided to the NBL&A to improve their business processes.
2. Background Information:
AIP is a non-profit organization, which “promotes the advancement and diffusion of
knowledge of physics and its application to human welfare”. In order to accomplish its mission,
AIP supports ten physics and astronomy societies (i.e. American Astronomical Society,
American Physical Society, Society of Rheology etc; AIP, 2010b) with publishing, membership
administration, organizing exhibits, and conferences. Moreover, AIP also supports individual
scientists, students and the general public by offering a career network; preserve the history of
physics, and educating or support teachers in making known the history of physics. However,
AIP‟s core business is in publishing and selling advertisements in it 50 journals, which earned
them $77.2- million in 2009 (AIP, 2010c).
The Niels Bohr Library & Archives and the Center for History of Physics are divisions of
the American Institute of Physics that share a common mission: to help preserve and make
known the history of modern physics and allied sciences. The Library & Archives serves both as
a repository and a clearinghouse for information in the history of physics, astronomy, geophysics
and allied fields. In-house holdings include an outstanding collection of textbooks, monographs,
biographies, and related publications, dating mostly from ca. 1850–1950; over 30,000
photographs and other images; ca. 1,000 oral histories with many of the outstanding figures in
the fields that we cover; and archival records of AIP and its Member Societies along with other

7
archival records and personal papers of a select number of scientists. All of these materials are
indexed online.
As a clearinghouse, NBL&A maintain and update the International Catalog of Sources
for the History of Physics and Allied Sciences (ICOS for short), which contains descriptions of
over 9,000 archival and manuscript collections, oral history interviews, and other primary
sources in our fields at ca. 900 repositories worldwide. As part of its efforts, the NBL&A
actively encourages scientists‟ home institutions to support archival programs that preserve their
papers and the institution‟s history. NBL&A also preserves the records of AIP and its Member
Societies and occasionally the papers of individuals like Goudsmit, whose papers don‟t have a
natural home elsewhere.
3. Project Rationale:
This information audit was started for several reasons. First of all the status quo is
changing as AIP has adopted a publishing-based Content Management System called Polopoly.
Because, AIP has a centralized IT strategies, and limited resources (staff) they cannot provide
support to NBL&A‟s unique IT needs. Second, the NBL‟s current IT systems are not adequate
for their future needs / goals. This is the result of various factors such as an ad-hoc IT strategy
over the years, expansion of their collection due to grant-based projects such as Oral History
Interview and Goudsmit‟s Digitization Project, increasing diversity of user base, and the rise of
born digital files.
Over the last year the NBL&A have tried provisioning several systems that they felt
would fulfill the requirements but these systems fell far short of the expectations. In order to
determine what type of system would work for their purposes they need to carefully assess their
needs, current situation and processes. Thus, our aim was to understand all the existing processes
to identify NBL&A‟s information needs to help them make a more well-informed decision.

8
4. Information Audit:
Information audit is an analysis technique used for the assessment of information needs
and assets. The information audit suggested for this project assessed the needs and co-related
them with the current information landscape at the NBL&A. Based on this audit, one can
determine if the existing information environment is aligned with the goals and objectives of the
organization. The audit helped understand the possible solutions to improve the existing
conditions keeping in mind the constraints that may exist for the organization.
Several tools will be used while performing the information audit:
 Business process maps – visualize the inflow and outflow of information,
type/format of information, physical/logistical location of information and
identify potential issues and inefficiencies.
 Use case and activity diagrams – depict key users and dependencies within the
organization.
 Fit gap analysis – check actual performance of NBL against desired performance.
Describe gap to identify needs, purposes and objectives.
Information maps and business process maps allow us to gain in-depth knowledge of the
current assets and processes. While identifying the current system, the information environment
as a whole got mapped out, giving unique insights about the existing IT structure at the NBL&A.
Activity diagrams were targeted at detailed workflows with stepwise actions of various
organization members. These members‟ workflows are described in detailed to understand the
behaviors involved with various digital assets as categorized by the NBL&A. While diagrams
helped us better understand information flows within the NBL&A, they assisted in identifying
the roadblocks involved in various processed. Moreover, once the problems are clearly defined
potential solutions can be thought about which will indentify future system requirements.
After mapping out the information environment, a fit gap analysis was done to check the
performance of the ongoing activities and procedures at the NBL&A against the desired
performance or “ideal scenario”. The first step of the analysis was to describe the desired
situation that would be ideal for the activities at NBL&A. The second step was to compare the

9
ideal situation with the current situation, describing what “fits” and what poses a “gap”. This
analysis put together can be used to set minimum system requirements and constraints for
potential solutions.
5. Methodology:
In order to perform in depth problem analysis, we need accurate information about the
current information needs and assets, thorough understanding about various business processes
and ongoing activities at the NBL&A. The team started out with a few group meetings and then
conducted multiple rounds of personal interviews. During the course of the project, relevant
literature was reviewed to understand the conditions at other libraries and archives.
5.1. Interviews:
The project team conducted multiple rounds of structured interviews with 10 organization
members. The interviews focused on the day-to-day activities of the organization members and
how they interact with various digital assets of the NBL&A. This gave us knowledge about the
in- and out-flows of different types of information. Moreover, we tried to identify existing
roadblocks in each member‟s processes and understand their expectations of the potential
information structure.
5.2. Literature Review:
The relevant literature on information management and information audit provided a
structure for the project and its tasks to help us deal with the problem at hand in a systematic
manner. In the last decade there have been many changes going on within libraries and archives
around the world with the growing need for digitization. The literature review involved looking
at NBL&A‟s peers to identify the various systems in use and their processes. This kind of review
not only helped us but will provide a comparable view for the NBL&A to co-relate problems
with potential solutions.

10
6. Business Process Maps:
This section describes the various business processes at the NBL&A based on the
information gathered from interviewing organization members. These are categorized based on
the digital assets categories pre-defined at the NBL&A.
6.1. Collections/Manuscripts:
Diagram1. Business Process Maps for Collections/Manuscripts

11
NBL&A does not aim at archiving collections at their location, but seeks to preserve
them at the most appropriate repository. This is done by maintaining a database of contacts and
other information related to physicists, soliciting the respective institutions where the researchers
worked to take possession and archive research papers, and perform a regular tracking of
obituaries in newspapers for physicists. The researchers‟ names are checked with the Library of
Congress database and cases are labeled according to the established unified name. There can be
multiple number of cases open at the same time; however priorities are set based on the
importance of the researcher (N – Nobel Prize winner, W – very important, Q – anything else).
These cases remain open for investigation based on the importance, for example cases labeled
„Q‟ remain open for six months. The universities or the people contacted are also recorded into
the database. As a last resort if a suitable home for a researcher‟s work and materials are not
found, they are taken in by the NBL&A.
Collections, which may include manuscripts, videotapes, prints, and films, are donated to
NBL&A in the form of personal delivery or mail, uploaded digitally through the File Transfer
protocol (FTP) server, and sent as CDs/DVDs or email. They are then sorted based on the level
of copyright restrictions giving priority to less copyright restrictive and important files. Digital
files are checked for the format of the file – whether they can be accessed using the existing
software or not. Document files are usually in the form of .rtf or .doc format. Audio files are
usually in .mp3 format. If any of these files cannot be opened by the existing software, then help
is sorted from the IT department. These digital files are stored in the preservation server. As for
physical collections, they are stored in acid free boxes. Video tapes are not given priority for
archiving since they are likely to deteriorate quicker. The consequent step is to catalogue the
materials in two databases – one is a NBL&A database accessed through Microsoft Access (MS
Access) forms and the second is an Integrated Library Cataloguing System known as Horizon.
Much of the information stored in both the databases is same with some extra fields in the
Horizon module. The information in the Horizon database is according to the MARC1
format
and standards. An initial part of the cataloguing process is to thank the donors and record the
details of deed of gift letter received from the donors. This letter or document contains the
copyright permissions given to NBL&A. This information is also entered into the databases
1
The MARC formats are standards for the representation and communication of bibliographic and related information in
machine-readable form.

12
along with other data such as description, donor info, unique catalogue number, and name of the
collection (with naming conventions as described by the Library of Congress).
The Horizon application has the added advantage of making the records searchable online
through the ICOS search module on the NBL&A website. The records may contain links to a
finding aids HTML page. The process of creating finding aids also starts from the MS access
forms, however it depends on a couple of conditions – one, whether the collection is important
enough to create a finding aid and two, whether the collection is large enough to create a finding
aid during a project cycle. The creation of finding aids is a continuous process which can take
place at any time after the collections has been catalogued. As stated earlier, the content for
finding aids is first entered into access forms (known as the EAD module). The data entered is
then converted into .xml and .html files through a series of steps that includes exporting the data
into Dreamweaver and finally converting them into web pages (with frame, without frame
format) using scripts. These finding aids can therefore be published on the website. The
subsequent web page link for the finding aid is entered into Horizon record for the same
collection.

13
6.2. Photo Collections
Diagram 2. Business Process Map for Photo Collections
The NBL&A photo archive is known as the Emilio Segre Visual Archives (ESVA)
names after Emilio Segre, an eminent physicist. While the biggest donors to the photo archives
have been Physics Today (since 1940) and AIP, photos are donated from other sources on a
regular basis. Precedence is given to recently donated photographs while other photographs from
the backlog are worked on round the year. The digital photos are either sent over email, uploaded
to the FTP server or physically through mail. A thank you note is sent to the donor in response to
the deed of gift document. Based on the copyright permissions, photos are scanned and stored in
a temporary folder in the R drive (located on the NOVELL server) under the donor‟s name. The
photos are then catalogued through Access forms into a MS SQL Server Database. All copyright
information is entered into the database. Any data entered into the database contributes towards
the searching capability of the digital photograph. Meanwhile, the photos are converted into
common file formats. The resolution is decreased using Photoshop and the files are copied onto
NAS server at Melville. These files are cross mounted onto the production server APP1, through
which the photos are displayed on the web pages. As part of the step to link the metadata to the
picture files, the file name of the picture should match the file name entered in the metadata
record.

14
6.3. Oral Histories
Diagram3. Business Process Map for Oral Histories
The digitization of the oral transcripts starts from checking physical folders of the
transcripts with old labels. The collection is checked for uniformity of cataloguing information
with the records in the NBL&A database and the ICOS records in the Horizon database. The
convention is also checked with existing Library of Congress Authority of Records. The
transcripts are converted into digital format using Microsoft OCR (Optical Character
Recognition) software scan and uploaded into an FTP server. These files are in Microsoft Word
formats which then pass through Dreamweaver scripts to convert them into .xhtml files. These
can be uploaded into Host A server, where the files are linked into web pages. Therefore, they
can be accessed online. Oral transcript collections whose cataloguing information mismatched
with the ICOS records are then renamed with new labels and put back into the folders to then go
through the overall process.

15
6.4. Books
Diagram4. Business Process Map for Books
As an integrated library system, the acquisition of books is tracked via an acquisition
record made in Horizon. This record is printed out and sent to vendors for buying. The budget
allocated for books in a year is $3000 or they can be received as a gift. International standardized
catalogue information of the book is downloaded from WorldCat and OCLC records in the form
of .dat forms. These are then imported into Horizon. Since, there is a backlog of about 1000
books; they are placed on the shelves in the archives and are identified and recorded with the
shelf number in horizon. Books that are taken out to be placed on the library are checked with
the horizon records for errors. A unique call record is added for the book based on the author
name followed by placing them on the rack in the library.

16
7. Existing IT Structure:
In serving the public and its member societies the NBL&A has had a very ad-hoc IT
strategy over the years. As events occurred and grants where received, the NBL&A chose
different type of IT systems based on short-term needs (Table 1).
Year Event Reason
1998 Visual Archives Commercial
1999 Online Public Access Catalog Usability
1999 Integrated Library System Horizon Year 2000 bugs
1999 Introduction of NBL access database Record more
information on
collections
2000 Introduction of Finding Aids Consortium / Grant
2002 Update of Visual Archives (Oracle) Unstable, server update
2004 Updated Horizon to 7.3.3 -
2005 Made ICOS Google searchable Usability
2006 Update of Visual Archives (MYSQL) Unstable
2007-2009 Oral History Interviews put online Grant
2008-2009 Goudsmit project Grant
Table1. Introduction of IT systems
(AIP, 1998, 1999a, 1999b, 2000, 2002, 2005, 2006, 2007, 2008)
The following paragraphs will describe the IT systems the NBL&A uses, daily. A complete
overview can be found in the following diagram.

17
Diagram 5 Existing IT Structure
The International Catalog of Sources is part of an integrated library system called
Horizon, designed to catalog books, serials, and collections and manuscripts. The Horizon library

18
system runs on a dedicated server (LibServer) in Melville, NY and is searchable by a built-in
search engine called Dynex. Horizon is also used for acquisitions of books, but most of its
functions are unused.
The photographs are the only digital assets archived separately in the Emilio Segre Visual
Archives (ESVA) a separate website, and are being commercially sold online via a software
layer, that is connected to a MS SQL database. The ESVA runs on a NAS server in Melville,
NY. The ESVA can be searched via a quick query search and an advanced search (federated
search).
Oral history interviews, finding aids, newsletters etc are hosted on HostA web server in
Melville, NY and are an active part of the website. The website and the photographs are
searchable via a Google and Google custom search. The NBL&A also employs a federated
search engine called Varity. This federated search indexes the ICOS, ESVA and website. Google
doesn‟t index the ICOS, it used to do that, however that stopped working.
The previously discussed systems are all online and accessible to the public. The
NBL&A also uses a Microsoft Access database internally to record information about its
collections. This MS Access database is located on a Novell server in College Park, MD. This
server also hosts all digital files (photo‟s, OHI, etc) permanently, this server is also known as the
“preservation server”.
8. Fit Gap Analysis:
The fit-gap analysis aims at comparing the desired situation with the ongoing situation
and describes the gap between the two. First, is the description about what the NBL&A want
from a system, what aspects of it do they already have and thus identify the gaps between the
two. The following is a summary of information system requirements based on a NBL&A in-
house team‟s vision to integrate the diverse digital collections:
 Organization - an ability to organize the stored data while preserving the complex
relationships among them.

19
 Usability - information should be accessible to the stakeholders, visible to the
internet search engines. This accessibility however is a controlled activity –
NBL&A should be able to control the flow of interaction.
 Portability - Information assets needs to be standardized and in non proprietary
format which makes it easier to migrate them when the system is changed or
expanded to include newer features.
 Cost - should not prohibitive in terms of economic and manpower costs.
 Please see Appendix A for full description of functional requirements as
described by the NBL&A.
The NBL&A wants an integrated system to organize, store, and disseminate the
collection while persevering the complex relationships between the digital assets. The following
fit-gap analysis will look at the two broad requirement categories – organization and usability.
8.1. Organization:
When discussing organization of collections and preserving the complex relations among
the files we thought of a couple of implications. First the new system should be able to store and
organize different types of collections such as flat, single-item collections and hierarchical
multiple-item collections. Currently, different types of collections and file types come with
varying attributes or metadata. Larger multi-item collections are only described as a whole in a
catalogue record and potentially in a finding aid (a finding aid may/may not be created for a
collection). Also, such collections do not have item-level metadata (Meta-light). On the other
hand, single-item collections (such as photo collections) are described on an item-level in the
file‟s own metadata.
Second, the new system should be able to handle varying file formats and sizes
depending on the type of digital asset. Different digital assets/ items (such as books, transcripts,
manuscripts, photos, etc) require different metadata and thus have different types of catalogue
records. According to Table 2, Table 3 and Table 4, different collections require different type of
descriptions and follow different standards. The future system thus needs to be able to make
distinctions between different types of collections and accommodate each collection‟s individual
needs. Thus, the system must have different templates/ forms to input data about different

20
collections. With the various digital collections, there are many procedures followed before the
digital records are made accessible. The future system could simplify these pre-posting
procedures and thus avoid redundancy. Since the individual collections have very different ways
of being stored, catalogued and accessed, it has been difficult to present an integrated search for
any user. Currently, different systems are being used for different digital assets and these assets
do not communicate with each other as they are treated like individual entities. This is not only
reflected by the system but also in the procedures, policies, redundancy of work performed by
organization members associated with different digital assets, etc.
As seen in the diagram in section 7, the NBL&A operates a separate database to record
additional information regarding archival collections, and writing finding aids. Moreover, this
database is used for administrating the use of the collections. The current structure of the
database is unorganized, confusing and not scalable. Tables are not normalized in anyway,
making the database slow and potentially causing insertion, update and deletion anomalies. This
database stands completely separate from the online systems, and causes redundancies and
inefficiencies in workflow. Information has to be manually re-entered into the integrated library
system. The database and the online systems are only linked by the common catalog number
which is being used as a unique identification number.
The digital assets at the NBL&A are not well integrated. They use HTML links to
preserve relationships among different collections such as Oral History Interviews „linked‟ to its
respective catalog record and collection specific finding aids to their catalog record. In the future,
the Goudsmit Digitization Project will also be linked to the existing finding aids and catalog
record via HTML. HTML links are very cumbersome, since they had to be placed manually into
the catalog records; they are error sensitive and not easily transferable to new systems.
Lastly the NBL&A physically saves all of its born digital files on a preservation server.
However, the metadata is stored in the catalog records using the integrated library systems,
which is physically located on a separate server (LibServ) altogether. Also, the NBL&A database
stores metadata. However, that is only for internal use. The photos in the ESVA have their
metadata stored in the SQL Database in the MS SQL Server. These constraints make the current
system hard to migrate into a new system.

21
File type File extensions
Text Txt, pdf, doc, rtf, wpd, email, indd, access
database logs,
Photo‟s Tiff, jpg, pdf,
Audio Wav, mp3
Table2. Different file formats used by NBL&A
Table3. Overview of Metadata collected by NBL&A in ICOS
ICOS Books ICOS Archive ICOS OHI
Title Name Name
Author, date Author, date Interviewer, interviewee,
date
Publisher Description, size Description, size
Call No Owner Use and reproduction
Description, size, Country Owner
ISBN Biography / History Country
Added Author Scope of Material Notes
Location Notes Added Author
Collection Added Author Genre Terms
Status Location Location
Subjects Collection Collection
Edition Status Status
Source of Acquisition Subjects Subjects

22
Photo’s NBL&A Access
DB
OHI transcript Finding Aids
Catalog nr Accession Date Name Name
Description Accession Type Copyright Publisher, address
Date Accession Nr Origin Date
Credit Old accession NR Interviewee Encoding
information
Names Items in Accession Interviewer Location
Begin Location Location, interview Title and dates of
collection
End Location Date Papers created by
Other Location Abstract Size
Oversize Location Sessions Short description
Main Entry Language
Member Society Selected search
terms / subjects
Title Historical note
Collection ID Scope and content
of collection
Description Organization and
arrangement of
collection
Notes Access restrictions
Begin Date Restrictions of use
End Date Provenance and
acquisition
information
Linear Feet Processing
information
Proc Priority Other related

23
materials
ICOS Nr Container list
Donor Name
Restrictions
Description of
Restrictions
Deed sent Notes
Date deed send
Date deed returned
Thank you sent,
date
*more information
collected in
different tabs.
Table4. Overview of Metadata used by NBL&A outside ICOS
8.2. Usability:
The second aspect of the fit-gap analysis is the usability of the system. The systems
usability has implication on all of its users. Users are broadly classifies as researchers, non-
researchers and the staff at NBL&A. Thus, we have the general public (researchers and non-
researchers) who want to search the collection and we have the NBL&A employees who manage
the collection. For the general public it is important that the various digital assets are searchable,
as a collection on the whole, from main search engines such as Google and Yahoo. The current
NBL&A website uses a total of 4 different search engines, none of which are able to present a
complete overview of their collection. Search should be general keyword based or in depth by
using various search categories and any way the user desires.
For NBL&A employees the system should be standardized and easy to use and
independent of the type of digital asset. The system should allow them to manipulate collections

24
and disseminate it in various ways such as exhibitions, mobile devices, photo of the months, etc.
The system should have a user-friendly interface that can be used by organization members
with/without technical expertise. Moreover, the system should be able to control and record
which organization members get access to what types of files. This will help maintain uniformity
and help track changes that are made to various files.
9. Literature review
The goal of this literature review is to provide a baseline understanding of the current
state of research and practice in the management of digital assets in the archives and libraries,
particularly regarding the integration of various digital assets.
9.1 Standards and Policies
In the archival environment, there is a lot of debate about standardization and policies.
Some institutions are trying to set up own standards and policies, which differ from very strict
Electronic Records Management Systems that need actual certifications and are very rigid to
simple policies and tips (Joerling, 2010). Here is an overview of some of the policies:
 Trustworthy Repository Audit Certification (TRAC)
 Trusted digital repository (TRD)
 Open Archival Information System Reference Model (OAIS), NASA
 Information Life cycle approach (Hodge, 2000)
 Department of Defense 5015.2
 Model Requirements for the Management of Electronic Records (MoReq2), EU
 Victorian Electronic Record Strategy (VERS), Australia
 Document Management and Electronic Archiving (DOMEA), Germany
 Records, Document and Information Management System (RDIMS), Canada
 International Standard Archival Authority Record (ISAAR), International Council on Archives
The NBL&A doesn‟t need to comply with any governmental mandate, and therefore
doesn‟t need to be certified. Most of these policies are very strict and would mean too much
bureaucracy and red tape. Moreover, the NBL&A already complies with some important archival
standards such as MARC and EAD and follows conventions on naming authority designed by

25
the national library of congress. So these developments in the archival environment aren‟t
interesting for NBL&A.
9.2 Adopting IT Systems
Second step of the literature review was researching for new IT systems that can be used
for managing digital archives. While doing so, the following different systems were encountered:
 Enterprise Content Management Systems (ECMS)
 Digital Assets Management Systems (DAMS)
 Electronic Record management (ERM)
 Content Management Systems (CMS)
 Collection Management Systems (CMS)
 Document Management Systems (DMS)
 Integrated Library Systems (ILS)
Obviously there is a lot of overlap between systems, and not all of them match the
functional requirements set up by the NBL&A or can deal with the constraints. The following
paragraphs will perform a quick evaluation with case studies that specially study the ERM and
DAMS used by other organizations ranged from adapting existing library systems to developing
the whole system from scratch.
First, an ECMS is defined as, “the strategies, methods and tools used to capture, manage,
store, preserve, and deliver content and documents related to organizational processes. ECM
tools and strategies allow the management of an organization's unstructured information,
wherever that information exists” (AIIM, 2010). This encompasses document and record
management, groupware, web content, and business process management. An ECMS goes way
beyond the needs of the NBL&A, so we do not have to evaluate this tool.
A DAMS “consists of managing tasks and decisions surrounding the ingestion,
annotation, cataloguing, storage, retrieval and distribution of digital assets” (Jacobsen,
Schlenker, Edwards, 2005). A wide variety of systems can perform these tasks, however it is
agreed upon that central to this solution lays a database program (Peterashbyhayter, 2010). Some

26
examples of DAMS are Dspace, Fedora, GreenStone, and Eprints, etc. These systems match the
functional requirements given by the NBL&A and therefore should be explored further.
ERM is the “A computer-based facility for managing and controlling records throughout
the information life cycle” (Curaconsortium, 2010). An ERM will handle everything from
planning, to classifying, storing, securing, destruction to preservation and coordinating access.
However, the functionality of this type of this system is limited to managing a record as
“evidence” of an event and doesn‟t really allow for dissemination and manipulation. A lot of
organizations use ERM as a tool to be able to track documents and comply with government
regulation. Therefore this tool doesn‟t fit the bill for NBL&A and doesn‟t need to be evaluated
further.
A Content Management System can be defined in multiple ways. On the one hand it can
be a web Content Management System which allows for users without any technical knowledge
to manage the content of a website. On the other hand it can be an actual system to manage
documents for example for enterprises, media, learning, collections, mobile devices etc. A DMS,
which is a sort of a Content Management System, “indexes and profiles documents based on
content; controls documents using such functions as check in/checkout, version control, audit
trails, and security of information; and facilitates searching by profile values or by some other
hierarchical structure such as folders and files” (ischool Texas, 2010). Lastly a Content
Management System can also be described as a Collection Management System. A Collection
MS is “a piece of software that allows collecting institutions to manage data about their
collections and items they hold and are an integral part of managing the documenting
collections” (Collections Australia, 2010). A Collection MS thus describes, administers
information regarding collections, donors, and location etc. These types of systems are mostly
used by museums and archives, to literally manage their collections. However, in recent years,
due to digitization of museums, these collection management systems have begun moving
towards digital asset management and electronic record management. They keep records /
catalogues of their collection, describe their collections in detail, relate items to each other and to
history etc. Some of these systems thus have a good match with NBL&A system requirements
and should be further evaluated. A few examples of collection management systems are

27
Collective Access, Artlid and Gallery Systems. Based on our understanding, Collective Access
seems more appropriate for NBL&A.
Lastly, ILS are “enterprise resource planning systems for libraries, used to track items
owned, orders made, bills paid, and patrons who have borrowed” (Wikipedia, 2010). There are
many ILS, some proprietary like SirseDynex, Newgen and Exlibre and some open source like
CDS Invenio.
9.3. Case Studies:
A couple of case studies were researched to study other organizations which uses some of
the systems and technologies mentioned above. They range from adapting existing library
systems to developing the whole system from scratch.
At Portland State University Library (PSU) (http://vikat.pdx.edu/), capabilities of
Integrated Library System were expanded to accommodate the hierarchical structure found in
traditional archival finding aids. PSU uses Electronic Resources Management (ERM) from
Innovative Interfaces Inc. Brenner, Larsen, & Weston exploited ERM‟s ability to “replicate the
two-level hierarchical relationships between aggregators or publishers and the electronic and
print resources”. The authors, Brenner, Larsen, & Weston, admitted that the resource records
created were not as rich as those of traditional finding aids though. In the same article, the
authors also mentioned about the approaches used by the Library of Congress and University of
Washington. Library of Congress selectively adds records of individual items from their
archival collections to the OPACs (Online Public Access Catalog). That approach allows users
to have complete access to the items within the library‟s collections. However, that approach
hides the hierarchical relationship between items and the collections they belongs to. University
of Washington adopts a different approach; they put collection level MARC records in their
OPAC and are searchable like other bibliographic records. These collection level records are
then linked to finding aids with more complete description (Brenner, Larsen, & Weston, 2006).
The Washington Research Library Consortium (WRLC) (http://www.wrlc.org/ ) is a
consortium of eight libraries in the Washington DC metropolitan area. The WRLC member
libraries host quite a variety of unique special collections: manuscripts, photographs, slides, full-
text documents, magazines, comic books, audio recordings, video clips, and so on. Each special

28
collections is needed to be accessible from both WRLC‟s Digital and Special Collections Web
site and corresponding member library‟s special collections Web site. The digital objects also
need to be accessible through EAD finding aids. The public Web interface needs to be simple
enough for the new inexperienced users yet powerful enough for the experienced power
users. After evaluating and testing both commercial systems and open source software, Allison
B. Zhang and Don Gourley of WRLC could not find any system that met all of their
requirements. They finally decided to build a customized system by integrating best tools
available and chose Greenstone Digital Library software as a tool for presenting the digital
collections. Their customization with Greenstone involved designing and crating metadata,
designing plug-in to import metadata, working with configuration files, and defining numerous
macros (Zhang & Gourley, 2006).
10.Recommendations:
According to the results of the fit gap analysis and the literature review, tentative
recommendations can be drawn to mitigate the current and future needs of NBL&A. According
to Tennant (2008), it is not the right time to choose a new library related IT system since the
market is in a state of flux. It would be better to wait for a period of time before taking the leap.
However the requirement for any system should be standardized formats for data description,
metadata and storage. If cost is to be taken into consideration, using open source products would
be the way forward. A possible set of systems that can be implemented in the future2
are:
 Maintenance of the current system while modifying elements such as normalization of
databases.
 Implementation of Digital asset management system.
 Implementation of collection management system.
 Implementation of integrated library systems.
2
More information is provided in the presentation slides

29
11. Limitations:
The initial functional requirements provided by NBL&A were incomplete and
ambiguous. During the course of the project, initial assumptions were made about the metadata
structure and its implications on the future system. Certain assumptions were made about the
capabilities of the existing Horizon and Verity software applications. Quantifiable costs and
benefits need to be calculated for current and future systems that may be considered for
implementation. Literature review for this subject is limited, not every organization documents
its experiences. User experiences with the system need to be recorded in the future to make
better recommendations.

30
Appendix A:
Niels Bohr Library & Archives and Center for History of Physics
Digital Assessment Project
We are looking for:
1) A system that allows us to store materials in an organized manner, while retaining complex
relationships between some items;
2) A system that allows us to use and manipulate these digital items more creatively, with better
searching capabilities, and to help us present these items online.
Our system requirements are:
Complexity - can handle hierarchies and relationships between digital items; store flat, single-
item digital collections, in addition to collections of varying sizes that include multiple file
formats (i.e. email messages with attached documents, or a folder that contains a Word
document, a .PDF file and a video clip); see attached inventory for specific file types and storage
needs.
Searching - will allow all levels of searching, from simple keyword searches on file names to
full-text searching on OCRed documents; will allow access via Google or other search engine
results; will allow commenting or other user interaction.
Permanence - that digital items are stored in a non-proprietary format, in a system that will be
committed to long-term use at ACP or that can be easily migrated without losing formatting or
hierarchies.
Access - that staff members, as well as outside users and library patrons, can easily access digital
items without registration or log-ins; and that if needed, certain items could be restricted or
shielded from outside users.
Metadata - will be able to ingest existing metadata in batches, with little or no manual re-keying
Expansion - will expand to fit future needs of storage space, additional projects, complexity of
hierarchies, and possible linking of projects.
Standards - conforms to accepted standards in the professional archival community, on all the
points listed above; standards include MARC, EAD, XML
Support - will be installed, customized, maintained and otherwise supported by AIP Web
Development staff.
Cost - will not be prohibitive in costs, in either direct spending, necessary hardware to run the
program, or staff time and resources to use and support the program.

31
Current projects
Oral history interview project: scanned and OCRed interview transcripts; full-text searchability;
use of audio files and images; ability to continue to add more interviews indefinitely
Digitized archival materials: scanned manuscript collections consisting of TIFF images of text;
ability to link the digital images to an existing EAD/XML interface; ability to retain the
hierarchies of the items in the collection
Databases: the History Center is currently assembling a database of biographical information on
acclaimed physicists, then linking profiles together in multiples ways - according to professional
interests, research teams, educational background, etc.
Our users are:
Experienced users: Library/archives patrons, historians and other researchers who are already
familiar with our catalog and resources. The public interface must be familiar and consistent
with our other webpages.
New users: Items will be discovered through searches in our online catalog, through searches in
Google and other search engines, links in web exhibits, newsletters, press releases, Facebook
updates, etc. The interface must be intuitive, and easy to navigate back to the main catalog so
the user can start a new search or browse other relevant materials.
In-house use: content will be regularly accessed by library and archives staff.
Example of different online interfaces that inspire us:
The series description and box inventories of the Joseph Cornell papers at the Smithsonian
Archives of American Art. If you click on a digitized folder, this collection also has a nice
interface for clicking through the digital images -
http://www.aaa.si.edu/collectionsonline/cornjose/series1.htm
The interactive finding aid of the Aldo Leopold papers at the University of Wisconsin - Madison
http://digicoll.library.wisc.edu/cgi/f/findaid/findaid-
idx?c=wiarchives;cc=wiarchives;view=text;rgn=main;didno=uw-lib-leopoldpapers

32
Reference:
AIIM. 2010, April 01. What is Ecms. Retrieved from http://www.aiim.org/What-is-ECM-
Enterprise-Content-Management.aspx
AIP. 1998, May. Searchable database of photo's. Retrieved from
http://www.aip.org/history/newsletter/fall98/esvaweb.htm
AIP a. 1999, Spring. New integrated library system. Retrieved from
http://www.aip.org/history/newsletter/spr99/ils.htm
AIP b. 1999, December. New online catalog for niels bohr library online. Retrieved from
http://www.aip.org/history/newsletter/spring2000/nbl.htm
AIP. 2000, Fall. Finding aids to major collections online. Retrieved from
http://www.aip.org/history/newsletter/fall2000/findaid.htm
AIP. 2002, Fall. New format: emilio segre visual archives. Retrieved from
http://www.aip.org/history/newsletter/fall2002/esva.htm
AIP. 2005, Spring. Enhancing web access to library catalog. Retrieved from
http://www.aip.org/history/newsletter/spring2005/catalog.htm
AIP. 2006, Spring. Improved online visual archives. Retrieved from
http://www.aip.org/history/newsletter/spring2006/visualarchives.htm
AIP. 2007, Fall. Major collection of oral history interviews mounted online. Retrieved from
http://www.aip.org/history/newsletter/fall2008/oral-history.html
AIP. 2008, Fall. Digitizing the samuel goudsmit papers. Retrieved from
http://www.aip.org/history/newsletter/current/digitizing_goudsmit.html
AIP a. 2010, April 01. About AIP, Retrieved from http://www.aip.org/aip
AIP b. 2010, April 01. AIP: A federation of physical sciences, Retrieved from
http://www.aip.org/aip/societies.html
AIP c. 2010, April 01. Annual report 2009, Retrieved from http://www.aip.org/aip/reports.html
Bak, G. Armstrong, P. 2009. Points of convergence: seamless long-term access to digital
publications and archival records at library and archives Canada. Archival Science, Vol 8: p279-
293.
Bearman, D. 1991. Hypermedia and interactivity in museums: Proceedings of an international
conference -------------- 1992, Documenting Documentation, Archivaria, 34 Summer.

33
Brandeis Institutional Repository Planning Documents, 2006 - 2007. (n.d.). Retrieved April 1,
2010, from Brandeis Institutional Repository: http://dcoll.brandeis.edu/handle/10192/21866
Brenner, M., Larsen, T., & Weston, C. (2006). Digital Collection Management through the
Library Catalog. INFORMATION TECHNOLOGY AND LIBRARIES , 65-77.
Cohen, P. 2010. Fending off digital decay, bit by bit. New York Times, March 15
http://www.nytimes.com/2010/03/16/books/16archive.html?scp=1&sq=digital%20archives&st=c
se
Collections Australia. 2010, April 01. Collection management system. Retrieved from
http://www.collectionsaustralia.net/sector_info_item/7
Curaconsortium. 2010, April 01. Glossary of information management terms. Retrieved from
http://www.curaconsortium.co.uk/glossary.html
Hodge, G. M. 2000, January. Best Practices for digital archiving. Retrieved from
http://www.dlib.org/dlib/january00/01hodge.html
Ischool Texas . 2010, April 01. Glossary. Retrieved from
www.ischool.utexas.edu/~scisco/lis389c.5/email/gloss.html
Jacobsen, J. Schlenker, T. Edwards, L. 2005. Implementing a Digital Asset Management System:
For Animation, Computer Games, and Web Development. Focal Press, Burlington, MA.
Joerling, K. 2010, March 19. The Truth and consequences of dod certification. Retrieved from
http://www.incontextmag.com/article/The-truth-and-consequences-of-DoD-certification
Kaplan, D. (2009). Choosing a Digital Asset Management System That's Right for You. Journal
of Archival Organization , 33-40.
Kurtz, M. (2010). Dublin Core, DSpace, and a Brief Analysis of Three University Repositories.
NFORMATION TECHNOLOGY AND LIBRARIES , 40-46.
NBL a. 2010, April 01. About the Niels Bohr Library & Archives, Retrieved from
http://www.aip.org/history/nbl/about.html
NBL b. 2010, January 20 Niels Bohr Library & Archives and Center for History of Physics,
digital assessment project, Retrieved from internal meetings
Peterashbyhayter. 2010, April 01. Photographic glossary. Retrieved from
http://www.peterashbyhayter.co.uk/glossaryD-E.html
Wikipedia, (2010, April 01). Integrated library system. Retrieved from
http://en.wikipedia.org/wiki/Integrated_library_system

34
Schneider, K. (2007, January 19). IT and Sympathy. Retrieved April 1, 2010, from ALA
TechSource: http://www.alatechsource.org/blog/2007/01/it-and-sympathy.html
Zhang, A. B., & Gourley, D. (2006). Building Digital Collections Using Greenstone Digital
Library Software. Internet Reference Services Quarterly , 11 (2), 71-89.

Information Audit Project

Recommended

Recommended

More Related Content

Similar to Information Audit Project

Similar to Information Audit Project (20)

Information Audit Project