SlideShare a Scribd company logo
1 of 20
Download to read offline
1
Data Integration and its application on
Digital Library System
ABSTRACT
A library may procure contents in various sources and forms to service their clients.
In the predominantly paper based erstwhile environment all these contents were put
to similar types of use, and copyright restrictions were imposed based on the
quantum of pages copied etc. In the electronic and digital perspective, owners of
information are resorting to punitive measures regarding the use and contents in
digital form. Some of the constraints faced by our libraries to engage in serious
digital initiatives are three fold - that of money, manpower and contents. Most of our
libraries, particularly in the higher education and research institutes solely depend
on the information providers and publishers in the developed world to satisfy their
urge for vital contents that inspire indigenous research. Since contents are a major
ingredient in digital library development, the pragmatic and viable way out for
libraries is to judiciously judge them as available in electronic forms in optical media
or on Web and procure at least some of them for hosting locally. This paper presents
some of the major issues involved in such a critical activity with some illustrative
examples available like IEE/IEEE Electronic Library, Indian Standards on CD-
ROM, Science Direct and Web access of Indian Academy of Sciences journals. The
justification for selecting external contents has also been mentioned. A detailed
checklist for evaluating contents is presented from various angles, like authenticity
of content, user interface and search.
2
Introduction
Data Integration:
Data integration is a process in which heterogeneous data is retrieved and combined
as an incorporated form and structure. Data integration allows different data types
(such as data sets, documents and tables) to be merged by users, organizations and
applications, for use as personal or business processes and/or functions.
It is generally implemented in data warehouses (DW) through specialized software
that hosts large data repositories from internal and external resources. Data is
extracted, amalgamated and presented as a unified form. For example, a user’s
complete data set may include extracted and combined data from marketing, sales
and operations, which is combined to form a complete report.
A data integration project usually involves the following steps:
a. Accessing data from all its sources and locations, whether those are on
premises or in the cloud or some combination of both.
b. Integrating data, so that records from one data source map to records in
another (e.g., even if one dataset uses “last name, first name” and another uses
“fname,lname” the integrated set will make sure both end up in the right
3
place). This type of data preparation is essential for analytics or other
applications to be able to use the data with any success.
c. Delivering integrated data to the business exactly when the business needs it,
whether it is in batch, near real time, or real time.
Application of data integration
Information and Communication Technology has revolutionized the concept
of libraries. Each and every library is slowly getting digitized. A 'digital
library' comprises digital collections, services and infrastructure to support
lifelong learning, research, scholarly communication as well as preservation
and conservation of our recorded knowledge. It is also a process of
democratization of information. This article will discuss the factors that will
necessitate the traditional libraries to get digitized, as well as the definition,
advantages and disadvantages of digital libraries, the requirement for building
a digital library etc.
We are in the age of a networked society where information technology in
addition to its use in all spheres of human activity has been used extensively
to record, store, and disseminate the information in the digital form.
Information technology has almost converted the world into a global village.
The revolution in the information technology sector is influencing medical
information and information. Libraries are also changing to meet the demand
put on them. The new generation whose demand for information is never met
is always demanding that traditional libraries should be developed as a well-
equipped and interconnected as digital libraries.
Digital Library:
Digital library System is a application of data integration. There are many
definitions of a "digital library." Terms such as "electronic library" and
"virtual library" are often used synonymously. The elements that have been
identified as common to these definitions are:
a. The digital library is not a single entity;
b. The digital library requires technology to link the resources of many;
c. The linkages between the many digital libraries and information services are
transparent to the end users;
d. Universal access to digital libraries and information services is a goal;
4
e. Digital library collections are not limited to document surrogates: they extend
to digital artifacts that cannot be represented or distributed in printed formats
According to Arms a digital library is a managed collection of information with
associated services where the information is stored in digital format and accessible
over a network.
Fig: Digital Library
A digital library is an organized collection of digitized material or it’s holding in the
digital form, which can be accessible by a computer on the network by using TCP/IP
or other protocol.Digital libraries initiative in 1994. In the Kahn/ Wile sky
architecture, items in the digital library are called “digital objects”. They are stored
in “repositories” and identified by “handles”. Information stored in a digital object
is called “content” which is divided into “data” and information about the data,
known as “properties” or “Metadata”.
Software:
Here are a number of software packages for use in general digital libraries, for
notable ones see Digital library software. Institutional repository software, which
focuses primarily on ingest, preservation and access of locally produced documents,
particularly locally produced academic outputs, can be found in Institutional
repository software. This software may be proprietary, as is the case with the Library
of Congress which uses Dig board and CTS to manage digital content
5
Digital Library Services
The major digital library services include
a. OPAC to web PAC
b. Digital Reference Service
c. Library Chat Rooms
d. Electronic Delivery Services
e. Virtual Library Tours
f. Ask-A-Librarian
g. Real Time Services
h. Bulletin Boards
i. Web-based User Education Web Forms
j. Frequently Asked Questions (FAQ)
k. Selective Dissemination of Information in Digital Library: Delivering
Customized Contents
l. RSS Feeds
Requirement for digital libraries
The Internet and World Wide Web provide the impetus and technological
environment for the development and operation of a digital library. The Internet
provides the TCP/IP and or its associated protocol for accessing the information and
web provide tools and technique for publishing the information over Internet. In the
digital environment it is reasonable to say that a central back up or archive should
be created at the national level, which will store information out put of the region as
well as information from out side the country. Some of the requirement for a digital
libraries are:
1 .Audio visual: Color T.V., V.C.R., D.V.D., Sound box, Telephone etc.
2. Computer: Server, P.C. with multimedia, U.PS. etc
6
3. Network: LAN, MAN, WAN, Internet etc.
4. Printer: Laser printer, Dot matrix, Barcode printer, Digital graphic printer etc
5. Scanner: H.P. Scan jet, flatbed, Sheet feeder, Drum scanner, Slide scanner,
Microfilming scanner, Digital camera, Barcode scanner etc
6. Storage devices: Optical storage device, CD-ROM, Jukebox etc.
7. Software: Any suitable software, which is interconnected and suitable for LAN
and WAN connection. PC
Factors of change to digital libraries
The limited buying power of libraries, complex nature of recent document, storage
problem etc are some of the common factor which are influencing to change to
digital mode, some other factors are-
1. Information explosion
2. Searching problem in traditional libraries
3. Low cost of technology: When we consider the storage capacity of digital
document and its maintained then it can be easily realize that the cost of technologies
is much more less than that of traditional libraries.
7
4. Environmental factor: the use of digital libraries is the cleanest technologies to
fulfil the slogan “Burn a CD-ROM save a tree.”
Elements of the digital library
Fully developed digital library environment involves the below mentioned elements.
These components might not be all be part of a discrete digital library system but
could be provided by other related or multipurpose system or environment.
Accordingly, integration is a consistent issue cited by digital library developers.
a. A private or public network
b. Client services for the browser, including repository querying and workflow.
c. Content delivery via file transfer or streaming media.
d. Initial conversion of content from physical to digital form
e. Patron access through a browser or dedicated client
f. Storage of digital content and metadata in an appropriate multimedia
repository, including right management capabilities to enforce intellectual
property rights, if required E-commerce functionality may also be present if
needed to handle accounting and billing.
g. The extraction or creation of metadata or indexing information describing the
content to facilitate searching and discovery as well as administrative
structural metadata to assist in object viewing, management and preservation.
Advantages of the Digital Library
A digital library is not confined to a particular location or so called building it is
virtually distributed all over the world. The user can get his/ her information on his
own computer screen by using the Internet. Actually it is a network of multimedia
system, which provides fingertip access.
1. No physical boundary: The user of a digital library need not to go to the library
physically, people from all over the world could gain access to the same information,
as long as an Internet connection is available.
2. Round the clock availability: Digital libraries can be accessed at any time, 24
hours a day and 365 days of the year
3. Multiple accesses: The same resources can be used at the same time by a number
of users.
8
4. Structured approach: Digital library provides access to much richer content in a
more structured manner i.e. we can easily move from the catalogue to the particular
book then to a particular chapter and so on.
5. Information retrieval: The user is able to use any search term bellowing to the
word or phrase of the entire collection. Digital library will provide very user friendly
interfaces, giving click able access to its resources.
6. Preservation and conservation: An exact copy of the original can be made any
number of times without any degradation in quality.
7. Space: Whereas traditional libraries are limited by storage space, digital libraries
have the potential to store much more information, simply because digital
information requires very little physical space to contain them. When the library had
no space for extension digitization is the only solution.
8. Networking: A particular digital library can provide the link to any other resources
of other digital library very easily thus a seamlessly integrated resource sharing can
be achieved.
9. Cost: The cost of maintaining a digital library is much lower than that of a
traditional library. A traditional library must spend large sums of money paying for
staff, book maintains, rent, and additional books. Digital libraries do away with these
fees.
Disadvantages of the Digital Library
The computer viruses, lack of standardization for digitized information, quick
degrading properties of digitized material, different display standard of digital
product and its associated problem, health hazard nature of the radiation from
monitor etc. makes digital libraries at times handicap.
1. Copyright: - Digitization violates the copy right law as the thought content of
one author can be freely transfer by other without his acknowledgement. So
One difficulty to overcome for digital libraries is the way to distribute
information. How does a digital library distribute information at will while
protecting the copyright of the author?
2. Speed of access: - As more and more computer are connected to the Internet
its speed of access reasonably decreasing. If new technology will not evolve
to solve the problem then in near future Internet will be full of error messages.
3. Initial cost is high: - The infrastructure cost of digital library i.e. the cost of
hardware, software; leasing communication circuit is generally very high.
9
4. Band width: - Digital library will need high band for transfer of multimedia
resources but the band width is decreasing day by day due to its over
utilization.
5. Efficiency: - With the much larger volume of digital information, finding the
right material for a specific task becomes increasingly difficult.
6. Environment: - Digital libraries cannot reproduce the environment of a
traditional library. Many people also find reading printed material to be easier
than reading material on a computer screen.
7. Preservation: - Due to technological developments, a digital library can
rapidly become out-of-date and its data may become inaccessible.
Architecture of Digital Library
Kahn and Wilensky describe the architecture of the digital library having the
characteristics that can apply for all type of material. A name or identifier is
essential to save and object. For the digital library the names or identifiers are a
vital building block, which are needed to identify digital objected, to register
intellectual property in digital objects, to record changes of ownership, required
for citation for information retrieval and are used for links between objects. These
names/ identify must be unique.
An administrative system is required to decide who can assign them and change
the objects that they identify. They must last for very long time periods, which
exclude the use of an identifier tied to a specific location, such as the name of a
computer and the names must persist even if the organization that named an
object no longer exists when the objects is used. The computer systems are
required to resolve the name rapidly, by providing the location where an object
with a given name is stored.
To achieve these satisfactions a handle system is implemented. A “handle” is a
unique string used to identify digital objects and it is independent of the location
where the digital object is stored and can remain valid over very long periods of
time. A global server provides a definitive resource for legal and archival
purpose, with a caching server for fast resolution. The computer system checks
that new names are indeed unique, and supports standard user interfaces, such as
Magic. A local handle server is being added for increased local control.
Parts of Digital Library Objects
Information is stored as “digital objects” in the digital library. A primitive idea of a
digital object is that it is just a set of bits, but this idea is too simple. The content of
10
even the of the basic digital object has some structure, and information, such as
intellectual property rights, must be associated with the digital object. Figure 2
shows that object in a repository has two parts, content and associated data,
sometimes called “metadata”.
Metadata
The term refers to any data used to aid the identification, description and location of
networked electronic resources. Many different metadata formats exist, some quite
simple in their description, others quite complex and rich. Metadata is defined as
data providing information about one or more aspects of the data, such as: Means of
creation of the data; Purpose of the data; Time and date of creation; Creator or author
of data; Placement on a computer network where the data was created and the
standards used The metadata of a text document contains the information about the
length, author, time of written and summary of the document. And in case of digital
image, metadata describes how large the picture is, the color depth, the other data.
Metadata is data.
As such, metadata can be stored and managed in a database, often called a registry
or repository. However, it is impossible to identify metadata just by looking at it
because a user would not know when data is metadata or just data. resolution, when
it was created, and other data. Metadata is data. As such, metadata can be stored and
managed in a database, often called a registry or repository. However, it is
impossible to identify metadata just by looking at it because a user would not know
when data is metadata or just data
Metadata in libraries
Metadata has been used in various forms as a means of cataloging archived
information. The DDC system employed by libraries for the classification of library
materials is an early example of metadata usage.
Library catalogues used 3x5 inch cards to display a book's title, author, subject
matter, and a brief plot synopsis along with an abbreviated alpha numeric
identification system which indicated the physical location of the book within the
library's shelves. Such data helps classify, aggregate, identify, and locate a particular
book. Another form of older metadata collection is the use by US Census Bureau of
what is known as the "Long Form." The Long Form asks questions that are used to
create demographic data to create patterns and to find patterns of distribution. The
term was coined in 1968 by Philip Bagley, one of the pioneers of computerized
document retrieval.
11
Since then the fields of information management, information science, information
technology, librarianship and GIS have widely adopted the term. In these fields the
word metadata is defined as "data about data". While this is the generally accepted
definition, various disciplines have adopted their own more specific explanation and
uses of the term.
Meta data describe the attributes and contents of on original document or work and
describes a resource. Metadata may be defined as representing higher-level
information that describes the content, context, quality, structure and accessibility of
specific data set such as digital data images, databases and printed materials. As
large scientific databases were developed, it become evident that surrogates were
required to provide more information about data set:
Metadata include two types of information
1. Basic details about the institutions that hold relevant information who are they?
where are they and what is their function? What are their.
a. Available resources?
b. Key linkages (who is currently working with whom and how)?
2. About relevant data sets
a. Description of data sets (What, purpose, form at and how managed).
b. Coverage (geographic, thematic, time scale, completeness, limitations and
gaps), access (availability, cost, formats available and documentation).
Metadata not only provides pointers to the original data sets but it also help
in sharing data among the database produces. It is a tool to integrate data
that are in heterogeneous format and scattered geographically, several
agencies are taking initiatives in creating Metadata / Metadata base by using
various Metadata standard.
The structure of information in a digital library
Interactions, such as the query described above, require that information in a digital
library be organized effectively. Within the library, information is stored as basic
units of digital information, e.g., a digitized map, a section of text, a Web page, a
scanned photograph, etc. In digital form, each basic unit is a sequence of bits, but
users often want to refer to material at a higher level of abstraction than the
individual item. Common English terms, such as a "report", a "computer program",
or an "opera" can refer to many items that are variants of each other. They may have
12
different formats, minor differences of content, different usage restrictions, and so
on, but for some purposes users are willing to consider them as equivalent.
The issues to be addressed in structuring information include the
following.
Digital materials are frequently related to other materials by relationships such as
part/whole, sequence, etc. For example, a digitized text may consist of pages,
chapters, front matter, an index, illustrations, and so on. In the World Wide Web, a
typical item may include several pages of text, with embedded images, and links to
other information. A single computer program is assembled from many files, both
source and binary, with complex rules of inclusion. Materials belong to collections.
These may be collections in the traditional, custodial sense; they may be the on-line
groupings provided by a publisher; or they may be the pages maintained by a
Webmaster.
The same item may be stored in several digital formats. Sometimes, these formats
are exactly equivalent and it is possible to convert from one to the other (e.g., an
uncompressed image and the same image stored with a loss-less compression). At
other times, the different formats contain different information (e.g., differing
representations of a page of text in SGML and PostScript formats).
a. Because digital objects are easy to change, different versions are created
continually. (Some organizations change their Web home page several times
per month.) Versions may differ by a single bit or may be very different. When
existing material is converted to digital form, the same physical item may be
converted several times. For example, a scanned photograph may have a high
resolution archival version, a medium quality version, and a thumbnail.
b. Each element of digital information may have different rights and
permissions associated with it.
c. The manner in which the user wishes to access material may depend upon the
characteristics of computer systems and networks, and the size of the material.
For example, a user connected to the digital library over a high speed network
may have a different pattern of work from the same user when using a dial-up
line. The information architecture described here provides a general approach
to organizing the material within the digital library in such a manner that
computer programs can understand the structure of the material and carry out
the interactions that the user wishes.
13
Repository
A repository stores digital objects, both the content on the metadata. A digital
object as stored in a repository may be very different from the digital object that
is made available to users’ computers. Different repositories will have very
different internal organizations, but for each digital object every repository will
have a properties record, which holds attributes of the object, and a transaction
log. Since digital objects contain valuable intellectual property, the stored from
of a digital object within the repository includes information that allows for it to
be managed within economic and social frameworks. The repository maintains
this information, provides basic reference information, and provides security to
ensure that only valid operations are carried out on the digital objects. The
internal organization of a repository and the way that digital objects are stored
are hidden from the user. A simple protocol is called the “repository access
protocol.” The basic commands in this protocol are those to access a digital object
and its metadata, and the service request to disseminate a digital object. In
addition there are commands to add and delete digital objects.
Electronic theses and dissertation
Electronic theses and dissertations (ETD) are defined as those theses and
dissertations submitted archived, or accessed primarily in electronic formats.
That includes additional word processed documents made available in PDF, as
well as less traditional hypertext and multimedia formats purchased
electronically on CD – ROM or World Wide Web.
Needs of ETD
a. Almost all TD’s are produces as electronic documents and if researchers
know in advance about have to prepare ETD, then creating their own ETD
usually is very simple process.
b. Minimize duplication of effort.
c. Improve visibility.
d. Accelerate ETD s available faster to outside audience.
e. Cost and benefits.
f. Enhancing access to university research.
g. Helping universities develop digital library services & infrastructure.
h. Increasing sharing collaboration among universities and students.
14
Objectives for ETD
The traditional methods of archiving and storing theses and dissertation are
inefficient and unwieldy. Many theses and dissertation lie mouldering in library
stacks, with no efficient way for researchers to locate the information that may
be contained in them. Further the time and cost involved in procuring copies of
those works may often be prohibitive. The main objectives are as follows :-
a) To advance digital library technology.
b) To empower students to convey a richer message through the use of
multimedia and hypermedia technologies.
c) To empower universities to unlock their information resources.
d) To improve education and research by allowing students to produce electronic
documents.
e) To lower the cost of submitting and handling theses and dissertations.
Technical issue involved
a. Tools for creation
b. Management
c. Access
d. Archiving and storage
Metadata:
a) Capable of complete full text retrieval.
b) Copyright and publication multilingual system.
c) Document format of ETD (PDF or XML)
d) Dublin core and resource description format.
e) The information retrieval engine.
f) VTLS union metadata service for NDLTD format for ETD.
g) What information regarding ETD can collect and share.
h) XML and ETD metadata (ETD – MS: an interoperability metadata
standard for electronic theses and dissertations.
ETD in India:
Through conducting research works and producing PhD theses as a unique
source of information, Indian universities play a major role in generation and
dissemination of knowledge. UGC INFONET, on ambitious programmed of
UGC is around and university libraries can do best utilize it for content
15
Creation and management. As part of ongoing international effects to
networked digital library of theses and dissertation Indian university libraries
can also develop a digital electronic theses and dissertation (ETDs). Fifteen
universities registered and started contribution of ETD at UGC INFONET So
dhganga. White ETD are owned and maintained by the institutions at which
they were produced on archived, it is possible to give searchers the appearance
of a single collection by gathering all the metadata (title, author etc.) into a
central search engine. Then when a potentially be relevant document is found,
the user will be redirected to the institution that contains the actual document.
Otherwise theses in e-form can be sent to INFLIBNET, where we can host
them, and allow users to browse through and download them. INTLIBNET has
already hosted and online database of theses of PhD submitted to Indian
universities. Full text of existing theses collection can also be made available
by converting them in to digital form.
ISSUES IN DIGITAL LIBRARY DEVELOPMENT
There are umpteen numbers of problems the Digital Library development
teams face in India while they embark on the digital development as well as
during progress phase. Some of the prominent and predominant among them
include the following
1. Lack of Proper ICT Infrastructure
Digital Libraries Demand Cutting Edge IT and Communication Infrastructure
such as
a. High end and powerful server; structure LAN with Broadband Intranet
facilities ideally optical fiber based Gigabit networks; Required number of
workstations capable of providing online information services, computing and
multimedia application.
b. Internet connectivity with sufficient bandwidth, capable of meeting the
informational and computational requirement of the user community.
c. Lack of proper planning and Integration of Information resources: presently
the library acquisitions in India are either paper based and electronic. Some of
the libraries need retro-conversion and digitization of library holding too.
Literature on related studies show that there is severe lapse on the libraries
with regard to proper planning of the Information resources which are
conducive for developing digital libraries. There is a dire need for proper
planning and meticulously framed content integration model which is
16
achieved and implemented through world standard digital library
technologies.
2. Rigidity in the Publisher’s Policies and Data Formats
Having successfully installed and configured a digital library does not qualify
a library to automatically populate all its digital collection into the digital
library. One has to obtain publisher’s consent and copy right. Permissions
for the same digital libraries software usually accept and process all popular
and standard digital formats such as HTML, word, RTF, PPT, or PDF.
Most of the publisher’s put their materials in their own proprietary e-book
reader formats, from which the text extraction become almost impossible.
3. Lack of ITC Strategies and Policies
A vast majority of the libraries in India do not have laid down policies on
ITC planning and strategies to meet the challenges posed by the technology
push the information overload, as well as the demand pull from user
4. Lack of Technical Skill
The Human Resource available in the libraries need time to time professional
enrichment inputs and rigorous training on the latest technologies which are
playing around in the new information environment. The kind of training
programmes being imparted in India at the moment are not able to meet the
demand in terms of quantity as well quality.
5. Management Support
For the provision of world class Information system, resources and
services the libraries need the whole hearted sport from the respective
management. Institutional support in terms of proper funding, human
resources and IT skill enrichment are pre-requisites for the development and
maintenance of state-of art digital library system and services.
6. Copyright Issues
Issue of Copyright, intellectual property and fair use concerns are posing
unprecedented array of problems to the libraries and librarians are struggling
to cope with all these related issues in the new digital environment.
17
Demonstration:
To demonstrate the viability of integrating file system and digital library
technology, we implemented an object-oriented extension to the file system called
Synopsis .The traditional file system interface is augmented with a uniform, logical
interface for secure, saleable, distributed information sharing. In addition to
traditional untyped files, Synopsis defines an interface to a typed file object called
a synopsis. The file system uses static directories to group similar files. Synopsis
defines a meta-object, called a digest, to classify synopses dynamically. A digest is
very similar to a database view. Path names serve to identify files. To discover files,
Synopsis adds content-based addressing through queries on synopsis properties. For
operational encapsulation, Synopsis adds method invocation on a synopsis as a way
of accessing a file.
In Synopsis, the metadata associated with the file object is represented by a
collection of attributes. The attributes associated with a synopsis are partitioned into
two sets, search and state, according to purpose. The purpose of search attributes is
to store metadata useful in finding and classifying the file. Typically, search
attributes are derived from properties of the file -- hence the name "synopsis" implies
that the file object is intended as a summary of the file. The purpose of state attributes
is to store information that is necessary for method implementation. Minimally, the
state contains information about the type of the synopsis, its identifier, access
controls, and the associated file.
Synopsis is currently deployed within Transarc. It manages approximately 200,000
files which include program source, product documentation, Web pages, news
postings, customer information files, and defect reports. Through a secure HTTP
gateway, customers, consultants and product support specialists invoke methods on
synopses approximately 10,000 times each week to solve critical (and not-so-
critical) customer issues.
In addition to the file location services provided by the HTTP gateway, we
implemented several applications that simplify the interface to legacy source code
control software, improve processing of electronic messages, and integrate diverse
scientific data files. We implemented task and annotation services that simplify
information sharing across all applications. Clients of the annotation service use it
to leave hints about the relationships that exist between documents.
Other systems attempt to enhance the file system interface with content-management
services. The Semantic File System (SFS) uses types to identify transducers that
18
extract and index document summaries to improve file location. Garlic uses typed
object wrappers that are similar to a synopsis to hide heterogeneity among a
collection of data sources (including files).Shore is an object database that provides a
shell extension for legacy access through a file system interface.
Future Enhancement
Large scale digitization projects are underway at Google, the Million Book Project,
and Internet Archive. With continued improvements in book handling and
presentation technologies such as optical character recognition and development of
alternative depositories and business models, digital libraries are rapidly growing
in popularity. Just as libraries have ventured into audio and video collections, so
have digital libraries such as the Internet Archive. Google Books project recently
received a court victory on proceeding with their book-scanning project that was
halted by the Authors' guild. This helped open the road for libraries to work with
Google to better reach patrons who are accustomed to computerized information.
According to Larry Lannom, Director of Information Management Technology at
the nonprofit Corporation for National Research Initiatives (CNRI), "all the
problems associated with digital libraries are wrapped up in archiving." He goes on
to state, "If in 100 years people can still read your article, we'll have solved the
problem." Daniel Akst, author of The Webster Chronicle, proposes that "the future
of libraries — and of information — is digital." Peter Lyman and Hal Variant,
information scientists at the University of California, Berkeley, estimate that "the
world's total yearly production of print, film, optical, and magnetic content would
require roughly 1.5 billion gigabytes of storage." Therefore, they believe that "soon
it will be technologically possible for an average person to access virtually all
recorded information.
19
Conclusion:
Appropriate infrastructure tools, techniques and manpower is the basic needs
for the development of digital library. Concept of a digital library is new
phenomenon in the developing countries and there is a lack of efficient library
experts who are also well trained in the digitizing process. The converting task of
traditional library into digital library is very complex and for it there is a strong need
for adequate number of highly trained staff for better performance. In India, training
for library professionals in the use of digital resources and development of a digital
library in the networking environment is giving the different institutes INSDOC,
NISCAIR, INFLIBNET, DELNET, etc. in different universities all over the country
the Department of library and Information science have been providing some basic
training in library automation which indeed has not been sufficient at all for
equipping library professionals for handling library automation job. Greenstone, D
space and E-print installation are picking up quite fast in India and institution like
DRTC, INFLIBNET NCSI, IITs, IIMK and many other are giving wide popularity
and training on these software. India has recognized the power of digital libraries
and lots of initiatives are on the move for developing a digital library.
20
References
1. Arms , Willam Y. Key Concept in the Architecture of the Digital Library. D-Lib Magazine,
July/1995.
2. Arms, William Y, Dushay, Naomi; Fulker Dave and Lagaze Carl. A case study in metadata
Harvesting : The NSDL
3. Digital Library.
4. http://www.dlib.org/metrics/public/papers/dig-lib-scope.html
5. http://www.cnri.reston.va.us/kahn-cerf-88.pdf

More Related Content

What's hot

Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
Digital library
Digital libraryDigital library
Digital librarynamithavn
 
Digital library-overview
Digital library-overviewDigital library-overview
Digital library-overviewAnkit Dubey
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies Shriram Pandey
 
12997 article text-48831-1-10-20160701
12997 article text-48831-1-10-2016070112997 article text-48831-1-10-20160701
12997 article text-48831-1-10-20160701Ankit Dubey
 
DIGITAL LIBRARY ARCHITECTURE
DIGITAL LIBRARY ARCHITECTUREDIGITAL LIBRARY ARCHITECTURE
DIGITAL LIBRARY ARCHITECTUREsarika meher
 
User Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical GuideUser Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical GuideSophia Guevara
 
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN TODAY’S INFORMATIO...
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN  TODAY’S INFORMATIO...WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN  TODAY’S INFORMATIO...
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN TODAY’S INFORMATIO...`Shweta Bhavsar
 

What's hot (20)

Computers in Libraries
Computers in LibrariesComputers in Libraries
Computers in Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libray
Digital LibrayDigital Libray
Digital Libray
 
Digital library
Digital libraryDigital library
Digital library
 
Introduction to Digital libraries
Introduction to Digital librariesIntroduction to Digital libraries
Introduction to Digital libraries
 
DIGITAL LIBRARY
DIGITAL LIBRARYDIGITAL LIBRARY
DIGITAL LIBRARY
 
Innovative ICT Based Library Services
Innovative ICT Based Library ServicesInnovative ICT Based Library Services
Innovative ICT Based Library Services
 
Digital library-overview
Digital library-overviewDigital library-overview
Digital library-overview
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies
 
Digital library
Digital libraryDigital library
Digital library
 
12997 article text-48831-1-10-20160701
12997 article text-48831-1-10-2016070112997 article text-48831-1-10-20160701
12997 article text-48831-1-10-20160701
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
DIGITAL LIBRARY ARCHITECTURE
DIGITAL LIBRARY ARCHITECTUREDIGITAL LIBRARY ARCHITECTURE
DIGITAL LIBRARY ARCHITECTURE
 
User Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical GuideUser Focused Digital Library: A Practical Guide
User Focused Digital Library: A Practical Guide
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN TODAY’S INFORMATIO...
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN  TODAY’S INFORMATIO...WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN  TODAY’S INFORMATIO...
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN TODAY’S INFORMATIO...
 
Digital Content Creation
Digital Content CreationDigital Content Creation
Digital Content Creation
 
Basic Concepts of Digital Library
Basic Concepts of Digital LibraryBasic Concepts of Digital Library
Basic Concepts of Digital Library
 

Similar to Digital library

Project management report-on Digital Libraries
Project management report-on Digital LibrariesProject management report-on Digital Libraries
Project management report-on Digital LibrariesMD. Mahmudul Hasan
 
Ict uses in libraries
Ict uses in librariesIct uses in libraries
Ict uses in librariesLiaquat Rahoo
 
ICT_Unit_V_Dr_NS.ppt
ICT_Unit_V_Dr_NS.pptICT_Unit_V_Dr_NS.ppt
ICT_Unit_V_Dr_NS.pptSasi Kumar
 
Dr H K Kaul
Dr H K KaulDr H K Kaul
Dr H K Kaullrc.jiit
 
Digital libraries: successfully designing developing and implementing your d...
Digital libraries:  successfully designing developing and implementing your d...Digital libraries:  successfully designing developing and implementing your d...
Digital libraries: successfully designing developing and implementing your d...Beatrice Amollo
 
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydIn Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydLighton Phiri
 
New ICT Trends and Issues of Librarianship
New ICT Trends and Issues of LibrarianshipNew ICT Trends and Issues of Librarianship
New ICT Trends and Issues of LibrarianshipLiaquat Rahoo
 
Access to electronic information resources in libraries
Access  to electronic information resources in librariesAccess  to electronic information resources in libraries
Access to electronic information resources in librariesavid
 
Technologies used in a library : problems and solutions
Technologies used in a library : problems and solutionsTechnologies used in a library : problems and solutions
Technologies used in a library : problems and solutionsSelim Reza Bappy
 
Digital library and metadata
Digital library and metadataDigital library and metadata
Digital library and metadataramncsi
 
e-governance: A hypothetical e-library
e-governance: A hypothetical e-librarye-governance: A hypothetical e-library
e-governance: A hypothetical e-libraryAhasan Uddin Bhuiyan
 
Digital Library
 Digital Library Digital Library
Digital LibraryShiv Kumar
 
Technological trends in libraries lilian okello
Technological trends in libraries   lilian okelloTechnological trends in libraries   lilian okello
Technological trends in libraries lilian okelloFrancis Mwangi
 
Use of ict in a library
Use of ict in  a libraryUse of ict in  a library
Use of ict in a libraryKhushK4
 
NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...
NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...
NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...IAEME Publication
 
IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...
IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...
IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...IAEME Publication
 
Cloud Computing in Academic Libraries A Review
Cloud Computing in Academic Libraries A ReviewCloud Computing in Academic Libraries A Review
Cloud Computing in Academic Libraries A Reviewijtsrd
 

Similar to Digital library (20)

Project management report-on Digital Libraries
Project management report-on Digital LibrariesProject management report-on Digital Libraries
Project management report-on Digital Libraries
 
Ict uses in libraries
Ict uses in librariesIct uses in libraries
Ict uses in libraries
 
ICT_Unit_V_Dr_NS.ppt
ICT_Unit_V_Dr_NS.pptICT_Unit_V_Dr_NS.ppt
ICT_Unit_V_Dr_NS.ppt
 
Dr H K Kaul
Dr H K KaulDr H K Kaul
Dr H K Kaul
 
Web 3
Web 3Web 3
Web 3
 
Overview of dbms
Overview of dbmsOverview of dbms
Overview of dbms
 
Digital libraries: successfully designing developing and implementing your d...
Digital libraries:  successfully designing developing and implementing your d...Digital libraries:  successfully designing developing and implementing your d...
Digital libraries: successfully designing developing and implementing your d...
 
Drc Chapter 4
Drc Chapter 4Drc Chapter 4
Drc Chapter 4
 
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and LloydIn Search of Simplicity: Redesigning the Digital Bleek and Lloyd
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
 
New ICT Trends and Issues of Librarianship
New ICT Trends and Issues of LibrarianshipNew ICT Trends and Issues of Librarianship
New ICT Trends and Issues of Librarianship
 
Access to electronic information resources in libraries
Access  to electronic information resources in librariesAccess  to electronic information resources in libraries
Access to electronic information resources in libraries
 
Technologies used in a library : problems and solutions
Technologies used in a library : problems and solutionsTechnologies used in a library : problems and solutions
Technologies used in a library : problems and solutions
 
Digital library and metadata
Digital library and metadataDigital library and metadata
Digital library and metadata
 
e-governance: A hypothetical e-library
e-governance: A hypothetical e-librarye-governance: A hypothetical e-library
e-governance: A hypothetical e-library
 
Digital Library
 Digital Library Digital Library
Digital Library
 
Technological trends in libraries lilian okello
Technological trends in libraries   lilian okelloTechnological trends in libraries   lilian okello
Technological trends in libraries lilian okello
 
Use of ict in a library
Use of ict in  a libraryUse of ict in  a library
Use of ict in a library
 
NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...
NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...
NOMENCLATURE CHANGE FOR LIBRARY AND INFORMATION SCIENCE (LIS) SCHOOLS IN NIGE...
 
IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...
IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...
IMPLEMENTATION OF DIGITAL LIBRARY SYSTEM BY USING DSPACE & ANDROID APPS AT AM...
 
Cloud Computing in Academic Libraries A Review
Cloud Computing in Academic Libraries A ReviewCloud Computing in Academic Libraries A Review
Cloud Computing in Academic Libraries A Review
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Digital library

  • 1. 1 Data Integration and its application on Digital Library System ABSTRACT A library may procure contents in various sources and forms to service their clients. In the predominantly paper based erstwhile environment all these contents were put to similar types of use, and copyright restrictions were imposed based on the quantum of pages copied etc. In the electronic and digital perspective, owners of information are resorting to punitive measures regarding the use and contents in digital form. Some of the constraints faced by our libraries to engage in serious digital initiatives are three fold - that of money, manpower and contents. Most of our libraries, particularly in the higher education and research institutes solely depend on the information providers and publishers in the developed world to satisfy their urge for vital contents that inspire indigenous research. Since contents are a major ingredient in digital library development, the pragmatic and viable way out for libraries is to judiciously judge them as available in electronic forms in optical media or on Web and procure at least some of them for hosting locally. This paper presents some of the major issues involved in such a critical activity with some illustrative examples available like IEE/IEEE Electronic Library, Indian Standards on CD- ROM, Science Direct and Web access of Indian Academy of Sciences journals. The justification for selecting external contents has also been mentioned. A detailed checklist for evaluating contents is presented from various angles, like authenticity of content, user interface and search.
  • 2. 2 Introduction Data Integration: Data integration is a process in which heterogeneous data is retrieved and combined as an incorporated form and structure. Data integration allows different data types (such as data sets, documents and tables) to be merged by users, organizations and applications, for use as personal or business processes and/or functions. It is generally implemented in data warehouses (DW) through specialized software that hosts large data repositories from internal and external resources. Data is extracted, amalgamated and presented as a unified form. For example, a user’s complete data set may include extracted and combined data from marketing, sales and operations, which is combined to form a complete report. A data integration project usually involves the following steps: a. Accessing data from all its sources and locations, whether those are on premises or in the cloud or some combination of both. b. Integrating data, so that records from one data source map to records in another (e.g., even if one dataset uses “last name, first name” and another uses “fname,lname” the integrated set will make sure both end up in the right
  • 3. 3 place). This type of data preparation is essential for analytics or other applications to be able to use the data with any success. c. Delivering integrated data to the business exactly when the business needs it, whether it is in batch, near real time, or real time. Application of data integration Information and Communication Technology has revolutionized the concept of libraries. Each and every library is slowly getting digitized. A 'digital library' comprises digital collections, services and infrastructure to support lifelong learning, research, scholarly communication as well as preservation and conservation of our recorded knowledge. It is also a process of democratization of information. This article will discuss the factors that will necessitate the traditional libraries to get digitized, as well as the definition, advantages and disadvantages of digital libraries, the requirement for building a digital library etc. We are in the age of a networked society where information technology in addition to its use in all spheres of human activity has been used extensively to record, store, and disseminate the information in the digital form. Information technology has almost converted the world into a global village. The revolution in the information technology sector is influencing medical information and information. Libraries are also changing to meet the demand put on them. The new generation whose demand for information is never met is always demanding that traditional libraries should be developed as a well- equipped and interconnected as digital libraries. Digital Library: Digital library System is a application of data integration. There are many definitions of a "digital library." Terms such as "electronic library" and "virtual library" are often used synonymously. The elements that have been identified as common to these definitions are: a. The digital library is not a single entity; b. The digital library requires technology to link the resources of many; c. The linkages between the many digital libraries and information services are transparent to the end users; d. Universal access to digital libraries and information services is a goal;
  • 4. 4 e. Digital library collections are not limited to document surrogates: they extend to digital artifacts that cannot be represented or distributed in printed formats According to Arms a digital library is a managed collection of information with associated services where the information is stored in digital format and accessible over a network. Fig: Digital Library A digital library is an organized collection of digitized material or it’s holding in the digital form, which can be accessible by a computer on the network by using TCP/IP or other protocol.Digital libraries initiative in 1994. In the Kahn/ Wile sky architecture, items in the digital library are called “digital objects”. They are stored in “repositories” and identified by “handles”. Information stored in a digital object is called “content” which is divided into “data” and information about the data, known as “properties” or “Metadata”. Software: Here are a number of software packages for use in general digital libraries, for notable ones see Digital library software. Institutional repository software, which focuses primarily on ingest, preservation and access of locally produced documents, particularly locally produced academic outputs, can be found in Institutional repository software. This software may be proprietary, as is the case with the Library of Congress which uses Dig board and CTS to manage digital content
  • 5. 5 Digital Library Services The major digital library services include a. OPAC to web PAC b. Digital Reference Service c. Library Chat Rooms d. Electronic Delivery Services e. Virtual Library Tours f. Ask-A-Librarian g. Real Time Services h. Bulletin Boards i. Web-based User Education Web Forms j. Frequently Asked Questions (FAQ) k. Selective Dissemination of Information in Digital Library: Delivering Customized Contents l. RSS Feeds Requirement for digital libraries The Internet and World Wide Web provide the impetus and technological environment for the development and operation of a digital library. The Internet provides the TCP/IP and or its associated protocol for accessing the information and web provide tools and technique for publishing the information over Internet. In the digital environment it is reasonable to say that a central back up or archive should be created at the national level, which will store information out put of the region as well as information from out side the country. Some of the requirement for a digital libraries are: 1 .Audio visual: Color T.V., V.C.R., D.V.D., Sound box, Telephone etc. 2. Computer: Server, P.C. with multimedia, U.PS. etc
  • 6. 6 3. Network: LAN, MAN, WAN, Internet etc. 4. Printer: Laser printer, Dot matrix, Barcode printer, Digital graphic printer etc 5. Scanner: H.P. Scan jet, flatbed, Sheet feeder, Drum scanner, Slide scanner, Microfilming scanner, Digital camera, Barcode scanner etc 6. Storage devices: Optical storage device, CD-ROM, Jukebox etc. 7. Software: Any suitable software, which is interconnected and suitable for LAN and WAN connection. PC Factors of change to digital libraries The limited buying power of libraries, complex nature of recent document, storage problem etc are some of the common factor which are influencing to change to digital mode, some other factors are- 1. Information explosion 2. Searching problem in traditional libraries 3. Low cost of technology: When we consider the storage capacity of digital document and its maintained then it can be easily realize that the cost of technologies is much more less than that of traditional libraries.
  • 7. 7 4. Environmental factor: the use of digital libraries is the cleanest technologies to fulfil the slogan “Burn a CD-ROM save a tree.” Elements of the digital library Fully developed digital library environment involves the below mentioned elements. These components might not be all be part of a discrete digital library system but could be provided by other related or multipurpose system or environment. Accordingly, integration is a consistent issue cited by digital library developers. a. A private or public network b. Client services for the browser, including repository querying and workflow. c. Content delivery via file transfer or streaming media. d. Initial conversion of content from physical to digital form e. Patron access through a browser or dedicated client f. Storage of digital content and metadata in an appropriate multimedia repository, including right management capabilities to enforce intellectual property rights, if required E-commerce functionality may also be present if needed to handle accounting and billing. g. The extraction or creation of metadata or indexing information describing the content to facilitate searching and discovery as well as administrative structural metadata to assist in object viewing, management and preservation. Advantages of the Digital Library A digital library is not confined to a particular location or so called building it is virtually distributed all over the world. The user can get his/ her information on his own computer screen by using the Internet. Actually it is a network of multimedia system, which provides fingertip access. 1. No physical boundary: The user of a digital library need not to go to the library physically, people from all over the world could gain access to the same information, as long as an Internet connection is available. 2. Round the clock availability: Digital libraries can be accessed at any time, 24 hours a day and 365 days of the year 3. Multiple accesses: The same resources can be used at the same time by a number of users.
  • 8. 8 4. Structured approach: Digital library provides access to much richer content in a more structured manner i.e. we can easily move from the catalogue to the particular book then to a particular chapter and so on. 5. Information retrieval: The user is able to use any search term bellowing to the word or phrase of the entire collection. Digital library will provide very user friendly interfaces, giving click able access to its resources. 6. Preservation and conservation: An exact copy of the original can be made any number of times without any degradation in quality. 7. Space: Whereas traditional libraries are limited by storage space, digital libraries have the potential to store much more information, simply because digital information requires very little physical space to contain them. When the library had no space for extension digitization is the only solution. 8. Networking: A particular digital library can provide the link to any other resources of other digital library very easily thus a seamlessly integrated resource sharing can be achieved. 9. Cost: The cost of maintaining a digital library is much lower than that of a traditional library. A traditional library must spend large sums of money paying for staff, book maintains, rent, and additional books. Digital libraries do away with these fees. Disadvantages of the Digital Library The computer viruses, lack of standardization for digitized information, quick degrading properties of digitized material, different display standard of digital product and its associated problem, health hazard nature of the radiation from monitor etc. makes digital libraries at times handicap. 1. Copyright: - Digitization violates the copy right law as the thought content of one author can be freely transfer by other without his acknowledgement. So One difficulty to overcome for digital libraries is the way to distribute information. How does a digital library distribute information at will while protecting the copyright of the author? 2. Speed of access: - As more and more computer are connected to the Internet its speed of access reasonably decreasing. If new technology will not evolve to solve the problem then in near future Internet will be full of error messages. 3. Initial cost is high: - The infrastructure cost of digital library i.e. the cost of hardware, software; leasing communication circuit is generally very high.
  • 9. 9 4. Band width: - Digital library will need high band for transfer of multimedia resources but the band width is decreasing day by day due to its over utilization. 5. Efficiency: - With the much larger volume of digital information, finding the right material for a specific task becomes increasingly difficult. 6. Environment: - Digital libraries cannot reproduce the environment of a traditional library. Many people also find reading printed material to be easier than reading material on a computer screen. 7. Preservation: - Due to technological developments, a digital library can rapidly become out-of-date and its data may become inaccessible. Architecture of Digital Library Kahn and Wilensky describe the architecture of the digital library having the characteristics that can apply for all type of material. A name or identifier is essential to save and object. For the digital library the names or identifiers are a vital building block, which are needed to identify digital objected, to register intellectual property in digital objects, to record changes of ownership, required for citation for information retrieval and are used for links between objects. These names/ identify must be unique. An administrative system is required to decide who can assign them and change the objects that they identify. They must last for very long time periods, which exclude the use of an identifier tied to a specific location, such as the name of a computer and the names must persist even if the organization that named an object no longer exists when the objects is used. The computer systems are required to resolve the name rapidly, by providing the location where an object with a given name is stored. To achieve these satisfactions a handle system is implemented. A “handle” is a unique string used to identify digital objects and it is independent of the location where the digital object is stored and can remain valid over very long periods of time. A global server provides a definitive resource for legal and archival purpose, with a caching server for fast resolution. The computer system checks that new names are indeed unique, and supports standard user interfaces, such as Magic. A local handle server is being added for increased local control. Parts of Digital Library Objects Information is stored as “digital objects” in the digital library. A primitive idea of a digital object is that it is just a set of bits, but this idea is too simple. The content of
  • 10. 10 even the of the basic digital object has some structure, and information, such as intellectual property rights, must be associated with the digital object. Figure 2 shows that object in a repository has two parts, content and associated data, sometimes called “metadata”. Metadata The term refers to any data used to aid the identification, description and location of networked electronic resources. Many different metadata formats exist, some quite simple in their description, others quite complex and rich. Metadata is defined as data providing information about one or more aspects of the data, such as: Means of creation of the data; Purpose of the data; Time and date of creation; Creator or author of data; Placement on a computer network where the data was created and the standards used The metadata of a text document contains the information about the length, author, time of written and summary of the document. And in case of digital image, metadata describes how large the picture is, the color depth, the other data. Metadata is data. As such, metadata can be stored and managed in a database, often called a registry or repository. However, it is impossible to identify metadata just by looking at it because a user would not know when data is metadata or just data. resolution, when it was created, and other data. Metadata is data. As such, metadata can be stored and managed in a database, often called a registry or repository. However, it is impossible to identify metadata just by looking at it because a user would not know when data is metadata or just data Metadata in libraries Metadata has been used in various forms as a means of cataloging archived information. The DDC system employed by libraries for the classification of library materials is an early example of metadata usage. Library catalogues used 3x5 inch cards to display a book's title, author, subject matter, and a brief plot synopsis along with an abbreviated alpha numeric identification system which indicated the physical location of the book within the library's shelves. Such data helps classify, aggregate, identify, and locate a particular book. Another form of older metadata collection is the use by US Census Bureau of what is known as the "Long Form." The Long Form asks questions that are used to create demographic data to create patterns and to find patterns of distribution. The term was coined in 1968 by Philip Bagley, one of the pioneers of computerized document retrieval.
  • 11. 11 Since then the fields of information management, information science, information technology, librarianship and GIS have widely adopted the term. In these fields the word metadata is defined as "data about data". While this is the generally accepted definition, various disciplines have adopted their own more specific explanation and uses of the term. Meta data describe the attributes and contents of on original document or work and describes a resource. Metadata may be defined as representing higher-level information that describes the content, context, quality, structure and accessibility of specific data set such as digital data images, databases and printed materials. As large scientific databases were developed, it become evident that surrogates were required to provide more information about data set: Metadata include two types of information 1. Basic details about the institutions that hold relevant information who are they? where are they and what is their function? What are their. a. Available resources? b. Key linkages (who is currently working with whom and how)? 2. About relevant data sets a. Description of data sets (What, purpose, form at and how managed). b. Coverage (geographic, thematic, time scale, completeness, limitations and gaps), access (availability, cost, formats available and documentation). Metadata not only provides pointers to the original data sets but it also help in sharing data among the database produces. It is a tool to integrate data that are in heterogeneous format and scattered geographically, several agencies are taking initiatives in creating Metadata / Metadata base by using various Metadata standard. The structure of information in a digital library Interactions, such as the query described above, require that information in a digital library be organized effectively. Within the library, information is stored as basic units of digital information, e.g., a digitized map, a section of text, a Web page, a scanned photograph, etc. In digital form, each basic unit is a sequence of bits, but users often want to refer to material at a higher level of abstraction than the individual item. Common English terms, such as a "report", a "computer program", or an "opera" can refer to many items that are variants of each other. They may have
  • 12. 12 different formats, minor differences of content, different usage restrictions, and so on, but for some purposes users are willing to consider them as equivalent. The issues to be addressed in structuring information include the following. Digital materials are frequently related to other materials by relationships such as part/whole, sequence, etc. For example, a digitized text may consist of pages, chapters, front matter, an index, illustrations, and so on. In the World Wide Web, a typical item may include several pages of text, with embedded images, and links to other information. A single computer program is assembled from many files, both source and binary, with complex rules of inclusion. Materials belong to collections. These may be collections in the traditional, custodial sense; they may be the on-line groupings provided by a publisher; or they may be the pages maintained by a Webmaster. The same item may be stored in several digital formats. Sometimes, these formats are exactly equivalent and it is possible to convert from one to the other (e.g., an uncompressed image and the same image stored with a loss-less compression). At other times, the different formats contain different information (e.g., differing representations of a page of text in SGML and PostScript formats). a. Because digital objects are easy to change, different versions are created continually. (Some organizations change their Web home page several times per month.) Versions may differ by a single bit or may be very different. When existing material is converted to digital form, the same physical item may be converted several times. For example, a scanned photograph may have a high resolution archival version, a medium quality version, and a thumbnail. b. Each element of digital information may have different rights and permissions associated with it. c. The manner in which the user wishes to access material may depend upon the characteristics of computer systems and networks, and the size of the material. For example, a user connected to the digital library over a high speed network may have a different pattern of work from the same user when using a dial-up line. The information architecture described here provides a general approach to organizing the material within the digital library in such a manner that computer programs can understand the structure of the material and carry out the interactions that the user wishes.
  • 13. 13 Repository A repository stores digital objects, both the content on the metadata. A digital object as stored in a repository may be very different from the digital object that is made available to users’ computers. Different repositories will have very different internal organizations, but for each digital object every repository will have a properties record, which holds attributes of the object, and a transaction log. Since digital objects contain valuable intellectual property, the stored from of a digital object within the repository includes information that allows for it to be managed within economic and social frameworks. The repository maintains this information, provides basic reference information, and provides security to ensure that only valid operations are carried out on the digital objects. The internal organization of a repository and the way that digital objects are stored are hidden from the user. A simple protocol is called the “repository access protocol.” The basic commands in this protocol are those to access a digital object and its metadata, and the service request to disseminate a digital object. In addition there are commands to add and delete digital objects. Electronic theses and dissertation Electronic theses and dissertations (ETD) are defined as those theses and dissertations submitted archived, or accessed primarily in electronic formats. That includes additional word processed documents made available in PDF, as well as less traditional hypertext and multimedia formats purchased electronically on CD – ROM or World Wide Web. Needs of ETD a. Almost all TD’s are produces as electronic documents and if researchers know in advance about have to prepare ETD, then creating their own ETD usually is very simple process. b. Minimize duplication of effort. c. Improve visibility. d. Accelerate ETD s available faster to outside audience. e. Cost and benefits. f. Enhancing access to university research. g. Helping universities develop digital library services & infrastructure. h. Increasing sharing collaboration among universities and students.
  • 14. 14 Objectives for ETD The traditional methods of archiving and storing theses and dissertation are inefficient and unwieldy. Many theses and dissertation lie mouldering in library stacks, with no efficient way for researchers to locate the information that may be contained in them. Further the time and cost involved in procuring copies of those works may often be prohibitive. The main objectives are as follows :- a) To advance digital library technology. b) To empower students to convey a richer message through the use of multimedia and hypermedia technologies. c) To empower universities to unlock their information resources. d) To improve education and research by allowing students to produce electronic documents. e) To lower the cost of submitting and handling theses and dissertations. Technical issue involved a. Tools for creation b. Management c. Access d. Archiving and storage Metadata: a) Capable of complete full text retrieval. b) Copyright and publication multilingual system. c) Document format of ETD (PDF or XML) d) Dublin core and resource description format. e) The information retrieval engine. f) VTLS union metadata service for NDLTD format for ETD. g) What information regarding ETD can collect and share. h) XML and ETD metadata (ETD – MS: an interoperability metadata standard for electronic theses and dissertations. ETD in India: Through conducting research works and producing PhD theses as a unique source of information, Indian universities play a major role in generation and dissemination of knowledge. UGC INFONET, on ambitious programmed of UGC is around and university libraries can do best utilize it for content
  • 15. 15 Creation and management. As part of ongoing international effects to networked digital library of theses and dissertation Indian university libraries can also develop a digital electronic theses and dissertation (ETDs). Fifteen universities registered and started contribution of ETD at UGC INFONET So dhganga. White ETD are owned and maintained by the institutions at which they were produced on archived, it is possible to give searchers the appearance of a single collection by gathering all the metadata (title, author etc.) into a central search engine. Then when a potentially be relevant document is found, the user will be redirected to the institution that contains the actual document. Otherwise theses in e-form can be sent to INFLIBNET, where we can host them, and allow users to browse through and download them. INTLIBNET has already hosted and online database of theses of PhD submitted to Indian universities. Full text of existing theses collection can also be made available by converting them in to digital form. ISSUES IN DIGITAL LIBRARY DEVELOPMENT There are umpteen numbers of problems the Digital Library development teams face in India while they embark on the digital development as well as during progress phase. Some of the prominent and predominant among them include the following 1. Lack of Proper ICT Infrastructure Digital Libraries Demand Cutting Edge IT and Communication Infrastructure such as a. High end and powerful server; structure LAN with Broadband Intranet facilities ideally optical fiber based Gigabit networks; Required number of workstations capable of providing online information services, computing and multimedia application. b. Internet connectivity with sufficient bandwidth, capable of meeting the informational and computational requirement of the user community. c. Lack of proper planning and Integration of Information resources: presently the library acquisitions in India are either paper based and electronic. Some of the libraries need retro-conversion and digitization of library holding too. Literature on related studies show that there is severe lapse on the libraries with regard to proper planning of the Information resources which are conducive for developing digital libraries. There is a dire need for proper planning and meticulously framed content integration model which is
  • 16. 16 achieved and implemented through world standard digital library technologies. 2. Rigidity in the Publisher’s Policies and Data Formats Having successfully installed and configured a digital library does not qualify a library to automatically populate all its digital collection into the digital library. One has to obtain publisher’s consent and copy right. Permissions for the same digital libraries software usually accept and process all popular and standard digital formats such as HTML, word, RTF, PPT, or PDF. Most of the publisher’s put their materials in their own proprietary e-book reader formats, from which the text extraction become almost impossible. 3. Lack of ITC Strategies and Policies A vast majority of the libraries in India do not have laid down policies on ITC planning and strategies to meet the challenges posed by the technology push the information overload, as well as the demand pull from user 4. Lack of Technical Skill The Human Resource available in the libraries need time to time professional enrichment inputs and rigorous training on the latest technologies which are playing around in the new information environment. The kind of training programmes being imparted in India at the moment are not able to meet the demand in terms of quantity as well quality. 5. Management Support For the provision of world class Information system, resources and services the libraries need the whole hearted sport from the respective management. Institutional support in terms of proper funding, human resources and IT skill enrichment are pre-requisites for the development and maintenance of state-of art digital library system and services. 6. Copyright Issues Issue of Copyright, intellectual property and fair use concerns are posing unprecedented array of problems to the libraries and librarians are struggling to cope with all these related issues in the new digital environment.
  • 17. 17 Demonstration: To demonstrate the viability of integrating file system and digital library technology, we implemented an object-oriented extension to the file system called Synopsis .The traditional file system interface is augmented with a uniform, logical interface for secure, saleable, distributed information sharing. In addition to traditional untyped files, Synopsis defines an interface to a typed file object called a synopsis. The file system uses static directories to group similar files. Synopsis defines a meta-object, called a digest, to classify synopses dynamically. A digest is very similar to a database view. Path names serve to identify files. To discover files, Synopsis adds content-based addressing through queries on synopsis properties. For operational encapsulation, Synopsis adds method invocation on a synopsis as a way of accessing a file. In Synopsis, the metadata associated with the file object is represented by a collection of attributes. The attributes associated with a synopsis are partitioned into two sets, search and state, according to purpose. The purpose of search attributes is to store metadata useful in finding and classifying the file. Typically, search attributes are derived from properties of the file -- hence the name "synopsis" implies that the file object is intended as a summary of the file. The purpose of state attributes is to store information that is necessary for method implementation. Minimally, the state contains information about the type of the synopsis, its identifier, access controls, and the associated file. Synopsis is currently deployed within Transarc. It manages approximately 200,000 files which include program source, product documentation, Web pages, news postings, customer information files, and defect reports. Through a secure HTTP gateway, customers, consultants and product support specialists invoke methods on synopses approximately 10,000 times each week to solve critical (and not-so- critical) customer issues. In addition to the file location services provided by the HTTP gateway, we implemented several applications that simplify the interface to legacy source code control software, improve processing of electronic messages, and integrate diverse scientific data files. We implemented task and annotation services that simplify information sharing across all applications. Clients of the annotation service use it to leave hints about the relationships that exist between documents. Other systems attempt to enhance the file system interface with content-management services. The Semantic File System (SFS) uses types to identify transducers that
  • 18. 18 extract and index document summaries to improve file location. Garlic uses typed object wrappers that are similar to a synopsis to hide heterogeneity among a collection of data sources (including files).Shore is an object database that provides a shell extension for legacy access through a file system interface. Future Enhancement Large scale digitization projects are underway at Google, the Million Book Project, and Internet Archive. With continued improvements in book handling and presentation technologies such as optical character recognition and development of alternative depositories and business models, digital libraries are rapidly growing in popularity. Just as libraries have ventured into audio and video collections, so have digital libraries such as the Internet Archive. Google Books project recently received a court victory on proceeding with their book-scanning project that was halted by the Authors' guild. This helped open the road for libraries to work with Google to better reach patrons who are accustomed to computerized information. According to Larry Lannom, Director of Information Management Technology at the nonprofit Corporation for National Research Initiatives (CNRI), "all the problems associated with digital libraries are wrapped up in archiving." He goes on to state, "If in 100 years people can still read your article, we'll have solved the problem." Daniel Akst, author of The Webster Chronicle, proposes that "the future of libraries — and of information — is digital." Peter Lyman and Hal Variant, information scientists at the University of California, Berkeley, estimate that "the world's total yearly production of print, film, optical, and magnetic content would require roughly 1.5 billion gigabytes of storage." Therefore, they believe that "soon it will be technologically possible for an average person to access virtually all recorded information.
  • 19. 19 Conclusion: Appropriate infrastructure tools, techniques and manpower is the basic needs for the development of digital library. Concept of a digital library is new phenomenon in the developing countries and there is a lack of efficient library experts who are also well trained in the digitizing process. The converting task of traditional library into digital library is very complex and for it there is a strong need for adequate number of highly trained staff for better performance. In India, training for library professionals in the use of digital resources and development of a digital library in the networking environment is giving the different institutes INSDOC, NISCAIR, INFLIBNET, DELNET, etc. in different universities all over the country the Department of library and Information science have been providing some basic training in library automation which indeed has not been sufficient at all for equipping library professionals for handling library automation job. Greenstone, D space and E-print installation are picking up quite fast in India and institution like DRTC, INFLIBNET NCSI, IITs, IIMK and many other are giving wide popularity and training on these software. India has recognized the power of digital libraries and lots of initiatives are on the move for developing a digital library.
  • 20. 20 References 1. Arms , Willam Y. Key Concept in the Architecture of the Digital Library. D-Lib Magazine, July/1995. 2. Arms, William Y, Dushay, Naomi; Fulker Dave and Lagaze Carl. A case study in metadata Harvesting : The NSDL 3. Digital Library. 4. http://www.dlib.org/metrics/public/papers/dig-lib-scope.html 5. http://www.cnri.reston.va.us/kahn-cerf-88.pdf