1. Devi Ahilya Vishva VidhyalyaDevi Ahilya Vishva Vidhyalya
Taksha Shila Campus, Khandwa Road, IndoreTaksha Shila Campus, Khandwa Road, Indore
DATA MININGDATA MINING
(concepts, need and application in library)(concepts, need and application in library)
A presentation submitted against the unit test-3 for the completion of M. Phil.A presentation submitted against the unit test-3 for the completion of M. Phil.
In Library & Information Science, Second Semester (Session 2010-11)In Library & Information Science, Second Semester (Session 2010-11)
Submitted by:Submitted by: Guided By:Guided By:
Ritesh Tiwari Mr. Bhupendra RathaRitesh Tiwari Mr. Bhupendra Ratha
Roll No. 16 Lecturer (SLIS, UTD, DAVV)Roll No. 16 Lecturer (SLIS, UTD, DAVV)
2. ContentsContents
What is Data MiningWhat is Data Mining
Technical DefinitionTechnical Definition
ConceptConcept
Terms Included in Data MiningTerms Included in Data Mining
Data WarehouseData Warehouse
MetadataMetadata
Example of MetadataExample of Metadata
Metadata SchemaMetadata Schema
Processes included in Data MiningProcesses included in Data Mining
Need of Data MiningNeed of Data Mining
Applications of Data Mining in LibrariesApplications of Data Mining in Libraries
ConclusionConclusion
ReferencesReferences
3. What is Data Mining ?What is Data Mining ?
As we find the elements like, Diamond,As we find the elements like, Diamond,
Coal etc. out of the mines, the process ofCoal etc. out of the mines, the process of
finding the elements is called Mining,finding the elements is called Mining,
same way when we mine the exactsame way when we mine the exact
required data or information from a largerequired data or information from a large
amount of data, this process is known asamount of data, this process is known as
Data MiningData Mining..
4. Technical Definition:Technical Definition:
Data mining, a branch of ComputerData mining, a branch of Computer
Science, is the process ofScience, is the process of
extracting patterns from large dataextracting patterns from large data
sets by combining methods fromsets by combining methods from
Statistics and artificial IntelligenceStatistics and artificial Intelligence
with database management. Itwith database management. It
simply means finding the requiredsimply means finding the required
information out of the huge datainformation out of the huge data
sets.sets.
5. Concept:Concept:
The subject “ Data Mining” is a partThe subject “ Data Mining” is a part
of Computer Science which is allof Computer Science which is all
about creating a system forabout creating a system for
capturing the Data or information,capturing the Data or information,
Editing, Formatting, Storage,Editing, Formatting, Storage,
Naming, Indexing or Meta DataNaming, Indexing or Meta Data
Creation, Preservation, SecurityCreation, Preservation, Security
and Maintenance of same in such aand Maintenance of same in such a
manner that makes the searching /manner that makes the searching /
retrieval of data / Informationretrieval of data / Information
effective and user friendly. Dataeffective and user friendly. Data
Mining is a Broad term whichMining is a Broad term which
includes so many processes toincludes so many processes to
achieve the goal of fast & effectiveachieve the goal of fast & effective
retrieval of Information / Data.retrieval of Information / Data.
6. Terms included in Data Mining:Terms included in Data Mining:
Data Warehouse:Data Warehouse:
Meta Data:Meta Data:
7. Data Warehouse:Data Warehouse:
Data warehousing is combining data fromData warehousing is combining data from
multiple and usually varied sources intomultiple and usually varied sources into
one comprehensive and easilyone comprehensive and easily
manipulated database. Commonmanipulated database. Common
accessing systems of data warehousingaccessing systems of data warehousing
include, queries analysis and reporting.include, queries analysis and reporting.
Because data warehousing creates oneBecause data warehousing creates one
database in the end, the number ofdatabase in the end, the number of
sources can be anything you want it to be,sources can be anything you want it to be,
provided that the system can handle theprovided that the system can handle the
volume, of course. The final result,volume, of course. The final result,
however, is homogeneous data, which canhowever, is homogeneous data, which can
be more easily manipulated.be more easily manipulated.
8. Metadata:Metadata:
Metadata is structured data whichMetadata is structured data which
describes the characteristics of adescribes the characteristics of a
resource. It shares many similarresource. It shares many similar
characteristics to the cataloguing thatcharacteristics to the cataloguing that
takes place in libraries, museums andtakes place in libraries, museums and
archives. The term "meta" derives fromarchives. The term "meta" derives from
the Greek word denoting a nature of athe Greek word denoting a nature of a
higher order or more fundamental kind. Ahigher order or more fundamental kind. A
metadata record consists of a number ofmetadata record consists of a number of
pre-defined elements representing specificpre-defined elements representing specific
attributes of a resource, and each elementattributes of a resource, and each element
can have one or more values.can have one or more values.
9. Example of Metadata:Example of Metadata:
Element nameElement name
ValueValue
Title: Web catalogueTitle: Web catalogue
Creator: Dagnija McAuliffeCreator: Dagnija McAuliffe
Publisher: University of QueenslandPublisher: University of Queensland
LibraryLibrary
Format: Text/htmlFormat: Text/html
Relation: Library Web siteRelation: Library Web site
10. Metadata Schema:Metadata Schema:
The Format or Schema of MetadataThe Format or Schema of Metadata
may be vary in differentmay be vary in different
organizations according to theirorganizations according to their
requirements.requirements.
Each metadata schema will usuallyEach metadata schema will usually
have the following characteristics:have the following characteristics:
A limited number of elementsA limited number of elements
The name of each elementThe name of each element
The meaning of each elementThe meaning of each element
Location or Address of eachLocation or Address of each
11. Metadata Schema:Metadata Schema:
Some of the most popular metadataSome of the most popular metadata
schemas include:schemas include:
Dublin CoreDublin Core
AACR2AACR2 (Anglo-American Cataloging(Anglo-American Cataloging
Rules)Rules)
GILSGILS (Government Information Locator(Government Information Locator
Service)Service)
EADEAD (Encoded Archives Description)(Encoded Archives Description)
IMSIMS (IMS Global Learning Consortium)(IMS Global Learning Consortium)
AGLSAGLS (Australian Government Locator(Australian Government Locator
12. Processes included in Data Mining:Processes included in Data Mining:
Creating the Databases.Creating the Databases.
Integrating the different databases.Integrating the different databases.
Editing and formatting of Data forEditing and formatting of Data for
creating Data Ware-House.creating Data Ware-House.
Organizing the data in helpful sequenceOrganizing the data in helpful sequence
systematically.systematically.
Proper naming of the data/document.Proper naming of the data/document.
Creating the Meta data.Creating the Meta data.
Searching/Retrieving the required andSearching/Retrieving the required and
useful information or data out of the datauseful information or data out of the data
ware-house with the help of Meta data.ware-house with the help of Meta data.
Requesting the users for their feedback.Requesting the users for their feedback.
Evaluation of system.Evaluation of system.
Making Arrangements for Modification.Making Arrangements for Modification.
13. Need of Data Mining:Need of Data Mining:
In earlier time “Availability of Information” was theIn earlier time “Availability of Information” was the
greatest Problem of Researchers or Library users, butgreatest Problem of Researchers or Library users, but
now a days “Huge quantity of Information” is thenow a days “Huge quantity of Information” is the
greatest problem. The solution is needed for searchinggreatest problem. The solution is needed for searching
the required information from this Huge amount ofthe required information from this Huge amount of
Information, that’s why “Information, that’s why “ Data MiningData Mining” came in to” came in to
existence to provide the solution to this situation.existence to provide the solution to this situation.
Some other needs of “Data Mining” are as follows:Some other needs of “Data Mining” are as follows:
For satisfying the user’s need.For satisfying the user’s need.
For better Storage and Retrieval System especially inFor better Storage and Retrieval System especially in
digital Environment.digital Environment.
For increasing accuracy in search.For increasing accuracy in search.
For preventing users from data-garbage.For preventing users from data-garbage.
For solving the problems raised due to InformationFor solving the problems raised due to Information
Explosion.Explosion.
For creating user friendly search environment.For creating user friendly search environment.
For saving the time of readers.For saving the time of readers.
For providing global access to Information.For providing global access to Information.
14. Application of Data Mining in Libraries:Application of Data Mining in Libraries:
The Data Mining is a broader term and includes “DataThe Data Mining is a broader term and includes “Data
Warehousing and “Meta data creation” and now a daysWarehousing and “Meta data creation” and now a days
applies in Libraries and Information Science in followingapplies in Libraries and Information Science in following
applications:applications:
In capturing the Information or data.In capturing the Information or data.
In creating the Databases.In creating the Databases.
In Integrating the Database.In Integrating the Database.
In Proper naming of data or Information.In Proper naming of data or Information.
In indexing and cataloguing of digitalIn indexing and cataloguing of digital
documents.documents.
In searching the Information out of the hugeIn searching the Information out of the huge
data.data.
In improving the services of libraries andIn improving the services of libraries and
Information Centers.Information Centers.
15. Conclusion:Conclusion:
Data Mining including Data WarehousingData Mining including Data Warehousing
and Metadata creation is making theand Metadata creation is making the
modern libraries capable of creating amodern libraries capable of creating a
better Information Storage & Retrievalbetter Information Storage & Retrieval
system, especially in the Digital Era.system, especially in the Digital Era.
“Unavailability of Information” is not a“Unavailability of Information” is not a
problem now days but “Huge amount ofproblem now days but “Huge amount of
Information” is the problem rises due toInformation” is the problem rises due to
“Information Explosion”. It is very difficult“Information Explosion”. It is very difficult
to find a specific Information or data out ofto find a specific Information or data out of
the Huge amount of Information or Data.the Huge amount of Information or Data.
Data Mining helped the Libraries &Data Mining helped the Libraries &
Information Centers to handle InformationInformation Centers to handle Information
explosion and to face this situation.explosion and to face this situation.
16. References:References:
Data Mining: concepts & techniques by Han andData Mining: concepts & techniques by Han and
Kamber Publisher: ElsevierKamber Publisher: Elsevier
Data Mining: introductory & advanced topics byData Mining: introductory & advanced topics by
Dunham & Sridhar Publisher: PearsonDunham & Sridhar Publisher: Pearson
EducationEducation
http://www.wiseqeek.comhttp://www.wiseqeek.com
http://www.library.uq.edu.auhttp://www.library.uq.edu.au
http://en.wikipedia.orghttp://en.wikipedia.org