SlideShare a Scribd company logo
1 of 32
SUBMITTED BY: SHUVRA GHOSH
ROLL NO: 07
COURSE: MLIS
GUIDED BY: PROF. UDAYAN BHATTACHARYA
DEPARTMENT OF LIBRARY AND
INFORMATION SCIENCE
JADAVPUR UNIVERSITY
*
*
Process of discovering valuable information from a
collection of data, or it is the process of converting raw
data into useful information.
Knowledge discovery is an activity that produces
knowledge by discovering it or deriving it from existing
information.
Knowledge Discovery refers to the overall process of
discovering useful knowledge from data, and data mining
refers to a particular step in this process.
*Why do we need knowledge discovery
process?
*
• Database data
• Data Warehouse
• Transactional data
• Other kinds of Data-
Time related data
Sequence data (historical data records, Stock Exchange)
Data streams (Video surveillance, Sensor data)
Spatial data (Maps)
Hypertext and Multimedia data (Text, Video, Audio)
Graph and networked data
Engineering design data (auto CAD)
Web
*
• Interactive
• Iterative
• Procedure to extract knowledge from data
• Knowledge being searched for is –
implicit
previously unknown
potentially useful
*
*
Data Cleaning − in this step, the noise and inconsistent data is
removed. Example Parsing the Data.
Cleaning is performed for detection
Of syntax error.
Parser decides the given string of
Data is acceptable within data
Specification.
*
Data Integration − in this step, multiple data sources are combined
Example: Retail loan application, commercial loan application,
demand deposit application are combined in bank data
warehouse.
.
Data Selection − in this step, data relevant to the analysis task
are retrieved from the database.
*
Data Transformation − in this step, data is transformed or consolidated into
forms appropriate for mining by performing summary or aggregation
operations.
The aggregation operators perform mathematical operations like Average,
Aggregate, Count, Max, Min and Sum, on the numeric property of the
elements in the collection.
*
Data Mining − in this step, intelligent methods are applied in order to
extract data patterns.
intelligent methods are –
• Association
• Classification
Decision tree
• Clustering
• Regression
*
*
*
*
Pattern Evaluation − in this step, data patterns are evaluated.
*
Knowledge Presentation − in this step, knowledge is
represented by various visualize tools.
 Table
 Chart
 Graph
*
Knowledge discovery process has three parts
Academic Research Models
Industrial Models
Hybrid Models
•
 The efforts to establish a KDP model were initiated in
academia, in the mid-1990s.
 when the DM field was being shaped, researchers started
defining multistep procedures to guide users of DM tools in
the complex knowledge discovery world.
 The two process models developed in 1996 and 1998 are the
nine-step model by Fayyad et al. and the eight-step model by
Anand and Buchner.
*
1.Developing and understanding the application domain. This step
includes learning the relevant prior knowledge and the goals of the end user of
the discovered knowledge.
2. Creating a target data set. Here the data miner selects a subset of variables
(attributes) and data points (examples) that will be used to perform discovery
tasks. This step usually includes querying the existing data to select the desired
subset.
3. Data cleaning and pre-processing. This step consists of removing outliers,
dealing with noise and missing values in the data, and accounting for time
sequence information and known changes.
4. Data reduction and projection. This step consists of finding useful
attributes by applying dimension reduction and transformation methods, and
finding invariant representation of the data.
5. Choosing the data mining task. Here the data miner matches the goals
defined in Step 1 with a particular DM method, such as classification,
regression, clustering, etc.
*
Two representative industrial models are the five-step model by
Cabena et al., with support from IBM and the industrial six-step
CRISP-DM model, developed by a large consortium of
European companies.
*
The CRISP-DM (Cross-Industry Standard Process for Data Mining)
was first established in the late 1990s by four companies: Integral
Solutions Ltd. (a provider of commercial data mining solutions),
NCR (a database provider), DaimlerChrysler (an automobile
manufacturer), and OHRA (an insurance company).
*
*
The development of academic and industrial models has led to the
development of hybrid models, i.e., models that combine aspects of both.
One such model is a six-step KDP model developed by Cios et al.
The main differences and extensions include
• providing more general, research-oriented description of the steps,
• introducing a data mining step instead of the modeling step,
• introducing several new explicit feedback mechanisms, (the CRISP-
DM model has only three major feedback sources, while the hybrid
model has more detailed feedback mechanisms) and
• Modification of the last step, since in the hybrid model, the
knowledge discovered for a particular domain may be applied in other
domains.
*
*
1. Understanding of the problem domain. This initial step involves
working closely with domain experts to define the problem and
determine the project goals, identifying key people, and learning about
current solutions to the problem. It also involves learning domain-
specific terminology. A description of the problem, including its
restrictions, is prepared. Finally, project goals are translated into DM
goals, and the initial selection of DM tools to be used later in the process
is performed.
2. Understanding of the data. This step includes collecting sample data
and deciding which data, including format and size, will be needed.
Background knowledge can be used to guide these efforts. Data are
checked for completeness, redundancy, missing values, plausibility of
attribute values, etc. Finally, the step includes verification of the
usefulness of the data with respect to the DM goals.
*
Knowledge Discovery in Databases is the process by which a task is
identified and performed upon a database in order to extract
information about the elements of the database. This process involves
first collecting the data to be analysed, cleaning up the data, and
reducing it to those features of interest to the process. At which time the
tool or tools to be used upon the data are identified. These tools are
then used to mine the data for information. Once the information has
been created, it must be evaluated as to it efficacy to the process. Any
knowledge thereupon gained is then re-incorporated into the process as
well as used for purposes outside the scope of the process.
This is a very complex process, but it is one that lends itself to a fair
degree of automation. As such, it enters into the field of artificial
intelligence, not just for the tools it employs, but for the fact that the
process tries to re-incorporate the knowledge it has created.
*
*Thank you

More Related Content

What's hot

Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data miningDevakumar Jain
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data miningEr. Nawaraj Bhandari
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and predictionDataminingTools Inc
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-WarehouseAbdul Aslam
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 

What's hot (20)

Data mining
Data miningData mining
Data mining
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 
data mining
data miningdata mining
data mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-Warehouse
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 

Similar to Knowledge discovery process

knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)Kartik Kalpande Patil
 
6 ijaems sept-2015-6-a review of data security primitives in data mining
6 ijaems sept-2015-6-a review of data security primitives in data mining6 ijaems sept-2015-6-a review of data security primitives in data mining
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.pptSK Chew
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introductionBasma Gamal
 
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEDATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEIJDKP
 
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET-	 Fault Detection and Prediction of Failure using Vibration AnalysisIRJET-	 Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- Fault Detection and Prediction of Failure using Vibration AnalysisIRJET Journal
 
From data mining to knowledge discovery in
From data mining to knowledge discovery inFrom data mining to knowledge discovery in
From data mining to knowledge discovery inRaj Kumar Ranabhat
 
Data Mining Implementation process.pptx
Data Mining Implementation process.pptxData Mining Implementation process.pptx
Data Mining Implementation process.pptxLithal Fragrance
 
A review on data mining
A  review on data miningA  review on data mining
A review on data miningEr. Nancy
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective ApproachIRJET Journal
 
Introducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huIntroducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huwekineheshete
 

Similar to Knowledge discovery process (20)

dwdm unit 1.ppt
dwdm unit 1.pptdwdm unit 1.ppt
dwdm unit 1.ppt
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)
 
6 ijaems sept-2015-6-a review of data security primitives in data mining
6 ijaems sept-2015-6-a review of data security primitives in data mining6 ijaems sept-2015-6-a review of data security primitives in data mining
6 ijaems sept-2015-6-a review of data security primitives in data mining
 
Seminar Report Vaibhav
Seminar Report VaibhavSeminar Report Vaibhav
Seminar Report Vaibhav
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
Data mining
Data miningData mining
Data mining
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
 
ml-02x01.pdf
ml-02x01.pdfml-02x01.pdf
ml-02x01.pdf
 
KDD assignmnt data.docx
KDD assignmnt data.docxKDD assignmnt data.docx
KDD assignmnt data.docx
 
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEDATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
 
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET-	 Fault Detection and Prediction of Failure using Vibration AnalysisIRJET-	 Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
 
From data mining to knowledge discovery in
From data mining to knowledge discovery inFrom data mining to knowledge discovery in
From data mining to knowledge discovery in
 
Data Mining Implementation process.pptx
Data Mining Implementation process.pptxData Mining Implementation process.pptx
Data Mining Implementation process.pptx
 
A review on data mining
A  review on data miningA  review on data mining
A review on data mining
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Unit 3.pdf
Unit 3.pdfUnit 3.pdf
Unit 3.pdf
 
Introducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huIntroducition to Data scinece compiled by hu
Introducition to Data scinece compiled by hu
 

More from Shuvra Ghosh

Intelligent Information Agent
Intelligent Information AgentIntelligent Information Agent
Intelligent Information AgentShuvra Ghosh
 
Fundamental Category
 Fundamental Category Fundamental Category
Fundamental CategoryShuvra Ghosh
 
Economics of information
Economics of information Economics of information
Economics of information Shuvra Ghosh
 

More from Shuvra Ghosh (6)

Intelligent Information Agent
Intelligent Information AgentIntelligent Information Agent
Intelligent Information Agent
 
Altmetrics
Altmetrics Altmetrics
Altmetrics
 
Fundamental Category
 Fundamental Category Fundamental Category
Fundamental Category
 
ISO 2709
ISO 2709ISO 2709
ISO 2709
 
Economics of information
Economics of information Economics of information
Economics of information
 
Web of Science
Web of ScienceWeb of Science
Web of Science
 

Recently uploaded

Goregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Goregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsGoregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Goregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
Bokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime Bokaro
Bokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime BokaroBokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime Bokaro
Bokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime Bokaromeghakumariji156
 
Dadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Dadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Dadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
2023 - Between Philosophy and Practice: Introducing Yoga
2023 - Between Philosophy and Practice: Introducing Yoga2023 - Between Philosophy and Practice: Introducing Yoga
2023 - Between Philosophy and Practice: Introducing YogaRaphaël Semeteys
 
Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...
Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...
Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...mitaliverma221
 
KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...
KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...
KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...Cara Menggugurkan Kandungan 087776558899
 
February 2024 Recommendations for newsletter
February 2024 Recommendations for newsletterFebruary 2024 Recommendations for newsletter
February 2024 Recommendations for newsletterssuserdfec6a
 
(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...
(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...
(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...aarushi sharma
 
Colaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Colaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsColaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Colaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
What are some effective methods for increasing concentration and focus while ...
What are some effective methods for increasing concentration and focus while ...What are some effective methods for increasing concentration and focus while ...
What are some effective methods for increasing concentration and focus while ...SOFTTECHHUB
 
Social Learning Theory presentation.pptx
Social Learning Theory presentation.pptxSocial Learning Theory presentation.pptx
Social Learning Theory presentation.pptxumef01177
 
Exploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdf
Exploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdfExploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdf
Exploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdfMindful Wellness Journey
 
Hisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Hisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsHisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Hisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
March 2023 Recommendations for newsletter
March 2023 Recommendations for newsletterMarch 2023 Recommendations for newsletter
March 2023 Recommendations for newsletterssuserdfec6a
 
SIKP311 Sikolohiyang Pilipino - Ginhawa.pptx
SIKP311 Sikolohiyang Pilipino - Ginhawa.pptxSIKP311 Sikolohiyang Pilipino - Ginhawa.pptx
SIKP311 Sikolohiyang Pilipino - Ginhawa.pptxStephenMino
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607dollysharma2066
 
Emotional Freedom Technique Tapping Points Diagram.pdf
Emotional Freedom Technique Tapping Points Diagram.pdfEmotional Freedom Technique Tapping Points Diagram.pdf
Emotional Freedom Technique Tapping Points Diagram.pdfaprilross605
 

Recently uploaded (18)

Girls in Mahipalpur (delhi) call me [🔝9953056974🔝] escort service 24X7
Girls in Mahipalpur  (delhi) call me [🔝9953056974🔝] escort service 24X7Girls in Mahipalpur  (delhi) call me [🔝9953056974🔝] escort service 24X7
Girls in Mahipalpur (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Goregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Goregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsGoregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Goregaon West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Bokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime Bokaro
Bokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime BokaroBokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime Bokaro
Bokaro Escorts Service Girl ^ 9332606886, WhatsApp Anytime Bokaro
 
Dadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Dadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Dadar West Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
2023 - Between Philosophy and Practice: Introducing Yoga
2023 - Between Philosophy and Practice: Introducing Yoga2023 - Between Philosophy and Practice: Introducing Yoga
2023 - Between Philosophy and Practice: Introducing Yoga
 
Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...
Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...
Call Girls In Mumbai Just Genuine Call ☎ 7738596112✅ Call Girl Andheri East G...
 
KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...
KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...
KLINIK BATA Jual obat penggugur kandungan 087776558899 ABORSI JANIN KEHAMILAN...
 
February 2024 Recommendations for newsletter
February 2024 Recommendations for newsletterFebruary 2024 Recommendations for newsletter
February 2024 Recommendations for newsletter
 
(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...
(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...
(JAYA)🎄Low Rate Call Girls Lucknow Call Now 8630512678 Premium Collection Of ...
 
Colaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Colaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsColaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Colaba Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
What are some effective methods for increasing concentration and focus while ...
What are some effective methods for increasing concentration and focus while ...What are some effective methods for increasing concentration and focus while ...
What are some effective methods for increasing concentration and focus while ...
 
Social Learning Theory presentation.pptx
Social Learning Theory presentation.pptxSocial Learning Theory presentation.pptx
Social Learning Theory presentation.pptx
 
Exploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdf
Exploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdfExploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdf
Exploring Stoic Philosophy From Ancient Wisdom to Modern Relevance.pdf
 
Hisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Hisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsHisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Hisar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
March 2023 Recommendations for newsletter
March 2023 Recommendations for newsletterMarch 2023 Recommendations for newsletter
March 2023 Recommendations for newsletter
 
SIKP311 Sikolohiyang Pilipino - Ginhawa.pptx
SIKP311 Sikolohiyang Pilipino - Ginhawa.pptxSIKP311 Sikolohiyang Pilipino - Ginhawa.pptx
SIKP311 Sikolohiyang Pilipino - Ginhawa.pptx
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
 
Emotional Freedom Technique Tapping Points Diagram.pdf
Emotional Freedom Technique Tapping Points Diagram.pdfEmotional Freedom Technique Tapping Points Diagram.pdf
Emotional Freedom Technique Tapping Points Diagram.pdf
 

Knowledge discovery process

  • 1. SUBMITTED BY: SHUVRA GHOSH ROLL NO: 07 COURSE: MLIS GUIDED BY: PROF. UDAYAN BHATTACHARYA DEPARTMENT OF LIBRARY AND INFORMATION SCIENCE JADAVPUR UNIVERSITY *
  • 2. * Process of discovering valuable information from a collection of data, or it is the process of converting raw data into useful information. Knowledge discovery is an activity that produces knowledge by discovering it or deriving it from existing information. Knowledge Discovery refers to the overall process of discovering useful knowledge from data, and data mining refers to a particular step in this process.
  • 3. *Why do we need knowledge discovery process?
  • 4. *
  • 5. • Database data • Data Warehouse • Transactional data • Other kinds of Data- Time related data Sequence data (historical data records, Stock Exchange) Data streams (Video surveillance, Sensor data) Spatial data (Maps) Hypertext and Multimedia data (Text, Video, Audio) Graph and networked data Engineering design data (auto CAD) Web *
  • 6. • Interactive • Iterative • Procedure to extract knowledge from data • Knowledge being searched for is – implicit previously unknown potentially useful *
  • 7. *
  • 8. Data Cleaning − in this step, the noise and inconsistent data is removed. Example Parsing the Data. Cleaning is performed for detection Of syntax error. Parser decides the given string of Data is acceptable within data Specification. *
  • 9. Data Integration − in this step, multiple data sources are combined Example: Retail loan application, commercial loan application, demand deposit application are combined in bank data warehouse. .
  • 10. Data Selection − in this step, data relevant to the analysis task are retrieved from the database. *
  • 11. Data Transformation − in this step, data is transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations. The aggregation operators perform mathematical operations like Average, Aggregate, Count, Max, Min and Sum, on the numeric property of the elements in the collection. *
  • 12. Data Mining − in this step, intelligent methods are applied in order to extract data patterns. intelligent methods are – • Association • Classification Decision tree • Clustering • Regression *
  • 13. *
  • 14. *
  • 15. *
  • 16. Pattern Evaluation − in this step, data patterns are evaluated. *
  • 17. Knowledge Presentation − in this step, knowledge is represented by various visualize tools.  Table  Chart  Graph *
  • 18. Knowledge discovery process has three parts Academic Research Models Industrial Models Hybrid Models •
  • 19.  The efforts to establish a KDP model were initiated in academia, in the mid-1990s.  when the DM field was being shaped, researchers started defining multistep procedures to guide users of DM tools in the complex knowledge discovery world.  The two process models developed in 1996 and 1998 are the nine-step model by Fayyad et al. and the eight-step model by Anand and Buchner. *
  • 20. 1.Developing and understanding the application domain. This step includes learning the relevant prior knowledge and the goals of the end user of the discovered knowledge. 2. Creating a target data set. Here the data miner selects a subset of variables (attributes) and data points (examples) that will be used to perform discovery tasks. This step usually includes querying the existing data to select the desired subset. 3. Data cleaning and pre-processing. This step consists of removing outliers, dealing with noise and missing values in the data, and accounting for time sequence information and known changes. 4. Data reduction and projection. This step consists of finding useful attributes by applying dimension reduction and transformation methods, and finding invariant representation of the data. 5. Choosing the data mining task. Here the data miner matches the goals defined in Step 1 with a particular DM method, such as classification, regression, clustering, etc. *
  • 21.
  • 22. Two representative industrial models are the five-step model by Cabena et al., with support from IBM and the industrial six-step CRISP-DM model, developed by a large consortium of European companies. *
  • 23. The CRISP-DM (Cross-Industry Standard Process for Data Mining) was first established in the late 1990s by four companies: Integral Solutions Ltd. (a provider of commercial data mining solutions), NCR (a database provider), DaimlerChrysler (an automobile manufacturer), and OHRA (an insurance company). *
  • 24. *
  • 25.
  • 26. The development of academic and industrial models has led to the development of hybrid models, i.e., models that combine aspects of both. One such model is a six-step KDP model developed by Cios et al. The main differences and extensions include • providing more general, research-oriented description of the steps, • introducing a data mining step instead of the modeling step, • introducing several new explicit feedback mechanisms, (the CRISP- DM model has only three major feedback sources, while the hybrid model has more detailed feedback mechanisms) and • Modification of the last step, since in the hybrid model, the knowledge discovered for a particular domain may be applied in other domains. *
  • 27. *
  • 28. 1. Understanding of the problem domain. This initial step involves working closely with domain experts to define the problem and determine the project goals, identifying key people, and learning about current solutions to the problem. It also involves learning domain- specific terminology. A description of the problem, including its restrictions, is prepared. Finally, project goals are translated into DM goals, and the initial selection of DM tools to be used later in the process is performed. 2. Understanding of the data. This step includes collecting sample data and deciding which data, including format and size, will be needed. Background knowledge can be used to guide these efforts. Data are checked for completeness, redundancy, missing values, plausibility of attribute values, etc. Finally, the step includes verification of the usefulness of the data with respect to the DM goals. *
  • 29.
  • 30.
  • 31. Knowledge Discovery in Databases is the process by which a task is identified and performed upon a database in order to extract information about the elements of the database. This process involves first collecting the data to be analysed, cleaning up the data, and reducing it to those features of interest to the process. At which time the tool or tools to be used upon the data are identified. These tools are then used to mine the data for information. Once the information has been created, it must be evaluated as to it efficacy to the process. Any knowledge thereupon gained is then re-incorporated into the process as well as used for purposes outside the scope of the process. This is a very complex process, but it is one that lends itself to a fair degree of automation. As such, it enters into the field of artificial intelligence, not just for the tools it employs, but for the fact that the process tries to re-incorporate the knowledge it has created. *