SlideShare a Scribd company logo
INTRODUCTION TO
DATA MINING
Samrat Devidas Tayade
TE IT
ARMIET
prerequisites
• Knowledge of databases
• Data warehousing
• Olap
Knowledge of databases
• Database : A database is an organized collection of data, generally
stored and accessed electronically from a computer system.
Where databases are more complex they are often developed
using formal design and modeling techniques.
• Database Management System (DBMS) – add, remove, update
records – retrieve data that match certain criteria – cross-reference
data in different tables – perform complex aggregate calculation •
Database consists of columns (attributes) and rows (records).
Data warehousing
Data warehousing is the process of
constructing and using a data
warehouse. A data warehouse is
constructed by integrating data from
multiple heterogeneous sources that
support analytical reporting,
structured and/or ad hoc queries, and
decision making. Data warehousing
involves data cleaning, data
integration, and data consolidations.
OLAP
Online Analytical Processing Server (OLAP) is based on the
multidimensional data model. It allows managers, and analysts to get
an insight of the information through fast, consistent, and interactive
access to information.
TYPES OF OLAP :
1. Relational OLAP (ROLAP)
2. Multidimensional OLAP (MOLAP)
3. Hybrid OLAP (HOLAP)
4. Specialized SQL Servers
Content
• What is data mining
• Kind to be mined
• Technologies used
• Major issues in data
mining
What is data mining
• The practice of examining large pre-existing databases in order to generate new
information.
• Data Mining is defined as extracting information from huge sets of data. In other
words, we can say that data mining is the procedure of mining knowledge from
data.
• The information or knowledge extracted so can be used for any of the following
applications −
1. Market Analysis
2. Fraud Detection
3. Customer Retention
4. Production Control
5. Science Exploration
Kind to be mined
• Kind of knowledge to be mined
• It refers to the kind of functions to be performed.
• These functions are −
I. Characterization
II. Discrimination
III. Association and Correlation Analysis
IV. Classification
V. Prediction
VI. Clustering
VII. Outlier Analysis
VIII.Evolution Analysis
Kind of data mined
1.Flat Files
2.Relational Databases
3.DataWarehouse
4.Transactional Databases
5.Multimedia Databases
6.Spatial Databases
7.Time Series Databases
8.World Wide Web(WWW)
Technologies used
Major issues in data mining
References
• Google
• Wikipedia
• Tutorials point
• www.lsamratl.tk

More Related Content

What's hot

Data Mining: Key definitions
Data Mining: Key definitionsData Mining: Key definitions
Data Mining: Key definitions
DataminingTools Inc
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse ArchitecturesTheju Paul
 
Data flow ii extract
Data flow   ii extractData flow   ii extract
Data flow ii extract
Dr. Dipti Patil
 
Over view of data structures
Over view of data structuresOver view of data structures
Over view of data structures
NagajothiN1
 
Data warehouse 13 data transformation
Data warehouse 13 data transformationData warehouse 13 data transformation
Data warehouse 13 data transformation
Vaibhav Khanna
 
Mis chapter5
Mis chapter5Mis chapter5
Mis chapter5Poleak
 
Tatyana Matvienko,Senior Java Developer, Big data storages
 Tatyana Matvienko,Senior Java Developer, Big data storages Tatyana Matvienko,Senior Java Developer, Big data storages
Tatyana Matvienko,Senior Java Developer, Big data storages
Alina Vilk
 
Big data storages
Big data storagesBig data storages
Big data storages
DataArt
 
Final presentation
Final presentationFinal presentation
Final presentation
Dave Nawazish Ali
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
dataware
 
Data Management_TL III Annual Meet
Data Management_TL III Annual MeetData Management_TL III Annual Meet
Data Management_TL III Annual Meet
Tropical Legumes III
 
data warehousing and data mining
data warehousing and data mining data warehousing and data mining
data warehousing and data mining
E2MATRIX
 

What's hot (16)

Data Mining: Key definitions
Data Mining: Key definitionsData Mining: Key definitions
Data Mining: Key definitions
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
Data flow ii extract
Data flow   ii extractData flow   ii extract
Data flow ii extract
 
Over view of data structures
Over view of data structuresOver view of data structures
Over view of data structures
 
Data warehouse 13 data transformation
Data warehouse 13 data transformationData warehouse 13 data transformation
Data warehouse 13 data transformation
 
Session 10 data
Session 10 dataSession 10 data
Session 10 data
 
Mis chapter5
Mis chapter5Mis chapter5
Mis chapter5
 
Mis chapter5
Mis chapter5Mis chapter5
Mis chapter5
 
Tatyana Matvienko,Senior Java Developer, Big data storages
 Tatyana Matvienko,Senior Java Developer, Big data storages Tatyana Matvienko,Senior Java Developer, Big data storages
Tatyana Matvienko,Senior Java Developer, Big data storages
 
Big data storages
Big data storagesBig data storages
Big data storages
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
 
Data Management_TL III Annual Meet
Data Management_TL III Annual MeetData Management_TL III Annual Meet
Data Management_TL III Annual Meet
 
data warehousing and data mining
data warehousing and data mining data warehousing and data mining
data warehousing and data mining
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 

Similar to Introduction to data mining

Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
Murli Jha
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
GraceJoyMoleroCarwan
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
Dhilsath Fathima
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
malathieswaran29
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
Er. Nawaraj Bhandari
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Devyani Vaidya
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Devyani Vaidya
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Devyani Vaidya
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
RafiulHasan19
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
Samraiz Tejani
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
RahulSingh986955
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
ParnalSatle
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
Priyanshu931034
 
Lecture2 (1).ppt
Lecture2 (1).pptLecture2 (1).ppt
Lecture2 (1).ppt
Minakshee Patil
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
Dr Pradhan PL Pradhan
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical Processing
Walid Elbadawy
 
data warehousing
data warehousingdata warehousing
data warehousing
143sohil
 

Similar to Introduction to data mining (20)

Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
data warehousing
data warehousingdata warehousing
data warehousing
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
 
Lecture2 (1).ppt
Lecture2 (1).pptLecture2 (1).ppt
Lecture2 (1).ppt
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical Processing
 
data warehousing
data warehousingdata warehousing
data warehousing
 

Recently uploaded

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 

Recently uploaded (20)

Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 

Introduction to data mining

  • 1. INTRODUCTION TO DATA MINING Samrat Devidas Tayade TE IT ARMIET
  • 2. prerequisites • Knowledge of databases • Data warehousing • Olap
  • 3. Knowledge of databases • Database : A database is an organized collection of data, generally stored and accessed electronically from a computer system. Where databases are more complex they are often developed using formal design and modeling techniques. • Database Management System (DBMS) – add, remove, update records – retrieve data that match certain criteria – cross-reference data in different tables – perform complex aggregate calculation • Database consists of columns (attributes) and rows (records).
  • 4. Data warehousing Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical reporting, structured and/or ad hoc queries, and decision making. Data warehousing involves data cleaning, data integration, and data consolidations.
  • 5. OLAP Online Analytical Processing Server (OLAP) is based on the multidimensional data model. It allows managers, and analysts to get an insight of the information through fast, consistent, and interactive access to information. TYPES OF OLAP : 1. Relational OLAP (ROLAP) 2. Multidimensional OLAP (MOLAP) 3. Hybrid OLAP (HOLAP) 4. Specialized SQL Servers
  • 6. Content • What is data mining • Kind to be mined • Technologies used • Major issues in data mining
  • 7. What is data mining • The practice of examining large pre-existing databases in order to generate new information. • Data Mining is defined as extracting information from huge sets of data. In other words, we can say that data mining is the procedure of mining knowledge from data. • The information or knowledge extracted so can be used for any of the following applications − 1. Market Analysis 2. Fraud Detection 3. Customer Retention 4. Production Control 5. Science Exploration
  • 8. Kind to be mined • Kind of knowledge to be mined • It refers to the kind of functions to be performed. • These functions are − I. Characterization II. Discrimination III. Association and Correlation Analysis IV. Classification V. Prediction VI. Clustering VII. Outlier Analysis VIII.Evolution Analysis
  • 9. Kind of data mined 1.Flat Files 2.Relational Databases 3.DataWarehouse 4.Transactional Databases 5.Multimedia Databases 6.Spatial Databases 7.Time Series Databases 8.World Wide Web(WWW)
  • 11. Major issues in data mining
  • 12. References • Google • Wikipedia • Tutorials point • www.lsamratl.tk