SlideShare a Scribd company logo
By
M.LAVANYA, M.Sc(cs)
NADAR SARASWATHI COLLEGE OF
ARTS & SCIENCE,THENI.
Issues In Data Integration
 There are number of issues to consider during data
integration
Schema integration
Redundancy
Detection and resolution
of data value conflicts.
Schema Integration:
Integrate meta data from different sources.
The real word entities from multiple source be
matched referred to as the entity identification problem.
Redundancy
Redundancy:
An attribute may be redundant if it cam be
derived or obtaining from another attribute or set of
attribute.
Inconsistencies in attribute can also cause
redundancies in the resulting data set.
Some redundancies can be detected by
correlation analysis.
Detection and resolution of data value
conflicts
Detection and resolution of data value conflicts:
This is the third important issues in data
integration.
Attribute values from another different
sources may differ for the same real world entity.
An attribute is one system may be
recorded at a lower level abstraction then the “same “
attribute in another.
DATA PREPROCESSING IN DATA
MINING
Preprocessing in data mining:
data preprocessing is a data mining
technique which is used to transform the raw data in a
useful and efficient format.
Steps involved in data preprocessing:
1.Data preprocessing:
The data can have many irrelevant and
missing parts. To handle this part, data cleaning is
done.
Missing data
(a)Missing data:
This situation arises when some data is
missing in the data. It can be handled in various ways.
some of them are:
1.Ignore the tupes:
This approach is suitable only when the
dataset we have is quite large and multiple values.
Missing data
2.Filling the missing values:
There are various ways to do this task.
you can choose to fill the missing values manually.
( b)Noisy data:
noisy data is a meaningless data that can’t be
interpreted by machines .It can be generated due to
generated due to faulty data collection, data entry
errors etc.It can be handled in following ways:

More Related Content

What's hot

Data Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignData Dictionary in System Analysis and Design
Data Dictionary in System Analysis and Design
Arafat Hossan
 
XL-MINER: Associations
XL-MINER: AssociationsXL-MINER: Associations
XL-MINER: Associations
DataminingTools Inc
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship Diagram
iqbalrahman
 
MS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerMS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql Server
DataminingTools Inc
 
Data Dictionary
Data DictionaryData Dictionary
Data Dictionary
Vishal Anand
 
The composite data model a unified approach for combining and querying multip...
The composite data model a unified approach for combining and querying multip...The composite data model a unified approach for combining and querying multip...
The composite data model a unified approach for combining and querying multip...
ieeepondy
 
Introduction To XL-Miner
Introduction To XL-MinerIntroduction To XL-Miner
Introduction To XL-Miner
DataminingTools Inc
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
DataminingTools Inc
 
DataMining Techniq
DataMining TechniqDataMining Techniq
DataMining Techniq
Respa Peter
 
DM
DMDM
DM
sowfi
 
Data cleaning
Data cleaningData cleaning
Data cleaning
Pooja Jain
 
G045033841
G045033841G045033841
G045033841
IJERA Editor
 
Metaandmete haldus - Jüri Harju
Metaandmete haldus -  Jüri HarjuMetaandmete haldus -  Jüri Harju
Metaandmete haldus - Jüri Harju
ORACLE USER GROUP ESTONIA
 
Data Cleaning
Data CleaningData Cleaning
Database note for 4th semester Notes
Database note for 4th semester Notes Database note for 4th semester Notes
Database note for 4th semester Notes
Islamia College University
 
DATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSINGDATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSING
Ahtesham Ullah khan
 

What's hot (20)

Data Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignData Dictionary in System Analysis and Design
Data Dictionary in System Analysis and Design
 
XL-MINER: Associations
XL-MINER: AssociationsXL-MINER: Associations
XL-MINER: Associations
 
Schemas and Schema-driven Metadata Software
Schemas and Schema-driven Metadata SoftwareSchemas and Schema-driven Metadata Software
Schemas and Schema-driven Metadata Software
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship Diagram
 
MS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerMS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql Server
 
Database Data Models Design Using Visio
Database Data Models Design Using VisioDatabase Data Models Design Using Visio
Database Data Models Design Using Visio
 
Data Dictionary
Data DictionaryData Dictionary
Data Dictionary
 
The composite data model a unified approach for combining and querying multip...
The composite data model a unified approach for combining and querying multip...The composite data model a unified approach for combining and querying multip...
The composite data model a unified approach for combining and querying multip...
 
Introduction To XL-Miner
Introduction To XL-MinerIntroduction To XL-Miner
Introduction To XL-Miner
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
DataMining Techniq
DataMining TechniqDataMining Techniq
DataMining Techniq
 
DM
DMDM
DM
 
Data cleaning
Data cleaningData cleaning
Data cleaning
 
G045033841
G045033841G045033841
G045033841
 
Metaandmete haldus - Jüri Harju
Metaandmete haldus -  Jüri HarjuMetaandmete haldus -  Jüri Harju
Metaandmete haldus - Jüri Harju
 
Data Cleaning
Data CleaningData Cleaning
Data Cleaning
 
Preprocess
PreprocessPreprocess
Preprocess
 
Database note for 4th semester Notes
Database note for 4th semester Notes Database note for 4th semester Notes
Database note for 4th semester Notes
 
DATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSINGDATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSING
 
New
NewNew
New
 

Similar to Data mining

Data Integration In Data Mining.pdf
Data Integration In Data Mining.pdfData Integration In Data Mining.pdf
Data Integration In Data Mining.pdf
Maria Mathe
 
Chapter 2 Cond (1).ppt
Chapter 2 Cond (1).pptChapter 2 Cond (1).ppt
Chapter 2 Cond (1).ppt
kannaradhas
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
T Kavitha
 
Unit 3-2.ppt
Unit 3-2.pptUnit 3-2.ppt
Unit 3-2.ppt
Ankit506645
 
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
ijmpict
 
ppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvy
ppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvyppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvy
ppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvy
vk5985399
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
DataminingTools Inc
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysis
Datamining Tools
 
ICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short NotesICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short Notes
Abdul Haseeb
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
eSAT Journals
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
eSAT Publishing House
 
Feature selection a novel
Feature selection a novelFeature selection a novel
Feature selection a novel
csandit
 
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
csandit
 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
Upendra Reddy Vuyyuru
 
Cal Essay
Cal EssayCal Essay
Cal Essay
Sherry Bailey
 
Preprocessing
PreprocessingPreprocessing
Preprocessing
Kiran Bhowmick
 
Week 3 Classification of Database Management Systems & Data Modeling
Week 3 Classification of Database Management Systems & Data ModelingWeek 3 Classification of Database Management Systems & Data Modeling
Week 3 Classification of Database Management Systems & Data Modeling
oudesign
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.ppt
Subrata Kumer Paul
 
LINK MINING PROCESS
LINK MINING PROCESSLINK MINING PROCESS
LINK MINING PROCESS
IJDKP
 

Similar to Data mining (20)

Data Integration In Data Mining.pdf
Data Integration In Data Mining.pdfData Integration In Data Mining.pdf
Data Integration In Data Mining.pdf
 
Chapter 2 Cond (1).ppt
Chapter 2 Cond (1).pptChapter 2 Cond (1).ppt
Chapter 2 Cond (1).ppt
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Unit 3-2.ppt
Unit 3-2.pptUnit 3-2.ppt
Unit 3-2.ppt
 
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
 
ppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvy
ppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvyppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvy
ppt_rdbms.pdfuvuguvuvugycycyctcucuvyvvuvuvy
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysis
 
ICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short NotesICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short Notes
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
Feature selection a novel
Feature selection a novelFeature selection a novel
Feature selection a novel
 
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
 
B0930610
B0930610B0930610
B0930610
 
Cal Essay
Cal EssayCal Essay
Cal Essay
 
Preprocessing
PreprocessingPreprocessing
Preprocessing
 
Week 3 Classification of Database Management Systems & Data Modeling
Week 3 Classification of Database Management Systems & Data ModelingWeek 3 Classification of Database Management Systems & Data Modeling
Week 3 Classification of Database Management Systems & Data Modeling
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.ppt
 
LINK MINING PROCESS
LINK MINING PROCESSLINK MINING PROCESS
LINK MINING PROCESS
 

More from snegacmr

Process improvement
Process improvementProcess improvement
Process improvement
snegacmr
 
Rest based xml web services
Rest based xml web servicesRest based xml web services
Rest based xml web services
snegacmr
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
snegacmr
 
Basic concepts of parallelization
Basic concepts of parallelizationBasic concepts of parallelization
Basic concepts of parallelization
snegacmr
 
Deepi rdbms
Deepi rdbmsDeepi rdbms
Deepi rdbms
snegacmr
 
Computer network
Computer networkComputer network
Computer network
snegacmr
 
Os
OsOs
Dm powerpoint
Dm powerpointDm powerpoint
Dm powerpoint
snegacmr
 
Sql
SqlSql
Query optimization
Query optimizationQuery optimization
Query optimization
snegacmr
 
Network security
Network securityNetwork security
Network security
snegacmr
 
Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)
snegacmr
 
System components (os)
System components (os)System components (os)
System components (os)
snegacmr
 

More from snegacmr (14)

Process improvement
Process improvementProcess improvement
Process improvement
 
Rest based xml web services
Rest based xml web servicesRest based xml web services
Rest based xml web services
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Basic concepts of parallelization
Basic concepts of parallelizationBasic concepts of parallelization
Basic concepts of parallelization
 
Deepi rdbms
Deepi rdbmsDeepi rdbms
Deepi rdbms
 
Computer network
Computer networkComputer network
Computer network
 
Os
OsOs
Os
 
Dm powerpoint
Dm powerpointDm powerpoint
Dm powerpoint
 
Sql
SqlSql
Sql
 
Cn
CnCn
Cn
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
Network security
Network securityNetwork security
Network security
 
Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)
 
System components (os)
System components (os)System components (os)
System components (os)
 

Recently uploaded

The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
ShahidSultan24
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 

Recently uploaded (20)

The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 

Data mining

  • 1. By M.LAVANYA, M.Sc(cs) NADAR SARASWATHI COLLEGE OF ARTS & SCIENCE,THENI.
  • 2. Issues In Data Integration  There are number of issues to consider during data integration Schema integration Redundancy Detection and resolution of data value conflicts. Schema Integration: Integrate meta data from different sources. The real word entities from multiple source be matched referred to as the entity identification problem.
  • 3. Redundancy Redundancy: An attribute may be redundant if it cam be derived or obtaining from another attribute or set of attribute. Inconsistencies in attribute can also cause redundancies in the resulting data set. Some redundancies can be detected by correlation analysis.
  • 4. Detection and resolution of data value conflicts Detection and resolution of data value conflicts: This is the third important issues in data integration. Attribute values from another different sources may differ for the same real world entity. An attribute is one system may be recorded at a lower level abstraction then the “same “ attribute in another.
  • 5. DATA PREPROCESSING IN DATA MINING Preprocessing in data mining: data preprocessing is a data mining technique which is used to transform the raw data in a useful and efficient format. Steps involved in data preprocessing: 1.Data preprocessing: The data can have many irrelevant and missing parts. To handle this part, data cleaning is done.
  • 6. Missing data (a)Missing data: This situation arises when some data is missing in the data. It can be handled in various ways. some of them are: 1.Ignore the tupes: This approach is suitable only when the dataset we have is quite large and multiple values.
  • 7. Missing data 2.Filling the missing values: There are various ways to do this task. you can choose to fill the missing values manually. ( b)Noisy data: noisy data is a meaningless data that can’t be interpreted by machines .It can be generated due to generated due to faulty data collection, data entry errors etc.It can be handled in following ways: