SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2222
Missing Data Imputation by Evidence Chain
Anaswara R1, Sruthy S2
1M. Tech. Student, Computer Science and Engineering, Sree Buddha College of Engineering, Kerala, India.
2Assistant Professor, Computer Science and Engineering, Sree Buddha College of Engineering, Kerala, India.
----------------------------------------------------------------------***---------------------------------------------------------------------
Abstract –Missing data in a dataset decrease the accuracy
of data analysis. Missing value occurs when no data value is
stored for a variable in an observation. Adataminingproblem
that adversely affects data analysis and decision making
processes. Missing data handling is an important step in data
pre-processing. Imputation technique is used to fill missing
data. Imputation is a technique of replacing missingdata with
substituted values. This will increase the accuracy and
efficiency of the dataset. Here use a new method an evidence
chain for missing data imputation.
Key Words: Imputation, MapReduce, Attribute
combination, Possible value, Marked dataset, Data pre-
processing.
1. INTRODUCTION
Missing data handling is an important step in data pre-
processing. Data pre-processing is an indispensable step
before data mining. Data preprocessing includesthemissing
data imputation, entity identification, and outlier detection.
Missing data padding is an important problem to be solved
in data preprocessing. Because of lack of information, the
omissionofinformationand man-madeoperation,some data
are missing in the dataset. These incomplete data will affect
the quality of data mining and evenleadtotheestablishment
of the wrong data mining model, making the data mining
results deviate from the actual data. The preprocessing
phase helps users to understand the data and to make
appropriate decisions to create the mining models and to
make appropriate decisions. Thisphasewill identifythedata
inconsistencies like missing data and fix the missing or
wrong data using appropriate techniques. This will increase
the accuracy and integrity of data mining.
Missing data is a common problem that and is the current
focus of this research. This is a rapidly growing area. There
have been many techniques for finding missing data and
solving missing data problems [1]. If a dataset that contain a
small number of missing data, then the simplest method to
handle missing data is “Deletion technique”. In deletion
techniques eliminate the attributes or cases. This isa default
method for handling missing data. This methodisnotgoodif
the data set contains a large number of missing data. In this
case use imputation techniques for handling missing data.
Imputation technique is a process of replacing missing data
with substituted values. If an important data is missing for a
particular instance, then it can be estimated from the data
that are present by using these imputations.
There are many studies on imputation methods, including
the use of neural network methods, using regression, and
Bayesian network. A new imputationtechniqueisusedhere.
This method identifies the missing value and imputes the
value to the missing place.
The new method used here use evidence chain for missing
data imputation and this will combine with the MapReduce
programming model to handle large amount of data [7].
Here the new imputationmethod isimplementedbycreating
an application. That generates numerical data automatically
and in between that generates missing data fields. This is
used as the dataset for missing data imputation.
2. RELATED WORKS
There is various imputation techniques are used in data
mining. The selection of imputation technique may be
depending on datasets or may be related tothemechanisms.
Some of the methods used in differentareasaregiven below:
Xiaofeng Zhu et al. [2] suggest a method to handle missing
value estimation in mixed-attribute datasets. Various
techniques have been used to dealing missing value in a
dataset with homogeneous attributes. A mixture-kernel
based iterative estimator is advocated for missing value
imputation.Mixture-kernel-basediterativeestimatorutilizes
all the available observed information, including observed
information in incomplete instances. Author propose a new
algorithm which has better classification accuracy and the
convergence speed of the algorithm than extant methods,
such as the nonparametric imputation method with a single
kernel, frequency estimator (FE), and the nonparametric
method for continuous attributes. The imputed values are
used to impute subsequent missing values innewalgorithm.
Bankat M. Patil et al. [3] provide a novel approach for
estimation of missing data method using cluster based k-
mean weighted distance algorithm(CMIWD).Hereproposed
an efficient missing value imputation method which on
clustering with weighted distance. Based on the user
specified value K, divide the data set into clusters. Then find
a complete valued neighbour that is nearest to the missing
valued instance. Then compute missing value by taking the
average of the centroid value, and the centroidal distance of
the neighbour and this is used as impute value.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2223
Ingunn Myrtveit et al. [4] provide a Full Information
Maximum Likelihood (FIML) method. The FIML is a model-
based method. FIML method can analyze incomplete data
sets directly. This is an incomplete case analysis method.
Thirukumaran. S et al. [5] suggest a method to improving
accuracy rate of imputation of missing data using classifier
methods. Mean method by step Digression (MMSD) impute
missing data containing varying amount of missing values.
MMSD is alternate to mean method. The MMSD overcomes
the limitation of mean method.
Dan Li et al. [6] provide a Fuzzy K-means Clustering method
for missing data imputation. Author present a missing data
imputation method based on one of the most popular
techniques in Knowledge Discovery in Databases (KDD),i.e.,
clustering technique.
Yongsong Qin et al. [8] suggest a semi-parametric
optimization for missing data imputation. This method
overcomes some limitations in linear models and non-
parametric models by making an optimal inference.
The Prereplacing method replaces the missingvaluesbefore
the data mining process [9]. It works in the pre-process
phase. The Embedded method is able to handle missing
values during the data mining process. The missing value is
imputed in the same time while creating the model. There
are many imputations with different features to impute the
missing data for these methods. The pre-replacing method
contains Mean-and Mode, Linear Regression, K-Nearest
Neighbor (KNN), Expectation Maximization (EM), Hot-Deck
(HD), and Autoassociative Neural Network techniques. The
embedded method contains the Casewise deletion, Lazy
decision tree, Dynamic path generation, C5.0, and Surrogate
split techniques.
3. METHODOLOGY
A new method is used for missing data imputation. The new
imputation method is implemented by creating an
application. That generates numerical data automatically
and in between that generates missing data fields. This is
used as a dataset for missing data imputation. This first
identifying the presence of missing data and replaces it with
possible value. Here missing value imputation is performed
with the help of a combination of combination attribute and
possible value. In this method first estimate the missing
value of the imputation value, the algorithm first scans the
each data tuple in the entire dataset. Then marking tuple
with missing value as “-1” as incomplete data tuples, and
combine different associated attributevaluesofmissingdata
in the incomplete data tuple as the evidence of the estimated
missing value. So that many of combination of the relevant
attributes for missing data in theincompletetupleconstitute
the estimated missing data value of the chain of evidence.
The repeated evidence with maximum possible value is
taken as the value for missing value [7].
The imputation process is done through the following steps:
Step 1: First scan each data tuple in the entire dataset to
identify the position of missing value.Thenmark the missing
value as “-1”. This dataset is called marked dataset. This
dataset is used for further operations.
Step 2: Scan each tuple in the dataset and combinethe entire
associated attribute with the missing value. This will serve
as an evidence for estimating the missing data values. The
result will be used as the chain of evidence to estimate the
missing data. In this stage use a map function calculate the
associated attribute value combinations of the missing data
in the incomplete data tuple combi_attri.
The Reduce function gets a set of associated attribute value
combinations.
Step 3: Find all the possible values in the dataset. The
possible values are takenfromthecorrespondingcolumnsof
missing data. Then calculatetheprobabilityvalueofpossible
values.
Step 4: Check the missing data’s attribute combination with
the combi_attri table if corresponding combination is
present in the database then take its corresponding possible
value. If there more than one possible value for same
combi_attri then takesthecorrespondingpossiblevaluewith
maximum probability. If there is no repeated attribute
combination, then use the possible value with the highest
possible value for missing value imputation.
Stage 5: The value of the missing value estimate of step 4 is
filled into the original missing dataset.
Modules present in this project are:
1) Data capturing: This module includes a control panel
that can activate/deactivate the sensors periodically.
The sequence of sensor operations causes missing data.
The captured data undergo a MapReduce algorithm for
handling efficient storage and quick retrieval.
2) Dataset management: The captured data is used as the
dataset for missing data imputation.
3) Missing data imputation: In this module apply above
method to impute missing data.
4. CONCLUSION
The processing of missing data hasbecomea veryimportant
step in the process of data preprocessing. The limitations of
data acquisition or improper operation of the data lead to
data errors, incompleteresults, andinconsistencies.This will
reduce accuracy of data mining. To overcome these
limitations missing value imputation methods are used.
Evidence chain is used to predict the value of missing data.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2224
Estimation of missing value by using the set of combinations
of missing attributes values. The Map-Reduce framework is
used to process large data.
REFERENCES
[1] T. Aljuaid and S. Sasi, “Proper imputation techniquesfor
missing values in data sets,”2016 International
Conference on Data Science and Engineering (ICDSE),
Cochin, 2016, pp. 1-5. doi:
10.1109/ICDSE.2016.7823957.
[2] X. Zhu, S. Zhang, Z. Jin, Z. Zhang, and Z. Xu, “Missingvalue
estimation for mixed- attribute datasets,” Knowledge
and Data Engineering, IEEE Transactions on, vol. 23, pp.
110-121, 2011.
[3] B. M. Patil, R. C. Joshi, and D. Toshniwal, “Missing Value
Imputation Based on K-Mean Clustering with Weighted
Distance”, Communications in Computer and
Information Science, 1, Contemporary Computing, Part
11. 94, pp. 600-609.
[4] Ingunn Myrtveit, Erik Stensrud, and Ulf H. Olsson,
“Analyzing Data Sets with Missing Data: An Empirical
Evaluation ofImputationMethodsandLikelihood-Based
Methods”, IEEE Transactions On Software Engineering,
Vol. 27, No. 11, November 2001.
[5] S. Thirukumaran and A. Sumathi, “Improving accuracy
rate of imputation of missing data using classifier
methods”, 10th International Conference on Intelligent
Systems and Control (ISCO), Coimbatore, India, pp. 1-7,
2016.
[6] Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards
“Missing Data Imputation: A Study of Fuzzy K-means
Clustering Method”. In: Tsumoto, S., Słowiński, R.,
Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC
2004. LNCS (LNAI), vol. 3066, pp. 573–579.
[7] X. Xu, W. Chong, S. Li, A. Arabo and J. Xiao, “MIAEC:
Missing Data Imputation Based on the Evidence Chain,”
in IEEE Access, vol. 6, pp. 12983-12992, 2018.
doi: 10.1109/ACCESS.2018.2803755.
[8] Y. Qin, S. Zhang, X. Zhu, J. Zhang, and C. Zhang, “Semi-
parametric optimization for missing data imputation,”
Applied Intelligence, vol. 27, pp. 79-88, 2007.
[9] T. Aljuaid and S. Sasi, “Proper imputation techniquesfor
missing values in data sets,” 2016 International
Conference on Data Science and Engineering (ICDSE),
Cochin, 2016, pp. 1-5.
doi: 10.1109/ICDSE.2016.7823957.
BIOGRAPHIES
Anaswara R, she is currently pursing Master’s Degree in
Computer Science and Engineering from Sree Buddha
college of Engineering, Elavumthitta, India. Her research
area of interest includes the field Data mining.
Sruthy. S, she is an Assistant ProfessorintheDepartment of
Computer Science and Engineering, Sree Buddha College of
Engineering. Her main area of interest is Computer Vision
and Image Processing and Data mining.

More Related Content

What's hot

Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...ertekg
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...eSAT Publishing House
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...IJDKP
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentIJDKP
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGijccmsjournal
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetIRJET Journal
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceIJDKP
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET Journal
 
The pertinent single-attribute-based classifier for small datasets classific...
The pertinent single-attribute-based classifier  for small datasets classific...The pertinent single-attribute-based classifier  for small datasets classific...
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
 
Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...eSAT Journals
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
 
IRJET - Survey on Clustering based Categorical Data Protection
IRJET - Survey on Clustering based Categorical Data ProtectionIRJET - Survey on Clustering based Categorical Data Protection
IRJET - Survey on Clustering based Categorical Data ProtectionIRJET Journal
 
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...RSIS International
 

What's hot (15)

Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environment
 
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MININGA REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
 
M033059064
M033059064M033059064
M033059064
 
The pertinent single-attribute-based classifier for small datasets classific...
The pertinent single-attribute-based classifier  for small datasets classific...The pertinent single-attribute-based classifier  for small datasets classific...
The pertinent single-attribute-based classifier for small datasets classific...
 
Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
IRJET - Survey on Clustering based Categorical Data Protection
IRJET - Survey on Clustering based Categorical Data ProtectionIRJET - Survey on Clustering based Categorical Data Protection
IRJET - Survey on Clustering based Categorical Data Protection
 
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
 

Similar to IRJET- Missing Data Imputation by Evidence Chain

Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challengesijcnes
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsIRJET Journal
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computingijtsrd
 
IRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET Journal
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...IJDKP
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal
 
IRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current ApproachesIRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current ApproachesIRJET Journal
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmIRJET Journal
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...nalini manogaran
 
IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...
IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...
IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...IRJET Journal
 
Detection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed ApproachDetection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed ApproachEditor IJMTER
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTIJERA Editor
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningNandakumar P
 
Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS  Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS IRJET Journal
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONIRJET Journal
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET Journal
 
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...IRJET Journal
 

Similar to IRJET- Missing Data Imputation by Evidence Chain (20)

Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challenges
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy Algorithms
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computing
 
IRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression Method
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
IRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current ApproachesIRJET- A Review of Data Cleaning and its Current Approaches
IRJET- A Review of Data Cleaning and its Current Approaches
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...
 
IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...
IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...
IRJET- Hybrid Architecture of Heart Disease Prediction System using Genetic N...
 
Detection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed ApproachDetection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed Approach
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data Mining
 
B0930610
B0930610B0930610
B0930610
 
Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS  Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
 
A1802050102
A1802050102A1802050102
A1802050102
 
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 

IRJET- Missing Data Imputation by Evidence Chain

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2222 Missing Data Imputation by Evidence Chain Anaswara R1, Sruthy S2 1M. Tech. Student, Computer Science and Engineering, Sree Buddha College of Engineering, Kerala, India. 2Assistant Professor, Computer Science and Engineering, Sree Buddha College of Engineering, Kerala, India. ----------------------------------------------------------------------***--------------------------------------------------------------------- Abstract –Missing data in a dataset decrease the accuracy of data analysis. Missing value occurs when no data value is stored for a variable in an observation. Adataminingproblem that adversely affects data analysis and decision making processes. Missing data handling is an important step in data pre-processing. Imputation technique is used to fill missing data. Imputation is a technique of replacing missingdata with substituted values. This will increase the accuracy and efficiency of the dataset. Here use a new method an evidence chain for missing data imputation. Key Words: Imputation, MapReduce, Attribute combination, Possible value, Marked dataset, Data pre- processing. 1. INTRODUCTION Missing data handling is an important step in data pre- processing. Data pre-processing is an indispensable step before data mining. Data preprocessing includesthemissing data imputation, entity identification, and outlier detection. Missing data padding is an important problem to be solved in data preprocessing. Because of lack of information, the omissionofinformationand man-madeoperation,some data are missing in the dataset. These incomplete data will affect the quality of data mining and evenleadtotheestablishment of the wrong data mining model, making the data mining results deviate from the actual data. The preprocessing phase helps users to understand the data and to make appropriate decisions to create the mining models and to make appropriate decisions. Thisphasewill identifythedata inconsistencies like missing data and fix the missing or wrong data using appropriate techniques. This will increase the accuracy and integrity of data mining. Missing data is a common problem that and is the current focus of this research. This is a rapidly growing area. There have been many techniques for finding missing data and solving missing data problems [1]. If a dataset that contain a small number of missing data, then the simplest method to handle missing data is “Deletion technique”. In deletion techniques eliminate the attributes or cases. This isa default method for handling missing data. This methodisnotgoodif the data set contains a large number of missing data. In this case use imputation techniques for handling missing data. Imputation technique is a process of replacing missing data with substituted values. If an important data is missing for a particular instance, then it can be estimated from the data that are present by using these imputations. There are many studies on imputation methods, including the use of neural network methods, using regression, and Bayesian network. A new imputationtechniqueisusedhere. This method identifies the missing value and imputes the value to the missing place. The new method used here use evidence chain for missing data imputation and this will combine with the MapReduce programming model to handle large amount of data [7]. Here the new imputationmethod isimplementedbycreating an application. That generates numerical data automatically and in between that generates missing data fields. This is used as the dataset for missing data imputation. 2. RELATED WORKS There is various imputation techniques are used in data mining. The selection of imputation technique may be depending on datasets or may be related tothemechanisms. Some of the methods used in differentareasaregiven below: Xiaofeng Zhu et al. [2] suggest a method to handle missing value estimation in mixed-attribute datasets. Various techniques have been used to dealing missing value in a dataset with homogeneous attributes. A mixture-kernel based iterative estimator is advocated for missing value imputation.Mixture-kernel-basediterativeestimatorutilizes all the available observed information, including observed information in incomplete instances. Author propose a new algorithm which has better classification accuracy and the convergence speed of the algorithm than extant methods, such as the nonparametric imputation method with a single kernel, frequency estimator (FE), and the nonparametric method for continuous attributes. The imputed values are used to impute subsequent missing values innewalgorithm. Bankat M. Patil et al. [3] provide a novel approach for estimation of missing data method using cluster based k- mean weighted distance algorithm(CMIWD).Hereproposed an efficient missing value imputation method which on clustering with weighted distance. Based on the user specified value K, divide the data set into clusters. Then find a complete valued neighbour that is nearest to the missing valued instance. Then compute missing value by taking the average of the centroid value, and the centroidal distance of the neighbour and this is used as impute value.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2223 Ingunn Myrtveit et al. [4] provide a Full Information Maximum Likelihood (FIML) method. The FIML is a model- based method. FIML method can analyze incomplete data sets directly. This is an incomplete case analysis method. Thirukumaran. S et al. [5] suggest a method to improving accuracy rate of imputation of missing data using classifier methods. Mean method by step Digression (MMSD) impute missing data containing varying amount of missing values. MMSD is alternate to mean method. The MMSD overcomes the limitation of mean method. Dan Li et al. [6] provide a Fuzzy K-means Clustering method for missing data imputation. Author present a missing data imputation method based on one of the most popular techniques in Knowledge Discovery in Databases (KDD),i.e., clustering technique. Yongsong Qin et al. [8] suggest a semi-parametric optimization for missing data imputation. This method overcomes some limitations in linear models and non- parametric models by making an optimal inference. The Prereplacing method replaces the missingvaluesbefore the data mining process [9]. It works in the pre-process phase. The Embedded method is able to handle missing values during the data mining process. The missing value is imputed in the same time while creating the model. There are many imputations with different features to impute the missing data for these methods. The pre-replacing method contains Mean-and Mode, Linear Regression, K-Nearest Neighbor (KNN), Expectation Maximization (EM), Hot-Deck (HD), and Autoassociative Neural Network techniques. The embedded method contains the Casewise deletion, Lazy decision tree, Dynamic path generation, C5.0, and Surrogate split techniques. 3. METHODOLOGY A new method is used for missing data imputation. The new imputation method is implemented by creating an application. That generates numerical data automatically and in between that generates missing data fields. This is used as a dataset for missing data imputation. This first identifying the presence of missing data and replaces it with possible value. Here missing value imputation is performed with the help of a combination of combination attribute and possible value. In this method first estimate the missing value of the imputation value, the algorithm first scans the each data tuple in the entire dataset. Then marking tuple with missing value as “-1” as incomplete data tuples, and combine different associated attributevaluesofmissingdata in the incomplete data tuple as the evidence of the estimated missing value. So that many of combination of the relevant attributes for missing data in theincompletetupleconstitute the estimated missing data value of the chain of evidence. The repeated evidence with maximum possible value is taken as the value for missing value [7]. The imputation process is done through the following steps: Step 1: First scan each data tuple in the entire dataset to identify the position of missing value.Thenmark the missing value as “-1”. This dataset is called marked dataset. This dataset is used for further operations. Step 2: Scan each tuple in the dataset and combinethe entire associated attribute with the missing value. This will serve as an evidence for estimating the missing data values. The result will be used as the chain of evidence to estimate the missing data. In this stage use a map function calculate the associated attribute value combinations of the missing data in the incomplete data tuple combi_attri. The Reduce function gets a set of associated attribute value combinations. Step 3: Find all the possible values in the dataset. The possible values are takenfromthecorrespondingcolumnsof missing data. Then calculatetheprobabilityvalueofpossible values. Step 4: Check the missing data’s attribute combination with the combi_attri table if corresponding combination is present in the database then take its corresponding possible value. If there more than one possible value for same combi_attri then takesthecorrespondingpossiblevaluewith maximum probability. If there is no repeated attribute combination, then use the possible value with the highest possible value for missing value imputation. Stage 5: The value of the missing value estimate of step 4 is filled into the original missing dataset. Modules present in this project are: 1) Data capturing: This module includes a control panel that can activate/deactivate the sensors periodically. The sequence of sensor operations causes missing data. The captured data undergo a MapReduce algorithm for handling efficient storage and quick retrieval. 2) Dataset management: The captured data is used as the dataset for missing data imputation. 3) Missing data imputation: In this module apply above method to impute missing data. 4. CONCLUSION The processing of missing data hasbecomea veryimportant step in the process of data preprocessing. The limitations of data acquisition or improper operation of the data lead to data errors, incompleteresults, andinconsistencies.This will reduce accuracy of data mining. To overcome these limitations missing value imputation methods are used. Evidence chain is used to predict the value of missing data.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2224 Estimation of missing value by using the set of combinations of missing attributes values. The Map-Reduce framework is used to process large data. REFERENCES [1] T. Aljuaid and S. Sasi, “Proper imputation techniquesfor missing values in data sets,”2016 International Conference on Data Science and Engineering (ICDSE), Cochin, 2016, pp. 1-5. doi: 10.1109/ICDSE.2016.7823957. [2] X. Zhu, S. Zhang, Z. Jin, Z. Zhang, and Z. Xu, “Missingvalue estimation for mixed- attribute datasets,” Knowledge and Data Engineering, IEEE Transactions on, vol. 23, pp. 110-121, 2011. [3] B. M. Patil, R. C. Joshi, and D. Toshniwal, “Missing Value Imputation Based on K-Mean Clustering with Weighted Distance”, Communications in Computer and Information Science, 1, Contemporary Computing, Part 11. 94, pp. 600-609. [4] Ingunn Myrtveit, Erik Stensrud, and Ulf H. Olsson, “Analyzing Data Sets with Missing Data: An Empirical Evaluation ofImputationMethodsandLikelihood-Based Methods”, IEEE Transactions On Software Engineering, Vol. 27, No. 11, November 2001. [5] S. Thirukumaran and A. Sumathi, “Improving accuracy rate of imputation of missing data using classifier methods”, 10th International Conference on Intelligent Systems and Control (ISCO), Coimbatore, India, pp. 1-7, 2016. [6] Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards “Missing Data Imputation: A Study of Fuzzy K-means Clustering Method”. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 573–579. [7] X. Xu, W. Chong, S. Li, A. Arabo and J. Xiao, “MIAEC: Missing Data Imputation Based on the Evidence Chain,” in IEEE Access, vol. 6, pp. 12983-12992, 2018. doi: 10.1109/ACCESS.2018.2803755. [8] Y. Qin, S. Zhang, X. Zhu, J. Zhang, and C. Zhang, “Semi- parametric optimization for missing data imputation,” Applied Intelligence, vol. 27, pp. 79-88, 2007. [9] T. Aljuaid and S. Sasi, “Proper imputation techniquesfor missing values in data sets,” 2016 International Conference on Data Science and Engineering (ICDSE), Cochin, 2016, pp. 1-5. doi: 10.1109/ICDSE.2016.7823957. BIOGRAPHIES Anaswara R, she is currently pursing Master’s Degree in Computer Science and Engineering from Sree Buddha college of Engineering, Elavumthitta, India. Her research area of interest includes the field Data mining. Sruthy. S, she is an Assistant ProfessorintheDepartment of Computer Science and Engineering, Sree Buddha College of Engineering. Her main area of interest is Computer Vision and Image Processing and Data mining.