SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 948
An User Friendly Interface for Data Preprocessing and Visualization
using Machine Learning Models
Mr. S. Yoganand1, Bharathi Kannan R2, Daya Meenakshi B2
1Assistant Professor, Department of Computer Science and Engineering, Agni College of Technology Chennai-130,
Tamil Nadu, India.
2,3UG Student, Department of Computer Science and Engineering,Agni College of Technology Chennai-130,
Tamil Nadu, India.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – Machine learning is one of the most efficient
techniques for prediction and classification related problems.
In this modern era, most of the industries all over the world
depend upon the machine learning models which leadintothe
data analytics century. There is no properandefficienttool for
handling the datasets which use machine learning models for
data prediction and Visualization. So, in this paper a novel
idea is proposed for making the user-friendly approach to
handle the machine learning models for data prediction and
visualisation. A tool is developed, such that it performs data
cleaning which will be a prerequisite for data analysis and
then provides a visible representation of the cleansed data.
The developed tool will take the input as structured dataset
that contains both textual and numerical data which are then
processed using machine learning algorithms to obtain a pre-
processed dataset. This process may undergo series of steps to
produce visualized and predicted data as per the chosen
effective algorithm to obtain efficient result.
Key Words: Machine learning, visualization, pre-
processing, Tool, user interface.
1. INTRODUCTION
An organisation uses the dataset for predictive
analysis and an important concern in these cases is data
quality. Using noisy data can hamper with the correctness of
analysis. The common errors are missing values, duplicates
and other errors. These errors need to be corrected for
reliable decisions and analytics. The users must know that
the effects of using the noisy data before proceeding with the
cleaning process. Noise removal will improve the model
performance, due to the fact that noises may disturb the
discovery of important information.
Machine learning is the appreciated application of
Artificial Intelligence. It is used to learn automatically
without any human assistancethatprovideshugedataset for
analysing with a large number of data fields. With the data
provided by the system after implementing the machine
learning algorithms, organizations are able to work more
effective and acquire profit over their competitors. The
system that uses machine learning technique will be able to
predict how the structure looks like and adjust the data
according to their structure. The mainchallengesinmachine
learning model is to deal with large data sources for data
cleaning process. Data cleaning process is carried by taking
in huge datasets which are checkedforthepossible errorsby
using data pre-processing techniques. The other challenges
include avoiding learning process from noisy data, avoiding
building a prejudiced model, not giving reasons for
compromising with the qualityofthedata.The bestpractices
for data cleaning using machine learning techniquesthatare
filling missing values, removing unnecessary rows,reducing
the size of the data and implementing a good quality plan.
The success of machine learning applications
depends on the amount of good quality data that is given to
it. But this process of cleaning may not be considered as a
main area in data pre-processing. The system that uses
powerful algorithms to process the noisy data can yield bad
results if irrelevant or wrong training set of data is given. In
the proposed model ML algorithms to find out the different
patterns in the data and group it by itself into clean and
noisy data which will help in reducing execution time.
2. Related Work:
Data Pre-processing is used to convert the raw data
into pre-processed data set. [1] In Machine Learning, the
data pre-processing is used to transform or encode the data
easily by their algorithm. It consists of interactive steps as
follows. Data cleaning is used to detect and correct
inaccurate records from a record or tables, and then
replacing, modifying or deleting this noisy data. Data
integration will combines the data residing indifferent
sources that provides user with a unified view of these data
[2]. The process of selecting suitable data for a research
project will impact data integritywhereData transformation
converts data from a source data format into resultant data
[3].
The tools which are available to process the data in
data processing and visualizing are Knime, Shogun, Oryx 2,
Tensor flow, Weka, RapidMiner, Trifacta Wrangler, Python
[12] [13]. In this paper, we will focus on removing the noisy
data that identifies the numerical values, predicting and
filling in missing values and detect outliers which hamper
with data analysis [11]. We propose a system that simplifies
the process for the user and allows for better processing. In
summary, Machine learning for data cleaning might be the
only way to provide complete and trustworthy data sets for
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 949
effective analytics, so we provide an user friendly interface
for pre-processing and model analysis with visualizationfor
the ease of user.
3. System Design:
The Data Pre-processingisdone withthreemethods
they are Data Cleaning, Data Transformation and Data
Reduction. The data cleaning application is to process the
raw dataset containing both textual and numerical data that
convert it into a cleaned dataset which can be used for data
analysis. Initially, users must upload the dataset in which
they perform the analysis. They can choose the operations
that they want to perform on their dataset from themodules
provided. This application performs a series of operations
which includes removing columns with less information or
no information, removing unnecessary rows, identifyingthe
numerical values, filling in the missing fields and identifying
the outliers. Some columns may contain less information or
no information that makes it hard to rely on such columns
for analysis and so such columns can be removed and they
don’t cause significant damage to the data.
Some rows may contain empty fields which will
again tamper with the proper pre-processing of the dataset.
Hence such values are identified and removed. The dataset
will contain categorical features ranging from numerical to
non-numerical values. This application requires only
numerical data which is used for analysis and prediction,
such that the fields containing numeric values areidentified.
If you try to remove them, you might reduce the amount of
data that is available. So, these fields need to be filled in
appropriate values.
4. Implementation:
The outliers with data points are really far from the
rest of your data points. Mathematically, an outlierisusually
defined as an observation over three standard deviations
from the mean. They can show up due to errors in data entry
or measurement, or just because there's a variation in the
population. Identifying and handlingoutliersisanimportant
part of data cleaning.
In Data Analysis we are using the subsequent
algorithms to analyse the cleansed data. Linear regression,
SVM (Support Vector Machine), KNN (K-Nearest
Neighbours), Logistic Regression, Decision Tree, K-Means,
Random Forest, Naive Bayes, Dimensional Reduction
Algorithms, Gradient Boosting Algorithms.
Linear Regression algorithm will use the
info points to seek out the simplest fit line to model the
info. A line can be represented by the equation, y = m*x +
c where y is the dependent variable and x is the
independent variable. Basic calculus theories are
applied to seek out the values for m and c using the given
data set. The SVM will separate the data points using a line.
The KNN will predict unknown data point with its k nearest
neighbours. The value of k is a critical factor regarding the
accuracy of prediction. It determines the nearest distance
using basic distance functions like Euclidean. Thisalgorithm
has to be a high computation power and that we have to
normalize the information initially to bring every datum
within the same range. The Decision Tree algorithm is used
to solve classification problems.Sometechniquesareusedto
categorize the data they are Gini, Chi-square, entropy etc. K-
Mean is an unsupervised algorithm that provides a solution
for clustering problem. The algorithm will follow the
procedure to form a cluster which contains homogeneous
data.
Random forest is identified as a collection of
decision trees. Every tree will try to estimate a classification
and this is called as a vote. We consider each votefromevery
tree and chose the maximum voted classification. Naive
Bayes can be applied only if the features are independent to
each other. Gradient Boosting Algorithm usesmultipleweak
algorithms to form accurate algorithm. Instead of using the
single estimator, will create a more stable and robust
algorithm. Based on the data set the algorithm is predicted
and provides an efficient result for data analysing process.
5. Results:
The user can click on the Submit button that is
provided and then select the operations they wish to
perform on their dataset from the list of operations
provided. The user can then upload the dataset into the
application by click on the Upload button to start the pre-
processing. Initially the original dataset is displayed and
then dataset after operation 1 will be displayed as cleansed
dataset. The selected operations are performed with the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 950
Cleansed dataset; finally the user will perform the data
analysis with the required algorithm to obtain the result in
visualization and it can be download by the user.
Upload Noisy Dataset:
Displaying the noisy Dataset:
Preprocessing the Data:
Applying Machine Learning Modal:
Output:
6. Conclusion:
Our developed systemperformsData Cleaning,Data
Transformation and Data Reduction in data pre-processing.
Our system which takes the rawdatasetsintotheapplication
which are then pre-processed to clean up all the noisy data
using pre-processing techniques and the cleansed data is
visualized to the users after all the pre-processing is done.
This system saves a lot of time since manual cleaning can be
avoided. After cleansing the user can choose or select the
machine learning model which will provide efficient results
as plots. This serves as an effective purpose for the users
who wants to clean huge datasets and visualizestheanalysis
of pre-processed data. In future the accuracy and
comparison of the machine learning algorithms can be done
within the friendly user interface.
REFERENCES
[1] Cristian Felix, Anshul Vikram Pandey, and EnricoBertini,
“TextTile: An Interactive Visualization Tool for Seamless
Exploratory, Analysis of Structured Dataand Unstructured
Text“, IEEE-2018.
[2] Data,Huawen Liu, Xuelong Li, Jiuyong Li, andShichao
Zhang, “Efficient Outlier Detection for High-Dimensional“,
IEEE-2019.
[3] M. Bostock, V. Ogievetsky, and J. Heer, “Datadriven
documents,” IEEE-2011.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 951
[4] F. Beck, S. Koch, and D. Weiskopf, “Visual Analysis and
Dissemination of Scientific Literature Collections with
SurVis”, IEEE-2016.
[5] Parke Godfrey, JarekGryz and PioterLasek,“Interactive
visualisation of large datasets”, IEEE-2016.
[6] Dileep kumarkoshleyand RajuHadler,“Data Cleaning: An
Abstraction-based approach”, IEEE-2015.
[7] Mehmet Adil Yalçın;NiklasElmqvist; Benjamin B.
Bederson,“Keshif :Rapid and Expressive Tabular Data
Exploration for Novices”, IEEE-2018.
[8] S. Papadimitriou, H. Kitagawa, P. B. Gibbons, and C.
Faloutsosk, “LOCI: Fast outlier detection using the local
correlation integral,” IEEE 19th Int. Conf. Data Eng. (ICDE),
Bengaluru, India, 2003, pp. 315–326.
[9] Y. Pang, J. Cao, and X. Li, “Learning samplingdistributions
for efficient object detection”, IEEE Trans. Cybern., vol. 47,
no. 1, pp. 117–129, Jan. 2017.
[10] M. Gupta, J. Gao, C. C. Aggarwal, and J. Han, “Outlier
detection for temporal data: A survey”, IEEE Trans. Knowl.
Data Eng., vol. 26, no. 9, pp. 2250–2267, Sep. 2014.
[11] S. F. Roth and J. Mattis, “Automating the presentation of
information,” in Artificial Intelligence Applications, 1991.
Pro-ceedings. , Seventh IEEE Conference on, vol. 1.IEEE,
1991, pp. 90–97.
[12] M. Bostock and J. Heer, “Protovis: A graphical toolkit
for visualization,” Visualization and Computer Graphics,
IEEE Transactions on, vol. 15, no. 6, pp. 1121–1128, 2009.
[13] A. Dziedzic, J. Duggan, A. J. Elmore, V. Gadepally, and M.
Stonebraker, “Bigdawg: a polystore for diverse interactive
applications,” in IEEE Viz Data Systems for Interactive
Analysis, 2015.
[14] P. Bohannon, W. Fan, F. Geerts, X. Jia, and A.
Kementsietsidis “Conditional functional dependencies for
data cleaning. In Data Engineering”, IEEE 23rd International
Conference on, pages 746–755. IEEE, 2007.

More Related Content

What's hot

IRJET - An Overview of Machine Learning Algorithms for Data Science
IRJET - An Overview of Machine Learning Algorithms for Data ScienceIRJET - An Overview of Machine Learning Algorithms for Data Science
IRJET - An Overview of Machine Learning Algorithms for Data ScienceIRJET Journal
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERSN ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERScsandit
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
 
A Study on Machine Learning and Its Working
A Study on Machine Learning and Its WorkingA Study on Machine Learning and Its Working
A Study on Machine Learning and Its WorkingIJMTST Journal
 
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...IRJET Journal
 
A02610104
A02610104A02610104
A02610104theijes
 
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATIONEDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATIONIJEEE
 
IRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET Journal
 
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET Journal
 
IRJET-Scaling Distributed Associative Classifier using Big Data
IRJET-Scaling Distributed Associative Classifier using Big DataIRJET-Scaling Distributed Associative Classifier using Big Data
IRJET-Scaling Distributed Associative Classifier using Big DataIRJET Journal
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
 
Ijatcse71852019
Ijatcse71852019Ijatcse71852019
Ijatcse71852019loki536577
 
Survey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selectionSurvey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selectioneSAT Journals
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceIJDKP
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniqueseSAT Journals
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET Journal
 
Survey on semi supervised classification methods and
Survey on semi supervised classification methods andSurvey on semi supervised classification methods and
Survey on semi supervised classification methods andeSAT Publishing House
 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesIRJET Journal
 

What's hot (20)

IRJET - An Overview of Machine Learning Algorithms for Data Science
IRJET - An Overview of Machine Learning Algorithms for Data ScienceIRJET - An Overview of Machine Learning Algorithms for Data Science
IRJET - An Overview of Machine Learning Algorithms for Data Science
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERSN ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
A Study on Machine Learning and Its Working
A Study on Machine Learning and Its WorkingA Study on Machine Learning and Its Working
A Study on Machine Learning and Its Working
 
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
 
A02610104
A02610104A02610104
A02610104
 
[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...
[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...
[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...
 
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATIONEDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
 
IRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of Clustering
 
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
 
IRJET-Scaling Distributed Associative Classifier using Big Data
IRJET-Scaling Distributed Associative Classifier using Big DataIRJET-Scaling Distributed Associative Classifier using Big Data
IRJET-Scaling Distributed Associative Classifier using Big Data
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
Ijatcse71852019
Ijatcse71852019Ijatcse71852019
Ijatcse71852019
 
Survey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selectionSurvey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selection
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniques
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
 
Survey on semi supervised classification methods and
Survey on semi supervised classification methods andSurvey on semi supervised classification methods and
Survey on semi supervised classification methods and
 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction Techniques
 

Similar to IRJET - An User Friendly Interface for Data Preprocessing and Visualization using Machine Learning Models

IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningIRJET Journal
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET Journal
 
IRJET- Road Accident Prediction using Machine Learning Algorithm
IRJET- Road Accident Prediction using Machine Learning AlgorithmIRJET- Road Accident Prediction using Machine Learning Algorithm
IRJET- Road Accident Prediction using Machine Learning AlgorithmIRJET Journal
 
Efficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
Efficiently Detecting and Analyzing Spam Reviews Using Live Data FeedEfficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
Efficiently Detecting and Analyzing Spam Reviews Using Live Data FeedIRJET Journal
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET Journal
 
Comparative Study of Enchancement of Automated Student Attendance System Usin...
Comparative Study of Enchancement of Automated Student Attendance System Usin...Comparative Study of Enchancement of Automated Student Attendance System Usin...
Comparative Study of Enchancement of Automated Student Attendance System Usin...IRJET Journal
 
Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challengesijcnes
 
IRJET - House Price Prediction using Machine Learning and RPA
 IRJET - House Price Prediction using Machine Learning and RPA IRJET - House Price Prediction using Machine Learning and RPA
IRJET - House Price Prediction using Machine Learning and RPAIRJET Journal
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...IOSR Journals
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONIRJET Journal
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET Journal
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmIRJET Journal
 
IRJET- Machine Learning
IRJET- Machine LearningIRJET- Machine Learning
IRJET- Machine LearningIRJET Journal
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET Journal
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...IRJET Journal
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisIRJET Journal
 
IRJET- Intelligence Extraction using Various Machine Learning Algorithms
IRJET- Intelligence Extraction using Various Machine Learning AlgorithmsIRJET- Intelligence Extraction using Various Machine Learning Algorithms
IRJET- Intelligence Extraction using Various Machine Learning AlgorithmsIRJET Journal
 
IRJET - A Novel Approach for Software Defect Prediction based on Dimensio...
IRJET -  	  A Novel Approach for Software Defect Prediction based on Dimensio...IRJET -  	  A Novel Approach for Software Defect Prediction based on Dimensio...
IRJET - A Novel Approach for Software Defect Prediction based on Dimensio...IRJET Journal
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET Journal
 

Similar to IRJET - An User Friendly Interface for Data Preprocessing and Visualization using Machine Learning Models (20)

IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
 
IRJET- Road Accident Prediction using Machine Learning Algorithm
IRJET- Road Accident Prediction using Machine Learning AlgorithmIRJET- Road Accident Prediction using Machine Learning Algorithm
IRJET- Road Accident Prediction using Machine Learning Algorithm
 
Efficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
Efficiently Detecting and Analyzing Spam Reviews Using Live Data FeedEfficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
Efficiently Detecting and Analyzing Spam Reviews Using Live Data Feed
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom Industry
 
Comparative Study of Enchancement of Automated Student Attendance System Usin...
Comparative Study of Enchancement of Automated Student Attendance System Usin...Comparative Study of Enchancement of Automated Student Attendance System Usin...
Comparative Study of Enchancement of Automated Student Attendance System Usin...
 
Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challenges
 
IRJET - House Price Prediction using Machine Learning and RPA
 IRJET - House Price Prediction using Machine Learning and RPA IRJET - House Price Prediction using Machine Learning and RPA
IRJET - House Price Prediction using Machine Learning and RPA
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its Analysis
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
 
IRJET- Machine Learning
IRJET- Machine LearningIRJET- Machine Learning
IRJET- Machine Learning
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
 
IRJET- Intelligence Extraction using Various Machine Learning Algorithms
IRJET- Intelligence Extraction using Various Machine Learning AlgorithmsIRJET- Intelligence Extraction using Various Machine Learning Algorithms
IRJET- Intelligence Extraction using Various Machine Learning Algorithms
 
IRJET - A Novel Approach for Software Defect Prediction based on Dimensio...
IRJET -  	  A Novel Approach for Software Defect Prediction based on Dimensio...IRJET -  	  A Novel Approach for Software Defect Prediction based on Dimensio...
IRJET - A Novel Approach for Software Defect Prediction based on Dimensio...
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoĂŁo Esperancinha
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 

Recently uploaded (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 

IRJET - An User Friendly Interface for Data Preprocessing and Visualization using Machine Learning Models

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 948 An User Friendly Interface for Data Preprocessing and Visualization using Machine Learning Models Mr. S. Yoganand1, Bharathi Kannan R2, Daya Meenakshi B2 1Assistant Professor, Department of Computer Science and Engineering, Agni College of Technology Chennai-130, Tamil Nadu, India. 2,3UG Student, Department of Computer Science and Engineering,Agni College of Technology Chennai-130, Tamil Nadu, India. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract – Machine learning is one of the most efficient techniques for prediction and classification related problems. In this modern era, most of the industries all over the world depend upon the machine learning models which leadintothe data analytics century. There is no properandefficienttool for handling the datasets which use machine learning models for data prediction and Visualization. So, in this paper a novel idea is proposed for making the user-friendly approach to handle the machine learning models for data prediction and visualisation. A tool is developed, such that it performs data cleaning which will be a prerequisite for data analysis and then provides a visible representation of the cleansed data. The developed tool will take the input as structured dataset that contains both textual and numerical data which are then processed using machine learning algorithms to obtain a pre- processed dataset. This process may undergo series of steps to produce visualized and predicted data as per the chosen effective algorithm to obtain efficient result. Key Words: Machine learning, visualization, pre- processing, Tool, user interface. 1. INTRODUCTION An organisation uses the dataset for predictive analysis and an important concern in these cases is data quality. Using noisy data can hamper with the correctness of analysis. The common errors are missing values, duplicates and other errors. These errors need to be corrected for reliable decisions and analytics. The users must know that the effects of using the noisy data before proceeding with the cleaning process. Noise removal will improve the model performance, due to the fact that noises may disturb the discovery of important information. Machine learning is the appreciated application of Artificial Intelligence. It is used to learn automatically without any human assistancethatprovideshugedataset for analysing with a large number of data fields. With the data provided by the system after implementing the machine learning algorithms, organizations are able to work more effective and acquire profit over their competitors. The system that uses machine learning technique will be able to predict how the structure looks like and adjust the data according to their structure. The mainchallengesinmachine learning model is to deal with large data sources for data cleaning process. Data cleaning process is carried by taking in huge datasets which are checkedforthepossible errorsby using data pre-processing techniques. The other challenges include avoiding learning process from noisy data, avoiding building a prejudiced model, not giving reasons for compromising with the qualityofthedata.The bestpractices for data cleaning using machine learning techniquesthatare filling missing values, removing unnecessary rows,reducing the size of the data and implementing a good quality plan. The success of machine learning applications depends on the amount of good quality data that is given to it. But this process of cleaning may not be considered as a main area in data pre-processing. The system that uses powerful algorithms to process the noisy data can yield bad results if irrelevant or wrong training set of data is given. In the proposed model ML algorithms to find out the different patterns in the data and group it by itself into clean and noisy data which will help in reducing execution time. 2. Related Work: Data Pre-processing is used to convert the raw data into pre-processed data set. [1] In Machine Learning, the data pre-processing is used to transform or encode the data easily by their algorithm. It consists of interactive steps as follows. Data cleaning is used to detect and correct inaccurate records from a record or tables, and then replacing, modifying or deleting this noisy data. Data integration will combines the data residing indifferent sources that provides user with a unified view of these data [2]. The process of selecting suitable data for a research project will impact data integritywhereData transformation converts data from a source data format into resultant data [3]. The tools which are available to process the data in data processing and visualizing are Knime, Shogun, Oryx 2, Tensor flow, Weka, RapidMiner, Trifacta Wrangler, Python [12] [13]. In this paper, we will focus on removing the noisy data that identifies the numerical values, predicting and filling in missing values and detect outliers which hamper with data analysis [11]. We propose a system that simplifies the process for the user and allows for better processing. In summary, Machine learning for data cleaning might be the only way to provide complete and trustworthy data sets for
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 949 effective analytics, so we provide an user friendly interface for pre-processing and model analysis with visualizationfor the ease of user. 3. System Design: The Data Pre-processingisdone withthreemethods they are Data Cleaning, Data Transformation and Data Reduction. The data cleaning application is to process the raw dataset containing both textual and numerical data that convert it into a cleaned dataset which can be used for data analysis. Initially, users must upload the dataset in which they perform the analysis. They can choose the operations that they want to perform on their dataset from themodules provided. This application performs a series of operations which includes removing columns with less information or no information, removing unnecessary rows, identifyingthe numerical values, filling in the missing fields and identifying the outliers. Some columns may contain less information or no information that makes it hard to rely on such columns for analysis and so such columns can be removed and they don’t cause significant damage to the data. Some rows may contain empty fields which will again tamper with the proper pre-processing of the dataset. Hence such values are identified and removed. The dataset will contain categorical features ranging from numerical to non-numerical values. This application requires only numerical data which is used for analysis and prediction, such that the fields containing numeric values areidentified. If you try to remove them, you might reduce the amount of data that is available. So, these fields need to be filled in appropriate values. 4. Implementation: The outliers with data points are really far from the rest of your data points. Mathematically, an outlierisusually defined as an observation over three standard deviations from the mean. They can show up due to errors in data entry or measurement, or just because there's a variation in the population. Identifying and handlingoutliersisanimportant part of data cleaning. In Data Analysis we are using the subsequent algorithms to analyse the cleansed data. Linear regression, SVM (Support Vector Machine), KNN (K-Nearest Neighbours), Logistic Regression, Decision Tree, K-Means, Random Forest, Naive Bayes, Dimensional Reduction Algorithms, Gradient Boosting Algorithms. Linear Regression algorithm will use the info points to seek out the simplest fit line to model the info. A line can be represented by the equation, y = m*x + c where y is the dependent variable and x is the independent variable. Basic calculus theories are applied to seek out the values for m and c using the given data set. The SVM will separate the data points using a line. The KNN will predict unknown data point with its k nearest neighbours. The value of k is a critical factor regarding the accuracy of prediction. It determines the nearest distance using basic distance functions like Euclidean. Thisalgorithm has to be a high computation power and that we have to normalize the information initially to bring every datum within the same range. The Decision Tree algorithm is used to solve classification problems.Sometechniquesareusedto categorize the data they are Gini, Chi-square, entropy etc. K- Mean is an unsupervised algorithm that provides a solution for clustering problem. The algorithm will follow the procedure to form a cluster which contains homogeneous data. Random forest is identified as a collection of decision trees. Every tree will try to estimate a classification and this is called as a vote. We consider each votefromevery tree and chose the maximum voted classification. Naive Bayes can be applied only if the features are independent to each other. Gradient Boosting Algorithm usesmultipleweak algorithms to form accurate algorithm. Instead of using the single estimator, will create a more stable and robust algorithm. Based on the data set the algorithm is predicted and provides an efficient result for data analysing process. 5. Results: The user can click on the Submit button that is provided and then select the operations they wish to perform on their dataset from the list of operations provided. The user can then upload the dataset into the application by click on the Upload button to start the pre- processing. Initially the original dataset is displayed and then dataset after operation 1 will be displayed as cleansed dataset. The selected operations are performed with the
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 950 Cleansed dataset; finally the user will perform the data analysis with the required algorithm to obtain the result in visualization and it can be download by the user. Upload Noisy Dataset: Displaying the noisy Dataset: Preprocessing the Data: Applying Machine Learning Modal: Output: 6. Conclusion: Our developed systemperformsData Cleaning,Data Transformation and Data Reduction in data pre-processing. Our system which takes the rawdatasetsintotheapplication which are then pre-processed to clean up all the noisy data using pre-processing techniques and the cleansed data is visualized to the users after all the pre-processing is done. This system saves a lot of time since manual cleaning can be avoided. After cleansing the user can choose or select the machine learning model which will provide efficient results as plots. This serves as an effective purpose for the users who wants to clean huge datasets and visualizestheanalysis of pre-processed data. In future the accuracy and comparison of the machine learning algorithms can be done within the friendly user interface. REFERENCES [1] Cristian Felix, Anshul Vikram Pandey, and EnricoBertini, “TextTile: An Interactive Visualization Tool for Seamless Exploratory, Analysis of Structured Dataand Unstructured Text“, IEEE-2018. [2] Data,Huawen Liu, Xuelong Li, Jiuyong Li, andShichao Zhang, “Efficient Outlier Detection for High-Dimensional“, IEEE-2019. [3] M. Bostock, V. Ogievetsky, and J. Heer, “Datadriven documents,” IEEE-2011.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 951 [4] F. Beck, S. Koch, and D. Weiskopf, “Visual Analysis and Dissemination of Scientific Literature Collections with SurVis”, IEEE-2016. [5] Parke Godfrey, JarekGryz and PioterLasek,“Interactive visualisation of large datasets”, IEEE-2016. [6] Dileep kumarkoshleyand RajuHadler,“Data Cleaning: An Abstraction-based approach”, IEEE-2015. [7] Mehmet Adil Yalçın;NiklasElmqvist; Benjamin B. Bederson,“Keshif :Rapid and Expressive Tabular Data Exploration for Novices”, IEEE-2018. [8] S. Papadimitriou, H. Kitagawa, P. B. Gibbons, and C. Faloutsosk, “LOCI: Fast outlier detection using the local correlation integral,” IEEE 19th Int. Conf. Data Eng. (ICDE), Bengaluru, India, 2003, pp. 315–326. [9] Y. Pang, J. Cao, and X. Li, “Learning samplingdistributions for efficient object detection”, IEEE Trans. Cybern., vol. 47, no. 1, pp. 117–129, Jan. 2017. [10] M. Gupta, J. Gao, C. C. Aggarwal, and J. Han, “Outlier detection for temporal data: A survey”, IEEE Trans. Knowl. Data Eng., vol. 26, no. 9, pp. 2250–2267, Sep. 2014. [11] S. F. Roth and J. Mattis, “Automating the presentation of information,” in Artificial Intelligence Applications, 1991. Pro-ceedings. , Seventh IEEE Conference on, vol. 1.IEEE, 1991, pp. 90–97. [12] M. Bostock and J. Heer, “Protovis: A graphical toolkit for visualization,” Visualization and Computer Graphics, IEEE Transactions on, vol. 15, no. 6, pp. 1121–1128, 2009. [13] A. Dziedzic, J. Duggan, A. J. Elmore, V. Gadepally, and M. Stonebraker, “Bigdawg: a polystore for diverse interactive applications,” in IEEE Viz Data Systems for Interactive Analysis, 2015. [14] P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis “Conditional functional dependencies for data cleaning. In Data Engineering”, IEEE 23rd International Conference on, pages 746–755. IEEE, 2007.