SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1341
SVM CLASSIFIER ALGORITHM FOR DATA STREAM MINING USING HIVE
AND R
Mrs.Pranamita Nanda1,B.Sandhiya2,R.Sandhiya3,A.S.Vanaja4
1Assistant Professor,2,3,4Students
Department of Computer Science and Engineering
Velammal Institute Of Technology, Ponneri, Tiruvallur.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract: Big data is a challengingfunctionalityforanalyzing
the large volume of data in the IT deployment in a different
dimension. To make that analysis process in more efficient
manner we use Hive tool for query processing and providing
statistical report using RStudio. The processing load in data
stream mining has been reduced by the technique know as
Feature Selection. However, whenitcomestominingoverhigh
dimensional data the search space from which an optimal
feature subset is derived growsexponentiallyinsize, leadingto
an intractable demand in computation. To reduce the
complexity of using accelerated particle swarm
optimization.(APSO), we connect the data by using Hadoop
technology. Hadoop technology is easier to store and retrieve
the data in a big data environment. With the dataset the
data’s are analysed and the statisticalreportisproduced using
SVM algorithm in R software where R languageisused. ThisR-
software environment is used toprovideastatisicalcomputing
and graphics. This statistical report compares the accuracy
between the linear and non linear grid where the higher
accuracy dataset is efficient. The final graph provides
combination of the linear and nonlinear with respect to cost
and sigma which is the userdefined value. PSO with SVM
algorithm increases the performance of analysing the data.
INTRODUCTION:
The process of handling large volume of data, storing and
retrieval of data is challenging factor. Data stream mining is
the process of extracting knowledge structures from
continuous, rapid data records. A data stream is an ordered
sequence of instances that in many application of data
stream mining can be read only once or a small number of
times using limited computing andstoragecapabilities.Thus
for retrieval of data we use data streamminingtechnique. To
make the retrieval of data in efficient manner we use
hadoop-hive tool for query processing. It takes less time to
process. Process such as converting the unstructured data
into structured data by creating schema. Then in hadoop
environment there is a data storage place known as hadoop
distributed file system where our database is importedfrom
the external device or internal device such as server or
system that we are working in to the HDFS using the hive
query. The keyword inpath or externalpath is used for
importing data from internal device and external device.
Then the data is extracted from the database using test data
and trained data. The trained data is already existing data’s
which is just a predicted one. With the trained data the
testing is done for analyzing. Both the test data and trained
data are used for classification algorithm known as Support
Vector Machine. The SVM classifier is the classification
algorithm. For a dataset consisting of feature s set and label
set an classifier build a model to predict classes. The
parameter used for this process is accuracy. The SVM
classifier evaluate the predicted data and provides the
accuracy. Thus the efficient accuracy is taken into
consideration.
EXISTING SYSTEM:
The light weight feature selection technique known as
swarm search is used for classfing the dataset. There are
many feature selection technique like CCV, Improved PSO
etc.,The amount of data feed is potentially infinite and the
data delivery is continuous like a high speed train of
information.The processing hence isexpectedtobereal time
and instantly responsive. The retrieval of data from large
volume of data and maintaining them is difficult and the
accuracy of the data is little lower which is been overcomed
using best classifier algorithm. The complication on top of
quantitatively computing the non-linear relations between
the feature value and target classes is the temporal nature of
such data stream, One must crunch on the data stream long
enough for accurately modeling seasonal cycles or regular
pattern if they ever exist. There are no straight-forward
relations that can easily map the attributedata intoa specific
class without a long-term observation. This impacts
considerately on the data mining algorithm design that
should be capable of just reading and forgetting the data
stream.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1342
LITERATURE SURVEY:
Big Data though it is hype up-springing many technical
challenges that confront both academic research
communities and commercial IT deployment, the root
sources of Big Data are founded on data streams and the
curse of dimensionality. It is generally known that data
which are sourced from data streams accumulate
continuously making traditional batch-based model
induction algorithms infeasible for real-time data mining. In
order to tackle this problem which is mainly based on the
high-dimensionality and streaming format of data feeds in
Big Data, a novel lightweight feature selection is proposed.
The feature selection is designed particularly for mining
streaming data on the fly, by using accelerated particle
swarm optimization (APSO) type of swarm search that
achieves enhanced analytical accuracy within reasonable
processing time. In this paper, a collection of Big Data with
exceptionally large degree of dimensionality are put under
test of our new feature selection algorithm for performance
evaluation.[1]
The energy-saving research of virtualization of the cloud
computing platform shows that there are problems in the
management mode of the existing virtualization
platform.This model is based on a single node managing the
whole platform and the single model is responsible for
migrating as well as scheduling all of the virtual
machine.Therefore proposing a double management model
of the virtual machine is used to solve the problem of single
management node bottleneck and scope of the migration.A
the same time,the improved PSO algorithm is used to make
the plan for virtual machine migration.On the premise of
meeting the service performance,the plan achieves energy
saving by server booting to a minimum.Through the
experiment,it proves that the proposed management mode
not only solves the bottleneck problem of single
management node, but also reduces themigrationscopeand
the difficulty of the problem. The improved PSO algorithm
obviously raises the speed of the migration and overall
energy efficiency of scheme.[2]
The cloud storage problem is one of the interesting and
important topics in the fields of cloud computing and big
data. From the viewpoint of optimization, one discrete PSO
algorithm is mainly utilized to handle with the cloud storage
problem of the distributed data centers in China’s railway
and copy with the data between two data centers.Inorderto
achieve the good performance considering the smallest
transmitting distance,onediscretePSOalgorithmessentially
marries each other between two data center sets. Numerical
results highlight that the discrete PSO algorithmcanprovide
the guideline for the suboptimal cloud storage strategy of
China’s railway when the number of the distributed data
centers is equal to 15, 17 and 18.[3]
One of the challenges in inferring a classificationmodel with
good prediction accuracy is to select the relevant features
that contribute to maximum predictive power. Manyfeature
selection techniques have been proposed and studied in the
past, but none so far claimed to be the best. In this paper, a
novel and efficientfeatureselectionmethodcalledClustering
Coefficients of Variation (CCV) is proposed.CCVisbased ona
very simple principle of variance-basis which finds an
optimal balance between generalization and overfitting.By
the simplicity of design it is anticipated that CCV will be a
useful alternative of pre-processingmethodforclassification
especially with those datasets that are characterized by
many features.[4]
In a series of recent papers, Prof. Olariu and his co-workers
have promoted the vision of vehicular clouds (VCs), a
nontrivial extension, along several dimensions, of
conventional cloud computing. Themaincontributionof this
work is to identify and analyze a number of security
challenges and potential privacy threats in VCs. Although
security issues have received attention in cloud computing
and vehicular networks, we identify security challengesthat
are specific to VCs, e.g., challenges of authentication of high-
mobility vehicles, scalability and single interface, tangled
identities and locations, and the complexity of establishing
trust relationships among multiple players caused by
intermittent short-range communications. Additionally, we
provide a security scheme that addresses several of the
challenges discussed.[5]
PROPOSED SYSTEM
We are proposing an approach called data stream mining
using Hadoop – Hive technology. To implement the big data
analytics in a huge scalabilitymanner,bigdata needshadoop
for processing the data. The main research challenge hereis
about finding the most appropriate model induction
algorithm for mining data streams. As an additional feature,
pertaining to the possibility of embedding the data miner
module into some small devices, the memoryrequirementis
opt to be as little as possible for obvious reasons of energy
saving and fitting into a tiny device size. In other words, the
learned model, probably in form of generalized non-linear
mappings between the valuesofthefeaturestothepredicted
target classes, must be compact enough to executeina small
run-time memory. No roomiswastedforstoringthefeatures
and their relations that are neither significant norcontribute
little to the model accuracy. To this end, without using
feature selection is out of consideration, as the number of
original features extracted from the data streams. Since
these models are built based on a stationary dataset, model
up-date needs to repeat the whole training process
whenever new samples arrive, adding them to incorporate
the changing underlying patterns. In dynamic stream
processing environment, however, data classificationmodel
would have to be frequently updated accordingly.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1343
ARCHITECTURAL DIAGRAM AND EXPLANATION:
FIG: 2: PROPOSED ARCHITECTURE
From the database the datasets related to the user needs is
retrieved using the hive query. The hive is the data
warehouse used to analyse and retrieve the data. For this,
first we need to continuously upload the data’s in database.
Then the datasets are retrieved for eg in database there will
be the medical datasets, traffic light datasets, weather
forcasting datasets etc., from these multiple datasets
required one is retrieved using the hive query. The datasets
have multiple fields here fields represents age, name, sex
etc., The retrieval of data is based on these fields. With the
retrieved datasets, analysis is done and divided into two
segments known as trained dataset and test dataset. The
trained dataset will be more than the test dataset. The
trained dataset undergoes some filtering process. But the
test dataset undergoes classification where the data’s are
sliced. And both sliced data and trained data enters into the
SVM machine.
The SVM algorithm is used for binary, multi-class problem
and anomalie detection. Using hyper planar the critical
points are divided known as support vectors. Theseperation
is then perpendicular bisector of the line joining these two
support vectors. These data’s are entered into the R input
frames. These R input frames is used to extract the data
using statistical computing andgraphics.Itisusedtoprovide
statistical report. The statistical report is provided forlinear
and nonlinear. These report provides accuracy for both
stream. Then linear accuracy and non linear accuracy is
compared to see the efficiency.Thenthegridanalysisisdone
which combines both the accuracy and provides the graph.
With that positive and negative data’s are identified. The
positive data is safe whereas the negative value is unsafe. It
increases the efficiency and takes less time for anaysing and
for retrieving the data. It improves the data processing
speed. It can be able to analyse the large volume of data in a
small time compare to another tools. It provides large scale
integration of data.
MODULES:
 Create schema in data warehouse
 Importing the data to HDFS
 Extracting the data
 Performance evolution
 Statistical report
MODULE DESCRIPTION:
A) CREATE SCHEMA IN DATA WAREHOUSE:
In database the data's will be in the unstructured format
which is unreadable. The database is uploaded in thesystem
and to process the unstructured data in Hive, a schema is
created. A schema is created using the attributes which is
considered as field in Hive. These fields can beusedtodivide
the data sets as test data and trained data where test data is
a unpredicted data and trained data is a predicted data.
B) IMORTING THE DATA IN HDFS:
The Hadoop Distributed File Systemisdesignedtostorevery
large dataset and to stream those data sets at high
bandwidth to user application. The Database is converted
from unstructured to structured format by creating the
schema which is loaded into the HDFS. If the database is
stored in the desktop then INPATH keyword is usedwhereif
it is stored in external devices then EXTERNALPATH
keyword is used. The keyword OVERWRITE is used to
replace old data with new data.
C) EXTRACTING THE DATA:
The hive query which is used for providing data
summarization ,query and analysis. It gives an SQL like
interface to query data data stored in various databases and
file systems that integrate with Hadoop. Hive provides the
necessary thenecessarySQLabstractiontointegrateHIVEQL
into underlying java API without the need to implement
queries in the low level API. Hive supportseasyportabilityof
SQL based application to Hadoop. It provides the sliced data
from the datasets which is relevant to the user query. Using
hive the data’s are retrieved in faster manneranditcanlarge
volume of data. As the database is stored in the system and
the processing also take place in the same system, the
system act as both client and server.
DATA STORAGE
DATA SETS
TRAINED
DATA
TEST
DATA
SVM
STATISTICAL
REPORT
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1344
D) PERFORMANCE EVOLUTION:
In this approach, the Support Vector Machine(SVM)
algorithm is used for analysing and retrival of data. It is a
linearized programming and supervised learning approach.
It is processed on the basis of Machine Learning(ML)
techniques. It accurately reduce the time complexity and
code complexity. RStudio is adaptable with any type of data
and produces the result with efficient improvement. The
SVM algorithm is divided into two types they are linear and
radial methods. Accuracy is the parameter which is
determined using the SVM algorithm. The linear provides
one accuracy and radial provides one accuracy. Comparing
these two accuracy the highest accuracy is considered as
efficient.
E) STATISTICAL REPORT:
The Statistical report is determined using the Rstudio as per
the user needs where R programming language is used for
analysing the data. The Rstudio tool provides the graphical
representation of the data for our input data. Both the linear
and radial is combined to provide grid graph which helps to
identify the highly positive and negative value
SCREENSHOTS:
A) CREATING SCHEMA IN DATA WAREHOUSE:
B) IMPORTING THE DATA INTO HDFS:
C) LINEAR KERNEL GRAPH:
E) RADIAL KERNEL GRAPH:
F) RADIAL GRID GRAPH:
CONCLUSION:
An approach known as Hive Tool which is used for storing
and retrieving the data in large volume at higher speed. The
Hive Tool can be used to process and store the exactdata ina
large database, compared to other data mining and cloud
methodologies. The R-Studio is used o provide thestatistical
report by anlysing the data in the database as per the user
requirement. The PSO with SVM algorithm improves the
throughput efficiency.
FUTURE ENHANCEMENT:
In this paper the process of analysing is performed using
Hive tool and statistical report is provided using R Software
where R language is used. The statistical report provides
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1345
positive and negative value in the database. In future using
these values prediction is done. This prediction says what
will be the future problem with the help of past analysed
data. Some new algorithm can be derived to increase the
parameters efficiency ie.accuracy and also reduces the time
consumption for the retrieval of data from the database.
REFERENCES:
[1] Simon Fong, Raymond wong, V.Vasilakos “Accelerated
PSO swarm search feature selection for data stream mining
bigdata”, IEEE Transaction on Data engineering, VOL.10,
NO.7, July 2016.
[2] Ge Rietai, Gao Jing “Improved PSO algorithm for energy
saving research in the double layermanagement modeof the
cloud platform”, CloudComputing and Bigdata
analysis(2016).
[3] Jun Liu, Tianyunshi, Ping Li “Optimal cloud storage
problem in the distributed cloud data centresbythediscrete
PSO algorithm”, Institute of computing technologies,
china(2015).
[4] Fong.S, Liang.J, Wong.R, Ghanavati.M, "A novel feature
selection by clustering coefficientsofvariations",2014Ninth
International Conference on Digital Information
Management (ICDIM), Sept. 29, 2014, pp.205-213.
[5] Gong Jun Yan, Ding Wen,Stephan dariu, Michael C Weigle
“Security challenges in vehicular cloud computing”, IEEE
Transaction on Intelligent transportation systems, VOL.14,
NO.1, March 2013.

More Related Content

What's hot

A Survey on Batch Auditing Systems for Cloud Storage
A Survey on Batch Auditing Systems for Cloud StorageA Survey on Batch Auditing Systems for Cloud Storage
A Survey on Batch Auditing Systems for Cloud Storage
IRJET Journal
 
A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...
IJECEIAES
 
data Fusion and log correlation
data Fusion and log correlationdata Fusion and log correlation
data Fusion and log correlation
Mahdi Sayyad
 
Peer-to-Peer Data Sharing and Deduplication using Genetic Algorithm
Peer-to-Peer Data Sharing and Deduplication using Genetic AlgorithmPeer-to-Peer Data Sharing and Deduplication using Genetic Algorithm
Peer-to-Peer Data Sharing and Deduplication using Genetic Algorithm
IRJET Journal
 
Qo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentQo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentAlexander Decker
 
Parallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot netParallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot net
redpel dot com
 
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud StorageIRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET Journal
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET Journal
 
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
IJECEIAES
 
Target Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big DataTarget Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big Data
IRJET Journal
 
Journals analysis ppt
Journals analysis pptJournals analysis ppt
Journals analysis ppt
Muhammad Heikal
 
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various ParametersDifferentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
iosrjce
 
Use of genetic algorithm for
Use of genetic algorithm forUse of genetic algorithm for
Use of genetic algorithm for
ijitjournal
 
High performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayesHigh performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayes
eSAT Journals
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Jenny Liu
 
SECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDS
SECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDSSECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDS
SECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDS
Gyan Prakash
 
Efficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data ProcessingEfficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data Processing
IRJET Journal
 
Ay4201347349
Ay4201347349Ay4201347349
Ay4201347349
IJERA Editor
 

What's hot (19)

A Survey on Batch Auditing Systems for Cloud Storage
A Survey on Batch Auditing Systems for Cloud StorageA Survey on Batch Auditing Systems for Cloud Storage
A Survey on Batch Auditing Systems for Cloud Storage
 
A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...A time efficient and accurate retrieval of range aggregate queries using fuzz...
A time efficient and accurate retrieval of range aggregate queries using fuzz...
 
data Fusion and log correlation
data Fusion and log correlationdata Fusion and log correlation
data Fusion and log correlation
 
Peer-to-Peer Data Sharing and Deduplication using Genetic Algorithm
Peer-to-Peer Data Sharing and Deduplication using Genetic AlgorithmPeer-to-Peer Data Sharing and Deduplication using Genetic Algorithm
Peer-to-Peer Data Sharing and Deduplication using Genetic Algorithm
 
Qo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentQo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environment
 
Parallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot netParallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot net
 
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud StorageIRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop Framework
 
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management FDMC: Framework for Decision Making in Cloud for EfficientResource Management
FDMC: Framework for Decision Making in Cloud for EfficientResource Management
 
Target Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big DataTarget Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big Data
 
Journals analysis ppt
Journals analysis pptJournals analysis ppt
Journals analysis ppt
 
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various ParametersDifferentiating Algorithms of Cloud Task Scheduling Based on various Parameters
Differentiating Algorithms of Cloud Task Scheduling Based on various Parameters
 
Use of genetic algorithm for
Use of genetic algorithm forUse of genetic algorithm for
Use of genetic algorithm for
 
High performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayesHigh performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayes
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
 
V3 i35
V3 i35V3 i35
V3 i35
 
SECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDS
SECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDSSECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDS
SECURE & EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA INTEGRITY IN CLOUDS
 
Efficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data ProcessingEfficient Cost Minimization for Big Data Processing
Efficient Cost Minimization for Big Data Processing
 
Ay4201347349
Ay4201347349Ay4201347349
Ay4201347349
 

Similar to Svm Classifier Algorithm for Data Stream Mining Using Hive and R

Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic AlgorithmCloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
IRJET Journal
 
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONSECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
IRJET Journal
 
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET Journal
 
Departure Delay Prediction using Machine Learning.
Departure Delay Prediction using Machine Learning.Departure Delay Prediction using Machine Learning.
Departure Delay Prediction using Machine Learning.
IRJET Journal
 
Effective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaSEffective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaS
IRJET Journal
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
IRJET Journal
 
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
IRJET-  	  An Efficient Data Replication in Salesforce Cloud EnvironmentIRJET-  	  An Efficient Data Replication in Salesforce Cloud Environment
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
IRJET Journal
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET Journal
 
Smart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend AnalysisSmart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend Analysis
IRJET Journal
 
IRJET- Efficient Privacy-Preserving using Novel Based Secure Protocol in SVM
IRJET-  	  Efficient Privacy-Preserving using Novel Based Secure Protocol in SVMIRJET-  	  Efficient Privacy-Preserving using Novel Based Secure Protocol in SVM
IRJET- Efficient Privacy-Preserving using Novel Based Secure Protocol in SVM
IRJET Journal
 
IEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and AbstractIEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and Abstract
tsysglobalsolutions
 
Development of Effective Audit Service to Maintain Integrity of Migrated Data...
Development of Effective Audit Service to Maintain Integrity of Migrated Data...Development of Effective Audit Service to Maintain Integrity of Migrated Data...
Development of Effective Audit Service to Maintain Integrity of Migrated Data...
IRJET Journal
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
IRJET Journal
 
Academic Resources Architecture Framework Planning using ERP in Cloud Computing
Academic Resources Architecture Framework Planning using ERP in Cloud ComputingAcademic Resources Architecture Framework Planning using ERP in Cloud Computing
Academic Resources Architecture Framework Planning using ERP in Cloud Computing
IRJET Journal
 
IRJET - Coarse Grain Load Balance Algorithm for Detecting
IRJET - Coarse Grain Load Balance Algorithm for DetectingIRJET - Coarse Grain Load Balance Algorithm for Detecting
IRJET - Coarse Grain Load Balance Algorithm for Detecting
IRJET Journal
 
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET Journal
 
IRJET- E-Commerce Website using Skyline Queries
IRJET-  	  E-Commerce Website using Skyline QueriesIRJET-  	  E-Commerce Website using Skyline Queries
IRJET- E-Commerce Website using Skyline Queries
IRJET Journal
 
Cloud java titles adrit solutions
Cloud java titles adrit solutionsCloud java titles adrit solutions
Cloud java titles adrit solutions
Adrit Techno Solutions
 
Energy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentEnergy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud Environment
IRJET Journal
 
IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...
IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...
IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...
IRJET Journal
 

Similar to Svm Classifier Algorithm for Data Stream Mining Using Hive and R (20)

Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic AlgorithmCloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
Cloud Computing Task Scheduling Algorithm Based on Modified Genetic Algorithm
 
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONSECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
 
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
 
Departure Delay Prediction using Machine Learning.
Departure Delay Prediction using Machine Learning.Departure Delay Prediction using Machine Learning.
Departure Delay Prediction using Machine Learning.
 
Effective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaSEffective Information Flow Control as a Service: EIFCaaS
Effective Information Flow Control as a Service: EIFCaaS
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
 
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
IRJET-  	  An Efficient Data Replication in Salesforce Cloud EnvironmentIRJET-  	  An Efficient Data Replication in Salesforce Cloud Environment
IRJET- An Efficient Data Replication in Salesforce Cloud Environment
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
 
Smart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend AnalysisSmart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend Analysis
 
IRJET- Efficient Privacy-Preserving using Novel Based Secure Protocol in SVM
IRJET-  	  Efficient Privacy-Preserving using Novel Based Secure Protocol in SVMIRJET-  	  Efficient Privacy-Preserving using Novel Based Secure Protocol in SVM
IRJET- Efficient Privacy-Preserving using Novel Based Secure Protocol in SVM
 
IEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and AbstractIEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and Abstract
 
Development of Effective Audit Service to Maintain Integrity of Migrated Data...
Development of Effective Audit Service to Maintain Integrity of Migrated Data...Development of Effective Audit Service to Maintain Integrity of Migrated Data...
Development of Effective Audit Service to Maintain Integrity of Migrated Data...
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
Academic Resources Architecture Framework Planning using ERP in Cloud Computing
Academic Resources Architecture Framework Planning using ERP in Cloud ComputingAcademic Resources Architecture Framework Planning using ERP in Cloud Computing
Academic Resources Architecture Framework Planning using ERP in Cloud Computing
 
IRJET - Coarse Grain Load Balance Algorithm for Detecting
IRJET - Coarse Grain Load Balance Algorithm for DetectingIRJET - Coarse Grain Load Balance Algorithm for Detecting
IRJET - Coarse Grain Load Balance Algorithm for Detecting
 
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
 
IRJET- E-Commerce Website using Skyline Queries
IRJET-  	  E-Commerce Website using Skyline QueriesIRJET-  	  E-Commerce Website using Skyline Queries
IRJET- E-Commerce Website using Skyline Queries
 
Cloud java titles adrit solutions
Cloud java titles adrit solutionsCloud java titles adrit solutions
Cloud java titles adrit solutions
 
Energy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentEnergy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud Environment
 
IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...
IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...
IRJET- A Novel Approach for Privacy Security in Cloud Storage Plan with Three...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 

Recently uploaded (20)

Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 

Svm Classifier Algorithm for Data Stream Mining Using Hive and R

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1341 SVM CLASSIFIER ALGORITHM FOR DATA STREAM MINING USING HIVE AND R Mrs.Pranamita Nanda1,B.Sandhiya2,R.Sandhiya3,A.S.Vanaja4 1Assistant Professor,2,3,4Students Department of Computer Science and Engineering Velammal Institute Of Technology, Ponneri, Tiruvallur. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract: Big data is a challengingfunctionalityforanalyzing the large volume of data in the IT deployment in a different dimension. To make that analysis process in more efficient manner we use Hive tool for query processing and providing statistical report using RStudio. The processing load in data stream mining has been reduced by the technique know as Feature Selection. However, whenitcomestominingoverhigh dimensional data the search space from which an optimal feature subset is derived growsexponentiallyinsize, leadingto an intractable demand in computation. To reduce the complexity of using accelerated particle swarm optimization.(APSO), we connect the data by using Hadoop technology. Hadoop technology is easier to store and retrieve the data in a big data environment. With the dataset the data’s are analysed and the statisticalreportisproduced using SVM algorithm in R software where R languageisused. ThisR- software environment is used toprovideastatisicalcomputing and graphics. This statistical report compares the accuracy between the linear and non linear grid where the higher accuracy dataset is efficient. The final graph provides combination of the linear and nonlinear with respect to cost and sigma which is the userdefined value. PSO with SVM algorithm increases the performance of analysing the data. INTRODUCTION: The process of handling large volume of data, storing and retrieval of data is challenging factor. Data stream mining is the process of extracting knowledge structures from continuous, rapid data records. A data stream is an ordered sequence of instances that in many application of data stream mining can be read only once or a small number of times using limited computing andstoragecapabilities.Thus for retrieval of data we use data streamminingtechnique. To make the retrieval of data in efficient manner we use hadoop-hive tool for query processing. It takes less time to process. Process such as converting the unstructured data into structured data by creating schema. Then in hadoop environment there is a data storage place known as hadoop distributed file system where our database is importedfrom the external device or internal device such as server or system that we are working in to the HDFS using the hive query. The keyword inpath or externalpath is used for importing data from internal device and external device. Then the data is extracted from the database using test data and trained data. The trained data is already existing data’s which is just a predicted one. With the trained data the testing is done for analyzing. Both the test data and trained data are used for classification algorithm known as Support Vector Machine. The SVM classifier is the classification algorithm. For a dataset consisting of feature s set and label set an classifier build a model to predict classes. The parameter used for this process is accuracy. The SVM classifier evaluate the predicted data and provides the accuracy. Thus the efficient accuracy is taken into consideration. EXISTING SYSTEM: The light weight feature selection technique known as swarm search is used for classfing the dataset. There are many feature selection technique like CCV, Improved PSO etc.,The amount of data feed is potentially infinite and the data delivery is continuous like a high speed train of information.The processing hence isexpectedtobereal time and instantly responsive. The retrieval of data from large volume of data and maintaining them is difficult and the accuracy of the data is little lower which is been overcomed using best classifier algorithm. The complication on top of quantitatively computing the non-linear relations between the feature value and target classes is the temporal nature of such data stream, One must crunch on the data stream long enough for accurately modeling seasonal cycles or regular pattern if they ever exist. There are no straight-forward relations that can easily map the attributedata intoa specific class without a long-term observation. This impacts considerately on the data mining algorithm design that should be capable of just reading and forgetting the data stream.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1342 LITERATURE SURVEY: Big Data though it is hype up-springing many technical challenges that confront both academic research communities and commercial IT deployment, the root sources of Big Data are founded on data streams and the curse of dimensionality. It is generally known that data which are sourced from data streams accumulate continuously making traditional batch-based model induction algorithms infeasible for real-time data mining. In order to tackle this problem which is mainly based on the high-dimensionality and streaming format of data feeds in Big Data, a novel lightweight feature selection is proposed. The feature selection is designed particularly for mining streaming data on the fly, by using accelerated particle swarm optimization (APSO) type of swarm search that achieves enhanced analytical accuracy within reasonable processing time. In this paper, a collection of Big Data with exceptionally large degree of dimensionality are put under test of our new feature selection algorithm for performance evaluation.[1] The energy-saving research of virtualization of the cloud computing platform shows that there are problems in the management mode of the existing virtualization platform.This model is based on a single node managing the whole platform and the single model is responsible for migrating as well as scheduling all of the virtual machine.Therefore proposing a double management model of the virtual machine is used to solve the problem of single management node bottleneck and scope of the migration.A the same time,the improved PSO algorithm is used to make the plan for virtual machine migration.On the premise of meeting the service performance,the plan achieves energy saving by server booting to a minimum.Through the experiment,it proves that the proposed management mode not only solves the bottleneck problem of single management node, but also reduces themigrationscopeand the difficulty of the problem. The improved PSO algorithm obviously raises the speed of the migration and overall energy efficiency of scheme.[2] The cloud storage problem is one of the interesting and important topics in the fields of cloud computing and big data. From the viewpoint of optimization, one discrete PSO algorithm is mainly utilized to handle with the cloud storage problem of the distributed data centers in China’s railway and copy with the data between two data centers.Inorderto achieve the good performance considering the smallest transmitting distance,onediscretePSOalgorithmessentially marries each other between two data center sets. Numerical results highlight that the discrete PSO algorithmcanprovide the guideline for the suboptimal cloud storage strategy of China’s railway when the number of the distributed data centers is equal to 15, 17 and 18.[3] One of the challenges in inferring a classificationmodel with good prediction accuracy is to select the relevant features that contribute to maximum predictive power. Manyfeature selection techniques have been proposed and studied in the past, but none so far claimed to be the best. In this paper, a novel and efficientfeatureselectionmethodcalledClustering Coefficients of Variation (CCV) is proposed.CCVisbased ona very simple principle of variance-basis which finds an optimal balance between generalization and overfitting.By the simplicity of design it is anticipated that CCV will be a useful alternative of pre-processingmethodforclassification especially with those datasets that are characterized by many features.[4] In a series of recent papers, Prof. Olariu and his co-workers have promoted the vision of vehicular clouds (VCs), a nontrivial extension, along several dimensions, of conventional cloud computing. Themaincontributionof this work is to identify and analyze a number of security challenges and potential privacy threats in VCs. Although security issues have received attention in cloud computing and vehicular networks, we identify security challengesthat are specific to VCs, e.g., challenges of authentication of high- mobility vehicles, scalability and single interface, tangled identities and locations, and the complexity of establishing trust relationships among multiple players caused by intermittent short-range communications. Additionally, we provide a security scheme that addresses several of the challenges discussed.[5] PROPOSED SYSTEM We are proposing an approach called data stream mining using Hadoop – Hive technology. To implement the big data analytics in a huge scalabilitymanner,bigdata needshadoop for processing the data. The main research challenge hereis about finding the most appropriate model induction algorithm for mining data streams. As an additional feature, pertaining to the possibility of embedding the data miner module into some small devices, the memoryrequirementis opt to be as little as possible for obvious reasons of energy saving and fitting into a tiny device size. In other words, the learned model, probably in form of generalized non-linear mappings between the valuesofthefeaturestothepredicted target classes, must be compact enough to executeina small run-time memory. No roomiswastedforstoringthefeatures and their relations that are neither significant norcontribute little to the model accuracy. To this end, without using feature selection is out of consideration, as the number of original features extracted from the data streams. Since these models are built based on a stationary dataset, model up-date needs to repeat the whole training process whenever new samples arrive, adding them to incorporate the changing underlying patterns. In dynamic stream processing environment, however, data classificationmodel would have to be frequently updated accordingly.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1343 ARCHITECTURAL DIAGRAM AND EXPLANATION: FIG: 2: PROPOSED ARCHITECTURE From the database the datasets related to the user needs is retrieved using the hive query. The hive is the data warehouse used to analyse and retrieve the data. For this, first we need to continuously upload the data’s in database. Then the datasets are retrieved for eg in database there will be the medical datasets, traffic light datasets, weather forcasting datasets etc., from these multiple datasets required one is retrieved using the hive query. The datasets have multiple fields here fields represents age, name, sex etc., The retrieval of data is based on these fields. With the retrieved datasets, analysis is done and divided into two segments known as trained dataset and test dataset. The trained dataset will be more than the test dataset. The trained dataset undergoes some filtering process. But the test dataset undergoes classification where the data’s are sliced. And both sliced data and trained data enters into the SVM machine. The SVM algorithm is used for binary, multi-class problem and anomalie detection. Using hyper planar the critical points are divided known as support vectors. Theseperation is then perpendicular bisector of the line joining these two support vectors. These data’s are entered into the R input frames. These R input frames is used to extract the data using statistical computing andgraphics.Itisusedtoprovide statistical report. The statistical report is provided forlinear and nonlinear. These report provides accuracy for both stream. Then linear accuracy and non linear accuracy is compared to see the efficiency.Thenthegridanalysisisdone which combines both the accuracy and provides the graph. With that positive and negative data’s are identified. The positive data is safe whereas the negative value is unsafe. It increases the efficiency and takes less time for anaysing and for retrieving the data. It improves the data processing speed. It can be able to analyse the large volume of data in a small time compare to another tools. It provides large scale integration of data. MODULES:  Create schema in data warehouse  Importing the data to HDFS  Extracting the data  Performance evolution  Statistical report MODULE DESCRIPTION: A) CREATE SCHEMA IN DATA WAREHOUSE: In database the data's will be in the unstructured format which is unreadable. The database is uploaded in thesystem and to process the unstructured data in Hive, a schema is created. A schema is created using the attributes which is considered as field in Hive. These fields can beusedtodivide the data sets as test data and trained data where test data is a unpredicted data and trained data is a predicted data. B) IMORTING THE DATA IN HDFS: The Hadoop Distributed File Systemisdesignedtostorevery large dataset and to stream those data sets at high bandwidth to user application. The Database is converted from unstructured to structured format by creating the schema which is loaded into the HDFS. If the database is stored in the desktop then INPATH keyword is usedwhereif it is stored in external devices then EXTERNALPATH keyword is used. The keyword OVERWRITE is used to replace old data with new data. C) EXTRACTING THE DATA: The hive query which is used for providing data summarization ,query and analysis. It gives an SQL like interface to query data data stored in various databases and file systems that integrate with Hadoop. Hive provides the necessary thenecessarySQLabstractiontointegrateHIVEQL into underlying java API without the need to implement queries in the low level API. Hive supportseasyportabilityof SQL based application to Hadoop. It provides the sliced data from the datasets which is relevant to the user query. Using hive the data’s are retrieved in faster manneranditcanlarge volume of data. As the database is stored in the system and the processing also take place in the same system, the system act as both client and server. DATA STORAGE DATA SETS TRAINED DATA TEST DATA SVM STATISTICAL REPORT
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1344 D) PERFORMANCE EVOLUTION: In this approach, the Support Vector Machine(SVM) algorithm is used for analysing and retrival of data. It is a linearized programming and supervised learning approach. It is processed on the basis of Machine Learning(ML) techniques. It accurately reduce the time complexity and code complexity. RStudio is adaptable with any type of data and produces the result with efficient improvement. The SVM algorithm is divided into two types they are linear and radial methods. Accuracy is the parameter which is determined using the SVM algorithm. The linear provides one accuracy and radial provides one accuracy. Comparing these two accuracy the highest accuracy is considered as efficient. E) STATISTICAL REPORT: The Statistical report is determined using the Rstudio as per the user needs where R programming language is used for analysing the data. The Rstudio tool provides the graphical representation of the data for our input data. Both the linear and radial is combined to provide grid graph which helps to identify the highly positive and negative value SCREENSHOTS: A) CREATING SCHEMA IN DATA WAREHOUSE: B) IMPORTING THE DATA INTO HDFS: C) LINEAR KERNEL GRAPH: E) RADIAL KERNEL GRAPH: F) RADIAL GRID GRAPH: CONCLUSION: An approach known as Hive Tool which is used for storing and retrieving the data in large volume at higher speed. The Hive Tool can be used to process and store the exactdata ina large database, compared to other data mining and cloud methodologies. The R-Studio is used o provide thestatistical report by anlysing the data in the database as per the user requirement. The PSO with SVM algorithm improves the throughput efficiency. FUTURE ENHANCEMENT: In this paper the process of analysing is performed using Hive tool and statistical report is provided using R Software where R language is used. The statistical report provides
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1345 positive and negative value in the database. In future using these values prediction is done. This prediction says what will be the future problem with the help of past analysed data. Some new algorithm can be derived to increase the parameters efficiency ie.accuracy and also reduces the time consumption for the retrieval of data from the database. REFERENCES: [1] Simon Fong, Raymond wong, V.Vasilakos “Accelerated PSO swarm search feature selection for data stream mining bigdata”, IEEE Transaction on Data engineering, VOL.10, NO.7, July 2016. [2] Ge Rietai, Gao Jing “Improved PSO algorithm for energy saving research in the double layermanagement modeof the cloud platform”, CloudComputing and Bigdata analysis(2016). [3] Jun Liu, Tianyunshi, Ping Li “Optimal cloud storage problem in the distributed cloud data centresbythediscrete PSO algorithm”, Institute of computing technologies, china(2015). [4] Fong.S, Liang.J, Wong.R, Ghanavati.M, "A novel feature selection by clustering coefficientsofvariations",2014Ninth International Conference on Digital Information Management (ICDIM), Sept. 29, 2014, pp.205-213. [5] Gong Jun Yan, Ding Wen,Stephan dariu, Michael C Weigle “Security challenges in vehicular cloud computing”, IEEE Transaction on Intelligent transportation systems, VOL.14, NO.1, March 2013.