SlideShare a Scribd company logo
1 of 28
Download to read offline
INSY 5339 Principles of Business Data Mining Dr.Sikora
1
PREDICTIVE ANALYSIS ON
ACTIVITY RECOGNITION
SYSTEM
PROJECT REPORT - INSY 5339 – PRINCIPLES OF BUSINESS
DATA MINING
INSY 5339 Principles of Business Data Mining Dr.Sikora
2
TABLE OF CONTENTS
1. DATASET INTRODUCTION………………………………………………..…………………..………..3
1.1 DATAMINING INTRODUCTION………………………………………..………………………….3
1.2 OBJECTIVE…………………………………………………..................................................3
1.3 DATA BACKGROUND…………………………………..................................................3
1.4 DATASET INFORMATION................................................................................4
2. DATA PREPARATION………………………………………………………………………………………5
2.1 DATA CLEANING ………………………………………………….........................................5
3. ALGORITHMS USED ……………………………………….................................................6
3.1 ACCURACY ON FULL TRAINING SET.................................................................7
3.2 ACCURACY ON CROSS FOLDS.................................................................……….8
3.3 ACCURACY ON PERCENTAGE SPLIT..................................................................9
4. EXPERIMENTAL DESIGN..................................................................................10
4.1 RESULTS FOR EACH CLASSIFIER………………………………………………………….………11
4.2 RELATIVE ACCURACY OF EXPERIMENTAL DESIGN.........................................20
5. ROC CURVES...................................................................................................21
5.1 ROC CURVE – KNOWLEDGE FLOW……………………………………………………………..21
5.2 SINGLE CLASS VS 3 CLASSIFIERS………………………………………………………………….22
5.3 ALL CLASSES VS ALL CLASSIFIERS………………………………………………………………..25
6. PRINCIPAL COMPONENT ANALYSIS.................................................................26
7. CONCLUSION…………………………………………………………………………………………………27
8. REFERENCES………………………………………………………………………………………………….28
INSY 5339 Principles of Business Data Mining Dr.Sikora
3
1. DATA SET INFORMATION:
1.1 DATA MINING INTRODUCTION:
Data Mining means nontrivial extraction of implicit, previously unknown, and
potentially useful information from data. It is an interdisciplinary subfield of computer
science. The overall goal of the data mining process is to extract information from a data set
and transform it into an understandable structure for further use. Data mining can also be
defined as the semi-automatic or automatic analysis of large quantities of data to extract
previously unknown, interesting patterns such as groups of data records (cluster analysis),
unusual records (anomaly detection), and dependencies (association rule mining, sequential
pattern mining). This usually involves using database techniques such as spatial indices.
These patterns can then be seen as a kind of summary of the input data, and may be used in
further analysis or, for example, in machine learning and predictive analytics.
1.2 OBJECTIVE:
The main objective of this project is to determine the type of activity performed by the
user(Bending , walking, sitting, standing, Lying , cycling).This is determined by the use of
Activity Recognition system based on Multisensor data fusion (AReM) sensors.Our other
objective is to build and train the model by using training and test data sets.The output
obtained in the previous steps will be used to determine the credibility of the usage of the
AReM sensors for further experiments with regards to Activity Sensing.
1.3. DATA BACKGROUND:
Activity Recognition (AR) is an emerging research topic, which is founded on
established research fields such as ubiquitous computing, context-aware computing and
multimedia, and machine learning for pattern recognition. Recognizing everyday life
activities is a challenging application in pervasive computing, with a lot of interesting
developments in the health care domain, the human behavior modeling domain and the
human-machine interaction domain . Inferring the activity of the users in their own
domestic environments becomes even more useful in the Ambient Assisted Living (AAL)
scenario, where facilities provide assistance and care for the elderlies and the knowledge of
their daily activities can ensure safety and a successful aging.
From the point of view of the deployment of activity recognition solutions, we recognize
three main approaches. The first kind of solutions generally use sensors (embedding
accelerometers, or transducers for physiological measures) that make direct measures
about the user movements. The disadvantage of this approach is that wearable devices can
be intrusive on the user, even if, with recent advances in technologies of embedded
systems, sensors tend to be smaller and smaller. Solutions that avoid the use of wearable
INSY 5339 Principles of Business Data Mining Dr.Sikora
4
devices instead, are motivated by the need for a less intrusive activity recognition systems.
Among these solutions, those based on cameras are probably the most common. These are
the second type of sensors. More recently, a new generation of non wearable solution is
emerging. These solution exploits the implicit alteration of the wireless channel due to the
movements of the user, which is measured by devices placed in the environment and that
measure the Received Signal Strength (RSS) of the beacon packets they exchange among
themselves.
1.4. DATA SET INFORMATION:
This dataset contains temporal data from a Wireless Sensor Network worn by an actor
performing the activities: bending, cycling, lying down, sitting, standing, walking. The
classification tasks consist in predicting the activity performed by the user from time-series
generated by a Wireless Sensor Network (WSN). In our activity recognition system we use
information coming the implicit alteration of the wireless channel due to the movements of
the user. The devices measure the RSS of the beacon packets they exchange among
themselves in the WSN. They are placed on the user’s chest and ankles. For the purpose of
communications, the beacon packets are exchanged by using a simple virtual token protocol
that completes its execution in a time slot of 50 milliseconds.
From the raw data we extract time-domain features to compress the time series and
slightly remove noise and correlations. We choose an epoch time of 250 milliseconds. In
such a time slot we elaborate 5 samples of RSS (sampled at 20 Hz) for each of the three
couples of WSN nodes (i.e. Chest-Right Ankle, Chest-Left Ankle, Right Ankle-Left Ankle). The
features include the mean value and standard deviation for each reciprocal RSS reading
from worn WSN sensors. For each activity 15 temporal sequences of input RSS data are
present. The dataset contains 480 sequences, for a total number of 42240 instances. The
positions of sensor nodes with the related identifiers are shown in figure.
INSY 5339 Principles of Business Data Mining Dr.Sikora
5
2. DATA PREPARATION:
2.1 DATA CLEANING:
Data cleansing, data cleaning or data scrubbing is the process of detecting and correcting (or
removing) corrupt or inaccurate records from a data set, table, or database. Used mainly in
databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant, etc.
parts of the data and then replacing, modifying, or deleting this dirty or coarse data.
1. Merging all the data sets:
The data set was split into several files depending on the activity they had
performed. Without changing the attributes, we have merged all the files into one
excel dataset and added one extra attribute (class attribute) named Activity to
categorize different actions performed.
2. Macros:
Definition: An Excel macro is a set of programming instructions stored in what is
known as VBA code that can be used to eliminate the need to repeat the steps of
commonly performed tasks repeatedly. These repetitive tasks might involve
complex calculations that require the use of formulas or they might be simple
formatting tasks - such as adding number formatting to new data or applying cell
and worksheet formats such as borders and shading. The Macros have been used
INSY 5339 Principles of Business Data Mining Dr.Sikora
6
to find the missing values and amend them.
The Final Class Attribute values are categorized as Bending1, Bending2, Cycling,
Lying, Sitting, Standing and Walking.
3. ALGORITHMS USED FOR THIS EXPERIMENT:
After trying out various algorithms, the following algorithms have yielded the best results for
our experiment.
J48
(J48) is an algorithm used to generate a decision tree developed by Ross Quinlan
mentioned earlier. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The
decision trees generated by C4.5 can be used for classification, and for this
reason, C4.5 is often referred to as a statistical classifier.
Naïve Bayes
A naive Bayes classifier is an algorithm that uses Bayes' theorem to classify
objects. Naive Bayes classifiers assume strong, or naive, independence between
attributes of data points. These classifiers are widely used for machine learning
because they are simple to implement.
Decision Table
Decision Tree algorithm belongs to the family of supervised learning algorithms.
The decision tree algorithm tries to solve the problem, by
using tree representation. Each internal node of the tree corresponds to an
attribute, and each leaf node corresponds to a class label.
Random Tree
Random Tree is a supervised Classifier; it is an ensemble learning algorithm that
generates lots of individual learners. It employs a bagging idea to construct a
random set of data for constructing a decision tree.
INSY 5339 Principles of Business Data Mining Dr.Sikora
7
OneR
OneR, short for "One Rule", is a simple, yet accurate, classification algorithm that
generates one rule for each predictor in the data, then selects the rule with the
smallest total error as its "one rule". To create a rule for a predictor, we construct
a frequency table for each predictor against the target. It has been shown that
OneR produces rules only slightly less accurate than state-of-the-art classification
algorithms while producing rules that are simple for humans to interpret.
ZeroR
ZeroR is the simplest classification method which relies on the target and ignores
all predictors. ZeroR classifier simply predicts the majority category (class).
Although there is no predictability power in ZeroR, it is useful for determining a
baseline performance as a benchmark for other classification methods.
3.1 ACCURACY ON FULL TRAINING SET:
In this step we have used the full training set with all algorithms to determine which one is
the best one for our analysis.
Classifier
Correctly Classified Instances
(%)
Incorrectly Classified
Instances (%)
ZeroR 17.0459 82.9541
OneR 48.0383 51.9615
Naïve Bayes 64.3386 35.6614
J48 87.8974 12.1026
Decision Table 71.5121 28.4879
Random Tree 98.8612 1.1388
INSY 5339 Principles of Business Data Mining Dr.Sikora
8
3.2 ACCURACY ON CROSS FOLDS:
Here we have used the cross validation with 10 folds and have performed the test with all
algorithms. The result is given in the form of graph as well as table.
Classifier
Correctly Classified Instances
(%)
Incorrectly Classified
Instances (%)
Zero R 17.0459 82.9541
OneR 47.2881 52.7119
Naïve Bayes 64.3955 35.6045
J48 78.9578 21.0422
Decision Table 65.314 34.686
Random Tree 75.3687 24.6313
0
10
20
30
40
50
60
70
80
90
100
zeroR OneR Naïve
Bayes
J48 Decision
Table
Random
Tree
Correctly Classified
Instances(%)
Incorrectly Classified
Instances(%)
INSY 5339 Principles of Business Data Mining Dr.Sikora
9
3.3 ACCURACY ON PERCENTAGE SPLIT:
In this step, we have used percentage split of 66% to predict the accuracy of each algorithm.
The results are displayed in the form of graphs as well as tables.
Classifier
Correctly Classified Instances
(%)
Incorrectly Classified
Instances (%)
Zero R 17.0253 82.9747
OneR 47.3226 52.6774
Naïve Bayes 63.6864 36.3136
J48 77.5503 22.4497
Decision Table 65.2253 34.7747
Random Tree 72.9058 27.0942
0
10
20
30
40
50
60
70
80
90
Zero R OneR Naïve
Bayes
J48 Decision
Table
Random
Tree
Correctly Classified
Instances(%)
Incorrectly Classified
Instances(%)
INSY 5339 Principles of Business Data Mining Dr.Sikora
10
Based on the various tests, we have concluded the following three algorithms are good
enough to predict the CLASS attribute.
• J48
• Decision Table
• Random Tree
We confirmed the above results with the Receiver Operating Characteristic (ROC) graph
plotted against True Positive vs False Positive for the 6 algorithms and found out that these
3 algorithms have the better accuracy and also better Area under curve.
4. EXPERIMENTAL DESIGN:
A full factorial experiment is an experiment which consists of two or more factors each
of which has a discrete level. Each experimental unit in the experiment takes on all
possible combinations of these levels across all factors. Such an experiment allows the
investigator to study the effect of each factor on the response variable.
We selected the following classifiers for our experimental design:
• J48
• Decision Table
• Random Tree
0
10
20
30
40
50
60
70
80
90
Zero R OneR Naïve
Bayes
J48 Decision
Table
Random
Tree
Correctly Classified
Instances(%)
Incorrectly Classified
Instances(%)
INSY 5339 Principles of Business Data Mining Dr.Sikora
11
Four Cell Experimental Design:
It consists of 2 factors:
• With 10 % noise
• Without noise
It consists of 2 levels:
• %Split - 66 %
• %Split - 75 %
% Split - 66% % Split - 75%
Without Noise C1 C3
With Noise C2 C4
Four Cell Experimental Design:
• C1 – Percentage split 66% without noise
• C2 – Percentage split 66% with noise
• C3- Percentage split 75% without noise
• C4- Percentage split 75% with noise
Total Number of Experiments = Number of conditions*Number of Classifiers*10=
4*3*10 = 120 runs
4.1 RESULTS FOR EACH CLASSIFIER:
The table below describes the 12 possible combinations of our 4 criteria with the 3 selected
classifiers. We ran each of these combinations 10 times and averaged their accuracy and
variance:
INSY 5339 Principles of Business Data Mining Dr.Sikora
12
E1= Performance of J48 when, Attributes without noise + Percentage Split of 66%:34%
E2= Performance of J48 when, Attributes with noise + Percentage Split of 66%:34%
E3= Performance of J48 when, Attributes without noise + Percentage Split of 75%:25%
E4= Performance of J48 when, Attributes with noise + Percentage Split of 75%:25%
E1= Performance of Decision Table when, Attributes without noise + Percentage Split of
66%:34%
E2= Performance of Decision Table when, Attributes with noise + Percentage Split of
66%:34%
E3= Performance of Decision Table when, Attributes without noise + Percentage Split of
75%:25%
E4= Performance of Decision Table when, Attributes with noise + Percentage Split of
75%:25%
E1= Performance of Random Tree when, Attributes without noise + Percentage Split of
66%:34%
E2= Performance of Random Tree when, Attributes with noise + Percentage Split of
66%:34%
E3= Performance of Random Tree when, Attributes without noise + Percentage Split of
75%:25%
E4= Performance of Random Tree when, Attributes with noise + Percentage Split of
75%:25%
J48 – In J48, we ran four experiments, E1 to E4:
E1- 66 -34 split, without noise
E2- 66-34 split, with noise
E3- 75-25 split, without noise
E4- 75-25 split, with noise
INSY 5339 Principles of Business Data Mining Dr.Sikora
13
E1- 66 -34 split, without noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 J48 66 77.5503
2 J48 66 77.5503
3 J48 66 77.5503
4 J48 66 77.5503
5 J48 66 77.5503
6 J48 66 77.5503
7 J48 66 77.5503
8 J48 66 77.5503
9 J48 66 77.5503
10 J48 66 77.5503
AVERAGE 77.5503
VARIANCE 0
E2 – 66-34 split, with noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 J48 66 68.4145
2 J48 66 68.4145
3 J48 66 68.4145
4 J48 66 68.4145
5 J48 66 68.4145
6 J48 66 68.4145
7 J48 66 68.4145
8 J48 66 68.4145
9 J48 66 68.4145
10 J48 66 68.4145
AVERAGE 68.4145
VARIANCE 0
INSY 5339 Principles of Business Data Mining Dr.Sikora
14
E3- 75-25 split, without noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 J48 75 78.3712
2 J48 75 78.3712
3 J48 75 78.3712
4 J48 75 78.3712
5 J48 75 78.3712
6 J48 75 78.3712
7 J48 75 78.3712
8 J48 75 78.3712
9 J48 75 78.3712
10 J48 75 78.3712
AVERAGE 78.3712
VARIANCE 0
E4- 75-25 split, with noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 J48 75 68.2008
2 J48 75 68.2008
3 J48 75 68.2008
4 J48 75 68.2008
5 J48 75 68.2008
6 J48 75 68.2008
7 J48 75 68.2008
8 J48 75 68.2008
9 J48 75 68.2008
10 J48 75 68.2008
AVERAGE 68.2008
VARIANCE 0
INSY 5339 Principles of Business Data Mining Dr.Sikora
15
DECISION TABLE - In Decision Table, we ran four experiments, E1 to E4:
E1- 66 -34 split, without noise
E2- 66-34 split, with noise
E3- 75-25 split, without noise
E4- 75-25 split, with noise
E1- 66 -34 split, without noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Decision Table 66 65.2253
2 Decision Table 66 65.2253
3 Decision Table 66 65.2253
4 Decision Table 66 65.2253
5 Decision Table 66 65.2253
6 Decision Table 66 65.2253
7 Decision Table 66 65.2253
8 Decision Table 66 65.2253
9 Decision Table 66 65.2253
10 Decision Table 66 65.2253
AVERAGE 65.2253
VARIANCE 0
62 64 66 68 70 72 74 76 78 80
% Split - 66%
% Split - 75%
J48
With Noise Without Noise
INSY 5339 Principles of Business Data Mining Dr.Sikora
16
E2 – 66-34 split, with noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Decision Table 66 60.3161
2 Decision Table 66 60.3161
3 Decision Table 66 60.3161
4 Decision Table 66 60.3161
5 Decision Table 66 60.3161
6 Decision Table 66 60.3161
7 Decision Table 66 60.3161
8 Decision Table 66 60.3161
9 Decision Table 66 60.3161
10 Decision Table 66 60.3161
AVERAGE 60.3161
VARIANCE 0
E3- 75-25 split, without noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Decision Table 75 64.1951
2 Decision Table 75 64.1951
3 Decision Table 75 64.1951
4 Decision Table 75 64.1951
5 Decision Table 75 64.1951
6 Decision Table 75 64.1951
7 Decision Table 75 64.1951
8 Decision Table 75 64.1951
9 Decision Table 75 64.1951
10 Decision Table 75 64.1951
AVERAGE 64.1951
VARIANCE 0
INSY 5339 Principles of Business Data Mining Dr.Sikora
17
E4- 75-25 split, with noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Decision Table 75 60.4924
2 Decision Table 75 60.4924
3 Decision Table 75 60.4924
4 Decision Table 75 60.4924
5 Decision Table 75 60.4924
6 Decision Table 75 60.4924
7 Decision Table 75 60.4924
8 Decision Table 75 60.4924
9 Decision Table 75 60.4924
10 Decision Table 75 60.4924
AVERAGE 60.4924
VARIANCE 0
57 58 59 60 61 62 63 64 65 66
% Split - 66%
% Split - 75%
Decision Table
With Noise Without Noise
INSY 5339 Principles of Business Data Mining Dr.Sikora
18
RANDOM TREE - In Random Tree, we ran four experiments, E1 to E4:
E1- 66 -34 split, without noise
E2- 66-34 split, with noise
E3- 75-25 split, without noise
E4- 75-25 split, with noise
E1- 66 -34 split, without noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Random Tree 66 72.9058
2 Random Tree 66 72.9058
3 Random Tree 66 72.9058
4 Random Tree 66 72.9058
5 Random Tree 66 72.9058
6 Random Tree 66 72.9058
7 Random Tree 66 72.9058
8 Random Tree 66 72.9058
9 Random Tree 66 72.9058
10 Random Tree 66 72.9058
AVERAGE 72.9058
VARIANCE 0
E2 – 66-34 split, with noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Random Tree 66 59.6546
2 Random Tree 66 59.6546
3 Random Tree 66 59.6546
4 Random Tree 66 59.6546
5 Random Tree 66 59.6546
6 Random Tree 66 59.6546
7 Random Tree 66 59.6546
8 Random Tree 66 59.6546
9 Random Tree 66 59.6546
10 Random Tree 66 59.6546
AVERAGE 59.6546
VARIANCE 0
INSY 5339 Principles of Business Data Mining Dr.Sikora
19
E3- 75-25 split, without noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Random Tree 75 73.3807
2 Random Tree 75 73.3807
3 Random Tree 75 73.3807
4 Random Tree 75 73.3807
5 Random Tree 75 73.3807
6 Random Tree 75 73.3807
7 Random Tree 75 73.3807
8 Random Tree 75 73.3807
9 Random Tree 75 73.3807
10 Random Tree 75 73.3807
AVERAGE 73.3807
VARIANCE 0
E4- 75-25 split, with noise
SEED CLASSIFIER PERCENTAGE
SPLIT
ACCURACY
1 Random Tree 75 59.4034
2 Random Tree 75 59.4034
3 Random Tree 75 59.4034
4 Random Tree 75 59.4034
5 Random Tree 75 59.4034
6 Random Tree 75 59.4034
7 Random Tree 75 59.4034
8 Random Tree 75 59.4034
9 Random Tree 75 59.4034
10 Random Tree 75 59.4034
AVERAGE 59.4034
VARIANCE 0
INSY 5339 Principles of Business Data Mining Dr.Sikora
20
4.2 RELATIVE ACCURACY OF EXPERIMENTAL DESIGN:
0 10 20 30 40 50 60 70 80
% Split - 66%
% Split - 75%
Random Tree
With Noise Without Noise
0
10
20
30
40
50
60
70
80
90
1 2 3 4
Accuracy vs Factors
J48 Decision Table Random Tree
INSY 5339 Principles of Business Data Mining Dr.Sikora
21
5. ROC CURVES:
The ROC curve is a fundamental tool for diagnostic test evaluation. The ROC curve is created
by plotting the true positive rate (TPR) against the false positive rate (FPR) at various
threshold settings.
5.1 ROC CURVE – KNOWLEDGE FLOW
Knowledge flow in Weka is used to create multiple ROC curves for different type of class
values against different classifiers.
• ArrfLoader component is used to load the data set into the knowledge flow.
• ClassAssigner component is used to choose the class attribute from the data set and
• ClassValuePicker component is used to choose a class value.
• TrainSetSplitMaker component is used because we are using Percentage split (66%)
on the Data set.
• The final list of classifiers are added and for each classifier a
ClassifierPerformanceEvaluator component is added to evaluate the classifiers.
• Finally, the ROC curve chart is taken using the PerformanceModelChart component.
The following graphs are obtained when we had taken the ROC Curves of the all
INSY 5339 Principles of Business Data Mining Dr.Sikora
22
5.2 SINGLE CLASS VS 3 CLASSIFIERS:
ROC CURVE – BENDING1 CLASS VS 3 CLASSIFIERS
ROC CURVE – BENDING2 CLASS VS 3 CLASSIFIERS
INSY 5339 Principles of Business Data Mining Dr.Sikora
23
ROC CURVE – CYCLING CLASS VS 3 CLASSIFIERS
ROC CURVE – LYING CLASS VS 3 CLASSIFIERS
INSY 5339 Principles of Business Data Mining Dr.Sikora
24
ROC CURVE – SITTING CLASS VS 3 CLASSIFIERS
ROC CURVE – STANDING CLASS VS 3 CLASSIFIERS
INSY 5339 Principles of Business Data Mining Dr.Sikora
25
ROC CURVE – WALKING CLASS VS 3 CLASSIFIERS
5.3 ALL CLASSES VS ALL CLASSIFIERS:
ROC CURVE – 7 CLASSES VS 3 CLASSIFIERS
INSY 5339 Principles of Business Data Mining Dr.Sikora
26
6. Principal Component Analysis:
To be sure, we have performed the Principal Component Analysis which has created a new
data set. Using PCA, we came up with new attributes whose values are functions of previous
attribute values. Tests performed on this data set have not yielded any improvement in the
accuracy with the introduction of new attributes. The results of PCA have been given as
below.
FULL TRAINING SET ACCURACY
Classifier Correctly Classified Instances(%)
Incorrectly Classified
Instances(%)
Zero R 17.0459 82.9541
One R 54.2106 45.7894
Naïve Bayes 64.2487 35.7515
J48 88.4964 11.5036
Decision Table 72.4118 27.5882
Random Tree 98.8612 1.1388
CROSS VALIDATION ACCURACY
Classifier Correctly Classified Instances(%)
Incorrectly Classified
Instances(%)
Zero R 17.0459 82.9541
One R 41.8973 58.1027
Naïve Bayes 64.1658 35.8342
J48 78.4536 21.5464
Decision Table 66.7606 33.2394
Random Tree 74.5306 25.4694
INSY 5339 Principles of Business Data Mining Dr.Sikora
27
PERCENTAGE SPLIT ACCURACY
Classifier Correctly Classified Instances(%)
Incorrectly Classified
Instances(%)
Zero R 17.0253 82.9747
One R 41.1531 58.8469
Naïve Bayes 63.5401 36.4599
J48 77.1743 22.8257
Decision Table 65.6013 34.3987
Random Tree 71.0673 28.9325
7. CONCLUSION:
Accuracy:
By looking at the accuracy of the classifiers, we were able to conclude that J48 has the best
accuracy – 77.1743 %.
Area under ROC:
By looking at the various ROC curves plotted against the 3 classifiers for the various class
values, J48 classifier seems to be more efficient because it has a large area under the curve
compared to the curves of Decision Table and Random Tree. And for the class attribute
which is the Activity, the LYING class seems to have produced accurate results because it has
a large area under the curve compared to the curves of the remaining class values.
Experimental Design:
The results from Experimental Design shows that by including 10 % Noise to our dataset,
there was an approximate dip of 8-9% in accuracy.
Overall:
The previously concluded results on the original dataset hold in this case, i.e. taking into
consideration the results of factorial experimental design, test on full training set data, cross
validation and percentage split test, the algorithm J48 YIELDED MAXIMUM ACCURACY.
Results obtained from ROC curves also show that J48 classifier gives the maximum
prediction accuracy. Also, the area under ROC Curve for J48 classifier tends to be the
maximum.
INSY 5339 Principles of Business Data Mining Dr.Sikora
28
8. REFERENCES:
• Human activity recognition using multisensor data fusion based on Reservoir
Computing, Journal of Ambient Intelligence and Smart Environments, 2016 by F.
Palumbo, C. Gallicchio, R. Pucci and A. Micheli
https://www.researchgate.net/publication/298911566_Human_activity_recognition
_using_multisensor_data_fusion_based_on_Reservoir_Computing
• Multisensor data fusion for activity recognition based on reservoir computing, in:
Evaluating AAL Systems Through Competitive Benchmarking, Communications in
Computer and Information Science by F. Palumbo, P. Barsocchi, C. Gallicchio, S.
Chessa and A. Micheli
https://www.researchgate.net/publication/258029665_Multisensor_Data_Fusion_fo
r_Activity_Recognition_Based_on_Reservoir_Computing

More Related Content

What's hot

Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools Drjabez
 
New Fuzzy Logic Based Intrusion Detection System
New Fuzzy Logic Based Intrusion Detection SystemNew Fuzzy Logic Based Intrusion Detection System
New Fuzzy Logic Based Intrusion Detection Systemijsrd.com
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf
 
Study on Data Mining Suitability for Intrusion Detection System (IDS)
Study on Data Mining Suitability for Intrusion Detection System (IDS)Study on Data Mining Suitability for Intrusion Detection System (IDS)
Study on Data Mining Suitability for Intrusion Detection System (IDS)ijdmtaiir
 
SVM Classification of MRI Brain Images for ComputerAssisted Diagnosis
SVM Classification of MRI Brain Images for ComputerAssisted DiagnosisSVM Classification of MRI Brain Images for ComputerAssisted Diagnosis
SVM Classification of MRI Brain Images for ComputerAssisted DiagnosisIJECEIAES
 
SELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEMSELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEMcscpconf
 
An Efficient VLSI Design of AES Cryptography Based on DNA TRNG Design
An Efficient VLSI Design of AES Cryptography Based on DNA TRNG DesignAn Efficient VLSI Design of AES Cryptography Based on DNA TRNG Design
An Efficient VLSI Design of AES Cryptography Based on DNA TRNG DesignIRJET Journal
 
Fuzzy Logic Final Report
Fuzzy Logic Final ReportFuzzy Logic Final Report
Fuzzy Logic Final ReportShikhar Agarwal
 
Identification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic AlgorithmIdentification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic Algorithmijtsrd
 
Neural Network Based Individual Classification System
Neural Network Based Individual Classification SystemNeural Network Based Individual Classification System
Neural Network Based Individual Classification SystemIRJET Journal
 
IRJET- Prediction of Heart Disease using RNN Algorithm
IRJET- Prediction of Heart Disease using RNN AlgorithmIRJET- Prediction of Heart Disease using RNN Algorithm
IRJET- Prediction of Heart Disease using RNN AlgorithmIRJET Journal
 
Self learning real time expert system
Self learning real time expert systemSelf learning real time expert system
Self learning real time expert systemijscai
 
Evaluation of network intrusion detection using markov chain
Evaluation of network intrusion detection using markov chainEvaluation of network intrusion detection using markov chain
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
 
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIntrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIOSR Journals
 
Classification of physiological signals for wheel loader operators using Mult...
Classification of physiological signals for wheel loader operators using Mult...Classification of physiological signals for wheel loader operators using Mult...
Classification of physiological signals for wheel loader operators using Mult...Reno Filla
 
IRJET- Sugarcane Leaf Disease Detection
IRJET- Sugarcane Leaf Disease DetectionIRJET- Sugarcane Leaf Disease Detection
IRJET- Sugarcane Leaf Disease DetectionIRJET Journal
 
Detection of plant diseases
Detection of plant diseasesDetection of plant diseases
Detection of plant diseasesMuneesh Wari
 
IRJET- Detection and Classification of Leaf Diseases
IRJET-  	  Detection and Classification of Leaf DiseasesIRJET-  	  Detection and Classification of Leaf Diseases
IRJET- Detection and Classification of Leaf DiseasesIRJET Journal
 

What's hot (19)

Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools
 
New Fuzzy Logic Based Intrusion Detection System
New Fuzzy Logic Based Intrusion Detection SystemNew Fuzzy Logic Based Intrusion Detection System
New Fuzzy Logic Based Intrusion Detection System
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
 
Study on Data Mining Suitability for Intrusion Detection System (IDS)
Study on Data Mining Suitability for Intrusion Detection System (IDS)Study on Data Mining Suitability for Intrusion Detection System (IDS)
Study on Data Mining Suitability for Intrusion Detection System (IDS)
 
SVM Classification of MRI Brain Images for ComputerAssisted Diagnosis
SVM Classification of MRI Brain Images for ComputerAssisted DiagnosisSVM Classification of MRI Brain Images for ComputerAssisted Diagnosis
SVM Classification of MRI Brain Images for ComputerAssisted Diagnosis
 
SELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEMSELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEM
 
40120140507007
4012014050700740120140507007
40120140507007
 
An Efficient VLSI Design of AES Cryptography Based on DNA TRNG Design
An Efficient VLSI Design of AES Cryptography Based on DNA TRNG DesignAn Efficient VLSI Design of AES Cryptography Based on DNA TRNG Design
An Efficient VLSI Design of AES Cryptography Based on DNA TRNG Design
 
Fuzzy Logic Final Report
Fuzzy Logic Final ReportFuzzy Logic Final Report
Fuzzy Logic Final Report
 
Identification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic AlgorithmIdentification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic Algorithm
 
Neural Network Based Individual Classification System
Neural Network Based Individual Classification SystemNeural Network Based Individual Classification System
Neural Network Based Individual Classification System
 
IRJET- Prediction of Heart Disease using RNN Algorithm
IRJET- Prediction of Heart Disease using RNN AlgorithmIRJET- Prediction of Heart Disease using RNN Algorithm
IRJET- Prediction of Heart Disease using RNN Algorithm
 
Self learning real time expert system
Self learning real time expert systemSelf learning real time expert system
Self learning real time expert system
 
Evaluation of network intrusion detection using markov chain
Evaluation of network intrusion detection using markov chainEvaluation of network intrusion detection using markov chain
Evaluation of network intrusion detection using markov chain
 
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIntrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
 
Classification of physiological signals for wheel loader operators using Mult...
Classification of physiological signals for wheel loader operators using Mult...Classification of physiological signals for wheel loader operators using Mult...
Classification of physiological signals for wheel loader operators using Mult...
 
IRJET- Sugarcane Leaf Disease Detection
IRJET- Sugarcane Leaf Disease DetectionIRJET- Sugarcane Leaf Disease Detection
IRJET- Sugarcane Leaf Disease Detection
 
Detection of plant diseases
Detection of plant diseasesDetection of plant diseases
Detection of plant diseases
 
IRJET- Detection and Classification of Leaf Diseases
IRJET-  	  Detection and Classification of Leaf DiseasesIRJET-  	  Detection and Classification of Leaf Diseases
IRJET- Detection and Classification of Leaf Diseases
 

Similar to Predictiveanalysisonactivityrecognitionsystem 190131212500

Human Activity Recognition System
Human Activity Recognition SystemHuman Activity Recognition System
Human Activity Recognition SystemIRJET Journal
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...IOSR Journals
 
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...IRJET Journal
 
Artificial Intelligence.docx
Artificial Intelligence.docxArtificial Intelligence.docx
Artificial Intelligence.docxashumar
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...IRJET Journal
 
Draft activity recognition from accelerometer data
Draft activity recognition from accelerometer dataDraft activity recognition from accelerometer data
Draft activity recognition from accelerometer dataRaghu Palakodety
 
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET Journal
 
Performance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural networkPerformance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural networkIAEME Publication
 
Real-Time Pertinent Maneuver Recognition for Surveillance
Real-Time Pertinent Maneuver Recognition for SurveillanceReal-Time Pertinent Maneuver Recognition for Surveillance
Real-Time Pertinent Maneuver Recognition for SurveillanceIRJET Journal
 
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNINGPREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNINGIRJET Journal
 
Achieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsAchieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsIOSR Journals
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for properIJDKP
 
Classification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural NetworkClassification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural Networkirjes
 
1 Object tracking using sensor network Orla Sahi
1       Object tracking using sensor network Orla Sahi1       Object tracking using sensor network Orla Sahi
1 Object tracking using sensor network Orla SahiSilvaGraf83
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTIJERA Editor
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection methodIJSRD
 
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...IRJET Journal
 
Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...IAEME Publication
 
Potato Leaf Disease Detection Using Machine Learning
Potato Leaf Disease Detection Using Machine LearningPotato Leaf Disease Detection Using Machine Learning
Potato Leaf Disease Detection Using Machine LearningIRJET Journal
 

Similar to Predictiveanalysisonactivityrecognitionsystem 190131212500 (20)

Human Activity Recognition System
Human Activity Recognition SystemHuman Activity Recognition System
Human Activity Recognition System
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
 
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
 
Introduction
IntroductionIntroduction
Introduction
 
Artificial Intelligence.docx
Artificial Intelligence.docxArtificial Intelligence.docx
Artificial Intelligence.docx
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
 
Draft activity recognition from accelerometer data
Draft activity recognition from accelerometer dataDraft activity recognition from accelerometer data
Draft activity recognition from accelerometer data
 
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
 
Performance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural networkPerformance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural network
 
Real-Time Pertinent Maneuver Recognition for Surveillance
Real-Time Pertinent Maneuver Recognition for SurveillanceReal-Time Pertinent Maneuver Recognition for Surveillance
Real-Time Pertinent Maneuver Recognition for Surveillance
 
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNINGPREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
 
Achieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsAchieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logs
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for proper
 
Classification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural NetworkClassification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural Network
 
1 Object tracking using sensor network Orla Sahi
1       Object tracking using sensor network Orla Sahi1       Object tracking using sensor network Orla Sahi
1 Object tracking using sensor network Orla Sahi
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection method
 
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
 
Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...
 
Potato Leaf Disease Detection Using Machine Learning
Potato Leaf Disease Detection Using Machine LearningPotato Leaf Disease Detection Using Machine Learning
Potato Leaf Disease Detection Using Machine Learning
 

Recently uploaded

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Recently uploaded (20)

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Predictiveanalysisonactivityrecognitionsystem 190131212500

  • 1. INSY 5339 Principles of Business Data Mining Dr.Sikora 1 PREDICTIVE ANALYSIS ON ACTIVITY RECOGNITION SYSTEM PROJECT REPORT - INSY 5339 – PRINCIPLES OF BUSINESS DATA MINING
  • 2. INSY 5339 Principles of Business Data Mining Dr.Sikora 2 TABLE OF CONTENTS 1. DATASET INTRODUCTION………………………………………………..…………………..………..3 1.1 DATAMINING INTRODUCTION………………………………………..………………………….3 1.2 OBJECTIVE…………………………………………………..................................................3 1.3 DATA BACKGROUND…………………………………..................................................3 1.4 DATASET INFORMATION................................................................................4 2. DATA PREPARATION………………………………………………………………………………………5 2.1 DATA CLEANING ………………………………………………….........................................5 3. ALGORITHMS USED ……………………………………….................................................6 3.1 ACCURACY ON FULL TRAINING SET.................................................................7 3.2 ACCURACY ON CROSS FOLDS.................................................................……….8 3.3 ACCURACY ON PERCENTAGE SPLIT..................................................................9 4. EXPERIMENTAL DESIGN..................................................................................10 4.1 RESULTS FOR EACH CLASSIFIER………………………………………………………….………11 4.2 RELATIVE ACCURACY OF EXPERIMENTAL DESIGN.........................................20 5. ROC CURVES...................................................................................................21 5.1 ROC CURVE – KNOWLEDGE FLOW……………………………………………………………..21 5.2 SINGLE CLASS VS 3 CLASSIFIERS………………………………………………………………….22 5.3 ALL CLASSES VS ALL CLASSIFIERS………………………………………………………………..25 6. PRINCIPAL COMPONENT ANALYSIS.................................................................26 7. CONCLUSION…………………………………………………………………………………………………27 8. REFERENCES………………………………………………………………………………………………….28
  • 3. INSY 5339 Principles of Business Data Mining Dr.Sikora 3 1. DATA SET INFORMATION: 1.1 DATA MINING INTRODUCTION: Data Mining means nontrivial extraction of implicit, previously unknown, and potentially useful information from data. It is an interdisciplinary subfield of computer science. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Data mining can also be defined as the semi-automatic or automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. 1.2 OBJECTIVE: The main objective of this project is to determine the type of activity performed by the user(Bending , walking, sitting, standing, Lying , cycling).This is determined by the use of Activity Recognition system based on Multisensor data fusion (AReM) sensors.Our other objective is to build and train the model by using training and test data sets.The output obtained in the previous steps will be used to determine the credibility of the usage of the AReM sensors for further experiments with regards to Activity Sensing. 1.3. DATA BACKGROUND: Activity Recognition (AR) is an emerging research topic, which is founded on established research fields such as ubiquitous computing, context-aware computing and multimedia, and machine learning for pattern recognition. Recognizing everyday life activities is a challenging application in pervasive computing, with a lot of interesting developments in the health care domain, the human behavior modeling domain and the human-machine interaction domain . Inferring the activity of the users in their own domestic environments becomes even more useful in the Ambient Assisted Living (AAL) scenario, where facilities provide assistance and care for the elderlies and the knowledge of their daily activities can ensure safety and a successful aging. From the point of view of the deployment of activity recognition solutions, we recognize three main approaches. The first kind of solutions generally use sensors (embedding accelerometers, or transducers for physiological measures) that make direct measures about the user movements. The disadvantage of this approach is that wearable devices can be intrusive on the user, even if, with recent advances in technologies of embedded systems, sensors tend to be smaller and smaller. Solutions that avoid the use of wearable
  • 4. INSY 5339 Principles of Business Data Mining Dr.Sikora 4 devices instead, are motivated by the need for a less intrusive activity recognition systems. Among these solutions, those based on cameras are probably the most common. These are the second type of sensors. More recently, a new generation of non wearable solution is emerging. These solution exploits the implicit alteration of the wireless channel due to the movements of the user, which is measured by devices placed in the environment and that measure the Received Signal Strength (RSS) of the beacon packets they exchange among themselves. 1.4. DATA SET INFORMATION: This dataset contains temporal data from a Wireless Sensor Network worn by an actor performing the activities: bending, cycling, lying down, sitting, standing, walking. The classification tasks consist in predicting the activity performed by the user from time-series generated by a Wireless Sensor Network (WSN). In our activity recognition system we use information coming the implicit alteration of the wireless channel due to the movements of the user. The devices measure the RSS of the beacon packets they exchange among themselves in the WSN. They are placed on the user’s chest and ankles. For the purpose of communications, the beacon packets are exchanged by using a simple virtual token protocol that completes its execution in a time slot of 50 milliseconds. From the raw data we extract time-domain features to compress the time series and slightly remove noise and correlations. We choose an epoch time of 250 milliseconds. In such a time slot we elaborate 5 samples of RSS (sampled at 20 Hz) for each of the three couples of WSN nodes (i.e. Chest-Right Ankle, Chest-Left Ankle, Right Ankle-Left Ankle). The features include the mean value and standard deviation for each reciprocal RSS reading from worn WSN sensors. For each activity 15 temporal sequences of input RSS data are present. The dataset contains 480 sequences, for a total number of 42240 instances. The positions of sensor nodes with the related identifiers are shown in figure.
  • 5. INSY 5339 Principles of Business Data Mining Dr.Sikora 5 2. DATA PREPARATION: 2.1 DATA CLEANING: Data cleansing, data cleaning or data scrubbing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a data set, table, or database. Used mainly in databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant, etc. parts of the data and then replacing, modifying, or deleting this dirty or coarse data. 1. Merging all the data sets: The data set was split into several files depending on the activity they had performed. Without changing the attributes, we have merged all the files into one excel dataset and added one extra attribute (class attribute) named Activity to categorize different actions performed. 2. Macros: Definition: An Excel macro is a set of programming instructions stored in what is known as VBA code that can be used to eliminate the need to repeat the steps of commonly performed tasks repeatedly. These repetitive tasks might involve complex calculations that require the use of formulas or they might be simple formatting tasks - such as adding number formatting to new data or applying cell and worksheet formats such as borders and shading. The Macros have been used
  • 6. INSY 5339 Principles of Business Data Mining Dr.Sikora 6 to find the missing values and amend them. The Final Class Attribute values are categorized as Bending1, Bending2, Cycling, Lying, Sitting, Standing and Walking. 3. ALGORITHMS USED FOR THIS EXPERIMENT: After trying out various algorithms, the following algorithms have yielded the best results for our experiment. J48 (J48) is an algorithm used to generate a decision tree developed by Ross Quinlan mentioned earlier. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. Naïve Bayes A naive Bayes classifier is an algorithm that uses Bayes' theorem to classify objects. Naive Bayes classifiers assume strong, or naive, independence between attributes of data points. These classifiers are widely used for machine learning because they are simple to implement. Decision Table Decision Tree algorithm belongs to the family of supervised learning algorithms. The decision tree algorithm tries to solve the problem, by using tree representation. Each internal node of the tree corresponds to an attribute, and each leaf node corresponds to a class label. Random Tree Random Tree is a supervised Classifier; it is an ensemble learning algorithm that generates lots of individual learners. It employs a bagging idea to construct a random set of data for constructing a decision tree.
  • 7. INSY 5339 Principles of Business Data Mining Dr.Sikora 7 OneR OneR, short for "One Rule", is a simple, yet accurate, classification algorithm that generates one rule for each predictor in the data, then selects the rule with the smallest total error as its "one rule". To create a rule for a predictor, we construct a frequency table for each predictor against the target. It has been shown that OneR produces rules only slightly less accurate than state-of-the-art classification algorithms while producing rules that are simple for humans to interpret. ZeroR ZeroR is the simplest classification method which relies on the target and ignores all predictors. ZeroR classifier simply predicts the majority category (class). Although there is no predictability power in ZeroR, it is useful for determining a baseline performance as a benchmark for other classification methods. 3.1 ACCURACY ON FULL TRAINING SET: In this step we have used the full training set with all algorithms to determine which one is the best one for our analysis. Classifier Correctly Classified Instances (%) Incorrectly Classified Instances (%) ZeroR 17.0459 82.9541 OneR 48.0383 51.9615 Naïve Bayes 64.3386 35.6614 J48 87.8974 12.1026 Decision Table 71.5121 28.4879 Random Tree 98.8612 1.1388
  • 8. INSY 5339 Principles of Business Data Mining Dr.Sikora 8 3.2 ACCURACY ON CROSS FOLDS: Here we have used the cross validation with 10 folds and have performed the test with all algorithms. The result is given in the form of graph as well as table. Classifier Correctly Classified Instances (%) Incorrectly Classified Instances (%) Zero R 17.0459 82.9541 OneR 47.2881 52.7119 Naïve Bayes 64.3955 35.6045 J48 78.9578 21.0422 Decision Table 65.314 34.686 Random Tree 75.3687 24.6313 0 10 20 30 40 50 60 70 80 90 100 zeroR OneR Naïve Bayes J48 Decision Table Random Tree Correctly Classified Instances(%) Incorrectly Classified Instances(%)
  • 9. INSY 5339 Principles of Business Data Mining Dr.Sikora 9 3.3 ACCURACY ON PERCENTAGE SPLIT: In this step, we have used percentage split of 66% to predict the accuracy of each algorithm. The results are displayed in the form of graphs as well as tables. Classifier Correctly Classified Instances (%) Incorrectly Classified Instances (%) Zero R 17.0253 82.9747 OneR 47.3226 52.6774 Naïve Bayes 63.6864 36.3136 J48 77.5503 22.4497 Decision Table 65.2253 34.7747 Random Tree 72.9058 27.0942 0 10 20 30 40 50 60 70 80 90 Zero R OneR Naïve Bayes J48 Decision Table Random Tree Correctly Classified Instances(%) Incorrectly Classified Instances(%)
  • 10. INSY 5339 Principles of Business Data Mining Dr.Sikora 10 Based on the various tests, we have concluded the following three algorithms are good enough to predict the CLASS attribute. • J48 • Decision Table • Random Tree We confirmed the above results with the Receiver Operating Characteristic (ROC) graph plotted against True Positive vs False Positive for the 6 algorithms and found out that these 3 algorithms have the better accuracy and also better Area under curve. 4. EXPERIMENTAL DESIGN: A full factorial experiment is an experiment which consists of two or more factors each of which has a discrete level. Each experimental unit in the experiment takes on all possible combinations of these levels across all factors. Such an experiment allows the investigator to study the effect of each factor on the response variable. We selected the following classifiers for our experimental design: • J48 • Decision Table • Random Tree 0 10 20 30 40 50 60 70 80 90 Zero R OneR Naïve Bayes J48 Decision Table Random Tree Correctly Classified Instances(%) Incorrectly Classified Instances(%)
  • 11. INSY 5339 Principles of Business Data Mining Dr.Sikora 11 Four Cell Experimental Design: It consists of 2 factors: • With 10 % noise • Without noise It consists of 2 levels: • %Split - 66 % • %Split - 75 % % Split - 66% % Split - 75% Without Noise C1 C3 With Noise C2 C4 Four Cell Experimental Design: • C1 – Percentage split 66% without noise • C2 – Percentage split 66% with noise • C3- Percentage split 75% without noise • C4- Percentage split 75% with noise Total Number of Experiments = Number of conditions*Number of Classifiers*10= 4*3*10 = 120 runs 4.1 RESULTS FOR EACH CLASSIFIER: The table below describes the 12 possible combinations of our 4 criteria with the 3 selected classifiers. We ran each of these combinations 10 times and averaged their accuracy and variance:
  • 12. INSY 5339 Principles of Business Data Mining Dr.Sikora 12 E1= Performance of J48 when, Attributes without noise + Percentage Split of 66%:34% E2= Performance of J48 when, Attributes with noise + Percentage Split of 66%:34% E3= Performance of J48 when, Attributes without noise + Percentage Split of 75%:25% E4= Performance of J48 when, Attributes with noise + Percentage Split of 75%:25% E1= Performance of Decision Table when, Attributes without noise + Percentage Split of 66%:34% E2= Performance of Decision Table when, Attributes with noise + Percentage Split of 66%:34% E3= Performance of Decision Table when, Attributes without noise + Percentage Split of 75%:25% E4= Performance of Decision Table when, Attributes with noise + Percentage Split of 75%:25% E1= Performance of Random Tree when, Attributes without noise + Percentage Split of 66%:34% E2= Performance of Random Tree when, Attributes with noise + Percentage Split of 66%:34% E3= Performance of Random Tree when, Attributes without noise + Percentage Split of 75%:25% E4= Performance of Random Tree when, Attributes with noise + Percentage Split of 75%:25% J48 – In J48, we ran four experiments, E1 to E4: E1- 66 -34 split, without noise E2- 66-34 split, with noise E3- 75-25 split, without noise E4- 75-25 split, with noise
  • 13. INSY 5339 Principles of Business Data Mining Dr.Sikora 13 E1- 66 -34 split, without noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 J48 66 77.5503 2 J48 66 77.5503 3 J48 66 77.5503 4 J48 66 77.5503 5 J48 66 77.5503 6 J48 66 77.5503 7 J48 66 77.5503 8 J48 66 77.5503 9 J48 66 77.5503 10 J48 66 77.5503 AVERAGE 77.5503 VARIANCE 0 E2 – 66-34 split, with noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 J48 66 68.4145 2 J48 66 68.4145 3 J48 66 68.4145 4 J48 66 68.4145 5 J48 66 68.4145 6 J48 66 68.4145 7 J48 66 68.4145 8 J48 66 68.4145 9 J48 66 68.4145 10 J48 66 68.4145 AVERAGE 68.4145 VARIANCE 0
  • 14. INSY 5339 Principles of Business Data Mining Dr.Sikora 14 E3- 75-25 split, without noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 J48 75 78.3712 2 J48 75 78.3712 3 J48 75 78.3712 4 J48 75 78.3712 5 J48 75 78.3712 6 J48 75 78.3712 7 J48 75 78.3712 8 J48 75 78.3712 9 J48 75 78.3712 10 J48 75 78.3712 AVERAGE 78.3712 VARIANCE 0 E4- 75-25 split, with noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 J48 75 68.2008 2 J48 75 68.2008 3 J48 75 68.2008 4 J48 75 68.2008 5 J48 75 68.2008 6 J48 75 68.2008 7 J48 75 68.2008 8 J48 75 68.2008 9 J48 75 68.2008 10 J48 75 68.2008 AVERAGE 68.2008 VARIANCE 0
  • 15. INSY 5339 Principles of Business Data Mining Dr.Sikora 15 DECISION TABLE - In Decision Table, we ran four experiments, E1 to E4: E1- 66 -34 split, without noise E2- 66-34 split, with noise E3- 75-25 split, without noise E4- 75-25 split, with noise E1- 66 -34 split, without noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Decision Table 66 65.2253 2 Decision Table 66 65.2253 3 Decision Table 66 65.2253 4 Decision Table 66 65.2253 5 Decision Table 66 65.2253 6 Decision Table 66 65.2253 7 Decision Table 66 65.2253 8 Decision Table 66 65.2253 9 Decision Table 66 65.2253 10 Decision Table 66 65.2253 AVERAGE 65.2253 VARIANCE 0 62 64 66 68 70 72 74 76 78 80 % Split - 66% % Split - 75% J48 With Noise Without Noise
  • 16. INSY 5339 Principles of Business Data Mining Dr.Sikora 16 E2 – 66-34 split, with noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Decision Table 66 60.3161 2 Decision Table 66 60.3161 3 Decision Table 66 60.3161 4 Decision Table 66 60.3161 5 Decision Table 66 60.3161 6 Decision Table 66 60.3161 7 Decision Table 66 60.3161 8 Decision Table 66 60.3161 9 Decision Table 66 60.3161 10 Decision Table 66 60.3161 AVERAGE 60.3161 VARIANCE 0 E3- 75-25 split, without noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Decision Table 75 64.1951 2 Decision Table 75 64.1951 3 Decision Table 75 64.1951 4 Decision Table 75 64.1951 5 Decision Table 75 64.1951 6 Decision Table 75 64.1951 7 Decision Table 75 64.1951 8 Decision Table 75 64.1951 9 Decision Table 75 64.1951 10 Decision Table 75 64.1951 AVERAGE 64.1951 VARIANCE 0
  • 17. INSY 5339 Principles of Business Data Mining Dr.Sikora 17 E4- 75-25 split, with noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Decision Table 75 60.4924 2 Decision Table 75 60.4924 3 Decision Table 75 60.4924 4 Decision Table 75 60.4924 5 Decision Table 75 60.4924 6 Decision Table 75 60.4924 7 Decision Table 75 60.4924 8 Decision Table 75 60.4924 9 Decision Table 75 60.4924 10 Decision Table 75 60.4924 AVERAGE 60.4924 VARIANCE 0 57 58 59 60 61 62 63 64 65 66 % Split - 66% % Split - 75% Decision Table With Noise Without Noise
  • 18. INSY 5339 Principles of Business Data Mining Dr.Sikora 18 RANDOM TREE - In Random Tree, we ran four experiments, E1 to E4: E1- 66 -34 split, without noise E2- 66-34 split, with noise E3- 75-25 split, without noise E4- 75-25 split, with noise E1- 66 -34 split, without noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Random Tree 66 72.9058 2 Random Tree 66 72.9058 3 Random Tree 66 72.9058 4 Random Tree 66 72.9058 5 Random Tree 66 72.9058 6 Random Tree 66 72.9058 7 Random Tree 66 72.9058 8 Random Tree 66 72.9058 9 Random Tree 66 72.9058 10 Random Tree 66 72.9058 AVERAGE 72.9058 VARIANCE 0 E2 – 66-34 split, with noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Random Tree 66 59.6546 2 Random Tree 66 59.6546 3 Random Tree 66 59.6546 4 Random Tree 66 59.6546 5 Random Tree 66 59.6546 6 Random Tree 66 59.6546 7 Random Tree 66 59.6546 8 Random Tree 66 59.6546 9 Random Tree 66 59.6546 10 Random Tree 66 59.6546 AVERAGE 59.6546 VARIANCE 0
  • 19. INSY 5339 Principles of Business Data Mining Dr.Sikora 19 E3- 75-25 split, without noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Random Tree 75 73.3807 2 Random Tree 75 73.3807 3 Random Tree 75 73.3807 4 Random Tree 75 73.3807 5 Random Tree 75 73.3807 6 Random Tree 75 73.3807 7 Random Tree 75 73.3807 8 Random Tree 75 73.3807 9 Random Tree 75 73.3807 10 Random Tree 75 73.3807 AVERAGE 73.3807 VARIANCE 0 E4- 75-25 split, with noise SEED CLASSIFIER PERCENTAGE SPLIT ACCURACY 1 Random Tree 75 59.4034 2 Random Tree 75 59.4034 3 Random Tree 75 59.4034 4 Random Tree 75 59.4034 5 Random Tree 75 59.4034 6 Random Tree 75 59.4034 7 Random Tree 75 59.4034 8 Random Tree 75 59.4034 9 Random Tree 75 59.4034 10 Random Tree 75 59.4034 AVERAGE 59.4034 VARIANCE 0
  • 20. INSY 5339 Principles of Business Data Mining Dr.Sikora 20 4.2 RELATIVE ACCURACY OF EXPERIMENTAL DESIGN: 0 10 20 30 40 50 60 70 80 % Split - 66% % Split - 75% Random Tree With Noise Without Noise 0 10 20 30 40 50 60 70 80 90 1 2 3 4 Accuracy vs Factors J48 Decision Table Random Tree
  • 21. INSY 5339 Principles of Business Data Mining Dr.Sikora 21 5. ROC CURVES: The ROC curve is a fundamental tool for diagnostic test evaluation. The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. 5.1 ROC CURVE – KNOWLEDGE FLOW Knowledge flow in Weka is used to create multiple ROC curves for different type of class values against different classifiers. • ArrfLoader component is used to load the data set into the knowledge flow. • ClassAssigner component is used to choose the class attribute from the data set and • ClassValuePicker component is used to choose a class value. • TrainSetSplitMaker component is used because we are using Percentage split (66%) on the Data set. • The final list of classifiers are added and for each classifier a ClassifierPerformanceEvaluator component is added to evaluate the classifiers. • Finally, the ROC curve chart is taken using the PerformanceModelChart component. The following graphs are obtained when we had taken the ROC Curves of the all
  • 22. INSY 5339 Principles of Business Data Mining Dr.Sikora 22 5.2 SINGLE CLASS VS 3 CLASSIFIERS: ROC CURVE – BENDING1 CLASS VS 3 CLASSIFIERS ROC CURVE – BENDING2 CLASS VS 3 CLASSIFIERS
  • 23. INSY 5339 Principles of Business Data Mining Dr.Sikora 23 ROC CURVE – CYCLING CLASS VS 3 CLASSIFIERS ROC CURVE – LYING CLASS VS 3 CLASSIFIERS
  • 24. INSY 5339 Principles of Business Data Mining Dr.Sikora 24 ROC CURVE – SITTING CLASS VS 3 CLASSIFIERS ROC CURVE – STANDING CLASS VS 3 CLASSIFIERS
  • 25. INSY 5339 Principles of Business Data Mining Dr.Sikora 25 ROC CURVE – WALKING CLASS VS 3 CLASSIFIERS 5.3 ALL CLASSES VS ALL CLASSIFIERS: ROC CURVE – 7 CLASSES VS 3 CLASSIFIERS
  • 26. INSY 5339 Principles of Business Data Mining Dr.Sikora 26 6. Principal Component Analysis: To be sure, we have performed the Principal Component Analysis which has created a new data set. Using PCA, we came up with new attributes whose values are functions of previous attribute values. Tests performed on this data set have not yielded any improvement in the accuracy with the introduction of new attributes. The results of PCA have been given as below. FULL TRAINING SET ACCURACY Classifier Correctly Classified Instances(%) Incorrectly Classified Instances(%) Zero R 17.0459 82.9541 One R 54.2106 45.7894 Naïve Bayes 64.2487 35.7515 J48 88.4964 11.5036 Decision Table 72.4118 27.5882 Random Tree 98.8612 1.1388 CROSS VALIDATION ACCURACY Classifier Correctly Classified Instances(%) Incorrectly Classified Instances(%) Zero R 17.0459 82.9541 One R 41.8973 58.1027 Naïve Bayes 64.1658 35.8342 J48 78.4536 21.5464 Decision Table 66.7606 33.2394 Random Tree 74.5306 25.4694
  • 27. INSY 5339 Principles of Business Data Mining Dr.Sikora 27 PERCENTAGE SPLIT ACCURACY Classifier Correctly Classified Instances(%) Incorrectly Classified Instances(%) Zero R 17.0253 82.9747 One R 41.1531 58.8469 Naïve Bayes 63.5401 36.4599 J48 77.1743 22.8257 Decision Table 65.6013 34.3987 Random Tree 71.0673 28.9325 7. CONCLUSION: Accuracy: By looking at the accuracy of the classifiers, we were able to conclude that J48 has the best accuracy – 77.1743 %. Area under ROC: By looking at the various ROC curves plotted against the 3 classifiers for the various class values, J48 classifier seems to be more efficient because it has a large area under the curve compared to the curves of Decision Table and Random Tree. And for the class attribute which is the Activity, the LYING class seems to have produced accurate results because it has a large area under the curve compared to the curves of the remaining class values. Experimental Design: The results from Experimental Design shows that by including 10 % Noise to our dataset, there was an approximate dip of 8-9% in accuracy. Overall: The previously concluded results on the original dataset hold in this case, i.e. taking into consideration the results of factorial experimental design, test on full training set data, cross validation and percentage split test, the algorithm J48 YIELDED MAXIMUM ACCURACY. Results obtained from ROC curves also show that J48 classifier gives the maximum prediction accuracy. Also, the area under ROC Curve for J48 classifier tends to be the maximum.
  • 28. INSY 5339 Principles of Business Data Mining Dr.Sikora 28 8. REFERENCES: • Human activity recognition using multisensor data fusion based on Reservoir Computing, Journal of Ambient Intelligence and Smart Environments, 2016 by F. Palumbo, C. Gallicchio, R. Pucci and A. Micheli https://www.researchgate.net/publication/298911566_Human_activity_recognition _using_multisensor_data_fusion_based_on_Reservoir_Computing • Multisensor data fusion for activity recognition based on reservoir computing, in: Evaluating AAL Systems Through Competitive Benchmarking, Communications in Computer and Information Science by F. Palumbo, P. Barsocchi, C. Gallicchio, S. Chessa and A. Micheli https://www.researchgate.net/publication/258029665_Multisensor_Data_Fusion_fo r_Activity_Recognition_Based_on_Reservoir_Computing