SlideShare a Scribd company logo
IT
[1]@gsantosgo
Information Tecnology
Information Tecnology
Data Analysis
Title: Activity Prediction that a subject performs based in measurements
obtained from the accelerometer and gyroscope of the Smartphones
Introduction:
Recently, our lives are invaded by small mobile devices, known as smartphones. These devices are mobile mini-
computers, they have an operating system that allows it to launch applications, include a set of applications to
manage contacts andaddress book, to create, editorview differenttypes of documents, to access orbrowse the
Web, too provide us telephony or messaging services, etc. Apart from these previous features, the most of the
smartphones have currently begun to incorporate other features such as cameras, GPS and various types of
sensors.
In this analysis, we used data obtained from the accelerometer [1] and gyroscope[2] sensor signals of the
smartphones. The accelerometer and gyroscope sensors measure 3-axial linear acceleration and3-axial angular
velocity, with these two sensors can monitor device acceleration, positions, orientation, rotation and angular
motion. All these data can be stored and used to recognize a user’s activity. Here we refer to physical activities
thatahumanpersoncanperformdailysuchaswalking, walking up, jogging, sitting, laying, etc.
The aim of this analysis consisted of perform a classification’s task. We took a dataset with their attributes
(acceleration, orientation,…) and its labeled variable (in this case is activity), and later we created various
classification’s models also known classifiers. To create these classification’s models we can use various
algorithms of classification. These algorithms use all available information of a dataset to help us to classify or
predictthatactivityisperformedbyahumanperson.
To create models of classification (models of classification), we performed a first task that consisted of choose
different algorithms or techniques of classification, then for each algorithm or technique of classification we
applied what is called cross-validation [3], that is, we trained these algorithm with a set of training data that
corresponds to several observations of our available dataset. The following task was tested our classification’s
algorithm to observe the accuracy, that is, if our predictive model can classify correctly a human’s activity
according to the acquiredknowledge in the stage of training. This whole process is known as supervisedlearning
[4].
IT
[2]@gsantosgo
Information Tecnology
Information Tecnology
Methods:
DataCollection
For this analysis we used a dataset on the Human Activity Recognition. This dataset were downloaded from
coursera.org [5]in Data Analysis Course on March 03, 2013 using the R programming language. The data of this
dataset are previously processedto make them easierto loadinto R, since the data was obtainedfromother raw
data from the UC Irvine Machine Learning Repository [6] that has a dataset available about Human Activity
Recognition[7], builtfromthe recordingsof 30 subjectsperforming activitiesofdaily living (ADL)while carrying a
waist-mountedSmartphonewithembeddedinertial sensors.
The dataset for this analysis contains 7352 observations and 563 variables. For each observation, there is a
categorical orfactorvariable called“activity”(ourlabeledvariable orclass)thatindicatestheactivity carriedout
by a human person, there are only six possible values for this variable: laying, sitting, standing, walk,
walkdown and walkup. Too, there isanotherintegervariable knownas“subject”thatisthe identificatorof the
person that performed that activity. Andfinally, the rest of the 561 variables are numeric variables (quantitative)
that contains features about time and frequency on triaxial acceleration (mean, standard deviation, energy,
correlation, etc.)fromtheaccelerometer, triaxial angularvelocityfromthegyroscope, etc.
For more information about all these variables, you can find the features here in this compressed file [8]. This
compressedfile contains some interesting descriptive files thatshow information aboutthe variables usedin this
dataset, all featuresandlabeledvariableorclass.
ExploratoryAnalysis
Exploratory analysis was performed by examining data and plots of the observed data. Exploratory analysis was
used to (1) identify missing values, (2) verify the quality of the data, (3) check name of variables that are
syntactically correct and (4) identified possible different patterns between the different activities and so to be
abletodistinguishwhenauserperformsanactivity oranother.
Our predictive model [9]shouldbe able to recognize patterns corresponding to every activity. Figure 1 shows the
different patterns for different activities according to the analysis of acceleration X-axis. We can observe that
therearedifferentpatternsaccording tothatactivity iscarriedoutby auser.
IT
[3]@gsantosgo
Information Tecnology
Information Tecnology
Figure2 showsthedifferentpatternsfordifferentactivitiesaccording totheanalysisofaccelerationY-axis.
Figure3 showsthedifferentpatternsfordifferentactivitiesaccording totheanalysisofaccelerationZ-axis.
IT
[4]@gsantosgo
Information Tecnology
Information Tecnology
It’s important keepin mind, if there are activities with common patterns, ourpredictive model will obviously have
more difficultto classify these activities correctly andtherefore ourmodel will have loweraccuracy, thatis, ithas
moredifficultiestodistinguishamong activities.
Statistical Modeling
To be able to classify the activity that is performed by a subject, we used various techniques or algorithms of
classification to recognize and predict our labeled variable (activity). The techniques (classifiers) employed for
thisdataanalysisarethefollowing:
DecisionTrees[10]
CART[11]
Bagging [12]
RamdomForest [13]
SVM[14]
We performed cross-validation for each of these previous techniques (classifiers). We also evaluated the
performance, theaccuracyandtheerrorrateoftheseclassifiers.
Reproducibility
All analysesperformedinthismanuscriptarereproducedintheR markdownfilesamsungPredictive.Rmd[15].
Note. Due to security concerns with the exchange of R code, we don’tsubmit code to reproduce analysis, in this
dataanalysis.
Results:
As I said, the dataset for this analysis contains a total size 7352 observations with 563 variables, these
observations correspond to a total 21 people. In Table 1, shows the number of examples per subject and type of
activity, andalsothepercentageoftotal peractivity fromourdataset.
We foundvariables that have syntactically incorrect names, thatis, the name of variables use incorrect character
such as comma(“,”), brackets (“(“),etc. , then itwasnecessary to have validvariable names andnotduplicatedin
our dataset (or data frame). We observed to detect missing values in the dataset, and there weren’t missing
values.
Ourclass orlabeledvariable was transformedfromcharactervariable to a factorvariable with 6 levels: “laying”,
“sitting”, “standing”, “walk”, “walkdown”and“walkup”.
IT
[5]@gsantosgo
Information Tecnology
Information Tecnology
According to assignment, for this data analysis we used a training set that include the data from subjects 1, 3, 5
and 6 and a test set that include the data from 27, 28, 29 and 30. Table 2 shows the number of samples per
activity that we used to perform the stage of training. And Table 3 indicates the number of samples per activity
thatweusedtoperformthestageoftesting.
id laying sitting standing walk walkdown walkup Total
1 50 47 53 95 49 53 347
3 62 52 61 58 49 59 341
5 52 44 56 56 47 47 302
6 57 55 57 57 48 51 325
7 52 48 53 57 47 51 308
8 54 46 54 48 38 41 281
11 57 53 47 59 46 54 316
14 51 54 60 59 45 54 323
15 72 59 53 54 42 48 328
16 70 69 78 51 47 51 366
17 71 64 78 61 46 48 368
19 83 73 73 52 39 40 360
21 90 85 89 52 45 47 408
22 72 62 63 46 36 42 321
23 72 68 68 59 54 51 372
25 73 65 74 74 58 65 409
26 76 78 74 59 50 55 392
27 74 70 80 57 44 51 376
28 80 72 79 54 46 51 382
29 69 60 65 53 48 49 344
30 70 62 59 65 62 65 383
Sum 1407 1286 1374 1226 986 1073 7352
% 19,14 17,49 18,69 16,68 13,41 14,59 100
Table 1.Number of samples per subject and type of activity
Laying sitting standing walk walkdown walkup
55 50 57 64 49 53
Table 2.Number of samples per activity for Training
laying sitting standing walk walkdown walkup
74 64 71 56 52 54
Table 3.Number of samples data per activity fo Testing
IT
[6]@gsantosgo
Information Tecnology
Information Tecnology
Weperformedtheprocessofcross-validationforeachofthepreviousclassifiersusing thetraining setandtest
setwerealreadyearlyspecified.
Theresultsobtainedfordifferentclassificationtechniques(predictivemodels)using theR programming language
arepresentedinTable4. Inthistablecanbetheaccuracy ofeachclassificationtechniqueperactivity. Thecells
inboldandunderlineindicatethebestaccuracy.
Itisimportanttakeintoaccountthatweusedall quantitativevariables(561variables)topredicttheactivity
carriedoutbyasubjectinthese5classificationtechniques. Recall, ifwehavealotofvariables, theperformance
oftheclassificationalgorithmmaybeextremely affected, tooalotofthesequantitativevariablescouldaddnoise
toclassifycorrectlyactivities, andotherscouldnotbeinteresting toprovidegoodinformationtodistinguish
among activities. Ontheotherhand, Itwill bevery interesting, toperformameasureofhowmuchtheclassifiers
areoverfitting[16].
Ingeneral, themostoftheclassificationtechniquesusedinthisanalysishavehighlevelsofaccuracy. Butwecan
observelessaccurateforsomeactivitiesandforsomeclassificationtechniques.
% Correctly Predicted
Model Tree
library(tree)
CART
library(rpart)
BAGGING
library(ipred)
Random Forest
library(randomForest)
SVM
library(e1071)
laying 100,00 100,00 100,00 100,00 100,00
sitting 70,31 67,19 67,19 82,81 82,81
standing 85,92 88,73 88,73 88,73 88,73
walk 50,00 57,14 80,30 92,86 92,86
walkdown 84,61 86,54 94,23 86,54 86,54
walkup 85,19 85,19 87,03 96,30 98,15
All 79,34 80,80 86,25 91,21 91,52
Table 4.Accuracies of the Classification Techniques
In the following tables (Table 5-9) show confusion matrices for each of classification
techniques.
Predicted Class
Actual Class laying sitting standing walk walkdown walkup
laying 74 0 0 0 0 0
sitting 0 45 19 0 0 0
standing 0 10 61 0 0 0
walk 0 0 0 28 6 22
walkdown 0 0 0 0 44 8
walkup 0 0 0 1 7 46
Table 5.Confusion matrix for the Decision Tree
IT
[7]@gsantosgo
Information Tecnology
Information Tecnology
Predicted Class
Actual Class laying sitting standing walk walkdown walkup
laying 74 0 0 0 0 0
sitting 0 43 21 0 0 0
standing 0 8 63 0 0 0
walk 0 0 0 32 4 20
walkdown 0 0 0 0 45 7
walkup 0 0 0 1 7 46
Table 6.Confusion matrix for the CART
Predicted Class
Actual Class laying sitting standing walk walkdown walkup
laying 74 0 0 0 0 0
sitting 0 43 21 0 0 0
standing 0 8 63 0 0 0
walk 0 0 0 53 0 3
walkdown 0 0 0 0 49 3
walkup 0 0 0 1 6 47
Table 7.Confusion matrix for Bagging
Predicted Class
Actual Class laying sitting standing walk walkdown walkup
laying 74 0 0 0 0 0
sitting 0 53 11 0 0 0
standing 0 8 63 0 0 0
walk 0 0 0 52 0 4
walkdown 0 0 0 0 47 5
walkup 0 0 0 0 2 52
Table 8.Confusion matrix for Random Forest
Predicted Class
Actual Class laying sitting standing walk walkdown walkup
laying 74 0 0 0 0 0
sitting 0 53 11 0 0 0
standing 0 8 63 0 0 0
walk 0 0 0 52 0 4
walkdown 0 0 0 0 47 5
walkup 0 0 0 0 1 53
Table 9.Confusion matrix for SVM
In general, we observedthatthe classification techniques identify correctly laying (100%). Itappears much more
difficulttodistinguishbetweensitting andstanding, andalsotodistinguishbetweenwalk, walkdownandwalkup.
IT
[8]@gsantosgo
Information Tecnology
Information Tecnology
The Bagging, Random Forest and SVM are classifiers that require more computing and memory resources, and
thereforemoreclassificationtimethanTreeandCART.
Conclusions:
In this analysis, we employed various classification techniques to obtain different predictive model. The SVM
classifier algorithm achieved the highest levels of accuracy for this analysis (91,52%accuracy). It will be
recommendable to increase the number of observations. Too, it will be recommendable to increase the samples
for the set of training data, and the samples for the set of test data, and observe if the accuracy increased or
decreased. On the other hand, there are some problems to detect patterns of some activities with each other,
because there are a lot of similar patterns among the different activity and then the classifier doesn’t classify
correctly.
References
[1]Accelerometer
http://en.wikipedia.org/wiki/Accelerometer. Accessed03/04/2013
[2]Gyroscope
http://en.wikipedia.org/wiki/Gyroscope. Accessed03/04/2013
[3]CrossValidation
http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29. Accessed03/10/2013
[4]SupervisedLearning
http://en.wikipedia.org/wiki/Supervised_learning. Accesed03/05/2013
[5]DatasetofHumanActivityRecognitionCoursera
https://spark-public.s3.amazonaws.com/dataanalysis/samsungData.rda. Accessed03/03/2013
[6]UC IrvineMachineLearning Repository
http://archive.ics.uci.edu/ml/. Accessed03/06/2013
[7]DatasetofHumanActivityRecognitionUsing SmartphonesDataSet
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones.
Accessed03/06/2013
[8]FileofHumanActivityRecognitionUCI
http://archive.ics.uci.edu/ml/machine-learning-databases/00240/UCI%20HAR%20Dataset.zip. Accessed
03/06/2013
[9]PredictiveModelling
http://en.wikipedia.org/wiki/Predictive_modelling. Accessed03/10/2013
[10]TreeLearning
http://en.wikipedia.org/wiki/Decision_tree_learning. Accessed03/10/2013
[11]CART
IT
[9]@gsantosgo
Information Tecnology
Information Tecnology
http://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees. Accessed03/10/2013
[12]Bagging
http://en.wikipedia.org/wiki/Bootstrap_aggregating. Accessed03/10/2013
[13]RandomForest(RF)
http://en.wikipedia.org/wiki/Random_forest. Accessed03/10/2013
[14]SupportVectorMachine(SVM)
http://en.wikipedia.org/wiki/Support_vector_machine. Accessed03/10/2013
[15]R MarkdownPage.
http://www.rstudio.com/ide/docs/authoring/using_markdown. Accessed03/06/2013
[16]Overfitting
http://en.wikipedia.org/wiki/Overfitting. Accessed03/10/2013

More Related Content

What's hot

Column store decision tree classification of unseen attribute set
Column store decision tree classification of unseen attribute setColumn store decision tree classification of unseen attribute set
Column store decision tree classification of unseen attribute set
ijma
 
IRJET- Plant Disease Detection and Classification using Image Processing a...
IRJET- 	  Plant Disease Detection and Classification using Image Processing a...IRJET- 	  Plant Disease Detection and Classification using Image Processing a...
IRJET- Plant Disease Detection and Classification using Image Processing a...
IRJET Journal
 
Histogram-based multilayer reversible data hiding method for securing secret ...
Histogram-based multilayer reversible data hiding method for securing secret ...Histogram-based multilayer reversible data hiding method for securing secret ...
Histogram-based multilayer reversible data hiding method for securing secret ...
journalBEEI
 
Data Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological DataData Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological Data
IRJET Journal
 
A real time filtering method of positioning data with moving window mechanism
A real time filtering method of positioning data with moving window mechanismA real time filtering method of positioning data with moving window mechanism
A real time filtering method of positioning data with moving window mechanism
Alexander Decker
 
Improved target recognition response using collaborative brain-computer inter...
Improved target recognition response using collaborative brain-computer inter...Improved target recognition response using collaborative brain-computer inter...
Improved target recognition response using collaborative brain-computer inter...
Kyongsik Yun
 
Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...
Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...
Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...
idescitation
 
Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...
ijitjournal
 
A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...
A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...
A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...
IRJET Journal
 
Kaggle digits analysis_final_fc
Kaggle digits analysis_final_fcKaggle digits analysis_final_fc
Kaggle digits analysis_final_fc
Zachary Combs
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
Manuel Martín
 
On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...
UniversitasGadjahMada
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
ijaia
 
28 01-2021-05
28 01-2021-0528 01-2021-05
28 01-2021-05
AdemarAlves7
 
HII: Histogram Inverted Index for Fast Images Retrieval
HII: Histogram Inverted Index for Fast Images Retrieval  HII: Histogram Inverted Index for Fast Images Retrieval
HII: Histogram Inverted Index for Fast Images Retrieval
IJECEIAES
 
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression TechniqueFuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
IJERA Editor
 
IRJET- Proposed System for Animal Recognition using Image Processing
IRJET-  	  Proposed System for Animal Recognition using Image ProcessingIRJET-  	  Proposed System for Animal Recognition using Image Processing
IRJET- Proposed System for Animal Recognition using Image Processing
IRJET Journal
 
Draft activity recognition from accelerometer data
Draft activity recognition from accelerometer dataDraft activity recognition from accelerometer data
Draft activity recognition from accelerometer data
Raghu Palakodety
 
Bs31267274
Bs31267274Bs31267274
Bs31267274IJMER
 

What's hot (20)

Column store decision tree classification of unseen attribute set
Column store decision tree classification of unseen attribute setColumn store decision tree classification of unseen attribute set
Column store decision tree classification of unseen attribute set
 
IRJET- Plant Disease Detection and Classification using Image Processing a...
IRJET- 	  Plant Disease Detection and Classification using Image Processing a...IRJET- 	  Plant Disease Detection and Classification using Image Processing a...
IRJET- Plant Disease Detection and Classification using Image Processing a...
 
Histogram-based multilayer reversible data hiding method for securing secret ...
Histogram-based multilayer reversible data hiding method for securing secret ...Histogram-based multilayer reversible data hiding method for securing secret ...
Histogram-based multilayer reversible data hiding method for securing secret ...
 
Data Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological DataData Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological Data
 
A real time filtering method of positioning data with moving window mechanism
A real time filtering method of positioning data with moving window mechanismA real time filtering method of positioning data with moving window mechanism
A real time filtering method of positioning data with moving window mechanism
 
Ijetcas14 329
Ijetcas14 329Ijetcas14 329
Ijetcas14 329
 
Improved target recognition response using collaborative brain-computer inter...
Improved target recognition response using collaborative brain-computer inter...Improved target recognition response using collaborative brain-computer inter...
Improved target recognition response using collaborative brain-computer inter...
 
Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...
Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...
Energy Efficient Mobile Targets Classification and Tracking in WSNs based on ...
 
Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...
 
A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...
A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...
A Hybrid Auto Surveillance Model Using Scale Invariant Feature Transformation...
 
Kaggle digits analysis_final_fc
Kaggle digits analysis_final_fcKaggle digits analysis_final_fc
Kaggle digits analysis_final_fc
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
 
On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...On comprehensive analysis of learning algorithms on pedestrian detection usin...
On comprehensive analysis of learning algorithms on pedestrian detection usin...
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
 
28 01-2021-05
28 01-2021-0528 01-2021-05
28 01-2021-05
 
HII: Histogram Inverted Index for Fast Images Retrieval
HII: Histogram Inverted Index for Fast Images Retrieval  HII: Histogram Inverted Index for Fast Images Retrieval
HII: Histogram Inverted Index for Fast Images Retrieval
 
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression TechniqueFuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
 
IRJET- Proposed System for Animal Recognition using Image Processing
IRJET-  	  Proposed System for Animal Recognition using Image ProcessingIRJET-  	  Proposed System for Animal Recognition using Image Processing
IRJET- Proposed System for Animal Recognition using Image Processing
 
Draft activity recognition from accelerometer data
Draft activity recognition from accelerometer dataDraft activity recognition from accelerometer data
Draft activity recognition from accelerometer data
 
Bs31267274
Bs31267274Bs31267274
Bs31267274
 

Similar to Data Analysis. Predictive Analysis. Activity Prediction that a subject performs based in measurements obtained from the accelerometer and gyroscope of the Smartphones

IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET Journal
 
Human activity recognition with self-attention
Human activity recognition with self-attentionHuman activity recognition with self-attention
Human activity recognition with self-attention
IJECEIAES
 
Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595
Marco Yandun
 
130509
130509130509
130509
130509130509
Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...
ISA Interchange
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
amreshkr19
 
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
International Research Journal of Modernization in Engineering Technology and Science
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
IOSR Journals
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
Editor IJMTER
 
4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf
RiyaDadlani1
 
4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf
RiyaDadlani1
 
Improving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approachImproving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approach
IJERA Editor
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmIAEME Publication
 
A Survey on Machine Learning Algorithms
A Survey on Machine Learning AlgorithmsA Survey on Machine Learning Algorithms
A Survey on Machine Learning Algorithms
AM Publications
 
Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Editor IJARCET
 
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESGI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
AM Publications
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms Excel
Dr. Abdul Ahad Abro
 

Similar to Data Analysis. Predictive Analysis. Activity Prediction that a subject performs based in measurements obtained from the accelerometer and gyroscope of the Smartphones (20)

IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
 
Human activity recognition with self-attention
Human activity recognition with self-attentionHuman activity recognition with self-attention
Human activity recognition with self-attention
 
Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595
 
130509
130509130509
130509
 
130509
130509130509
130509
 
Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf
 
4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf4_7268-76_IIOABJournal.pdf
4_7268-76_IIOABJournal.pdf
 
Improving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approachImproving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approach
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithm
 
A Survey on Machine Learning Algorithms
A Survey on Machine Learning AlgorithmsA Survey on Machine Learning Algorithms
A Survey on Machine Learning Algorithms
 
Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397
 
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESGI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
 
woot2
woot2woot2
woot2
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms Excel
 
Poster
PosterPoster
Poster
 

More from Guillermo Santos

Handwritten Digit recognition with R. Classification Problem
Handwritten Digit recognition with R. Classification ProblemHandwritten Digit recognition with R. Classification Problem
Handwritten Digit recognition with R. Classification ProblemGuillermo Santos
 
MadridJUG Mineria de Datos-Data Mining.09.may.2013
MadridJUG Mineria de Datos-Data Mining.09.may.2013MadridJUG Mineria de Datos-Data Mining.09.may.2013
MadridJUG Mineria de Datos-Data Mining.09.may.2013
Guillermo Santos
 
Data Analysis. Regression. LendingClub Loans
Data Analysis. Regression. LendingClub LoansData Analysis. Regression. LendingClub Loans
Data Analysis. Regression. LendingClub Loans
Guillermo Santos
 
Instalación R y RStudio en Windows
Instalación R y RStudio en WindowsInstalación R y RStudio en Windows
Instalación R y RStudio en Windows
Guillermo Santos
 
Presentación Geolocalización Noticias (geo news).2012
Presentación Geolocalización Noticias (geo news).2012Presentación Geolocalización Noticias (geo news).2012
Presentación Geolocalización Noticias (geo news).2012
Guillermo Santos
 
Algoritmos Aprendizaje Automático.2012
Algoritmos Aprendizaje Automático.2012Algoritmos Aprendizaje Automático.2012
Algoritmos Aprendizaje Automático.2012
Guillermo Santos
 
Kettle. Recuperación y Procesado de datos.2012
Kettle. Recuperación y Procesado de datos.2012Kettle. Recuperación y Procesado de datos.2012
Kettle. Recuperación y Procesado de datos.2012
Guillermo Santos
 

More from Guillermo Santos (7)

Handwritten Digit recognition with R. Classification Problem
Handwritten Digit recognition with R. Classification ProblemHandwritten Digit recognition with R. Classification Problem
Handwritten Digit recognition with R. Classification Problem
 
MadridJUG Mineria de Datos-Data Mining.09.may.2013
MadridJUG Mineria de Datos-Data Mining.09.may.2013MadridJUG Mineria de Datos-Data Mining.09.may.2013
MadridJUG Mineria de Datos-Data Mining.09.may.2013
 
Data Analysis. Regression. LendingClub Loans
Data Analysis. Regression. LendingClub LoansData Analysis. Regression. LendingClub Loans
Data Analysis. Regression. LendingClub Loans
 
Instalación R y RStudio en Windows
Instalación R y RStudio en WindowsInstalación R y RStudio en Windows
Instalación R y RStudio en Windows
 
Presentación Geolocalización Noticias (geo news).2012
Presentación Geolocalización Noticias (geo news).2012Presentación Geolocalización Noticias (geo news).2012
Presentación Geolocalización Noticias (geo news).2012
 
Algoritmos Aprendizaje Automático.2012
Algoritmos Aprendizaje Automático.2012Algoritmos Aprendizaje Automático.2012
Algoritmos Aprendizaje Automático.2012
 
Kettle. Recuperación y Procesado de datos.2012
Kettle. Recuperación y Procesado de datos.2012Kettle. Recuperación y Procesado de datos.2012
Kettle. Recuperación y Procesado de datos.2012
 

Recently uploaded

Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

Data Analysis. Predictive Analysis. Activity Prediction that a subject performs based in measurements obtained from the accelerometer and gyroscope of the Smartphones

  • 1. IT [1]@gsantosgo Information Tecnology Information Tecnology Data Analysis Title: Activity Prediction that a subject performs based in measurements obtained from the accelerometer and gyroscope of the Smartphones Introduction: Recently, our lives are invaded by small mobile devices, known as smartphones. These devices are mobile mini- computers, they have an operating system that allows it to launch applications, include a set of applications to manage contacts andaddress book, to create, editorview differenttypes of documents, to access orbrowse the Web, too provide us telephony or messaging services, etc. Apart from these previous features, the most of the smartphones have currently begun to incorporate other features such as cameras, GPS and various types of sensors. In this analysis, we used data obtained from the accelerometer [1] and gyroscope[2] sensor signals of the smartphones. The accelerometer and gyroscope sensors measure 3-axial linear acceleration and3-axial angular velocity, with these two sensors can monitor device acceleration, positions, orientation, rotation and angular motion. All these data can be stored and used to recognize a user’s activity. Here we refer to physical activities thatahumanpersoncanperformdailysuchaswalking, walking up, jogging, sitting, laying, etc. The aim of this analysis consisted of perform a classification’s task. We took a dataset with their attributes (acceleration, orientation,…) and its labeled variable (in this case is activity), and later we created various classification’s models also known classifiers. To create these classification’s models we can use various algorithms of classification. These algorithms use all available information of a dataset to help us to classify or predictthatactivityisperformedbyahumanperson. To create models of classification (models of classification), we performed a first task that consisted of choose different algorithms or techniques of classification, then for each algorithm or technique of classification we applied what is called cross-validation [3], that is, we trained these algorithm with a set of training data that corresponds to several observations of our available dataset. The following task was tested our classification’s algorithm to observe the accuracy, that is, if our predictive model can classify correctly a human’s activity according to the acquiredknowledge in the stage of training. This whole process is known as supervisedlearning [4].
  • 2. IT [2]@gsantosgo Information Tecnology Information Tecnology Methods: DataCollection For this analysis we used a dataset on the Human Activity Recognition. This dataset were downloaded from coursera.org [5]in Data Analysis Course on March 03, 2013 using the R programming language. The data of this dataset are previously processedto make them easierto loadinto R, since the data was obtainedfromother raw data from the UC Irvine Machine Learning Repository [6] that has a dataset available about Human Activity Recognition[7], builtfromthe recordingsof 30 subjectsperforming activitiesofdaily living (ADL)while carrying a waist-mountedSmartphonewithembeddedinertial sensors. The dataset for this analysis contains 7352 observations and 563 variables. For each observation, there is a categorical orfactorvariable called“activity”(ourlabeledvariable orclass)thatindicatestheactivity carriedout by a human person, there are only six possible values for this variable: laying, sitting, standing, walk, walkdown and walkup. Too, there isanotherintegervariable knownas“subject”thatisthe identificatorof the person that performed that activity. Andfinally, the rest of the 561 variables are numeric variables (quantitative) that contains features about time and frequency on triaxial acceleration (mean, standard deviation, energy, correlation, etc.)fromtheaccelerometer, triaxial angularvelocityfromthegyroscope, etc. For more information about all these variables, you can find the features here in this compressed file [8]. This compressedfile contains some interesting descriptive files thatshow information aboutthe variables usedin this dataset, all featuresandlabeledvariableorclass. ExploratoryAnalysis Exploratory analysis was performed by examining data and plots of the observed data. Exploratory analysis was used to (1) identify missing values, (2) verify the quality of the data, (3) check name of variables that are syntactically correct and (4) identified possible different patterns between the different activities and so to be abletodistinguishwhenauserperformsanactivity oranother. Our predictive model [9]shouldbe able to recognize patterns corresponding to every activity. Figure 1 shows the different patterns for different activities according to the analysis of acceleration X-axis. We can observe that therearedifferentpatternsaccording tothatactivity iscarriedoutby auser.
  • 3. IT [3]@gsantosgo Information Tecnology Information Tecnology Figure2 showsthedifferentpatternsfordifferentactivitiesaccording totheanalysisofaccelerationY-axis. Figure3 showsthedifferentpatternsfordifferentactivitiesaccording totheanalysisofaccelerationZ-axis.
  • 4. IT [4]@gsantosgo Information Tecnology Information Tecnology It’s important keepin mind, if there are activities with common patterns, ourpredictive model will obviously have more difficultto classify these activities correctly andtherefore ourmodel will have loweraccuracy, thatis, ithas moredifficultiestodistinguishamong activities. Statistical Modeling To be able to classify the activity that is performed by a subject, we used various techniques or algorithms of classification to recognize and predict our labeled variable (activity). The techniques (classifiers) employed for thisdataanalysisarethefollowing: DecisionTrees[10] CART[11] Bagging [12] RamdomForest [13] SVM[14] We performed cross-validation for each of these previous techniques (classifiers). We also evaluated the performance, theaccuracyandtheerrorrateoftheseclassifiers. Reproducibility All analysesperformedinthismanuscriptarereproducedintheR markdownfilesamsungPredictive.Rmd[15]. Note. Due to security concerns with the exchange of R code, we don’tsubmit code to reproduce analysis, in this dataanalysis. Results: As I said, the dataset for this analysis contains a total size 7352 observations with 563 variables, these observations correspond to a total 21 people. In Table 1, shows the number of examples per subject and type of activity, andalsothepercentageoftotal peractivity fromourdataset. We foundvariables that have syntactically incorrect names, thatis, the name of variables use incorrect character such as comma(“,”), brackets (“(“),etc. , then itwasnecessary to have validvariable names andnotduplicatedin our dataset (or data frame). We observed to detect missing values in the dataset, and there weren’t missing values. Ourclass orlabeledvariable was transformedfromcharactervariable to a factorvariable with 6 levels: “laying”, “sitting”, “standing”, “walk”, “walkdown”and“walkup”.
  • 5. IT [5]@gsantosgo Information Tecnology Information Tecnology According to assignment, for this data analysis we used a training set that include the data from subjects 1, 3, 5 and 6 and a test set that include the data from 27, 28, 29 and 30. Table 2 shows the number of samples per activity that we used to perform the stage of training. And Table 3 indicates the number of samples per activity thatweusedtoperformthestageoftesting. id laying sitting standing walk walkdown walkup Total 1 50 47 53 95 49 53 347 3 62 52 61 58 49 59 341 5 52 44 56 56 47 47 302 6 57 55 57 57 48 51 325 7 52 48 53 57 47 51 308 8 54 46 54 48 38 41 281 11 57 53 47 59 46 54 316 14 51 54 60 59 45 54 323 15 72 59 53 54 42 48 328 16 70 69 78 51 47 51 366 17 71 64 78 61 46 48 368 19 83 73 73 52 39 40 360 21 90 85 89 52 45 47 408 22 72 62 63 46 36 42 321 23 72 68 68 59 54 51 372 25 73 65 74 74 58 65 409 26 76 78 74 59 50 55 392 27 74 70 80 57 44 51 376 28 80 72 79 54 46 51 382 29 69 60 65 53 48 49 344 30 70 62 59 65 62 65 383 Sum 1407 1286 1374 1226 986 1073 7352 % 19,14 17,49 18,69 16,68 13,41 14,59 100 Table 1.Number of samples per subject and type of activity Laying sitting standing walk walkdown walkup 55 50 57 64 49 53 Table 2.Number of samples per activity for Training laying sitting standing walk walkdown walkup 74 64 71 56 52 54 Table 3.Number of samples data per activity fo Testing
  • 6. IT [6]@gsantosgo Information Tecnology Information Tecnology Weperformedtheprocessofcross-validationforeachofthepreviousclassifiersusing thetraining setandtest setwerealreadyearlyspecified. Theresultsobtainedfordifferentclassificationtechniques(predictivemodels)using theR programming language arepresentedinTable4. Inthistablecanbetheaccuracy ofeachclassificationtechniqueperactivity. Thecells inboldandunderlineindicatethebestaccuracy. Itisimportanttakeintoaccountthatweusedall quantitativevariables(561variables)topredicttheactivity carriedoutbyasubjectinthese5classificationtechniques. Recall, ifwehavealotofvariables, theperformance oftheclassificationalgorithmmaybeextremely affected, tooalotofthesequantitativevariablescouldaddnoise toclassifycorrectlyactivities, andotherscouldnotbeinteresting toprovidegoodinformationtodistinguish among activities. Ontheotherhand, Itwill bevery interesting, toperformameasureofhowmuchtheclassifiers areoverfitting[16]. Ingeneral, themostoftheclassificationtechniquesusedinthisanalysishavehighlevelsofaccuracy. Butwecan observelessaccurateforsomeactivitiesandforsomeclassificationtechniques. % Correctly Predicted Model Tree library(tree) CART library(rpart) BAGGING library(ipred) Random Forest library(randomForest) SVM library(e1071) laying 100,00 100,00 100,00 100,00 100,00 sitting 70,31 67,19 67,19 82,81 82,81 standing 85,92 88,73 88,73 88,73 88,73 walk 50,00 57,14 80,30 92,86 92,86 walkdown 84,61 86,54 94,23 86,54 86,54 walkup 85,19 85,19 87,03 96,30 98,15 All 79,34 80,80 86,25 91,21 91,52 Table 4.Accuracies of the Classification Techniques In the following tables (Table 5-9) show confusion matrices for each of classification techniques. Predicted Class Actual Class laying sitting standing walk walkdown walkup laying 74 0 0 0 0 0 sitting 0 45 19 0 0 0 standing 0 10 61 0 0 0 walk 0 0 0 28 6 22 walkdown 0 0 0 0 44 8 walkup 0 0 0 1 7 46 Table 5.Confusion matrix for the Decision Tree
  • 7. IT [7]@gsantosgo Information Tecnology Information Tecnology Predicted Class Actual Class laying sitting standing walk walkdown walkup laying 74 0 0 0 0 0 sitting 0 43 21 0 0 0 standing 0 8 63 0 0 0 walk 0 0 0 32 4 20 walkdown 0 0 0 0 45 7 walkup 0 0 0 1 7 46 Table 6.Confusion matrix for the CART Predicted Class Actual Class laying sitting standing walk walkdown walkup laying 74 0 0 0 0 0 sitting 0 43 21 0 0 0 standing 0 8 63 0 0 0 walk 0 0 0 53 0 3 walkdown 0 0 0 0 49 3 walkup 0 0 0 1 6 47 Table 7.Confusion matrix for Bagging Predicted Class Actual Class laying sitting standing walk walkdown walkup laying 74 0 0 0 0 0 sitting 0 53 11 0 0 0 standing 0 8 63 0 0 0 walk 0 0 0 52 0 4 walkdown 0 0 0 0 47 5 walkup 0 0 0 0 2 52 Table 8.Confusion matrix for Random Forest Predicted Class Actual Class laying sitting standing walk walkdown walkup laying 74 0 0 0 0 0 sitting 0 53 11 0 0 0 standing 0 8 63 0 0 0 walk 0 0 0 52 0 4 walkdown 0 0 0 0 47 5 walkup 0 0 0 0 1 53 Table 9.Confusion matrix for SVM In general, we observedthatthe classification techniques identify correctly laying (100%). Itappears much more difficulttodistinguishbetweensitting andstanding, andalsotodistinguishbetweenwalk, walkdownandwalkup.
  • 8. IT [8]@gsantosgo Information Tecnology Information Tecnology The Bagging, Random Forest and SVM are classifiers that require more computing and memory resources, and thereforemoreclassificationtimethanTreeandCART. Conclusions: In this analysis, we employed various classification techniques to obtain different predictive model. The SVM classifier algorithm achieved the highest levels of accuracy for this analysis (91,52%accuracy). It will be recommendable to increase the number of observations. Too, it will be recommendable to increase the samples for the set of training data, and the samples for the set of test data, and observe if the accuracy increased or decreased. On the other hand, there are some problems to detect patterns of some activities with each other, because there are a lot of similar patterns among the different activity and then the classifier doesn’t classify correctly. References [1]Accelerometer http://en.wikipedia.org/wiki/Accelerometer. Accessed03/04/2013 [2]Gyroscope http://en.wikipedia.org/wiki/Gyroscope. Accessed03/04/2013 [3]CrossValidation http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29. Accessed03/10/2013 [4]SupervisedLearning http://en.wikipedia.org/wiki/Supervised_learning. Accesed03/05/2013 [5]DatasetofHumanActivityRecognitionCoursera https://spark-public.s3.amazonaws.com/dataanalysis/samsungData.rda. Accessed03/03/2013 [6]UC IrvineMachineLearning Repository http://archive.ics.uci.edu/ml/. Accessed03/06/2013 [7]DatasetofHumanActivityRecognitionUsing SmartphonesDataSet http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones. Accessed03/06/2013 [8]FileofHumanActivityRecognitionUCI http://archive.ics.uci.edu/ml/machine-learning-databases/00240/UCI%20HAR%20Dataset.zip. Accessed 03/06/2013 [9]PredictiveModelling http://en.wikipedia.org/wiki/Predictive_modelling. Accessed03/10/2013 [10]TreeLearning http://en.wikipedia.org/wiki/Decision_tree_learning. Accessed03/10/2013 [11]CART
  • 9. IT [9]@gsantosgo Information Tecnology Information Tecnology http://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees. Accessed03/10/2013 [12]Bagging http://en.wikipedia.org/wiki/Bootstrap_aggregating. Accessed03/10/2013 [13]RandomForest(RF) http://en.wikipedia.org/wiki/Random_forest. Accessed03/10/2013 [14]SupportVectorMachine(SVM) http://en.wikipedia.org/wiki/Support_vector_machine. Accessed03/10/2013 [15]R MarkdownPage. http://www.rstudio.com/ide/docs/authoring/using_markdown. Accessed03/06/2013 [16]Overfitting http://en.wikipedia.org/wiki/Overfitting. Accessed03/10/2013