SlideShare a Scribd company logo
1 of 40
Download to read offline
A PROJECT REPORT
ON
A Study on Physical Activity Recognition from
Accelerometer Data using Smartphones
Submitted in partial fulfilment of the requirements for the award of the
degree of Bachelor of Science
in
Computer Science and Engineering
Project Supervisor:
Dr. Md Aktaruzzaman
Associate Professor
Department of Computer Science & Engineering
Islamic University-Bangladesh
Submitted By:
Diponkor Bala
Roll No: 1314021
Reg No: 1136
Session: 2013-2014
Department of Computer science & Engineering
Islamic University-Bangladesh
Certificate
This is to certify that the project report entitled “A Study on Physical
Activity Recognition from Accelerometer Data using
Smartphones” which is submitted by Diponkor Bala Roll No. 1314021
are an authentic work carried out by him at Department of Computer
Science & Engineering, Islamic University-Bangladesh under my
guidance. It is my pleasure to declare that, according to my knowledge
and query this work is original and has been done for the first time ever
like this procedure.
Signature: ………………………
Date: ……………………………
Dr. Md Aktaruzzaman
Associate Professor
Department of Computer Science & Engineering
Islamic University-Bangladesh
Acknowledgement
On the submission of my project report on “A Study on Physical
Activity Recognition from Accelerometer Data using
Smartphones”, we would like to extend our gratitude and sincere thanks
to my supervisor Dr. Md Aktaruzzaman, Associate Professor, Department of
Computer Science and Engineering for his constant motivation and support
that helps me to complete this work.
I truly appreciate and value his esteemed guidance & encouragement from
the beginning to the end of this project. I indebted to his for having helped
me shape the problem and providing insights towards the solution.
I want to thank all our teachers for providing a solid background for my
studies thereafter. They have been great sources of inspiration to me and
I thank them from the bottom of my heart.
Above all, I would like to thank all of my friends whose direct and indirect
support helped me complete my project in time. The project would have
been impossible without their perpetual moral support.
The Author
Diponkor Bala
Depatment of Computer Science & Engineering
Islamic University-Bangladesh
II
Abstract
Physical-activity recognition via wearable sensors can provide valuable information
regarding an individual's degree of functional ability and lifestyle. In this paper, we
present an accelerometer sensor-based approach for human-activity recognition.
Insufficient amount of physical activity, and hence storage of calories may lead
depression, obesity, cardiovascular diseases, and diabetes. The amount of consumed
calorie depends on the type of activity. The recognition of physical activity is very
important to estimate the amount of calories spent by a subject every day. There are
some research works already published in the literature for activity recognition
through accelerometers (body worn sensors). The accuracy of any recognition
system depends on the robustness of selected features and classifiers. For this work,
I extracted some features such as-mean, median absolute deviation(MAD), standard
deviation (STD) ,minimum(min), maximum(max), signal energy, signal magnitude
area (SMA), tilt angle (TA), autoregressive coefficients (ARcoeffs). The system was
trained and tested in an experiment with multiple human subjects in real-world
conditions. For classification, I selected five classifiers each offering good
performance for recognizing our set of activities and investigated how to combine
them into an optimal set of classifiers. The best classification rate in our experiment
was 92.0%.
Keywords: Activity Recognition, Smartphone, Accelerometer, Classification.
III
Contents
Certificates …………………………………………………...…………i
Acknowledgement………………………………………………….…...ii
Abstract…………………………………………………………………iii
List of Figures and Tables……………………………………………...iv
Contents…………………………………………………………………v
INTRODUCTION…………………………………………………….1-1
1.1 Introduction……………………………………………………1
1.2 Literature Review…………………………………………...…1
BACKGROUND……………………………………………………..2-8
2.1 Background………………………………………………..….…….2
2.2 Accelerometers……………………….…………………………..…2
2.2.1 The purpose of the accelerometer………………….………2
2.2.2 How they work……………………………………….…….3
2.3 Machine Learning……………………………………...……….….4
2.3.1 Types of Machine Learning Algorithms………….….…….4
2.3.1.1 Supervised Learning…………………….….….…..4
2.3.1.2 Unsupervised Learning………………………....….5
2.3.1.3 Semi-supervised Learning…………………....…….6
2.3.1.4 Reinforcement Learning……………………………6
2.3.2 Machine Learning Approaches……………………………..6
METHODOLOGY……………………………………………..……9-17
3.1 Methodology…………………………………………….………….9
3.1.1 Sensor Data Acquisition……………………………………9
IV
3.1.2 Preprocessing……………………………………………..11
3.1.3 Feature Selection and Extraction………………………....13
3.1.4 Estimate Calorie Consumption…………………..……….16
CLASSIFICATION……………………………………..……….18-28
4.1 Classification………………………………………………18
4.2 Confusion Matrix…………………………………………..23
RESULTS AND DISCUSSIONS……………………………………29
5.1 Result and Discussion…………….……………..………….29
CONCLUSIONS………………………………………..……………30
6.1 Conclusions………………………………..….…………….30
6.2 Future Work…………………………..…….………………30
REFERENCES……………………………………………….………31
V
LISTS OF FIGURES AND TABLES
Figure 1: Example of a binary classification problem in a two-dimensional space
…………………………………………………………………………………. 8
Figure 2: Activity Recognition process pipeline………………………..............9
Figure 3: The polarity and position of the accelerometer on the human body…10
Figure 4: Sample raw output of the tri-axial accelerometer during five different
types of physical activity………………………………………………………..11
Figure 5: Some example of window of an activity (walking)… .................……12
Figure 6: The Human Activity Recognition Process Pipeline with its four main
blocks……………………………………………………………………………14
Figure 7: An example of zero-crossings for counting peaks to get step counts...17
Table 1: Using five types of classifier…………………………………………..19
Table 2: Trained accuracy and Test accuracy results…………………...………20
Figure 8: Trained accuracy by classification learner app…………………….....20
Figure 9: Scatter plot for Decision Tree………………………….……………..21
Figure 10: Scatter plot for Linear SVM………………………………..……….21
Figure 11: Scatter plot for Cubic SVM…………………………………………22
Figure 12: Scatter plot for Quadratic SVM…………….……………………….22
Figure 13: Scatter plot for Weighted KNN……………….…………………….23
Figure 14: Confusion Matrix for Simple Tree………………………...………..24
Figure 15: Confusion Matrix for Linear SVM………………...……………….24
Figure 16: Confusion Matrix for Cubic SVM…………….……………………25
Figure 17: Confusion Matrix for Quadratic SVM……………………..……….25
Figure 18: Confusion Matrix for Weighted KNN……………..……………….26
Figure 19: ROC curve for Simple Tree……………………….………………..26
Figure 20: ROC curve for Linear SVM…………………………..……………27
Figure 21: ROC curve for Cubic SVM……………………………..………….27
Figure 22: ROC curve for Quadratic SVM ……………………………………28
Figure 23: ROC curve for Weighted KNN……………………………...……..28
VI
CHAPTER 1
INTRODUCTION
1.1. Introduction
Human activity recognition is an important yet challenging research area with many
applications in healthcare, smart environments, and homeland security. Computer
vision-based techniques have widely been used for human activity tracking, but they
mostly require infrastructure support. Alternatively, a more efficient approach is to
process the data from inertial measurement unit sensors worn on a user’s body or
built in a user’s smartphone to track his or her motion.
We aim to develop a model that is capable of recognizing multiple sets of daily
activities under real-world conditions, using data collected by a single tri-axial
accelerometer built into a cell phone (in our study, an Android smartphone). A tri-
axial accelerometer is a sensor that returns an estimate of acceleration along the x, y
and z axes from which velocity and displacement can also be estimated. Activity
recognition is formulated as a supervised classification problem, whose training data
is obtained via an experiment having human subjects perform each of the activities.
We aim at a classification methodology that is robust regardlessly of the classifier
tool in use.
Our model has been tested in an experiment having three users each performing one
of the following six physical activity patterns: Walking, Stairs-Up, Stairs-Down,
Sitting, Standing. In the following sections, we discuss the related work, describe
our data collection methodology and our approach to recognize activity from
accelerometer data, and results of our experiment.
1.2 Literature Review
Human activity recognition has been studied for years and researchers have
proposed different solutions to attack the problem. Existing approaches typically use
vision sensor, inertial sensor and the mixture of both. Machine learning and
threshold-base algorithms are often applied. Machine learning usually produces
more accurate and reliable results, while threshold-based algorithms are faster and
simpler. Single accelerometer attached to different body positions are the most
common solutions. Approaches that combine both vision and inertial sensors have
also been purposed. Another essential part of all these algorithms is data processing.
The quality of the input features has a great impact on the performance. Some
previous works are focused on generating the most useful features from the time
series data set. The common approach is to analyze the signal in both time and
frequency domain.
Active learning technique has been applied on many machine learning problems that
are time-consuming and labor-expensive to label samples. Some applications
include speech recognition, information extraction, and handwritten character
recognition.
CHAPTER 2
BACKGROUND
2.1 Background
2.2 Accelerometers
The accelerometer in Android phones measures the acceleration of the device on the
x (lateral), y (longitudinal), and z (vertical) axes. Accelerometers can be used to
detect movement and the rate of change of the speed of movement. As stated above,
the use of accelerometers in Android applications does not require the application to
have permission to use it. Therefore, it is possible for an application to collect a
user’s accelerometer data without the user’s knowledge. With accelerometer data
and the use of a server to collect the information, it is a fairly simple task for someone
to gain a user’s personal information, their location, or to figure out what a user is
doing or typing.
2.2.1 The purpose of the accelerometer
The application of accelerometers extends to multiple disciplines, both academic and
consumer-driven. For example, accelerometers in laptops protect hard drives from
damage. If the laptop were to suddenly drop while in use, the accelerometer would
detect the sudden free fall and immediately turn off the hard drive to avoid hitting
the reading heads into the hard drive platter. Without this, the two would strike and
cause scratches to the platter for extensive file and reading damage. Accelerometers
are likewise used in cars as the industry method way of detecting car crashes and
deploying airbags almost instantaneously.
In another example, a dynamic accelerometer measures gravitational pull to
determine the angle at which a device is tilted with respect to the Earth. By sensing
the amount of acceleration, users analyze how the device is moving.
Accelerometers allow the user to understand the surroundings of an item better. With
this small device, you can determine if an object is moving uphill, whether it will
fall over if it tilts any more, or whether it’s flying horizontally or angling downward.
For example, smartphones rotate their display between portrait and landscape mode
depending on how you tilt the phone.
2.2.2 How they work
An accelerator looks like a simple circuit for some larger electronic device. Despite
its humble appearance, the accelerometer consists of many different parts and works
in many ways, two of which are the piezoelectric effect and the capacitance sensor.
The piezoelectric effect is the most common form of accelerometer and uses
microscopic crystal structures that become stressed due to accelerative forces. These
crystals create a voltage from the stress, and the accelerometer interprets the voltage
to determine velocity and orientation.
The capacitance accelerometer senses changes in capacitance between
microstructures located next to the device. If an accelerative force moves one of
these structures, the capacitance will change and the accelerometer will translate that
capacitance to voltage for interpretation.
Accelerometers are made up of many different components, and can be purchased
as a separate device. Analog and digital displays are available, though for most
technology devices, these components are integrated into the main technology and
accessed using the governing software or operating system.
Typical accelerometers are made up of multiple axes, two to determine most two-
dimensional movement with the option of a third for 3D positioning. Most
smartphones typically make use of three-axis models, whereas cars simply use only
a two-axis to determine the moment of impact. The sensitivity of these devices is
quite high as they’re intended to measure even very minute shifts in acceleration.
The more sensitive the accelerometer, the more easily it can measure acceleration.
Accelerometers, while actively used in many electronics in today’s world, are also
available for use in custom projects. Whether you’re an engineer or tech geek, the
accelerometer plays a very active role in a wide range of functionalities. In many
cases you may not notice the presence of this simple sensor, but odds are you may
already be using a device with it.
2.3 Machine Learning
Machine learning is the area of study concerned about the design, development and
evaluation of systems capable to learn from data. In many common situations where
we need, for instance, to complete a particular task, or perhaps to make some
prediction regarding a given issue, it is possible to find solutions by the inspection
and analysis of previous observations with similar characteristics to the addressed
problem. In other words, Machine learning systems are capable of predicting future
actions based on past experiences.
Data are used as the input of the learning process and their representation is
fundamental for the performance of Machine learning systems. They must describe
any specific situation to better predict future data in a meaningful way. The property
that allows to correctly predict unseen samples, is known as generalization and it is
highly desirable in any learning machine as it is directly related to its performance.
2.3.1 Types of Machine Learning Algorithms
Machine learning algorithms have been categorized according to the type of input
used for training and its expected outcome. In this section, we describe the most
relevant categories.
2.3.1.1 Supervised Learning
In this type of learning, input data are usually composed of a pair of elements,
namely the input vector (x) together with its target (y) .This can be better clarified
with an example: assume a system that learns handwritten numbers from 0 to 9. The
input vectors would be the set of images of all the numbers (usually several samples
per each one) and the target vector the actual labels that correspond to each sample.
If the output of desired system is categorical (only a set of discrete classes are
considered), then it is a classification problem such as the example presented above.
Otherwise, if the output data are continuous variables, such as in temperature
forecasting or stock market prediction, then the system is considered a regression.
This algorithm type is the most commonly used for ML and it is also the one used
in our research. However it is not useful to solve all kind of problems. In fact, one
of its disadvantages is that in some applications it is not always possible to have
target information for all the available input samples. As a result, other techniques
can cope with these situations such as unsupervised and semi-supervised learning
which are described as follows.
When the learning is performed gradually, for instance, by adding one new sample
and its target at a time to the model, we refer to Online Machine Learning. This
supervised approach have the advantage of making the model adaptive and flexible
in accordance to the new inputs. This type of learning is required in applications with
high output variability and where a stream of new samples is available and can be
progressively added to the model for learning. This is the case of online web ranking
and stock market prediction applications.
2.3.1.2 Unsupervised Learning
In an unsupervised learning problem, the training data consists of only input vectors
without their associated targets. It aims to find certain similarities or discover
distinguishable structure within the input data (e.g. clustering). It can also be used
for density estimation to describe the distribution of the data in its space. Moreover,
this learning approach can be exploited for data visualization using dimensionality
reduction methods which allow to better project high-dimensional data into smaller
spaces .Unsupervised learning approaches have been already applied in several areas
such as in medical imaging where 3D Positron Emission Tomography (PET) scans
use cluster analysis to find dissimilarities between different organs and types of
tissue to be able to correctly segment the scanned area (George et al. 2011). It has
also been applied in the automatic grouping of similar shopping items (e.g. books,
movies, music), particularly in recommender systems for online stores that aim to
predict the user preferences based on products similarities and previous purchases.
2.3.1.3 Semi-supervised Learning
This learning approach combines labeled and unlabeled data for learning. Therefore,
it takes aspects from both supervised and unsupervised approaches. In general, small
amounts of labeled data are integrated with a large number of unlabeled samples for
learning. For example, it is useful for datasets where it is not always possible to have
a label for each sample. Evidence have shown that semi-supervised learning can
greatly improve the learning performance when compared with supervised learning
which does not take into account unlabeled data. This is feasible if considerations
such as the data smoothness assumption apply .Vast digital image collections on the
internet for content retrieval are an application example where this type of learning
can be exploited. Not all the images have an associated targets and it would be
humanly impossible to perform this labeling manually.
2.3.1.4 Reinforcement Learning
This learning approach is oriented on finding an appropriate set of actions to solve a
particular problem. This is done with the purpose of maximizing a reward. Optimal
solutions are not found through learning a model given a set of input-target pairs.
2.3.2 Machine Learning Approaches
Several ML modeling approaches have been developed throughout the years in order
to solve different tasks such as classification, regression and clustering .Some of
them are based on deterministic models which aim to find fixed causal relationships
between events. Other approaches, on the other hand, are probabilistic and assume
occurring events are generated from a probability distribution.
In the following list, the most popular ML algorithms are briefly described. Then, in
the next section we make particular focus on SVMs as they are the central ML
algorithm employed in this project.
• Decision Tree (DT): is a predictive model based on decision trees which makes
choices from a set of hierarchical rules related to the input data. It is a common
approach for classification particularly because the resulting models are easily
interpretable by humans (due to its intrinsic tree structure).
• Random Forest (RF): is an ML meta-classifier which is built using an ensemble
of DTs. The predicted class is chosen as the most frequently occurring amongst the
output of each DT.
• k-Nearest Neighbors (k-NN): this deterministic learning approach exploits
similarity measures between data for classification and regression tasks. Given a new
sample, the approach finds the k closest samples from a training set to decide the
prediction outcome with their. Its main disadvantage relies on the size of its model
as it is data-dependent and makes it unfeasible in large datasets. There are, however,
versions which consider data reduction techniques for alleviating this issue.
• Naive Bayes (NB): is a popular probabilistic classifier based on Bayes’s theorem
that predicts the class of a given sample by assuming an underlying probability
model of the data and making strong independence assumptions between its features.
Even though its formulation is quite simple, it has shown to perform well in various
applications .For example, when data are assumed to be Gaussian-distributed, it is
possible to learn the model only by calculating the mean and variance of the input
data.
• Artificial Neural Networks (ANN): is an ML approach with a biological
inspiration.
It simulates how the brain and its nervous system, composed of interconnected
neurons, is able to learn from experience and capture the underlying structure of the
data. Neurons are set in a layered structure and have associated weights which are
able to adapt based on the training data and the network output through a cost
function. This approach has shown to perform well in many including non-linear
problems. Its main disadvantage relies in the need of a large dataset for its training
stage. Multilayer Perceptron (MLP) is a popular ANN model that maps the input
through multiple layers of neurons in a fully connected directed graph until reaching
the output.
• Logistic Regression (LR): is a probabilistic algorithm used for solving
classification and regression problems. It estimates the probability of a given sample
of belonging to a particular class. This is achieved through the use of a logistic
function which is modeled by fitting the training data generally using maximum
likelihood estimation.
• Support Vector Machine (SVM):
A Support vector machine is one of the most commonly used supervised Machine
Learning algorithms. It was initially proposed by Vladimir Vapnik and his
colleagues in with the aim of solving linear and non-linear binary classification
problems. Afterward, this algorithm has been adapted for its application in
multiclass classification and regression analysis.
The SVM for classification is a deterministic approach that aims to find the
hyperplanes that best separate the data into classes. These subspaces are the ones
that provide the largest margin separation from the classes of the training data with
the intention of providing a model with low generalization error for its use with
unseen data samples.
SVMs are the basis for the classification of activities in this work. For this reason
we now introduce them, starting from the binary SVM model which is its simplest
representation, to the extended case that allows the classification of more than two
classes: the multiclass SVM. This algorithm will be further revised throughout the
development of this research to tackle specific requirements for our application in
aspects such as kernel type, arithmetic used and algorithm output type.
Figure 1: Example of a binary classification problem in a two-dimensional space
(circles and crosses). The line represents a possible solution to the problem.
CHAPTER 3
METHODOLOGY
3.1 Methodology
3.1.1 Sensor Data Acquisition
Pictorial representation of physical activity recognition is shown below:
Figure 2: Activity Recognition process pipeline
In order to collect data, we used a tri-axial accelerometer in the Android phone to
measure acceleration. Data from this accelerometer includes the acceleration along
the x-axis, y-axis and z-axis. These axes capture the medio-lateral (ML) movement
of the user (x-axis), vertical (V) movement (y-axis), and anterior-posterior (AP)
movement (z-axis). Figure 1(a) demonstrates these axes relative to a user.
Figure 3: The polarity and position of the accelerometer on the human body (image
courtesy of GENEActive, 2012)
In this study, the acceleration signals were recorded from 9 subjects (age range: 25±2
years). The subjects were asked to perform five activities: walking, stairs-up, stairs-
down, sitting, standing. The duration of stairs-up, stairs-down activity is about 60
±5 sec and walking, sitting, standing is 120±5 sec. The sampling frequency used for
collecting data was 100 fps. A sample of the acceleration signals of all activities is
shown in fig
Figure 4: Sample raw output of the tri-axial accelerometer during five different types of physical
activity
3.1.2 Preprocessing
The accelerometers real time output may contain random noise that should be
filtered out before it is used for activity recognition. A low pass moving average
filter has been applied for filtering out the noise from the acceleration data. Basically,
before the extraction process, we use the sliding window segmentation technique for
the signal to be divided into the particular size. In this project work I assumed the
each window size is 300. The raw signals from each dimension (x-axis, y-axis, and
z-axis) are split into several numbers of window segments. Two common approaches
are usually used in this method; with overlapping or without overlapping. The first
approach is conducted by segmenting the window with overlapping between two
consecutive window segments. Otherwise, there is no overlapping between two
consecutive window segments in the second approach.
In the following give some example of window data –
Window (1) Window (2)
Window (3) Window (4)
Figure 5: Some example of window of an activity (walking)
3.1.3 Feature Selection and Extraction
In a Machine learning problem, feature selection refers to the process of selecting a
significant set of features to largely impact the discrimination ability of a learning
algorithm. Feature extraction, on the other hand, is an approach to diminish the
dimensionality of an available set of features by performing inter-feature
transformations in order to obtain a new dimensionally reduced representation
without largely sacrificing relevant information from the original set. The curse of
dimensionality, which describes the difficulty in understanding and dealing with
high-dimensional data, is certainly linked with these two reduction mechanisms as
they can alleviate the problems that may arise when working in high-dimensional
spaces.
Feature selection and extraction also allows reducing the training times and
increasing the generalization performance in ML problems. They, however, differ
on that the interpretability of models in which feature selection is employed is much
clearer. In this case, features are distinct between each other and not merged such as
in feature-extraction based approaches. Depending on the application, the features
required for the extraction of relevant information may vary. In the particular case
of HAR, a reduced representation of the sensor data can be used as the input of a
recognition algorithm. This is attained by estimating various measures from the
sensor signals in different domains (e.g. in time and frequency). Nonetheless, other
time-frequency function representations such as the wavelet transforms are also
applicable. Once obtained, they can be further reduced using feature selection (e.g.
exhaustive search, or wrappers, filters and embedded methods and extraction
approaches, or a combination of both.
Figure 6: The Human Activity Recognition Process Pipeline with its four main
blocks
The description of extracted feature is given below:
Mean: Average or mean value of array
M = mean(A) returns the mean of the elements of A along the first array dimension
whose size does not equal 1.
 If A is a vector, then mean(A) returns the mean of the elements.
 If A is a matrix, then mean(A) returns a row vector containing the mean of
each column.
 If A is a multidimensional array, then mean(A) operates along the first array
dimension whose size does not equal 1, treating the elements as vectors. This
dimension becomes 1 while the sizes of all other dimensions remain the same.
Min: Minimum elements of an array
M = min(A) returns the minimum elements of an array.
 If A is a vector, then min(A) returns the minimum of A.
 If A is a matrix, then min(A) is a row vector containing the minimum value
of each column.
 If A is a multidimensional array, then min(A) operates along the first array
dimension whose size does not equal 1, treating the elements as vectors. The
size of this dimension becomes 1 while the sizes of all other dimensions
remain the same. If A is an empty array with first dimension 0,
then min(A) returns an empty array with the same size as A.
Max: Maximum elements of an array
M = max(A) returns the maximum elements of an array.
 If A is a vector, then max(A) returns the maximum of A.
 If A is a matrix, then max(A) is a row vector containing the maximum value
of each column.
 If A is a multidimensional array, then max(A) operates along the first array
dimension whose size does not equal 1, treating the elements as vectors. The
size of this dimension becomes 1 while the sizes of all other dimensions
remain the same. If A is an empty array whose first dimension has zero length,
then max(A) returns an empty array with the same size as A.
Mad: median absolute deviation
y = mad(X) returns the mean absolute deviation of the values in X.
 If X is a vector, then mad returns the mean or median absolute deviation of
the values in X.
 If X is a matrix, then mad returns a row vector containing the mean or median
absolute deviation of each column of X.
 If X is a multidimensional array, then mad operates along the first
nonsingleton dimension of X.
STD: Standard Deviation = determines the standard deviation of the
selected array
Signal Energy: In signal processing, total energy of signal x(t) is defined as similar
way.
Where |x(t)| denotes the magnitude of x(t). It is necessary to get a scalar quantity for
complex signal, because magnitude of complex number is defined as
And, it is also squared because of common convention to use similar terminology
for any signal. Therefore, the energy of a signal is defined as a sum of square of
magnitude.
Tilt angle (𝜽): The tilt angle is defined by the relative tilt of the body with respect
to the acceleration of gravity 𝑔⃗. 𝜃 has been expressed as the angle (in radians) made
by the V-axis with 𝑔⃗, and hence:
θ = cos−1
(
𝑉
√𝑉2+𝑀𝐿2+𝐴𝑃2
)
ARcoeffs: The acceleration signal of each axis was fitted to an AR model. The
model order was identified by satisfying Akaike Information Criteria (AIC). The
first 3 coefficients were considered for each axis, and 9 coefficients overall.
Sample Entropy: Sample Entropy measures the regularity (or complexity) of a time
series by matching patterns of length m within error tolerance r, then extending the
comparison for matching patterns of length m + 1. In this study, the values
of m=1, and r=20%(STD).
3.1.4 Estimate Calorie Consumption
Stride Length: A rough estimate of a stride length of each user is obtained based
on a subject’s height. For this work we used average stride length is 75 cm.
Step Count: Step counting is performed based on a zero crossing detector which is
activated only for walking, going up-stairs, or going down-stairs. To reduce the
influence of noise, a threshold of three times the standard deviation of the static
activities was used. The number of steps get computed by Eq. (1), and an example
of detected zero crossings is shown in Fig.
Number of Steps = Number of Zero Crossings / 2 (1)
Figure 7: An example of zero-crossings for counting peaks to get step counts.
Distance: A total walking distance is computed by:
Distance = Stride Length × Step Counts (2)
• Speed: A walking speed is computed by:
Speed = Distance / Duration (Walking) (3)
• Energy Expenditure: In our system, we used the Metabolic Equivalents (METS)
values, which are most frequently used for the calorie count, to compute energy
consumed during each activity. We used an expression defined by the American
College of Sports Medicine (ACSM) and as defined in Eq. (4).
Energy Expenditure (kcal) = 1.05 × METS × Duration (hour) × Weight (kg) (4)
The METS values for six activities (sitting=1.3, standing=1.8, up-stairs=4.0, down-
stairs=3.8, walking=3.6) are obtained from. Since the METS value for walking can
be very different depending on speed.
CHAPTER 4
CLASSIFICATION
4.1 Classification
Classification is technique to categorize our data into a desired and distinct number
of classes where we can assign label to each class. Five activities were studied as
listed above. Subjects were requested to stop between two activities in order to note
the start and end times of different activities. We used this information to label
different activities for our supervised learning classification problem. Activity labels
were chosen to reflect the content and style of the action.
What classifier is used as given below:
Classifier
Type
Prediction
Speed
Memory
Usage
Interpretability Model
Flexibility
Coarse Tree Fast Small Easy
Low
Few leaves
to make
coarse
distinctions
between
classes
(maximum
number of
splits is 4).
Linear SVM Binary: Fast
Multiclass:
Medium
Medium Easy Low
Makes a
simple linear
separation
between
classes.
Cubic SVM Binary: Fast
Multiclass:
Slow
Binary:
Medium
Multiclass:
Large
Hard Medium
Quadratic
SVM
Binary: Fast
Multiclass:
Slow
Binary:
Medium
Multiclass:
Large
Hard Medium
Weighted
KNN
Medium Medium Hard Medium
distinctions
between
classes,
using a
distance
weight. The
number of
neighbors is
set to 10.
Table 1: Using five types of classifier
Six subject data is used for trained and three subject data is used for testing. The
trained accuracy and test accuracy is given below in the following:
Classifier Trained Accuracy Test Accuracy
Simple tree 71.7% 64.40%
Linear SVM 89.9% 74.10%
Cubic SVM 90.5% 64.21%
Quadratic SVM 92.0% 66.17%
Weighted KNN 88.9% 68.22%
Table 2: Trained accuracy and Test accuracy results
Figure 8: Trained accuracy by classification learner app
Figure 9: Scatter plot for Decision Tree
Figure 10: Scatter plot for Linear SVM
Figure 11: Scatter plot for Cubic SVM
Figure 12: Scatter plot for Quadratic SVM
Figure 13: Scatter plot for Weighted KNN
4.2 Confusion Matrix
A common method to visualize the performance of a Machine learning algorithm is
through the confusion matrix C, also called contingency table. Assuming there are
m classes available, a typical confusion matrix consists of a squared matrix of size
m × m where misclassifications are visible outside the diagonal.
• True Positives (TP): actual samples of class a correctly predicted as class a
• True Negatives (TN): actual samples of class b correctly predicted as class b
• False Positives (FP): actual samples of class b incorrectly predicted as class a
• False Negatives (FN): actual samples of class a incorrectly predicted as class
b
Figure 14: Confusion Matrix for Simple Tree
Figure 15: Confusion Matrix for Linear SVM
Figure 16: Confusion Matrix for Cubic SVM
Figure 17: Confusion Matrix for Quadratic SVM
Figure 18: Confusion Matrix for Weighted KNN
Figure 19: ROC curve for Simple Tree
Figure 20: ROC curve for Linear SVM
Figure 21: ROC curve for Cubic SVM
Figure 22: ROC curve for Quadratic SVM
Figure 23: ROC curve for Weighted KNN
CHAPTER 5
RESULTS AND DISCUSSION
5.1 Result and Discussion
From the table of Training Accuracy and Test Accuracy table and confusion matrix
for corresponding classifiers I can see the highest training accuracy is obtained from
Quadratic SVM classifier and that value is 92.0% and the lowest training accuracy
is obtained from the Simple tree classifier and that vale is 71.7%. I also can see the
highest test accuracy is obtained from Linear SVM classifier model and that value
is 74.10% and the lowest test accuracy is obtained from Simple Tree classifier model
and that value is 64.40%.
Though, Quadratic SVM classifier is provided the highest test accuracy but can’t
provide highest test accuracy. Considering overall table I declare that the Linear
SVM is the best classifier for my work because this classifier provides a little less
accuracy than the Quadratic SVM classifier but it provides the highest test accuracy
value.
From the results I see that there is a difference between trained accuracy and test
accuracy results. This difference occurred due to my small dataset. The classifier
provides a little difference between trained accuracy and test accuracy if I trained
the classifier by a large data set. Data sets are acquisition for a small time duration
(2 minutes or less). I proposed that this work process will be given a better test
accuracy for a large number of data sets.
CHAPTER 6
CONCLUSIONS
6.1 Conclusions
Human activity recognition has broad applications in medical research and human
survey system. In this work, recognition accuracy of up to 74.10% on various
everyday activities using a single tri-axial accelerometer was obtained. In this project
work, we used a smartphone-based accelerometer sensor that helps me to get
acceleration data for five human activities: walking, stairs-up and stairs-down,
sitting, standing.
I tried to extract 36 features for getting the best performance. The activity data were
trained and tested using five classifier such as-Simple Tree, Linear SVM, Cubic
SVM, Quadratic SVM, Weighted KNN. The best classification rate in our
experiment was 92.0% which is achieved by Quadratic SVM classifier which is a
kind of Support Vector Machine (SVM) machine learning algorithm and the best
recognition rate is achieved from Linear SVM classifier. Classification performance
is robust to the orientation and the position of smartphones
Conclusively, SVM is the optimal choice for our project work.
6.2 Future Work
This is my first work till present student life. In future, I will try to work with a
large number of data and more activities and implement a real-time system on
smartphone.
There is still room for improvement in our work which can be addressed from two
different Perspectives:
I) by solving current limitations of our proposed systems, and
II) by extending our achievements through complementary and novel
applications.
In the first case, some issues have arisen such as the limited number of activities
the system can deal with, the fixed Smartphone Position on the chest, and the
adoption of novel approaches to deal users with distinct differences in their motion
patterns (e.g. people with walking difficulties) into the system.
REFERENCES
[1] Dr. Md Aktaruzzaman. Parametric estimation of sample entropy for physical
activity recognition. Conference: 37th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society (EMBC), 2015.
[2] Dr. Md Aktaruzzaman and Roberto Sassi. Parametric estimation of sample
entropy in heart rate variability analysis. Biomed Signal Process Control, 14:141-
147, 2014.
[3] D. Anguita, A. Ghio, L. Oneto, X. Parra, and J. L. Reyes-Ortiz. Apublic domain
dataset for human activity recognition using smart-phones. In 21th European
Symposium on Artificial Neural Networks, Computational Intelligence and Machine
Learning, ESANN 2013,Bruges, Belgium 2013.
[4] A. Bayat, M. Pomplun, and D. A. Tran. A study on human activity recognition
using accelerometer data from smartphones. Procedia Comput Sci, 34:450–457,
2014.
[5] P. Casale, O. Pujol, and P. Radeva. Human activity recognition from
accelerometer data using a wearable device. In J. Vitria, J. M. Sanches,and M.
Hernandez, editors, Pattern Recogn Image Anal, pages 289–296. Springer Berlin
Heidelberg, 2011.
[6] MATLAB Central MathWorks[www.mathworks.com/matlabcentral/]
[7] MathWorks – Support [www.mathworks.com/support/]

More Related Content

What's hot

NIC Project Final Report
NIC Project Final ReportNIC Project Final Report
NIC Project Final ReportKay Karanjia
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on SteroidsAdam Blevins
 
MACHINE LEARNING METHODS FOR THE
MACHINE LEARNING METHODS FOR THEMACHINE LEARNING METHODS FOR THE
MACHINE LEARNING METHODS FOR THEbutest
 
Android Application for American Sign Language Recognition
Android Application for American Sign Language RecognitionAndroid Application for American Sign Language Recognition
Android Application for American Sign Language RecognitionVishisht Tiwari
 
BRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKS
BRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKSBRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKS
BRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKSMAHIM MALLICK
 

What's hot (9)

report
reportreport
report
 
NIC Project Final Report
NIC Project Final ReportNIC Project Final Report
NIC Project Final Report
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on Steroids
 
MACHINE LEARNING METHODS FOR THE
MACHINE LEARNING METHODS FOR THEMACHINE LEARNING METHODS FOR THE
MACHINE LEARNING METHODS FOR THE
 
Android Application for American Sign Language Recognition
Android Application for American Sign Language RecognitionAndroid Application for American Sign Language Recognition
Android Application for American Sign Language Recognition
 
edc_adaptivity
edc_adaptivityedc_adaptivity
edc_adaptivity
 
Walentin Widgren Thesis
Walentin Widgren ThesisWalentin Widgren Thesis
Walentin Widgren Thesis
 
BRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKS
BRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKSBRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKS
BRAIN-COMPUTER INTERFACING TO DETECT STRESS DURING MOTOR IMAGERY TASKS
 
Thesis
ThesisThesis
Thesis
 

Similar to A Research Base Project Report on A study on physical activity recognition from accelerometer data using smartphones

Human activity recognition
Human activity recognitionHuman activity recognition
Human activity recognitionRandhir Gupta
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKSara Parker
 
RY_PhD_Thesis_2012
RY_PhD_Thesis_2012RY_PhD_Thesis_2012
RY_PhD_Thesis_2012Rajeev Yadav
 
Nweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyNweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyAimonJamali
 
Abstract contents
Abstract contentsAbstract contents
Abstract contentsloisy28
 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringPriyanka Pandit
 
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...Nischal Lal Shrestha
 
complete_project
complete_projectcomplete_project
complete_projectAnirban Roy
 
Resource scheduling algorithm
Resource scheduling algorithmResource scheduling algorithm
Resource scheduling algorithmShilpa Damor
 
Seminar- Robust Regression Methods
Seminar- Robust Regression MethodsSeminar- Robust Regression Methods
Seminar- Robust Regression MethodsSumon Sdb
 

Similar to A Research Base Project Report on A study on physical activity recognition from accelerometer data using smartphones (20)

Human activity recognition
Human activity recognitionHuman activity recognition
Human activity recognition
 
Final_Thesis
Final_ThesisFinal_Thesis
Final_Thesis
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORK
 
Report
ReportReport
Report
 
thesis
thesisthesis
thesis
 
final
finalfinal
final
 
RY_PhD_Thesis_2012
RY_PhD_Thesis_2012RY_PhD_Thesis_2012
RY_PhD_Thesis_2012
 
Thesis_Nazarova_Final(1)
Thesis_Nazarova_Final(1)Thesis_Nazarova_Final(1)
Thesis_Nazarova_Final(1)
 
thesis_report
thesis_reportthesis_report
thesis_report
 
Thesispdf
ThesispdfThesispdf
Thesispdf
 
main
mainmain
main
 
Nweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyNweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italy
 
Abstract contents
Abstract contentsAbstract contents
Abstract contents
 
Fulltext02
Fulltext02Fulltext02
Fulltext02
 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and Engineering
 
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
 
Mobile d
Mobile dMobile d
Mobile d
 
complete_project
complete_projectcomplete_project
complete_project
 
Resource scheduling algorithm
Resource scheduling algorithmResource scheduling algorithm
Resource scheduling algorithm
 
Seminar- Robust Regression Methods
Seminar- Robust Regression MethodsSeminar- Robust Regression Methods
Seminar- Robust Regression Methods
 

Recently uploaded

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 

Recently uploaded (20)

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 

A Research Base Project Report on A study on physical activity recognition from accelerometer data using smartphones

  • 1. A PROJECT REPORT ON A Study on Physical Activity Recognition from Accelerometer Data using Smartphones Submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Science in Computer Science and Engineering Project Supervisor: Dr. Md Aktaruzzaman Associate Professor Department of Computer Science & Engineering Islamic University-Bangladesh Submitted By: Diponkor Bala Roll No: 1314021 Reg No: 1136 Session: 2013-2014 Department of Computer science & Engineering Islamic University-Bangladesh
  • 2. Certificate This is to certify that the project report entitled “A Study on Physical Activity Recognition from Accelerometer Data using Smartphones” which is submitted by Diponkor Bala Roll No. 1314021 are an authentic work carried out by him at Department of Computer Science & Engineering, Islamic University-Bangladesh under my guidance. It is my pleasure to declare that, according to my knowledge and query this work is original and has been done for the first time ever like this procedure. Signature: ……………………… Date: …………………………… Dr. Md Aktaruzzaman Associate Professor Department of Computer Science & Engineering Islamic University-Bangladesh
  • 3. Acknowledgement On the submission of my project report on “A Study on Physical Activity Recognition from Accelerometer Data using Smartphones”, we would like to extend our gratitude and sincere thanks to my supervisor Dr. Md Aktaruzzaman, Associate Professor, Department of Computer Science and Engineering for his constant motivation and support that helps me to complete this work. I truly appreciate and value his esteemed guidance & encouragement from the beginning to the end of this project. I indebted to his for having helped me shape the problem and providing insights towards the solution. I want to thank all our teachers for providing a solid background for my studies thereafter. They have been great sources of inspiration to me and I thank them from the bottom of my heart. Above all, I would like to thank all of my friends whose direct and indirect support helped me complete my project in time. The project would have been impossible without their perpetual moral support. The Author Diponkor Bala Depatment of Computer Science & Engineering Islamic University-Bangladesh II
  • 4. Abstract Physical-activity recognition via wearable sensors can provide valuable information regarding an individual's degree of functional ability and lifestyle. In this paper, we present an accelerometer sensor-based approach for human-activity recognition. Insufficient amount of physical activity, and hence storage of calories may lead depression, obesity, cardiovascular diseases, and diabetes. The amount of consumed calorie depends on the type of activity. The recognition of physical activity is very important to estimate the amount of calories spent by a subject every day. There are some research works already published in the literature for activity recognition through accelerometers (body worn sensors). The accuracy of any recognition system depends on the robustness of selected features and classifiers. For this work, I extracted some features such as-mean, median absolute deviation(MAD), standard deviation (STD) ,minimum(min), maximum(max), signal energy, signal magnitude area (SMA), tilt angle (TA), autoregressive coefficients (ARcoeffs). The system was trained and tested in an experiment with multiple human subjects in real-world conditions. For classification, I selected five classifiers each offering good performance for recognizing our set of activities and investigated how to combine them into an optimal set of classifiers. The best classification rate in our experiment was 92.0%. Keywords: Activity Recognition, Smartphone, Accelerometer, Classification. III
  • 5. Contents Certificates …………………………………………………...…………i Acknowledgement………………………………………………….…...ii Abstract…………………………………………………………………iii List of Figures and Tables……………………………………………...iv Contents…………………………………………………………………v INTRODUCTION…………………………………………………….1-1 1.1 Introduction……………………………………………………1 1.2 Literature Review…………………………………………...…1 BACKGROUND……………………………………………………..2-8 2.1 Background………………………………………………..….…….2 2.2 Accelerometers……………………….…………………………..…2 2.2.1 The purpose of the accelerometer………………….………2 2.2.2 How they work……………………………………….…….3 2.3 Machine Learning……………………………………...……….….4 2.3.1 Types of Machine Learning Algorithms………….….…….4 2.3.1.1 Supervised Learning…………………….….….…..4 2.3.1.2 Unsupervised Learning………………………....….5 2.3.1.3 Semi-supervised Learning…………………....…….6 2.3.1.4 Reinforcement Learning……………………………6 2.3.2 Machine Learning Approaches……………………………..6 METHODOLOGY……………………………………………..……9-17 3.1 Methodology…………………………………………….………….9 3.1.1 Sensor Data Acquisition……………………………………9 IV
  • 6. 3.1.2 Preprocessing……………………………………………..11 3.1.3 Feature Selection and Extraction………………………....13 3.1.4 Estimate Calorie Consumption…………………..……….16 CLASSIFICATION……………………………………..……….18-28 4.1 Classification………………………………………………18 4.2 Confusion Matrix…………………………………………..23 RESULTS AND DISCUSSIONS……………………………………29 5.1 Result and Discussion…………….……………..………….29 CONCLUSIONS………………………………………..……………30 6.1 Conclusions………………………………..….…………….30 6.2 Future Work…………………………..…….………………30 REFERENCES……………………………………………….………31 V
  • 7. LISTS OF FIGURES AND TABLES Figure 1: Example of a binary classification problem in a two-dimensional space …………………………………………………………………………………. 8 Figure 2: Activity Recognition process pipeline………………………..............9 Figure 3: The polarity and position of the accelerometer on the human body…10 Figure 4: Sample raw output of the tri-axial accelerometer during five different types of physical activity………………………………………………………..11 Figure 5: Some example of window of an activity (walking)… .................……12 Figure 6: The Human Activity Recognition Process Pipeline with its four main blocks……………………………………………………………………………14 Figure 7: An example of zero-crossings for counting peaks to get step counts...17 Table 1: Using five types of classifier…………………………………………..19 Table 2: Trained accuracy and Test accuracy results…………………...………20 Figure 8: Trained accuracy by classification learner app…………………….....20 Figure 9: Scatter plot for Decision Tree………………………….……………..21 Figure 10: Scatter plot for Linear SVM………………………………..……….21 Figure 11: Scatter plot for Cubic SVM…………………………………………22 Figure 12: Scatter plot for Quadratic SVM…………….……………………….22 Figure 13: Scatter plot for Weighted KNN……………….…………………….23 Figure 14: Confusion Matrix for Simple Tree………………………...………..24 Figure 15: Confusion Matrix for Linear SVM………………...……………….24 Figure 16: Confusion Matrix for Cubic SVM…………….……………………25 Figure 17: Confusion Matrix for Quadratic SVM……………………..……….25 Figure 18: Confusion Matrix for Weighted KNN……………..……………….26 Figure 19: ROC curve for Simple Tree……………………….………………..26 Figure 20: ROC curve for Linear SVM…………………………..……………27 Figure 21: ROC curve for Cubic SVM……………………………..………….27 Figure 22: ROC curve for Quadratic SVM ……………………………………28 Figure 23: ROC curve for Weighted KNN……………………………...……..28 VI
  • 8. CHAPTER 1 INTRODUCTION 1.1. Introduction Human activity recognition is an important yet challenging research area with many applications in healthcare, smart environments, and homeland security. Computer vision-based techniques have widely been used for human activity tracking, but they mostly require infrastructure support. Alternatively, a more efficient approach is to process the data from inertial measurement unit sensors worn on a user’s body or built in a user’s smartphone to track his or her motion. We aim to develop a model that is capable of recognizing multiple sets of daily activities under real-world conditions, using data collected by a single tri-axial accelerometer built into a cell phone (in our study, an Android smartphone). A tri- axial accelerometer is a sensor that returns an estimate of acceleration along the x, y and z axes from which velocity and displacement can also be estimated. Activity recognition is formulated as a supervised classification problem, whose training data is obtained via an experiment having human subjects perform each of the activities. We aim at a classification methodology that is robust regardlessly of the classifier tool in use. Our model has been tested in an experiment having three users each performing one of the following six physical activity patterns: Walking, Stairs-Up, Stairs-Down, Sitting, Standing. In the following sections, we discuss the related work, describe our data collection methodology and our approach to recognize activity from accelerometer data, and results of our experiment. 1.2 Literature Review Human activity recognition has been studied for years and researchers have proposed different solutions to attack the problem. Existing approaches typically use vision sensor, inertial sensor and the mixture of both. Machine learning and threshold-base algorithms are often applied. Machine learning usually produces more accurate and reliable results, while threshold-based algorithms are faster and simpler. Single accelerometer attached to different body positions are the most
  • 9. common solutions. Approaches that combine both vision and inertial sensors have also been purposed. Another essential part of all these algorithms is data processing. The quality of the input features has a great impact on the performance. Some previous works are focused on generating the most useful features from the time series data set. The common approach is to analyze the signal in both time and frequency domain. Active learning technique has been applied on many machine learning problems that are time-consuming and labor-expensive to label samples. Some applications include speech recognition, information extraction, and handwritten character recognition.
  • 10. CHAPTER 2 BACKGROUND 2.1 Background 2.2 Accelerometers The accelerometer in Android phones measures the acceleration of the device on the x (lateral), y (longitudinal), and z (vertical) axes. Accelerometers can be used to detect movement and the rate of change of the speed of movement. As stated above, the use of accelerometers in Android applications does not require the application to have permission to use it. Therefore, it is possible for an application to collect a user’s accelerometer data without the user’s knowledge. With accelerometer data and the use of a server to collect the information, it is a fairly simple task for someone to gain a user’s personal information, their location, or to figure out what a user is doing or typing. 2.2.1 The purpose of the accelerometer The application of accelerometers extends to multiple disciplines, both academic and consumer-driven. For example, accelerometers in laptops protect hard drives from damage. If the laptop were to suddenly drop while in use, the accelerometer would detect the sudden free fall and immediately turn off the hard drive to avoid hitting the reading heads into the hard drive platter. Without this, the two would strike and cause scratches to the platter for extensive file and reading damage. Accelerometers are likewise used in cars as the industry method way of detecting car crashes and deploying airbags almost instantaneously. In another example, a dynamic accelerometer measures gravitational pull to determine the angle at which a device is tilted with respect to the Earth. By sensing the amount of acceleration, users analyze how the device is moving. Accelerometers allow the user to understand the surroundings of an item better. With this small device, you can determine if an object is moving uphill, whether it will fall over if it tilts any more, or whether it’s flying horizontally or angling downward. For example, smartphones rotate their display between portrait and landscape mode depending on how you tilt the phone.
  • 11. 2.2.2 How they work An accelerator looks like a simple circuit for some larger electronic device. Despite its humble appearance, the accelerometer consists of many different parts and works in many ways, two of which are the piezoelectric effect and the capacitance sensor. The piezoelectric effect is the most common form of accelerometer and uses microscopic crystal structures that become stressed due to accelerative forces. These crystals create a voltage from the stress, and the accelerometer interprets the voltage to determine velocity and orientation. The capacitance accelerometer senses changes in capacitance between microstructures located next to the device. If an accelerative force moves one of these structures, the capacitance will change and the accelerometer will translate that capacitance to voltage for interpretation. Accelerometers are made up of many different components, and can be purchased as a separate device. Analog and digital displays are available, though for most technology devices, these components are integrated into the main technology and accessed using the governing software or operating system. Typical accelerometers are made up of multiple axes, two to determine most two- dimensional movement with the option of a third for 3D positioning. Most smartphones typically make use of three-axis models, whereas cars simply use only a two-axis to determine the moment of impact. The sensitivity of these devices is quite high as they’re intended to measure even very minute shifts in acceleration. The more sensitive the accelerometer, the more easily it can measure acceleration. Accelerometers, while actively used in many electronics in today’s world, are also available for use in custom projects. Whether you’re an engineer or tech geek, the accelerometer plays a very active role in a wide range of functionalities. In many cases you may not notice the presence of this simple sensor, but odds are you may already be using a device with it. 2.3 Machine Learning Machine learning is the area of study concerned about the design, development and evaluation of systems capable to learn from data. In many common situations where we need, for instance, to complete a particular task, or perhaps to make some prediction regarding a given issue, it is possible to find solutions by the inspection
  • 12. and analysis of previous observations with similar characteristics to the addressed problem. In other words, Machine learning systems are capable of predicting future actions based on past experiences. Data are used as the input of the learning process and their representation is fundamental for the performance of Machine learning systems. They must describe any specific situation to better predict future data in a meaningful way. The property that allows to correctly predict unseen samples, is known as generalization and it is highly desirable in any learning machine as it is directly related to its performance. 2.3.1 Types of Machine Learning Algorithms Machine learning algorithms have been categorized according to the type of input used for training and its expected outcome. In this section, we describe the most relevant categories. 2.3.1.1 Supervised Learning In this type of learning, input data are usually composed of a pair of elements, namely the input vector (x) together with its target (y) .This can be better clarified with an example: assume a system that learns handwritten numbers from 0 to 9. The input vectors would be the set of images of all the numbers (usually several samples per each one) and the target vector the actual labels that correspond to each sample. If the output of desired system is categorical (only a set of discrete classes are considered), then it is a classification problem such as the example presented above. Otherwise, if the output data are continuous variables, such as in temperature forecasting or stock market prediction, then the system is considered a regression. This algorithm type is the most commonly used for ML and it is also the one used in our research. However it is not useful to solve all kind of problems. In fact, one of its disadvantages is that in some applications it is not always possible to have target information for all the available input samples. As a result, other techniques can cope with these situations such as unsupervised and semi-supervised learning which are described as follows. When the learning is performed gradually, for instance, by adding one new sample and its target at a time to the model, we refer to Online Machine Learning. This
  • 13. supervised approach have the advantage of making the model adaptive and flexible in accordance to the new inputs. This type of learning is required in applications with high output variability and where a stream of new samples is available and can be progressively added to the model for learning. This is the case of online web ranking and stock market prediction applications. 2.3.1.2 Unsupervised Learning In an unsupervised learning problem, the training data consists of only input vectors without their associated targets. It aims to find certain similarities or discover distinguishable structure within the input data (e.g. clustering). It can also be used for density estimation to describe the distribution of the data in its space. Moreover, this learning approach can be exploited for data visualization using dimensionality reduction methods which allow to better project high-dimensional data into smaller spaces .Unsupervised learning approaches have been already applied in several areas such as in medical imaging where 3D Positron Emission Tomography (PET) scans use cluster analysis to find dissimilarities between different organs and types of tissue to be able to correctly segment the scanned area (George et al. 2011). It has also been applied in the automatic grouping of similar shopping items (e.g. books, movies, music), particularly in recommender systems for online stores that aim to predict the user preferences based on products similarities and previous purchases. 2.3.1.3 Semi-supervised Learning This learning approach combines labeled and unlabeled data for learning. Therefore, it takes aspects from both supervised and unsupervised approaches. In general, small amounts of labeled data are integrated with a large number of unlabeled samples for learning. For example, it is useful for datasets where it is not always possible to have a label for each sample. Evidence have shown that semi-supervised learning can greatly improve the learning performance when compared with supervised learning which does not take into account unlabeled data. This is feasible if considerations such as the data smoothness assumption apply .Vast digital image collections on the internet for content retrieval are an application example where this type of learning can be exploited. Not all the images have an associated targets and it would be humanly impossible to perform this labeling manually.
  • 14. 2.3.1.4 Reinforcement Learning This learning approach is oriented on finding an appropriate set of actions to solve a particular problem. This is done with the purpose of maximizing a reward. Optimal solutions are not found through learning a model given a set of input-target pairs. 2.3.2 Machine Learning Approaches Several ML modeling approaches have been developed throughout the years in order to solve different tasks such as classification, regression and clustering .Some of them are based on deterministic models which aim to find fixed causal relationships between events. Other approaches, on the other hand, are probabilistic and assume occurring events are generated from a probability distribution. In the following list, the most popular ML algorithms are briefly described. Then, in the next section we make particular focus on SVMs as they are the central ML algorithm employed in this project. • Decision Tree (DT): is a predictive model based on decision trees which makes choices from a set of hierarchical rules related to the input data. It is a common approach for classification particularly because the resulting models are easily interpretable by humans (due to its intrinsic tree structure). • Random Forest (RF): is an ML meta-classifier which is built using an ensemble of DTs. The predicted class is chosen as the most frequently occurring amongst the output of each DT. • k-Nearest Neighbors (k-NN): this deterministic learning approach exploits similarity measures between data for classification and regression tasks. Given a new sample, the approach finds the k closest samples from a training set to decide the prediction outcome with their. Its main disadvantage relies on the size of its model as it is data-dependent and makes it unfeasible in large datasets. There are, however, versions which consider data reduction techniques for alleviating this issue. • Naive Bayes (NB): is a popular probabilistic classifier based on Bayes’s theorem that predicts the class of a given sample by assuming an underlying probability model of the data and making strong independence assumptions between its features.
  • 15. Even though its formulation is quite simple, it has shown to perform well in various applications .For example, when data are assumed to be Gaussian-distributed, it is possible to learn the model only by calculating the mean and variance of the input data. • Artificial Neural Networks (ANN): is an ML approach with a biological inspiration. It simulates how the brain and its nervous system, composed of interconnected neurons, is able to learn from experience and capture the underlying structure of the data. Neurons are set in a layered structure and have associated weights which are able to adapt based on the training data and the network output through a cost function. This approach has shown to perform well in many including non-linear problems. Its main disadvantage relies in the need of a large dataset for its training stage. Multilayer Perceptron (MLP) is a popular ANN model that maps the input through multiple layers of neurons in a fully connected directed graph until reaching the output. • Logistic Regression (LR): is a probabilistic algorithm used for solving classification and regression problems. It estimates the probability of a given sample of belonging to a particular class. This is achieved through the use of a logistic function which is modeled by fitting the training data generally using maximum likelihood estimation. • Support Vector Machine (SVM): A Support vector machine is one of the most commonly used supervised Machine Learning algorithms. It was initially proposed by Vladimir Vapnik and his colleagues in with the aim of solving linear and non-linear binary classification problems. Afterward, this algorithm has been adapted for its application in multiclass classification and regression analysis. The SVM for classification is a deterministic approach that aims to find the hyperplanes that best separate the data into classes. These subspaces are the ones that provide the largest margin separation from the classes of the training data with the intention of providing a model with low generalization error for its use with unseen data samples. SVMs are the basis for the classification of activities in this work. For this reason we now introduce them, starting from the binary SVM model which is its simplest representation, to the extended case that allows the classification of more than two classes: the multiclass SVM. This algorithm will be further revised throughout the
  • 16. development of this research to tackle specific requirements for our application in aspects such as kernel type, arithmetic used and algorithm output type. Figure 1: Example of a binary classification problem in a two-dimensional space (circles and crosses). The line represents a possible solution to the problem.
  • 17. CHAPTER 3 METHODOLOGY 3.1 Methodology 3.1.1 Sensor Data Acquisition Pictorial representation of physical activity recognition is shown below: Figure 2: Activity Recognition process pipeline In order to collect data, we used a tri-axial accelerometer in the Android phone to measure acceleration. Data from this accelerometer includes the acceleration along the x-axis, y-axis and z-axis. These axes capture the medio-lateral (ML) movement of the user (x-axis), vertical (V) movement (y-axis), and anterior-posterior (AP) movement (z-axis). Figure 1(a) demonstrates these axes relative to a user.
  • 18. Figure 3: The polarity and position of the accelerometer on the human body (image courtesy of GENEActive, 2012) In this study, the acceleration signals were recorded from 9 subjects (age range: 25±2 years). The subjects were asked to perform five activities: walking, stairs-up, stairs- down, sitting, standing. The duration of stairs-up, stairs-down activity is about 60 ±5 sec and walking, sitting, standing is 120±5 sec. The sampling frequency used for collecting data was 100 fps. A sample of the acceleration signals of all activities is shown in fig
  • 19. Figure 4: Sample raw output of the tri-axial accelerometer during five different types of physical activity 3.1.2 Preprocessing The accelerometers real time output may contain random noise that should be filtered out before it is used for activity recognition. A low pass moving average filter has been applied for filtering out the noise from the acceleration data. Basically, before the extraction process, we use the sliding window segmentation technique for the signal to be divided into the particular size. In this project work I assumed the each window size is 300. The raw signals from each dimension (x-axis, y-axis, and z-axis) are split into several numbers of window segments. Two common approaches are usually used in this method; with overlapping or without overlapping. The first approach is conducted by segmenting the window with overlapping between two consecutive window segments. Otherwise, there is no overlapping between two consecutive window segments in the second approach.
  • 20. In the following give some example of window data – Window (1) Window (2) Window (3) Window (4) Figure 5: Some example of window of an activity (walking)
  • 21. 3.1.3 Feature Selection and Extraction In a Machine learning problem, feature selection refers to the process of selecting a significant set of features to largely impact the discrimination ability of a learning algorithm. Feature extraction, on the other hand, is an approach to diminish the dimensionality of an available set of features by performing inter-feature transformations in order to obtain a new dimensionally reduced representation without largely sacrificing relevant information from the original set. The curse of dimensionality, which describes the difficulty in understanding and dealing with high-dimensional data, is certainly linked with these two reduction mechanisms as they can alleviate the problems that may arise when working in high-dimensional spaces. Feature selection and extraction also allows reducing the training times and increasing the generalization performance in ML problems. They, however, differ on that the interpretability of models in which feature selection is employed is much clearer. In this case, features are distinct between each other and not merged such as in feature-extraction based approaches. Depending on the application, the features required for the extraction of relevant information may vary. In the particular case of HAR, a reduced representation of the sensor data can be used as the input of a recognition algorithm. This is attained by estimating various measures from the sensor signals in different domains (e.g. in time and frequency). Nonetheless, other time-frequency function representations such as the wavelet transforms are also applicable. Once obtained, they can be further reduced using feature selection (e.g. exhaustive search, or wrappers, filters and embedded methods and extraction approaches, or a combination of both.
  • 22. Figure 6: The Human Activity Recognition Process Pipeline with its four main blocks The description of extracted feature is given below: Mean: Average or mean value of array M = mean(A) returns the mean of the elements of A along the first array dimension whose size does not equal 1.  If A is a vector, then mean(A) returns the mean of the elements.  If A is a matrix, then mean(A) returns a row vector containing the mean of each column.  If A is a multidimensional array, then mean(A) operates along the first array dimension whose size does not equal 1, treating the elements as vectors. This dimension becomes 1 while the sizes of all other dimensions remain the same. Min: Minimum elements of an array M = min(A) returns the minimum elements of an array.  If A is a vector, then min(A) returns the minimum of A.  If A is a matrix, then min(A) is a row vector containing the minimum value of each column.  If A is a multidimensional array, then min(A) operates along the first array dimension whose size does not equal 1, treating the elements as vectors. The
  • 23. size of this dimension becomes 1 while the sizes of all other dimensions remain the same. If A is an empty array with first dimension 0, then min(A) returns an empty array with the same size as A. Max: Maximum elements of an array M = max(A) returns the maximum elements of an array.  If A is a vector, then max(A) returns the maximum of A.  If A is a matrix, then max(A) is a row vector containing the maximum value of each column.  If A is a multidimensional array, then max(A) operates along the first array dimension whose size does not equal 1, treating the elements as vectors. The size of this dimension becomes 1 while the sizes of all other dimensions remain the same. If A is an empty array whose first dimension has zero length, then max(A) returns an empty array with the same size as A. Mad: median absolute deviation y = mad(X) returns the mean absolute deviation of the values in X.  If X is a vector, then mad returns the mean or median absolute deviation of the values in X.  If X is a matrix, then mad returns a row vector containing the mean or median absolute deviation of each column of X.  If X is a multidimensional array, then mad operates along the first nonsingleton dimension of X. STD: Standard Deviation = determines the standard deviation of the selected array Signal Energy: In signal processing, total energy of signal x(t) is defined as similar way.
  • 24. Where |x(t)| denotes the magnitude of x(t). It is necessary to get a scalar quantity for complex signal, because magnitude of complex number is defined as And, it is also squared because of common convention to use similar terminology for any signal. Therefore, the energy of a signal is defined as a sum of square of magnitude. Tilt angle (𝜽): The tilt angle is defined by the relative tilt of the body with respect to the acceleration of gravity 𝑔⃗. 𝜃 has been expressed as the angle (in radians) made by the V-axis with 𝑔⃗, and hence: θ = cos−1 ( 𝑉 √𝑉2+𝑀𝐿2+𝐴𝑃2 ) ARcoeffs: The acceleration signal of each axis was fitted to an AR model. The model order was identified by satisfying Akaike Information Criteria (AIC). The first 3 coefficients were considered for each axis, and 9 coefficients overall. Sample Entropy: Sample Entropy measures the regularity (or complexity) of a time series by matching patterns of length m within error tolerance r, then extending the comparison for matching patterns of length m + 1. In this study, the values of m=1, and r=20%(STD).
  • 25. 3.1.4 Estimate Calorie Consumption Stride Length: A rough estimate of a stride length of each user is obtained based on a subject’s height. For this work we used average stride length is 75 cm. Step Count: Step counting is performed based on a zero crossing detector which is activated only for walking, going up-stairs, or going down-stairs. To reduce the influence of noise, a threshold of three times the standard deviation of the static activities was used. The number of steps get computed by Eq. (1), and an example of detected zero crossings is shown in Fig. Number of Steps = Number of Zero Crossings / 2 (1) Figure 7: An example of zero-crossings for counting peaks to get step counts. Distance: A total walking distance is computed by: Distance = Stride Length × Step Counts (2) • Speed: A walking speed is computed by: Speed = Distance / Duration (Walking) (3)
  • 26. • Energy Expenditure: In our system, we used the Metabolic Equivalents (METS) values, which are most frequently used for the calorie count, to compute energy consumed during each activity. We used an expression defined by the American College of Sports Medicine (ACSM) and as defined in Eq. (4). Energy Expenditure (kcal) = 1.05 × METS × Duration (hour) × Weight (kg) (4) The METS values for six activities (sitting=1.3, standing=1.8, up-stairs=4.0, down- stairs=3.8, walking=3.6) are obtained from. Since the METS value for walking can be very different depending on speed.
  • 27. CHAPTER 4 CLASSIFICATION 4.1 Classification Classification is technique to categorize our data into a desired and distinct number of classes where we can assign label to each class. Five activities were studied as listed above. Subjects were requested to stop between two activities in order to note the start and end times of different activities. We used this information to label different activities for our supervised learning classification problem. Activity labels were chosen to reflect the content and style of the action. What classifier is used as given below: Classifier Type Prediction Speed Memory Usage Interpretability Model Flexibility Coarse Tree Fast Small Easy Low Few leaves to make coarse distinctions between classes (maximum number of splits is 4). Linear SVM Binary: Fast Multiclass: Medium Medium Easy Low Makes a simple linear separation between classes.
  • 28. Cubic SVM Binary: Fast Multiclass: Slow Binary: Medium Multiclass: Large Hard Medium Quadratic SVM Binary: Fast Multiclass: Slow Binary: Medium Multiclass: Large Hard Medium Weighted KNN Medium Medium Hard Medium distinctions between classes, using a distance weight. The number of neighbors is set to 10. Table 1: Using five types of classifier
  • 29. Six subject data is used for trained and three subject data is used for testing. The trained accuracy and test accuracy is given below in the following: Classifier Trained Accuracy Test Accuracy Simple tree 71.7% 64.40% Linear SVM 89.9% 74.10% Cubic SVM 90.5% 64.21% Quadratic SVM 92.0% 66.17% Weighted KNN 88.9% 68.22% Table 2: Trained accuracy and Test accuracy results Figure 8: Trained accuracy by classification learner app
  • 30. Figure 9: Scatter plot for Decision Tree Figure 10: Scatter plot for Linear SVM
  • 31. Figure 11: Scatter plot for Cubic SVM Figure 12: Scatter plot for Quadratic SVM
  • 32. Figure 13: Scatter plot for Weighted KNN 4.2 Confusion Matrix A common method to visualize the performance of a Machine learning algorithm is through the confusion matrix C, also called contingency table. Assuming there are m classes available, a typical confusion matrix consists of a squared matrix of size m × m where misclassifications are visible outside the diagonal. • True Positives (TP): actual samples of class a correctly predicted as class a • True Negatives (TN): actual samples of class b correctly predicted as class b • False Positives (FP): actual samples of class b incorrectly predicted as class a • False Negatives (FN): actual samples of class a incorrectly predicted as class b
  • 33. Figure 14: Confusion Matrix for Simple Tree Figure 15: Confusion Matrix for Linear SVM
  • 34. Figure 16: Confusion Matrix for Cubic SVM Figure 17: Confusion Matrix for Quadratic SVM
  • 35. Figure 18: Confusion Matrix for Weighted KNN Figure 19: ROC curve for Simple Tree
  • 36. Figure 20: ROC curve for Linear SVM Figure 21: ROC curve for Cubic SVM
  • 37. Figure 22: ROC curve for Quadratic SVM Figure 23: ROC curve for Weighted KNN
  • 38. CHAPTER 5 RESULTS AND DISCUSSION 5.1 Result and Discussion From the table of Training Accuracy and Test Accuracy table and confusion matrix for corresponding classifiers I can see the highest training accuracy is obtained from Quadratic SVM classifier and that value is 92.0% and the lowest training accuracy is obtained from the Simple tree classifier and that vale is 71.7%. I also can see the highest test accuracy is obtained from Linear SVM classifier model and that value is 74.10% and the lowest test accuracy is obtained from Simple Tree classifier model and that value is 64.40%. Though, Quadratic SVM classifier is provided the highest test accuracy but can’t provide highest test accuracy. Considering overall table I declare that the Linear SVM is the best classifier for my work because this classifier provides a little less accuracy than the Quadratic SVM classifier but it provides the highest test accuracy value. From the results I see that there is a difference between trained accuracy and test accuracy results. This difference occurred due to my small dataset. The classifier provides a little difference between trained accuracy and test accuracy if I trained the classifier by a large data set. Data sets are acquisition for a small time duration (2 minutes or less). I proposed that this work process will be given a better test accuracy for a large number of data sets.
  • 39. CHAPTER 6 CONCLUSIONS 6.1 Conclusions Human activity recognition has broad applications in medical research and human survey system. In this work, recognition accuracy of up to 74.10% on various everyday activities using a single tri-axial accelerometer was obtained. In this project work, we used a smartphone-based accelerometer sensor that helps me to get acceleration data for five human activities: walking, stairs-up and stairs-down, sitting, standing. I tried to extract 36 features for getting the best performance. The activity data were trained and tested using five classifier such as-Simple Tree, Linear SVM, Cubic SVM, Quadratic SVM, Weighted KNN. The best classification rate in our experiment was 92.0% which is achieved by Quadratic SVM classifier which is a kind of Support Vector Machine (SVM) machine learning algorithm and the best recognition rate is achieved from Linear SVM classifier. Classification performance is robust to the orientation and the position of smartphones Conclusively, SVM is the optimal choice for our project work. 6.2 Future Work This is my first work till present student life. In future, I will try to work with a large number of data and more activities and implement a real-time system on smartphone. There is still room for improvement in our work which can be addressed from two different Perspectives: I) by solving current limitations of our proposed systems, and II) by extending our achievements through complementary and novel applications. In the first case, some issues have arisen such as the limited number of activities the system can deal with, the fixed Smartphone Position on the chest, and the adoption of novel approaches to deal users with distinct differences in their motion patterns (e.g. people with walking difficulties) into the system.
  • 40. REFERENCES [1] Dr. Md Aktaruzzaman. Parametric estimation of sample entropy for physical activity recognition. Conference: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015. [2] Dr. Md Aktaruzzaman and Roberto Sassi. Parametric estimation of sample entropy in heart rate variability analysis. Biomed Signal Process Control, 14:141- 147, 2014. [3] D. Anguita, A. Ghio, L. Oneto, X. Parra, and J. L. Reyes-Ortiz. Apublic domain dataset for human activity recognition using smart-phones. In 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2013,Bruges, Belgium 2013. [4] A. Bayat, M. Pomplun, and D. A. Tran. A study on human activity recognition using accelerometer data from smartphones. Procedia Comput Sci, 34:450–457, 2014. [5] P. Casale, O. Pujol, and P. Radeva. Human activity recognition from accelerometer data using a wearable device. In J. Vitria, J. M. Sanches,and M. Hernandez, editors, Pattern Recogn Image Anal, pages 289–296. Springer Berlin Heidelberg, 2011. [6] MATLAB Central MathWorks[www.mathworks.com/matlabcentral/] [7] MathWorks – Support [www.mathworks.com/support/]