SlideShare a Scribd company logo
Bulletin of Electrical Engineering and Informatics
Vol. 10, No. 6, December 2021, pp. 3137~3146
ISSN: 2302-9285, DOI: 10.11591/eei.v10i6.2802 3137
Journal homepage: http://beei.org
Development of 3D convolutional neural network to recognize
human activities using moderate computation machine
Malik A. Alsaedi, Abdulrahman S. Mohialdeen, Baraa M. Albaker
College of Engineering, Al-Iraqia University, Sabe’ Abkar, Adhamiya, Baghdad, Iraq
Article Info ABSTRACT
Article history:
Received Jan 12, 2021
Revised May 20, 2021
Accepted Oct 9, 2021
Human activity recognition (HAR) is recently used in numerous applications
including smart homes to monitor human behavior, automate homes
according to human activities, entertainment, falling detection, violence
detection, and people care. Vision-based recognition is the most powerful
method widely used in HAR systems implementation due to its characteristics
in recognizing complex human activities. This paper addresses the design of a
3D convolutional neural network (3D-CNN) model that can be used in smart
homes to identify several numbers of activities. The model is trained using
KTH dataset that contains activities like (walking, running, jogging,
handwaving handclapping, boxing). Despite the challenges of this method due
to the effectiveness of the lamination, background variation, and human body
variety, the proposed model reached an accuracy of 93.33%. The model was
implemented, trained and tested using moderate computation machine and the
results show that the proposal was successfully capable to recognize human
activities with reasonable computations.
Keywords:
3D-CNN
Convolutional neural network
Deep learning
HAR
Smart home
Vision
This is an open access article under the CC BY-SA license.
Corresponding Author:
Abdulrahman S. Mohialdeen
College of Engineering
Al-Iraqia University
Sabe’ Abkar, Adhamiya, Baghdad, Iraq
Email: abd_saeed@aliraqia.edu.iq
1. INTRODUCTION
HAR is one of the challenging subjects because of the huge number of human activities, some of the
activities can be easily noticed some of them are confusing, and some of them require interaction with other
objects or humans, besides the diversity of the activities, the recognition methods are also diverse. There are
many types of data required to recognize a human activity, some of them use ambient sensors like
accelerometer, gyroscope, humidity, and temperature [1], [2]. Some of them get the benefit of the
smartphone's sensors like accelerometer and gyroscope [3], [4], the other use the radio frequency [5]. But the
most popular recognizing methods use vision-based recognition [6]-[14].
Vision uses images or videos to recognize the activity. Also, there is a lot of challenges for
recognizing human activities because of the effect of lamination, variance of background. Still, the question
is how to process these visual data to recognize the activity. The answer there are many techniques most of
them use machine learning, and deep learning has shown an excellent benefit for recognizing human
activities, especially the CNN which are very useful for vision-based data recognition.
In this paper, we will design a neural network architect a.k.a. model, that can be used for human
activity monitoring, the model proposed of 3D dimensional CNN (3D-CNN), and the purpose of using
3D-CNN is to extract spatial and temporal features rather than only spatial features, and the activity consists
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146
3138
of multiple movements that can be identified by extracting the temporal features from between several
numbers of frames.
Our proposed model is a small number of 3D-CNN layers to reduce the amount of processing time
so that any computer with low computational ability could recognize human activity online without delay.
CNN was introduced by Fukushima [12]. At the beginning, it was mainly used for image classification, and
image features extraction, one of the most attractive models that been invented to get into a challenge to
classify images [13], [14]. During that many video human activity datasets were published and KTH [15] is
one of the most popular small size datasets. CNN architect encouraged researchers to use it in HAR with
image data taken from video dataset [10], [16]. Other researchers proposed to use two-stream CNN where the
first stream is an image data and the second it's optical flow [16], [17] 3D-CNN for HAR most cited articles
[18] proposed model with TRECVID dataset, and [19] proposed 3D-CNN model for UCF-101 dataset [20] in
which our model inspired by their model.
2. RESEARCH METHOD
This paper proposing to use deep neural networks (DNN) models that consists of many different
layers, and each layer has its purpose during training and testing. Still, the dominant layer in which the article
focused on is the Convolutional neural network, which is one of the most widely used neural networks, and
its central idea is applying filters to the input data or convolute it, and transfer the convoluted data to the next
layer. CNN was primarily used to deal with image data. Now there are 1D, 2D, and 3D CNN to get the
benefit of this architect for another type of data with a different number of dimensions. 3D-CNN used for
three-dimensional data which is very suited for our project, because we are dealing with video data, and the
reason for using video data rather than image data is the activity made of several consequential movements of
body parts. This continuous movement can be noticed with successive images, and this is a video.
Pooling layers used in this paper are max-pooling, which returns the maximum value within a kernel
size when it wraps around the data, average-pooling which returns the average amount within a kernel size
that wraps along with the data, and global-average-pooling, which returns the average value of each
dimension or each kernel in the CNN layer, and that is why it is useful at the final part of the model to reduce
the number of parameters, and to help the model overcome overfitting dropout layers are used.
2.1. Proposed models
The first suggested model shown in Figure 1 is influenced by the model proposed by Tran et al.
[19]. The model consists of three connected 3D convolutional neural networks with 3*3*3 kernel size for all
convolutional layers, 6ed by a max-pooling layer with a kernel size of 2*2*2 for all max-pooling layers.
Zero-padding is used for all convolutional layers, and ‘ReLU’ activation function added after each
convolutional and fully connected (FC) layers except for the last was SoftMax.
Figure 1. Initial proposed model
In the second attempt, other convolutional and max-pooling layers added before flatten layer, to
increase the accuracy. Then dropouts were added in different places in the model with five attempts to get
high accuracy, where all dropouts were with a 0.5 percentage factor. We try with our model to reduce the
number of parameters (weights and biases) and return to the first model with dropouts. The accuracy
increased using the dropouts. But, the number of parameters in huge, because the output of the layer before
flatten layer was more extensive than before removing convolutional and pooling layers, and as the pooling
layer is gone the number of parameters increased, in (1)-(3) shows how the number of parameters calculated.
Input
30*40*40*1
Conv3D
64
MaxPool
Conv3D
128
MaxPool
Conv3D
256
MaxPool
Flatten
FC
128
FC
32
FC
6
SoftMax Output
6 classes
Bulletin of Electr Eng & Inf ISSN: 2302-9285 
Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi)
3139
So, to solve this problem GlobalAveragePooling3D layer replaced the flatten layer, this replacement reduced
the number of parameters to about two million parameters, which is very helpful for low or moderate
computation capabilities machines to deal with online activity recognition.
No. of parameters in CNN=(filterwidth * filterheight * filterdepth + 1) * no. of filters (1)
No. of parameters in FC net=neurons of current layer * neurons of previous layer (2)
No. of parameters in flatten=multiplication of all previous layer dimension (3)
3. RESULTS AND DISCUSSION
The neural network model is trained on the KTH dataset, the dataset is doubled before using it by
adding a flipped copy of it, 20% of data is taken for test after training, and another 20% taken for the test
during the training, and 60% were taken for the training process. The model was trained using Tensorflow-
v2.1 [21] as backend, and Keras-v2.3.1 [22] using Python-language-v3.7.5 as front-end, with batch size=16
and the shape of the frame taken from the video were 40x40 pixels with one channel (grayscale), from each
video 30 frames were taken between each frame and another there were four frames between taken frames
discarded.
The optimizer of the model was Adam optimizer [23], and with a learning rate of 0.001 and
categorical cross-entropy loss function, the machine specification was: HP 15 Notebook, Memory: 12288MB
RAM, Intel-Core i5-3230M processor with four cores, two of them are physical, and the maximum frequency
is 2.6GHz and Windows 8.1 Enterprise 64-bit OS (6.3, build 9600).
Validation data controls the training operation. So, if an update made to the model and validation
data applied to the model and the losses did not improved for three epochs for the learning rate would be
multiplied by a half and the minimum reduction is 0.0001. If the losses did not improve for 15 epochs
consequently, the training would be finished before getting to the given number of epochs is 100.
3.1. Calculating results
After training operation finishes test samples are pushed to the model to get the response the
accuracy, precision, recall, and f1_score are calculated using (4)-(7) which uses confusion matrix shown in
Figure 2 [24], for average loss is calculated using categorical cross-entropy algorithm [25].
Figure 2. Confusion matrix annotation
Accuracy=
∑ tpi
l
i=1 +tni
∑ (tpi+fni+fpi+tni
)
l
i=1
(4)
In (5) shows the way to calculate accuracy which defines the effectiveness of the model overall.
Precision=
∑ tpi
l
i=1
∑ (tpi+fpi)
l
i=1
(5)
According to (5) shows the way to calculate precision which determines the matching between the
label of classes and the calculated labels. In (6) shows the way to calculate recall which demonstrates the
effectiveness of the model to identify the label of classes. As shown in (7) shows the way to calculate
F1_score which defines the relation between output data taken from the model after entering data for test and
the positive labels.
In (8) show the way to calculate the average loss, where N is the number of samples, M is the
number of classes, d is the true label or desired output, and y is the calculated or tested output from the
model. Table 1 shows the calculated results, and it’s figure number. Table 2 shows a comparison of
accuracies for several studies done on the KTH dataset for human activity recognition and our study
accuracy, and it is evident that our method has shown a remarkable improvement according to accuracy.
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146
3140
Recall=
∑ tpi
l
i=1
∑ (tpi+fni)
l
i=1
(6)
F1_score=
(β2+1)∗Precision+Recall
β2∗Precision+Recall
(8)
Loss=
∑ −
1
M
∑ di,jlog⁡(yi,j)
M
j=1
N
i=1
N
(9)
Table 1. Calculated results for all models
No. No. of figure Accuracy % Loss Precision % Recall % F1_score %
1. Figure 3 85.83 0.39 86.30 85.89 86.09
2. Figure 4 88.75 0.39 88.97 88.97 88.97
3. Figure 5 92.08 0.23 92.67 91.90 92.28
4. Figure 6 92.92 0.22 93.08 92.95 93.02
Table 2. Comparison of accuracies of researches done on KTH
No. Method Accuracy %
1 Ahmad and Lee [26] 84.83
2 Taylor et al. [27] 88.00
3 Qian et al. [28] 88.69
4 Our method 93.33
Number the table consecutively according to the first mention (sequential order).
3.2. Calculating the number of operations for layers
We want to calculate an approximate number of operations for each layer, the calculations don’t
include controlling operations, calculations are detailed below:
a. 3D-CNN each kernel convolutes on the entire input data, and for 3D-CNN with a kernel size of (Kd,
Kh, Kw), input data of (frames, height, width) and strides are (strided, strideh, stridew), we would have:
No. of operations per node=(Kd * Kh * Kw + 1)2 * no. of previous kernels (9)
Output nodes=((frames-Kd)/strided+1) * (height-Kh)/strideh+1) *((width-
Kw)/stridew+1) *no. of current kernels
(10)
Output nodes=(frames/strided+1) * (height/strideh+1) * (width/stridew+1) *
no. of current kernels
(11)
In (9) shows the number of operations for each output node came from the convolution operation
and the power two because we have multiplications, in (10) [29] shows the number of output node and for
each output node we have. For 3D-CNN layer, but for no zero paddings, if we use padding which is used in
our proposed model the number of operation would be as shown in (11) [29].
b. Maxpooling3D performs comparison operation, for a pool window size of (Pd, Ph, Pw), input data of
(frames, height, width) and strides are (strided, strideh, stridew), we would have:
No. of operations per node=Pd * Ph * Pw (12)
Output nodes=((frames-Pd)/strided+1) * ((height-Ph)/strideh+1) * ((width-
Pw)/stridew+1) no. of current kernels
(13)
Output nodes=(frames/strided+1) * (height/strideh+1) * (width/stridew+1) *
current kernels
(14)
In (12) shows the number of operations for each pool window, in (13) shows the number of output
node without padding which is used in our model and the size of the stride are the same pool window size, in
(14) shows the number of output nodes if there are zero paddings.
c. Fully connected has a vast number of parameters as compared with CNN if the number of neurons of
the previous layer is Nprevious and the number of neurons of the current layer is Ncurrent.
In (15) shows the number of operations for a fully connected layer, power two because we have
multiplications.
Bulletin of Electr Eng & Inf ISSN: 2302-9285 
Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi)
3141
Number of operations=(Nprevious * Ncurrent)2 (15)
d. Flatten has only one operation which is reshaping the dimensions into one dimension.
e. Dropout works only during training operation, and it’s just hidden randomly chosen a portion of nodes
so as not to participate in producing the output at only some point in the training, so for testing it
doesn’t cost any operations.
f. GlobalAveragePooling3D adds nodes for each channel where the previous layer output is (frames,
height, width, channels), so it adds all the numbers in frames, height, and width for a particular channel
and divides by the number of (frames * height * width), so the total number of operations is shown
in (16).
Number of operations=(frames * height * width) * channels (16)
Table 3 shows the number of operations for each model proposed according to their figures, and we
can see the least number of operations is the model with the least number of parameters and high accuracy of
92.92%.
Table 3. Number of operations for each model according to its figure
No. of figure No. of operations
Figures 3 1.08 * 1013
Figures 4, 5 5.77 * 1012
Figure 6 4.77 * 1012
4. DISCUSSION
Each result is discussed according to the number of figures.
Figure 4 shows model architect, confusion matrix, accuracy, and loss figures for the earlier proposed
model. We can notice that because the last before flatten were not small we got a massive number of the
parameters for the fully connected layer, also we can see that the model’s learning was saturated in early time
which completed learning within 30 epochs. Figure 5 shows model architect, confusion matrix, accuracy and
loss figures for the proposed model after adding Conv3D and MaxPooling before flatten to reduce the
number of parameters, and the training time and the training finished within 30 epochs. Figure 6, shows
model architect, confusion matrix, accuracy, and loss figures for the model after adding dropouts after the
fourth, sixth, and eighth layers, the accuracy for this model shown remarkable improvement and the training
finished in epoch 100 which is the final demanded epoch. Figure 6 shows model architect, confusion matrix,
accuracy, and loss figures for the model after changing flatten layer with GlobalAveragePooling3D, which
reduced the number of parameters and so the training time per epoch. It also reduced the number of
operations during testing and got a fantastic accuracy of 92.92%, the training was finished in epoch 82.
Layer (type) Output shape Parameters
Conv3D 30, 40, 40, 64 1792
MaxPooling3D 15, 20, 20, 64 0
Conv3D 15, 20, 20, 128 221312
MaxPooling3D 7, 10, 10, 128 0
Conv3D 7, 10, 10, 256 884992
MaxPooling3D 3, 5, 5, 256 0
Flatten 19200 19200
Dense 128 2457728
Dense 32 4128
Dense 6 198
Total Parameters 3,570,150
(a) (b)
Figure 3. Earlier proposed model; (a) model architecture, (b) confusion matrix
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146
3142
(c) (d)
Figure 4. Earlier proposed model; (c) accuracy of training and validation, (d) losses of training and validation
(continue)
Layer (type) Output shape Parameters
Conv3D 30, 40, 40, 64 1792
MaxPooling3D 15, 20, 20, 64 0
Conv3D 15, 20, 20, 128 221312
MaxPooling3D 7, 10, 10, 128 0
Conv3D 7, 10, 10, 256 884992
MaxPooling3D 3, 5, 5, 256 0
Conv3D 3, 5, 5, 256 1769728
MaxPooling3D 1, 2, 2, 256 0
Flatten 1024 0
Dense 128 131200
Dense 32 4128
Dense 6 198
Total Parameters 3,013,350
(a) (b)
(c) (d)
Figure 5. Adding fourth convolutional and max-pooling layers; (a) model architecture, (b) confusion matrix,
(c) accuracy of training and validation, (d) losses of training and validation
Bulletin of Electr Eng & Inf ISSN: 2302-9285 
Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi)
3143
Layer (type) Output shape Parameters
Conv3D 30, 40, 40, 64 1792
MaxPooling3D 15, 20, 20, 64 0
Conv3D 15, 20, 20, 128 221312
MaxPooling3D 7, 10, 10, 128 0
Dropout (0.5) 7, 10, 10, 128 0
Conv3D 7, 10, 10, 256 884992
MaxPooling3D 3, 5, 5, 256 0
Dropout (0.5) 3, 5, 5, 256 0
Conv3D 3, 5, 5, 256 1769728
MaxPooling3D 1, 2, 2, 256 0
Dropout (0.5) 1, 2, 2, 256 0
Flatten 1024 0
Dense 128 131200
Dense 32 4128
Dense 6 198
Total Parameters 3,013,350
(a) (b)
(c) (d)
Figure 6. Dropouts before third and fourth convolutional and also before flatten layer; (a) model architecture,
(b) confusion matrix, (c) accuracy of training and validation, (d) losses of training and validation
Layer (type) Output shape Parameters
Conv3D 30, 40, 40, 64 1792
MaxPooling3 15, 20, 20, 64 0
Conv3D 15, 20, 20, 128 221312
MaxPooling3 7, 10, 10, 128 0
Conv3D 7, 10, 10, 256 884992
MaxPooling3 3, 5, 5, 256 0
Dropout (0.5) 3, 5, 5, 256 0
GlobalAVG3D 256 0
Dropout (0.5) 256 0
Dense 128 32896
Dense 32 4128
Dense 6 198
Total parameters 1,145,318
(a) (b)
Figure 6. Replacing flatten with global average pooling; (a) model architecture, (b) confusion matrix
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146
3144
(c) (d)
Figure 6. Replacing flatten with global average pooling; (c) accuracy of training and validation, (d) losses of
training and validation (continue)
We can see that dropout has great benefit, but this benefit can’t be taken unless when we put
dropout in the right place, we can see that there were several changes to the place and number of dropouts
when had seen that dropouts increases the accuracy when it was added before the third convolutional and
flatten layers but decreased slightly decreased when added before fourth convolutional layer. Then the place
of dropouts was changed to be before and after flatten layer, in which we got the maximum accuracy, after
that this increasing tested for the model with a smaller number of layers for the aim of decreasing the number
of parameters. The results were helpful, then complete the operation of parameters decreasing, flatten layer
has been replaced by Global-average-pooling, which reduced the number of parameters for the model by two
million parameters.
5. CONCLUSION
We have designed a model that can be used for online human activity recognition using moderate
computation machine. The accuracy of our proposed model was raised to 93.33%, and 92.92% for the model
with reduced amount of parameters. The last presented model is useful for moderate computation capabilities
machines, due to its low number of parameters and a low number of mathematical operations. We have
reached this high accuracy by getting the benefit of dropouts, and decreasing learning rate during training
when there is no improvement. The model with a low number of mathematical operations could be used for
online human activity recognition in a smart houses, helping monitoring human activities in the houses. We
intend to do more augmentation for the data to increase the overall accuracy, where only flipping
augmentation is made to the data.
REFERENCES
[1] X. Zhou, W. Liang, K. I. Wang, H. Wang, L. T. Yang and Q. Jin, "Deep-Learning-Enhanced Human Activity
Recognition for Internet of Healthcare Things," in IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6429-6438,
July 2020, doi: 10.1109/JIOT.2020.2985082..
[2] V. Bianchi, M. Bassoli, G. Lombardo, P. Fornacciari, M. Mordonini and I. De Munari, "IoT Wearable Sensor and
Deep Learning: An Integrated Approach for Personalized Human Activity Recognition in a Smart Home
Environment," in IEEE Internet of Things Journal, vol. 6, no. 5, pp. 8553-8562, Oct. 2019, doi:
10.1109/JIOT.2019.2920283.
[3] A. K. M. Masum, A. Barua, E. H. Bahadur, M. R. Alam, M. A. U. Z. Chowdhury and M. S. Alam, "Human
Activity Recognition Using Multiple Smartphone Sensors," 2018 International Conference on Innovations in
Science, Engineering and Technology (ICISET), 2018, pp. 468-473, doi: 10.1109/ICISET.2018.8745628.
[4] M. M. Hassan, M. Z. Uddin, A. Mohamed and A. Almogren, “A robust human activity recognition system using
smartphone sensors and deep learning,” Future Generation Computer Systems, vol. 81, pp. 307-313, 2018, doi:
10.1016/j.future.2017.11.029.
Bulletin of Electr Eng & Inf ISSN: 2302-9285 
Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi)
3145
[5] X. Wu, Z. Chu, P. Yang, C. Xiang, X. Zheng and W. Huang, "TW-See: Human Activity Recognition Through the
Wall With Commodity Wi-Fi Devices," in IEEE Transactions on Vehicular Technology, vol. 68, no. 1, pp. 306-
319, Jan. 2019, doi: 10.1109/TVT.2018.2878754.
[6] A. Diba et al., “Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification,”
Computer Science, 2017.
[7] T. Lima, B. Fernandes and P. Barros, "Human action recognition with 3D convolutional neural network," 2017
IEEE Latin American Conference on Computational Intelligence (LA-CCI), 2017, pp. 1-6, doi: 10.1109/LA-
CCI.2017.8285700.
[8] J. Carreira and A. Zisserman, “Quo Vadis, action recognition? A new model and the kinetics dataset,” Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2017-Janua, pp. 4724–4733,
2017.
[9] R. Singh, A. K. S. Kushwaha and R. Srivastava, “Multi-view recognition system for human activity based on
multiple features for video surveillance system,” Multimedia Tools and Applications, vol. 78, no. 12, pp. 17165-
17196, 2019, doi: 10.1007/s11042-018-7108-9.
[10] H. D. Mehr and H. Polat, "Human Activity Recognition in Smart Home With Deep Learning Approach," 2019 7th
International Istanbul Smart Grids and Cities Congress and Fair (ICSG), 2019, pp. 149-153, doi:
10.1109/SGCF.2019.8782290.
[11] Z. Tu et al., “Multi-stream CNN: Learning representations based on human-related regions for action recognition,”
Pattern Recognition, vol. 79, pp. 32-43, 2018, doi: 10.1016/j.patcog.2018.01.020.
[12] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition
unaffected by shift in position,” Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980, doi:
10.1007/BF00344251.
[13] J J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A large-scale hierarchical image
database," 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248-255, doi:
10.1109/CVPR.2009.5206848.
[14] A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,”
Advances in neural information processing systems, vol. 60, no. 6, pp. 84-90, 2017, doi: 10.1145/3065386.
[15] “KTH dataset,” 2005. [Online]. Available: https://www.csc.kth.se/cvap/actions/. [Accessed: 27-May-2020].
[16] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei, "Large-Scale Video Classification
with Convolutional Neural Networks," 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014,
pp. 1725-1732, doi: 10.1109/CVPR.2014.223.
[17] K. Simonyan and A. Zisserman, “Two-stream convolutional networks for action recognition in videos,” in
Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 1, no. 1, pp. 568-
576, 2014, doi: 10.5555/2968826.2968890.
[18] S. Ji, W. Xu, M. Yang and K. Yu, "3D Convolutional Neural Networks for Human Action Recognition," in IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221-231, Jan. 2013, doi:
10.1109/TPAMI.2012.59.
[19] D. Tran, L. Bourdev, R. Fergus, L. Torresani and M. Paluri, “Learning spatiotemporal features with 3D
convolutional networks,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), vol. 2015
Inter, pp. 4489-4497, 2015.
[20] K. Soomro, A. R. Zamir and M. Shah, “UCF101: A Dataset of 101 Human Actions Classes From Videos in The
Wild,” Computer Vision and Pattern Recognition, no. November, 2012.
[21] M. Abadi et al., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,” Distributed,
Parallel, and Cluster Computing, 2016.
[22] François Chollet, “Keras,” 2015. [Online]. Available: https://keras.io/. [Accessed: 08-Jun-2020].
[23] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” Computer Science, Mathematics, pp. 1-
15, 2015.
[24] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,”
Information Processing & Management, vol. 45, no. 4, pp. 427-437, 2009, doi: 10.1016/j.ipm.2009.03.002.
[25] Z. Zhang and M. R. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,”
32nd Conference on Neural Information Processing Systems (NeurIPS), 2018, vol. 2018-Decem, no. NeurIPS, pp.
8778-8788, doi: 10.5555/3327546.3327555.
[26] M. Ahmad and S. W. Lee, “Human action recognition using shape and CLG-motion flow from multi-view image
sequences,” Pattern Recognition, vol. 41, no. 7, pp. 2237-2252, 2008, doi: 10.1016/j.patcog.2007.12.008.
[27] G. W. Taylor, R. Fergus, Y. LeCun, and C. Bregler, “Convolutional learning of spatio-temporal features,”
European conference on computer vision, Springer, Berlin, Heidelberg, 2010, vol. 6316 LNCS, no. PART 6, pp.
140-153, doi: 10.1007/978-3-642-15567-3_11.
[28] H. Qian, Y. Mao, W. Xiang and Z. Wang, “Recognition of human activities using SVM multi-class classifier,”
Pattern Recognition Letters, vol. 31, no. 2, pp. 100-111, 2010, doi: 10.1016/j.patrec.2009.09.019.
[29] I. Vasilev, D. Slater, G. Spacagna, P. Roelants and V. Zocca, "Python Deep Learning," 2nd Editio. Birmingham:
Packt Publishing, 2019.
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146
3146
BIOGRAPHIES OF AUTHORS
Malik Alsaedi is Asst. Prof. of electrical engineering. He finished his B.Sc. degree from
University of Technology University  Baghdad, the M.Tch. degree from JNTU University
India and the Ph.D. degree from UTM university  Malaysia. Currently position a deputy dean
of engineering college  Al-Iraqia University Iraq. He is interested in optical communication
and IoT technology.
Abdulrahman S. Mohialdeen has a Bachelor degree in Electrical Engineering from
University of Baghdad, Master degree in Computer Engineering from Al-Iraqia University,
research interest in deep learning, human activity recognition, and computer vision.
Baraa Munqith Albaker received both B.Sc. degree in electrical engineering and
M.Sc. degree in computer and control engineering from University of Baghdad, Iraq, and
Ph.D. degree in control engineering from University of Malaya, Malaysia. He had worked in
industry on data acquisition systems and radar signal processing and analysis for over three
years. He was a lecturer at University of Baghdad for four years. Next, he was a senior lecturer
of UMPEDAC research Centre, University of Malaya for two years. Currently, he works as
head of Networks Engineering department at Al-Iraqia University. His research interests focus
on contemporary development in computer and control applications.

More Related Content

What's hot

I017425763
I017425763I017425763
I017425763
IOSR Journals
 
An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images
An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images
An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images
Associate Professor in VSB Coimbatore
 
IRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
IRJET- Handwritten Decimal Image Compression using Deep Stacked AutoencoderIRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
IRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
IRJET Journal
 
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEMIMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
Vishesh Banga
 
Thesis on Image compression by Manish Myst
Thesis on Image compression by Manish MystThesis on Image compression by Manish Myst
Thesis on Image compression by Manish Myst
Manish Myst
 
H0114857
H0114857H0114857
H0114857
IJRES Journal
 
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
Dr. Amarjeet Singh
 
A Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyA Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image Steganography
IOSR Journals
 
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
csandit
 
A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...
A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...
A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...
sipij
 
A new image steganography algorithm based
A new image steganography algorithm basedA new image steganography algorithm based
A new image steganography algorithm based
IJNSA Journal
 
Applying Deep Learning with Weak and Noisy labels
Applying Deep Learning with Weak and Noisy labelsApplying Deep Learning with Weak and Noisy labels
Applying Deep Learning with Weak and Noisy labels
Darian Frajberg
 
Image recognition
Image recognitionImage recognition
Image recognition
Aseed Usmani
 
USING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATION
USING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATIONUSING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATION
USING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATION
IJNSA Journal
 
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression TechniqueFuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
IJERA Editor
 
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
Edge AI and Vision Alliance
 

What's hot (16)

I017425763
I017425763I017425763
I017425763
 
An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images
An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images
An Efficient Analysis of Wavelet Techniques on Image Compression in MRI Images
 
IRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
IRJET- Handwritten Decimal Image Compression using Deep Stacked AutoencoderIRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
IRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
 
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEMIMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
 
Thesis on Image compression by Manish Myst
Thesis on Image compression by Manish MystThesis on Image compression by Manish Myst
Thesis on Image compression by Manish Myst
 
H0114857
H0114857H0114857
H0114857
 
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
 
A Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyA Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image Steganography
 
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
 
A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...
A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...
A ROBUST CHAOTIC AND FAST WALSH TRANSFORM ENCRYPTION FOR GRAY SCALE BIOMEDICA...
 
A new image steganography algorithm based
A new image steganography algorithm basedA new image steganography algorithm based
A new image steganography algorithm based
 
Applying Deep Learning with Weak and Noisy labels
Applying Deep Learning with Weak and Noisy labelsApplying Deep Learning with Weak and Noisy labels
Applying Deep Learning with Weak and Noisy labels
 
Image recognition
Image recognitionImage recognition
Image recognition
 
USING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATION
USING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATIONUSING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATION
USING BIAS OPTIMIAZATION FOR REVERSIBLE DATA HIDING USING IMAGE INTERPOLATION
 
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression TechniqueFuzzy Type Image Fusion Using SPIHT Image Compression Technique
Fuzzy Type Image Fusion Using SPIHT Image Compression Technique
 
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
 

Similar to Development of 3D convolutional neural network to recognize human activities using moderate computation machine

Paper_3.pdf
Paper_3.pdfPaper_3.pdf
Paper_3.pdf
ChauVVan
 
imagefiltervhdl.pptx
imagefiltervhdl.pptximagefiltervhdl.pptx
imagefiltervhdl.pptx
Akbarali206563
 
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET Journal
 
An optimized discrete wavelet transform compression technique for image trans...
An optimized discrete wavelet transform compression technique for image trans...An optimized discrete wavelet transform compression technique for image trans...
An optimized discrete wavelet transform compression technique for image trans...
IJECEIAES
 
Efficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
CSCJournals
 
Efficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
CSCJournals
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...
IJECEIAES
 
Improving AI surveillance using Edge Computing
Improving AI surveillance using Edge ComputingImproving AI surveillance using Edge Computing
Improving AI surveillance using Edge Computing
IRJET Journal
 
Video captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learningVideo captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learning
IJECEIAES
 
Beginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxBeginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptx
Ishaq Khan
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
Bomm Kim
 
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
IJCNCJournal
 
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
IJCNCJournal
 
A survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural NetworkA survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural Network
Sasanko Sekhar Gantayat
 
40120140507006
4012014050700640120140507006
40120140507006
IAEME Publication
 
Applying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory applicationApplying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory application
TELKOMNIKA JOURNAL
 
Deep Learning based Multi-class Brain Tumor Classification
Deep Learning based Multi-class Brain Tumor ClassificationDeep Learning based Multi-class Brain Tumor Classification
Deep Learning based Multi-class Brain Tumor Classification
IRJET Journal
 
DISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERAS
DISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERASDISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERAS
DISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERAS
cscpconf
 
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
ijma
 
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction NetworkEDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
gerogepatton
 

Similar to Development of 3D convolutional neural network to recognize human activities using moderate computation machine (20)

Paper_3.pdf
Paper_3.pdfPaper_3.pdf
Paper_3.pdf
 
imagefiltervhdl.pptx
imagefiltervhdl.pptximagefiltervhdl.pptx
imagefiltervhdl.pptx
 
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
IRJET- A Survey on Medical Image Interpretation for Predicting Pneumonia
 
An optimized discrete wavelet transform compression technique for image trans...
An optimized discrete wavelet transform compression technique for image trans...An optimized discrete wavelet transform compression technique for image trans...
An optimized discrete wavelet transform compression technique for image trans...
 
Efficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
 
Efficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...
 
Improving AI surveillance using Edge Computing
Improving AI surveillance using Edge ComputingImproving AI surveillance using Edge Computing
Improving AI surveillance using Edge Computing
 
Video captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learningVideo captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learning
 
Beginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxBeginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptx
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
ON THE PERFORMANCE OF INTRUSION DETECTION SYSTEMS WITH HIDDEN MULTILAYER NEUR...
 
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
On The Performance of Intrusion Detection Systems with Hidden Multilayer Neur...
 
A survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural NetworkA survey on the layers of convolutional Neural Network
A survey on the layers of convolutional Neural Network
 
40120140507006
4012014050700640120140507006
40120140507006
 
Applying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory applicationApplying convolutional neural networks for limited-memory application
Applying convolutional neural networks for limited-memory application
 
Deep Learning based Multi-class Brain Tumor Classification
Deep Learning based Multi-class Brain Tumor ClassificationDeep Learning based Multi-class Brain Tumor Classification
Deep Learning based Multi-class Brain Tumor Classification
 
DISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERAS
DISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERASDISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERAS
DISTRIBUTED SYSTEM FOR 3D REMOTE MONITORING USING KINECT DEPTH CAMERAS
 
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
 
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction NetworkEDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
 

More from journalBEEI

Square transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipherSquare transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipher
journalBEEI
 
Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...
journalBEEI
 
Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...
journalBEEI
 
A secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networksA secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networks
journalBEEI
 
Plant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkPlant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural network
journalBEEI
 
Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...
journalBEEI
 
Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...
journalBEEI
 
Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...
journalBEEI
 
Wireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithmWireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithm
journalBEEI
 
Implementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antennaImplementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antenna
journalBEEI
 
The calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human headThe calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human head
journalBEEI
 
Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...
journalBEEI
 
Design of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting applicationDesign of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting application
journalBEEI
 
Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...
journalBEEI
 
Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)
journalBEEI
 
Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...
journalBEEI
 
Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...
journalBEEI
 
An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...
journalBEEI
 
Parking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentationParking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentation
journalBEEI
 
Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...
journalBEEI
 

More from journalBEEI (20)

Square transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipherSquare transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipher
 
Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...
 
Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...
 
A secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networksA secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networks
 
Plant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkPlant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural network
 
Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...
 
Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...
 
Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...
 
Wireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithmWireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithm
 
Implementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antennaImplementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antenna
 
The calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human headThe calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human head
 
Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...
 
Design of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting applicationDesign of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting application
 
Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...
 
Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)
 
Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...
 
Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...
 
An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...
 
Parking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentationParking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentation
 
Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...
 

Recently uploaded

BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
Madhumitha Jayaram
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
wisnuprabawa3
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
ssuser36d3051
 
2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt
PuktoonEngr
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
PauloRodrigues104553
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Exception Handling notes in java exception
Exception Handling notes in java exceptionException Handling notes in java exception
Exception Handling notes in java exception
Ratnakar Mikkili
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 

Recently uploaded (20)

BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
 
2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Exception Handling notes in java exception
Exception Handling notes in java exceptionException Handling notes in java exception
Exception Handling notes in java exception
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 

Development of 3D convolutional neural network to recognize human activities using moderate computation machine

  • 1. Bulletin of Electrical Engineering and Informatics Vol. 10, No. 6, December 2021, pp. 3137~3146 ISSN: 2302-9285, DOI: 10.11591/eei.v10i6.2802 3137 Journal homepage: http://beei.org Development of 3D convolutional neural network to recognize human activities using moderate computation machine Malik A. Alsaedi, Abdulrahman S. Mohialdeen, Baraa M. Albaker College of Engineering, Al-Iraqia University, Sabe’ Abkar, Adhamiya, Baghdad, Iraq Article Info ABSTRACT Article history: Received Jan 12, 2021 Revised May 20, 2021 Accepted Oct 9, 2021 Human activity recognition (HAR) is recently used in numerous applications including smart homes to monitor human behavior, automate homes according to human activities, entertainment, falling detection, violence detection, and people care. Vision-based recognition is the most powerful method widely used in HAR systems implementation due to its characteristics in recognizing complex human activities. This paper addresses the design of a 3D convolutional neural network (3D-CNN) model that can be used in smart homes to identify several numbers of activities. The model is trained using KTH dataset that contains activities like (walking, running, jogging, handwaving handclapping, boxing). Despite the challenges of this method due to the effectiveness of the lamination, background variation, and human body variety, the proposed model reached an accuracy of 93.33%. The model was implemented, trained and tested using moderate computation machine and the results show that the proposal was successfully capable to recognize human activities with reasonable computations. Keywords: 3D-CNN Convolutional neural network Deep learning HAR Smart home Vision This is an open access article under the CC BY-SA license. Corresponding Author: Abdulrahman S. Mohialdeen College of Engineering Al-Iraqia University Sabe’ Abkar, Adhamiya, Baghdad, Iraq Email: abd_saeed@aliraqia.edu.iq 1. INTRODUCTION HAR is one of the challenging subjects because of the huge number of human activities, some of the activities can be easily noticed some of them are confusing, and some of them require interaction with other objects or humans, besides the diversity of the activities, the recognition methods are also diverse. There are many types of data required to recognize a human activity, some of them use ambient sensors like accelerometer, gyroscope, humidity, and temperature [1], [2]. Some of them get the benefit of the smartphone's sensors like accelerometer and gyroscope [3], [4], the other use the radio frequency [5]. But the most popular recognizing methods use vision-based recognition [6]-[14]. Vision uses images or videos to recognize the activity. Also, there is a lot of challenges for recognizing human activities because of the effect of lamination, variance of background. Still, the question is how to process these visual data to recognize the activity. The answer there are many techniques most of them use machine learning, and deep learning has shown an excellent benefit for recognizing human activities, especially the CNN which are very useful for vision-based data recognition. In this paper, we will design a neural network architect a.k.a. model, that can be used for human activity monitoring, the model proposed of 3D dimensional CNN (3D-CNN), and the purpose of using 3D-CNN is to extract spatial and temporal features rather than only spatial features, and the activity consists
  • 2.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146 3138 of multiple movements that can be identified by extracting the temporal features from between several numbers of frames. Our proposed model is a small number of 3D-CNN layers to reduce the amount of processing time so that any computer with low computational ability could recognize human activity online without delay. CNN was introduced by Fukushima [12]. At the beginning, it was mainly used for image classification, and image features extraction, one of the most attractive models that been invented to get into a challenge to classify images [13], [14]. During that many video human activity datasets were published and KTH [15] is one of the most popular small size datasets. CNN architect encouraged researchers to use it in HAR with image data taken from video dataset [10], [16]. Other researchers proposed to use two-stream CNN where the first stream is an image data and the second it's optical flow [16], [17] 3D-CNN for HAR most cited articles [18] proposed model with TRECVID dataset, and [19] proposed 3D-CNN model for UCF-101 dataset [20] in which our model inspired by their model. 2. RESEARCH METHOD This paper proposing to use deep neural networks (DNN) models that consists of many different layers, and each layer has its purpose during training and testing. Still, the dominant layer in which the article focused on is the Convolutional neural network, which is one of the most widely used neural networks, and its central idea is applying filters to the input data or convolute it, and transfer the convoluted data to the next layer. CNN was primarily used to deal with image data. Now there are 1D, 2D, and 3D CNN to get the benefit of this architect for another type of data with a different number of dimensions. 3D-CNN used for three-dimensional data which is very suited for our project, because we are dealing with video data, and the reason for using video data rather than image data is the activity made of several consequential movements of body parts. This continuous movement can be noticed with successive images, and this is a video. Pooling layers used in this paper are max-pooling, which returns the maximum value within a kernel size when it wraps around the data, average-pooling which returns the average amount within a kernel size that wraps along with the data, and global-average-pooling, which returns the average value of each dimension or each kernel in the CNN layer, and that is why it is useful at the final part of the model to reduce the number of parameters, and to help the model overcome overfitting dropout layers are used. 2.1. Proposed models The first suggested model shown in Figure 1 is influenced by the model proposed by Tran et al. [19]. The model consists of three connected 3D convolutional neural networks with 3*3*3 kernel size for all convolutional layers, 6ed by a max-pooling layer with a kernel size of 2*2*2 for all max-pooling layers. Zero-padding is used for all convolutional layers, and ‘ReLU’ activation function added after each convolutional and fully connected (FC) layers except for the last was SoftMax. Figure 1. Initial proposed model In the second attempt, other convolutional and max-pooling layers added before flatten layer, to increase the accuracy. Then dropouts were added in different places in the model with five attempts to get high accuracy, where all dropouts were with a 0.5 percentage factor. We try with our model to reduce the number of parameters (weights and biases) and return to the first model with dropouts. The accuracy increased using the dropouts. But, the number of parameters in huge, because the output of the layer before flatten layer was more extensive than before removing convolutional and pooling layers, and as the pooling layer is gone the number of parameters increased, in (1)-(3) shows how the number of parameters calculated. Input 30*40*40*1 Conv3D 64 MaxPool Conv3D 128 MaxPool Conv3D 256 MaxPool Flatten FC 128 FC 32 FC 6 SoftMax Output 6 classes
  • 3. Bulletin of Electr Eng & Inf ISSN: 2302-9285  Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi) 3139 So, to solve this problem GlobalAveragePooling3D layer replaced the flatten layer, this replacement reduced the number of parameters to about two million parameters, which is very helpful for low or moderate computation capabilities machines to deal with online activity recognition. No. of parameters in CNN=(filterwidth * filterheight * filterdepth + 1) * no. of filters (1) No. of parameters in FC net=neurons of current layer * neurons of previous layer (2) No. of parameters in flatten=multiplication of all previous layer dimension (3) 3. RESULTS AND DISCUSSION The neural network model is trained on the KTH dataset, the dataset is doubled before using it by adding a flipped copy of it, 20% of data is taken for test after training, and another 20% taken for the test during the training, and 60% were taken for the training process. The model was trained using Tensorflow- v2.1 [21] as backend, and Keras-v2.3.1 [22] using Python-language-v3.7.5 as front-end, with batch size=16 and the shape of the frame taken from the video were 40x40 pixels with one channel (grayscale), from each video 30 frames were taken between each frame and another there were four frames between taken frames discarded. The optimizer of the model was Adam optimizer [23], and with a learning rate of 0.001 and categorical cross-entropy loss function, the machine specification was: HP 15 Notebook, Memory: 12288MB RAM, Intel-Core i5-3230M processor with four cores, two of them are physical, and the maximum frequency is 2.6GHz and Windows 8.1 Enterprise 64-bit OS (6.3, build 9600). Validation data controls the training operation. So, if an update made to the model and validation data applied to the model and the losses did not improved for three epochs for the learning rate would be multiplied by a half and the minimum reduction is 0.0001. If the losses did not improve for 15 epochs consequently, the training would be finished before getting to the given number of epochs is 100. 3.1. Calculating results After training operation finishes test samples are pushed to the model to get the response the accuracy, precision, recall, and f1_score are calculated using (4)-(7) which uses confusion matrix shown in Figure 2 [24], for average loss is calculated using categorical cross-entropy algorithm [25]. Figure 2. Confusion matrix annotation Accuracy= ∑ tpi l i=1 +tni ∑ (tpi+fni+fpi+tni ) l i=1 (4) In (5) shows the way to calculate accuracy which defines the effectiveness of the model overall. Precision= ∑ tpi l i=1 ∑ (tpi+fpi) l i=1 (5) According to (5) shows the way to calculate precision which determines the matching between the label of classes and the calculated labels. In (6) shows the way to calculate recall which demonstrates the effectiveness of the model to identify the label of classes. As shown in (7) shows the way to calculate F1_score which defines the relation between output data taken from the model after entering data for test and the positive labels. In (8) show the way to calculate the average loss, where N is the number of samples, M is the number of classes, d is the true label or desired output, and y is the calculated or tested output from the model. Table 1 shows the calculated results, and it’s figure number. Table 2 shows a comparison of accuracies for several studies done on the KTH dataset for human activity recognition and our study accuracy, and it is evident that our method has shown a remarkable improvement according to accuracy.
  • 4.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146 3140 Recall= ∑ tpi l i=1 ∑ (tpi+fni) l i=1 (6) F1_score= (β2+1)∗Precision+Recall β2∗Precision+Recall (8) Loss= ∑ − 1 M ∑ di,jlog⁡(yi,j) M j=1 N i=1 N (9) Table 1. Calculated results for all models No. No. of figure Accuracy % Loss Precision % Recall % F1_score % 1. Figure 3 85.83 0.39 86.30 85.89 86.09 2. Figure 4 88.75 0.39 88.97 88.97 88.97 3. Figure 5 92.08 0.23 92.67 91.90 92.28 4. Figure 6 92.92 0.22 93.08 92.95 93.02 Table 2. Comparison of accuracies of researches done on KTH No. Method Accuracy % 1 Ahmad and Lee [26] 84.83 2 Taylor et al. [27] 88.00 3 Qian et al. [28] 88.69 4 Our method 93.33 Number the table consecutively according to the first mention (sequential order). 3.2. Calculating the number of operations for layers We want to calculate an approximate number of operations for each layer, the calculations don’t include controlling operations, calculations are detailed below: a. 3D-CNN each kernel convolutes on the entire input data, and for 3D-CNN with a kernel size of (Kd, Kh, Kw), input data of (frames, height, width) and strides are (strided, strideh, stridew), we would have: No. of operations per node=(Kd * Kh * Kw + 1)2 * no. of previous kernels (9) Output nodes=((frames-Kd)/strided+1) * (height-Kh)/strideh+1) *((width- Kw)/stridew+1) *no. of current kernels (10) Output nodes=(frames/strided+1) * (height/strideh+1) * (width/stridew+1) * no. of current kernels (11) In (9) shows the number of operations for each output node came from the convolution operation and the power two because we have multiplications, in (10) [29] shows the number of output node and for each output node we have. For 3D-CNN layer, but for no zero paddings, if we use padding which is used in our proposed model the number of operation would be as shown in (11) [29]. b. Maxpooling3D performs comparison operation, for a pool window size of (Pd, Ph, Pw), input data of (frames, height, width) and strides are (strided, strideh, stridew), we would have: No. of operations per node=Pd * Ph * Pw (12) Output nodes=((frames-Pd)/strided+1) * ((height-Ph)/strideh+1) * ((width- Pw)/stridew+1) no. of current kernels (13) Output nodes=(frames/strided+1) * (height/strideh+1) * (width/stridew+1) * current kernels (14) In (12) shows the number of operations for each pool window, in (13) shows the number of output node without padding which is used in our model and the size of the stride are the same pool window size, in (14) shows the number of output nodes if there are zero paddings. c. Fully connected has a vast number of parameters as compared with CNN if the number of neurons of the previous layer is Nprevious and the number of neurons of the current layer is Ncurrent. In (15) shows the number of operations for a fully connected layer, power two because we have multiplications.
  • 5. Bulletin of Electr Eng & Inf ISSN: 2302-9285  Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi) 3141 Number of operations=(Nprevious * Ncurrent)2 (15) d. Flatten has only one operation which is reshaping the dimensions into one dimension. e. Dropout works only during training operation, and it’s just hidden randomly chosen a portion of nodes so as not to participate in producing the output at only some point in the training, so for testing it doesn’t cost any operations. f. GlobalAveragePooling3D adds nodes for each channel where the previous layer output is (frames, height, width, channels), so it adds all the numbers in frames, height, and width for a particular channel and divides by the number of (frames * height * width), so the total number of operations is shown in (16). Number of operations=(frames * height * width) * channels (16) Table 3 shows the number of operations for each model proposed according to their figures, and we can see the least number of operations is the model with the least number of parameters and high accuracy of 92.92%. Table 3. Number of operations for each model according to its figure No. of figure No. of operations Figures 3 1.08 * 1013 Figures 4, 5 5.77 * 1012 Figure 6 4.77 * 1012 4. DISCUSSION Each result is discussed according to the number of figures. Figure 4 shows model architect, confusion matrix, accuracy, and loss figures for the earlier proposed model. We can notice that because the last before flatten were not small we got a massive number of the parameters for the fully connected layer, also we can see that the model’s learning was saturated in early time which completed learning within 30 epochs. Figure 5 shows model architect, confusion matrix, accuracy and loss figures for the proposed model after adding Conv3D and MaxPooling before flatten to reduce the number of parameters, and the training time and the training finished within 30 epochs. Figure 6, shows model architect, confusion matrix, accuracy, and loss figures for the model after adding dropouts after the fourth, sixth, and eighth layers, the accuracy for this model shown remarkable improvement and the training finished in epoch 100 which is the final demanded epoch. Figure 6 shows model architect, confusion matrix, accuracy, and loss figures for the model after changing flatten layer with GlobalAveragePooling3D, which reduced the number of parameters and so the training time per epoch. It also reduced the number of operations during testing and got a fantastic accuracy of 92.92%, the training was finished in epoch 82. Layer (type) Output shape Parameters Conv3D 30, 40, 40, 64 1792 MaxPooling3D 15, 20, 20, 64 0 Conv3D 15, 20, 20, 128 221312 MaxPooling3D 7, 10, 10, 128 0 Conv3D 7, 10, 10, 256 884992 MaxPooling3D 3, 5, 5, 256 0 Flatten 19200 19200 Dense 128 2457728 Dense 32 4128 Dense 6 198 Total Parameters 3,570,150 (a) (b) Figure 3. Earlier proposed model; (a) model architecture, (b) confusion matrix
  • 6.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146 3142 (c) (d) Figure 4. Earlier proposed model; (c) accuracy of training and validation, (d) losses of training and validation (continue) Layer (type) Output shape Parameters Conv3D 30, 40, 40, 64 1792 MaxPooling3D 15, 20, 20, 64 0 Conv3D 15, 20, 20, 128 221312 MaxPooling3D 7, 10, 10, 128 0 Conv3D 7, 10, 10, 256 884992 MaxPooling3D 3, 5, 5, 256 0 Conv3D 3, 5, 5, 256 1769728 MaxPooling3D 1, 2, 2, 256 0 Flatten 1024 0 Dense 128 131200 Dense 32 4128 Dense 6 198 Total Parameters 3,013,350 (a) (b) (c) (d) Figure 5. Adding fourth convolutional and max-pooling layers; (a) model architecture, (b) confusion matrix, (c) accuracy of training and validation, (d) losses of training and validation
  • 7. Bulletin of Electr Eng & Inf ISSN: 2302-9285  Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi) 3143 Layer (type) Output shape Parameters Conv3D 30, 40, 40, 64 1792 MaxPooling3D 15, 20, 20, 64 0 Conv3D 15, 20, 20, 128 221312 MaxPooling3D 7, 10, 10, 128 0 Dropout (0.5) 7, 10, 10, 128 0 Conv3D 7, 10, 10, 256 884992 MaxPooling3D 3, 5, 5, 256 0 Dropout (0.5) 3, 5, 5, 256 0 Conv3D 3, 5, 5, 256 1769728 MaxPooling3D 1, 2, 2, 256 0 Dropout (0.5) 1, 2, 2, 256 0 Flatten 1024 0 Dense 128 131200 Dense 32 4128 Dense 6 198 Total Parameters 3,013,350 (a) (b) (c) (d) Figure 6. Dropouts before third and fourth convolutional and also before flatten layer; (a) model architecture, (b) confusion matrix, (c) accuracy of training and validation, (d) losses of training and validation Layer (type) Output shape Parameters Conv3D 30, 40, 40, 64 1792 MaxPooling3 15, 20, 20, 64 0 Conv3D 15, 20, 20, 128 221312 MaxPooling3 7, 10, 10, 128 0 Conv3D 7, 10, 10, 256 884992 MaxPooling3 3, 5, 5, 256 0 Dropout (0.5) 3, 5, 5, 256 0 GlobalAVG3D 256 0 Dropout (0.5) 256 0 Dense 128 32896 Dense 32 4128 Dense 6 198 Total parameters 1,145,318 (a) (b) Figure 6. Replacing flatten with global average pooling; (a) model architecture, (b) confusion matrix
  • 8.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146 3144 (c) (d) Figure 6. Replacing flatten with global average pooling; (c) accuracy of training and validation, (d) losses of training and validation (continue) We can see that dropout has great benefit, but this benefit can’t be taken unless when we put dropout in the right place, we can see that there were several changes to the place and number of dropouts when had seen that dropouts increases the accuracy when it was added before the third convolutional and flatten layers but decreased slightly decreased when added before fourth convolutional layer. Then the place of dropouts was changed to be before and after flatten layer, in which we got the maximum accuracy, after that this increasing tested for the model with a smaller number of layers for the aim of decreasing the number of parameters. The results were helpful, then complete the operation of parameters decreasing, flatten layer has been replaced by Global-average-pooling, which reduced the number of parameters for the model by two million parameters. 5. CONCLUSION We have designed a model that can be used for online human activity recognition using moderate computation machine. The accuracy of our proposed model was raised to 93.33%, and 92.92% for the model with reduced amount of parameters. The last presented model is useful for moderate computation capabilities machines, due to its low number of parameters and a low number of mathematical operations. We have reached this high accuracy by getting the benefit of dropouts, and decreasing learning rate during training when there is no improvement. The model with a low number of mathematical operations could be used for online human activity recognition in a smart houses, helping monitoring human activities in the houses. We intend to do more augmentation for the data to increase the overall accuracy, where only flipping augmentation is made to the data. REFERENCES [1] X. Zhou, W. Liang, K. I. Wang, H. Wang, L. T. Yang and Q. Jin, "Deep-Learning-Enhanced Human Activity Recognition for Internet of Healthcare Things," in IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6429-6438, July 2020, doi: 10.1109/JIOT.2020.2985082.. [2] V. Bianchi, M. Bassoli, G. Lombardo, P. Fornacciari, M. Mordonini and I. De Munari, "IoT Wearable Sensor and Deep Learning: An Integrated Approach for Personalized Human Activity Recognition in a Smart Home Environment," in IEEE Internet of Things Journal, vol. 6, no. 5, pp. 8553-8562, Oct. 2019, doi: 10.1109/JIOT.2019.2920283. [3] A. K. M. Masum, A. Barua, E. H. Bahadur, M. R. Alam, M. A. U. Z. Chowdhury and M. S. Alam, "Human Activity Recognition Using Multiple Smartphone Sensors," 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET), 2018, pp. 468-473, doi: 10.1109/ICISET.2018.8745628. [4] M. M. Hassan, M. Z. Uddin, A. Mohamed and A. Almogren, “A robust human activity recognition system using smartphone sensors and deep learning,” Future Generation Computer Systems, vol. 81, pp. 307-313, 2018, doi: 10.1016/j.future.2017.11.029.
  • 9. Bulletin of Electr Eng & Inf ISSN: 2302-9285  Development of 3D convolutional neural network to recognize human … (Malik A. Alsaedi) 3145 [5] X. Wu, Z. Chu, P. Yang, C. Xiang, X. Zheng and W. Huang, "TW-See: Human Activity Recognition Through the Wall With Commodity Wi-Fi Devices," in IEEE Transactions on Vehicular Technology, vol. 68, no. 1, pp. 306- 319, Jan. 2019, doi: 10.1109/TVT.2018.2878754. [6] A. Diba et al., “Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification,” Computer Science, 2017. [7] T. Lima, B. Fernandes and P. Barros, "Human action recognition with 3D convolutional neural network," 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI), 2017, pp. 1-6, doi: 10.1109/LA- CCI.2017.8285700. [8] J. Carreira and A. Zisserman, “Quo Vadis, action recognition? A new model and the kinetics dataset,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2017-Janua, pp. 4724–4733, 2017. [9] R. Singh, A. K. S. Kushwaha and R. Srivastava, “Multi-view recognition system for human activity based on multiple features for video surveillance system,” Multimedia Tools and Applications, vol. 78, no. 12, pp. 17165- 17196, 2019, doi: 10.1007/s11042-018-7108-9. [10] H. D. Mehr and H. Polat, "Human Activity Recognition in Smart Home With Deep Learning Approach," 2019 7th International Istanbul Smart Grids and Cities Congress and Fair (ICSG), 2019, pp. 149-153, doi: 10.1109/SGCF.2019.8782290. [11] Z. Tu et al., “Multi-stream CNN: Learning representations based on human-related regions for action recognition,” Pattern Recognition, vol. 79, pp. 32-43, 2018, doi: 10.1016/j.patcog.2018.01.020. [12] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980, doi: 10.1007/BF00344251. [13] J J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A large-scale hierarchical image database," 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248-255, doi: 10.1109/CVPR.2009.5206848. [14] A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 60, no. 6, pp. 84-90, 2017, doi: 10.1145/3065386. [15] “KTH dataset,” 2005. [Online]. Available: https://www.csc.kth.se/cvap/actions/. [Accessed: 27-May-2020]. [16] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei, "Large-Scale Video Classification with Convolutional Neural Networks," 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1725-1732, doi: 10.1109/CVPR.2014.223. [17] K. Simonyan and A. Zisserman, “Two-stream convolutional networks for action recognition in videos,” in Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 1, no. 1, pp. 568- 576, 2014, doi: 10.5555/2968826.2968890. [18] S. Ji, W. Xu, M. Yang and K. Yu, "3D Convolutional Neural Networks for Human Action Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221-231, Jan. 2013, doi: 10.1109/TPAMI.2012.59. [19] D. Tran, L. Bourdev, R. Fergus, L. Torresani and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), vol. 2015 Inter, pp. 4489-4497, 2015. [20] K. Soomro, A. R. Zamir and M. Shah, “UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild,” Computer Vision and Pattern Recognition, no. November, 2012. [21] M. Abadi et al., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,” Distributed, Parallel, and Cluster Computing, 2016. [22] François Chollet, “Keras,” 2015. [Online]. Available: https://keras.io/. [Accessed: 08-Jun-2020]. [23] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” Computer Science, Mathematics, pp. 1- 15, 2015. [24] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Information Processing & Management, vol. 45, no. 4, pp. 427-437, 2009, doi: 10.1016/j.ipm.2009.03.002. [25] Z. Zhang and M. R. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,” 32nd Conference on Neural Information Processing Systems (NeurIPS), 2018, vol. 2018-Decem, no. NeurIPS, pp. 8778-8788, doi: 10.5555/3327546.3327555. [26] M. Ahmad and S. W. Lee, “Human action recognition using shape and CLG-motion flow from multi-view image sequences,” Pattern Recognition, vol. 41, no. 7, pp. 2237-2252, 2008, doi: 10.1016/j.patcog.2007.12.008. [27] G. W. Taylor, R. Fergus, Y. LeCun, and C. Bregler, “Convolutional learning of spatio-temporal features,” European conference on computer vision, Springer, Berlin, Heidelberg, 2010, vol. 6316 LNCS, no. PART 6, pp. 140-153, doi: 10.1007/978-3-642-15567-3_11. [28] H. Qian, Y. Mao, W. Xiang and Z. Wang, “Recognition of human activities using SVM multi-class classifier,” Pattern Recognition Letters, vol. 31, no. 2, pp. 100-111, 2010, doi: 10.1016/j.patrec.2009.09.019. [29] I. Vasilev, D. Slater, G. Spacagna, P. Roelants and V. Zocca, "Python Deep Learning," 2nd Editio. Birmingham: Packt Publishing, 2019.
  • 10.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3137 – 3146 3146 BIOGRAPHIES OF AUTHORS Malik Alsaedi is Asst. Prof. of electrical engineering. He finished his B.Sc. degree from University of Technology University Baghdad, the M.Tch. degree from JNTU University India and the Ph.D. degree from UTM university Malaysia. Currently position a deputy dean of engineering college Al-Iraqia University Iraq. He is interested in optical communication and IoT technology. Abdulrahman S. Mohialdeen has a Bachelor degree in Electrical Engineering from University of Baghdad, Master degree in Computer Engineering from Al-Iraqia University, research interest in deep learning, human activity recognition, and computer vision. Baraa Munqith Albaker received both B.Sc. degree in electrical engineering and M.Sc. degree in computer and control engineering from University of Baghdad, Iraq, and Ph.D. degree in control engineering from University of Malaya, Malaysia. He had worked in industry on data acquisition systems and radar signal processing and analysis for over three years. He was a lecturer at University of Baghdad for four years. Next, he was a senior lecturer of UMPEDAC research Centre, University of Malaya for two years. Currently, he works as head of Networks Engineering department at Al-Iraqia University. His research interests focus on contemporary development in computer and control applications.