SlideShare a Scribd company logo
1 of 11
Download to read offline
Bulletin of Electrical Engineering and Informatics
Vol. 10, No. 6, December 2021, pp. 3191~3201
ISSN: 2302-9285, DOI: 10.11591/eei.v10i6.3204 ๏ฒ3191
Journal homepage: http://beei.org
A multi-task learning based hybrid prediction algorithm for
privacy preserving human activity recognition framework
Vijaya Kumar Kambala, Harikiran Jonnadula
School of CSE, VIT-AP University, Amaravathi, Andhra Pradesh, India
Article Info ABSTRACT
Article history:
Received Aug 7, 2021
Revised Oct 3, 2021
Accepted Oct 31, 2021
There is ever increasing need to use computer vision devices to capture videos
as part of many real-world applications. However, invading privacy of people
is the cause of concern. There is need for protecting privacy of people while
videos are used purposefully based on objective functions. One such use case
is human activity recognition without disclosing human identity. In this paper,
we proposed a multi-task learning based hybrid prediction algorithm (MTL-
HPA) towards realising privacy preserving human activity recognition
framework (PPHARF). It serves the purpose by recognizing human activities
from videos while preserving identity of humans present in the multimedia
object. Face of any person in the video is anonymized to preserve privacy
while the actions of the person are exposed to get them extracted. Without
losing utility of human activity recognition, anonymization is achieved.
Humans and face detection methods file to reveal identity of the persons in
video. Action detection is done using bidirectional long short tem memory
network. We experimentally confirm with joint-annotated human motion data
base (JHMDB) and daily action localization in YouTube (DALY) datasets
that the framework recognises human activities and ensures non-disclosure of
privacy information. Our approach is better than many traditional
anonymization techniques such as noise adding, blurring, and masking.
Keywords:
Anonymization
Human activity recognition
Hybrid prediction algorithm
Multi-task learning
Privacy
This is an open access article under the CC BY-SA license.
Corresponding Author:
Harikiran Jonnadula
School of CSE, Vellore Institute of Technology
VIT-AP University, G-30, Inavolu, Beside AP Secretariat Amaravati
Andhra Pradesh 522237, India
Email: harikiran.j@vitap.ac.in
1. INTRODUCTION
In the contemporary era, there is an increased necessity for computer vision technology that supports
large-scale and automated study of visual data. It has become very crucial in many applications that are
useful to society. Usage of cameras associated with such computer vision applications became ubiquitous. In
cities across the globe there are millions of cameras that capture visual data on 24/7 basis for leveraging
different departments like policing, investigation agencies, healthcare industry, and so on. There is also need
for monitoring elderly people to recognize actions like fall. Human action recognition from live videos is a
popular research activity as studied in [1]-[4]. In the modern living, it is essential for different kinds of real
world applications. Based on human actions, it is possible to determine the kind of activity and take some
steps associated with the underlying application.
There are many advantages of using cameras in public places and select territories for monitoring.
However, with respect to human action recognition, face is an important biometric used to know the identity
of the person. Therefore, invasion of privacy in computer vision applications has become an increasing cause
of concern, particularly video recording that is not done with prior permission. In the modern society, we
๏ฒ ISSN:2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201
3192
need deployment cameras to capture video and recognise events but it is essential to eliminate privacy
invasion. Therefore, the need of the hour is to have mechanisms to recognise human actions from running
videos but ensure that the identity of humans is not lost. Towards this end, several researchers proposed
methods to ensure privacy-aware human activity recognition. For instance, [5]-[7] explored different methods
for privacy preserving action recognition with zero-shot approach, automatic fall detection, position based
super pixel transformation and hybrid reasoning respectively. Wu et al. in [8] there is GAN based approach
with deep learning for privacy preserving action recognition. Similarly, in [9]-[12] GAN based approaches
are discussed for computer vision applications. In different face recognition approaches are explored as
the human activity recognition needs to identify face or anonymize face that prevents disclosure of identity
[13]-[16].
From the literature, it is understood that most of the existing methods with adversarial learning do
not consider a holistic approach with multi-task learning. This gap is filled in this paper by proposing a
framework with underlying algorithm to leverage state of the art. Our approach is based on adversarial
learning that involves multi-task learning such as face anonymization, face detection and face recognition
where there is two-player game is characterized between generator G and discriminator D. The former tries
to preserve privacy while the latter tries to break it and with continuous adversarial learning setting, they
strive to perform well thus leading to better performance in privacy preserving human activity recognition.
Our contributions in this paper are being as.
โˆ’ We proposed a framework known as privacy preserving human activity recognition framework
(PPHARF) that follows a holistic approach consisting of discriminator and generator in adversarial setting
besides face anonymizer.
โˆ’ An algorithm named multi-task learning based hybrid prediction algorithm (MTL-HPA) is proposed
where multiple tasks are learned to ensure that the algorithm enhances privacy and ensures reliable action
recognition.
โˆ’ A prototype application is built using Python data science platform. It is used to explore the difference
between the proposed and existing methods. The empirical results revealed that the PPHARF outperforms
the existing approaches. The remainder of the paper is structured is being as. Section 2 reviews literature
pertaining to human action recognition from videos with privacy preserved.
2. RELATED WORK
This section reviews literature on various aspects of the research associated with human action
recognition while preserving privacy.
2.1. Generative adversarial networks
Generative adversarial network (GAN) models are found to be useful in improving performance in
computer vision applications. As explored by Wu et al. [8], generative adversary learning has benefits to
have a two-player game approach towards improving higher level of accuracy in human action recognition.
GAN based research is found in different researchers such as [9]-[12]. Peng and Schmid [9] employed it for
action detection where two-stream R-CNN is the deep learning technique employed. Ren and Lee [10]
explored multi-task feature learning based on GAN model using synthetic imagery. Ryoo et al. [11] studied
on privacy preserving action recognition that is based on GAN model and they considered extreme low-
resolution images. Liu et al. [12] used GAN approach for face recognition using deep hypersphere
embedding approach.
2.2. Privacy aware action recognition
Privacy refers to non-disclosure of identity of humans in this research. Kumar et al. [1] proposed
deep learning models like convolutional neural networks (CNN) for privacy preserving human activity
recognition. Wu et al. [8] proposed two different strategies for privacy preserving visual recognition. The
strategies are known as budget model restarting and ensemble. They also employed adversarial training for
better performance. Dai et al. [3] investigated on the trade-off between performance of action recognition
while preserving privacy and number of cameras and their resolution. They found that the spatial resolution,
the number of cameras used, and temporal resolution have their impact on performance from highest to
lowest respectively. Similar kind of work is carried out in [17]. Lyu et al. [4] proposed a collaborative deep
learning method that preserves privacy. They employed multiplicative perturbation for making their scheme
privacy aware. Ryoo et al. [17] used extreme low-resolution samples for action recognition. They introduced
a paradigm known as inverse super resolution (ISR) to learn image transformations optimally by creating
suitable low-resolution training images.
Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ
A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala)
3193
Machot et al. [5] proposed a framework for human action recognition using non-visual sensors.
Their framework leverages sensor data to discover unseen activities using a technique known as zero-shot
learning. Zhang et al. [2] focused on human action recognition with respect to falling and privacy preserving.
Their method involves feature extraction and representation and uses RGBD cameras. Rajput et al. [6] also
employed RGBD cameras and deep CNN in order to achieve privacy preserving human action recognition.
They reused cloud based learned models for better performance. Cippitelli et al. [18] used skeleton data
collected from RGBD sensors for action recognition. Yet another study by Cippitelli et al. [19] is based on
RGBD for human action recognition.
Ribono and Bettini [7] proposed an ontology-based solution based on hybrid reasoning and context-
aware approaches. Zolfaghari and Keyvanpour [20] proposed smart activity recognition framework (SARF)
which has different components such as sensing, pre-processing, feature extraction, feature selection, and
recognition of ambient assisted living (ASL). Ciliberto et al. [21] explored a privacy preserving 3D model for
human activity recognition. Gheid et al. [22] proposed a variant of KNN for privacy preserving action
recognition as part of their multi-party classification protocol. Gheid and Challl [23] does similar kind of
work that extends the work of [22] with optimization in security. Yonetani et al. [24] investigated on doubly
permuted homomorphic encryption (DPHE) approach for privacy preserving visual learning. It could
improve both privacy and accuracy over its predecessors.
2.3. Face recognition
Recognising face is essential in many computer vision applications. This is well explored problem
found in [25]. The performance of face recognition is further enhanced with recent advanced artificial
methods and available large datasets as studied in [26]-[28]. A multi-class classification problem is taken
with vanilla softmax in two loss functions such as softmax loss and center loss are combined in order to
jointly optimize the inter-class distance and intra-class feature distance. Wen et al. in [28] uses 200 million
images for training and employed triplet loss function for improving performance in face recognition. The
recent work in [26], [27] showed promising performance due to the usage of classification combined with
metric learning. From the literature, it is understood that most of the existing methods with adversarial
learning do not consider a holistic approach with multi-task learning. This gap is filled in this paper by
proposing a framework with underlying algorithm to leverage state of the art.
3. PRELIMINARIES
GAN is used in many computer vision applications such image-to-image translation. This section
provides an overview of GAN to understand the proposed approach which is based on adversarial learning.
GAN has two important components such as generator (G) and discriminator (D). The two components are
implemented using deep learning techniques. As presented in Figure 1, the G is used to generate new data
denoted as (Gz) and the underlying data distribution is denoted as pg(z). The aim of GAN is to see that
training sample denoted as pr(x) and pg(z) are same. On the other hand, the D takes (Gz) and real data x as
input. G gets trained while D has fixed parameters. The discriminator D generates output which is denoted as
D(G(z)) and between this and sample label, error rate is computed.
Figure 1. A typical GAN framework
๏ฒ ISSN:2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201
3194
Backpropagation algorithm is employed to update parameters of G. The D is aimed at finding whether
input is really from given sample and provides required feedback that helps in updating Gโ€™s parameters. Based
on real input data is x or not, the output of D approaches to either 1 and 0 respectively. The overall process
resembles min-max game played by two plyers and loss function of D is computed as in (1).
V(D,๐œƒ(๐ท)
)= -๐ธ๐‘ฅ~๐‘๐‘Ÿ(๐‘ฅ)[logD(x)]-๐ธ๐‘ง~๐‘๐‘”(๐‘ง)[log(1-D(g(z)))] (1)
Similarly, the G has its loss function as in (2).
V(G,๐œƒ(๐บ)
)=๐ธ๐‘ง~๐‘๐‘”(๐‘ง)[log(1-D(g(z)))] (2)
In the game, both players have their loss functions. In the process D maximizes
V(D)
,(๐œƒ(๐ท)
, ๐œƒ(๐บ)
)while G maximizes V(G)
,(๐œƒ(๐ท)
, ๐œƒ(๐บ)
)by updating ๐œƒ(๐ท)
and๐œƒ(๐ท)
respectively. The loss functions
of the players have parameter dependency. Nash equilibrium is to be achieved in order for a player to update
parameters of other player and stop training.
Min max V(D,G)= ๐ธ๐‘ฅ~๐‘๐‘Ÿ(๐‘ฅ)[logD(x)]+๐ธ๐‘ง~๐‘๐‘”(๐‘ง)[log(1-D(g(z)))] (3)
As the GAN is denoted as min-max optimization problem, the loss function is as expressed as in (3).
D makes an objective function as large as possible using read data as input. The goal of D is to ensure that
output denoted as D(G(z)) is close to 1. With training the min-max game converges to Nash equilibrium.
GAN models are found in many computer vision applications such as [9]-[12].
In this paper, the D tries to establish human identity while the G tries to make human face images as
hard as possible to identify while ensuring human action recognition. In the proposed approach there is third
component that anonymizes face region so as to see that the privacy is not lost. Reliable human action
recognition while defeating disclosure of identity is the main aim of this paper. As presented in Table 1, the
notations are provided to understand the symbols used in the proposed algorithm and its underlying
equations.
Table 1. Notations used in the algorithm
Notation Description
M Face anonymizer or modifier
f or rv input face or input face region
A Action detector
D The face classifier or discriminator
V Set of frames in a video
F Set of face images
๐‘Ÿ๐œ A face region
M (๐‘Ÿ๐œ) Modifying the face region given
๐œโ€ฒ
Updated frame
๐ฟ๐ด sum of the four losses in Faster-RCNN
M(f) The act of anonymization of given face
๐ฟ๐ท Face detection loss (by discriminator)
F Original image
๐œ† A weight
4. OUR APPROACH
Human action recognition from pre-recorded or live videos is an important requirement in many
applications such as surveillance. However, those applications generally recognize not only human action but
also identity of human. This is where the privacy of the person is lost. In this paper we aim at preventing
disclosure of identity while allowing reliable action recognition.
4.1. The framework
Inspired by the generative adversarial network (GAN) explained in section 3, we considered
adversarial learning phenomenon where two components compete. They include face anonymizer acting as
generator (G) and face classifier (D). The former strives to ensure that human face is anonymized so as to
preserve privacy while the latter tries to extract sensitive information in order to establish identity. Both G
and D competes with each other with quite opposite responsibilities. This kind of adversarial learning is
widely using in computer vision applications such as image to image translation [9]-[12]. Without
Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ
A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala)
3195
compromising on action detection performance, the face anonymizer strives to remove privacy sensitive
information from face region of humans in given video input. Figure 2 shows the proposed framework named
privacy preserving human activity recognition framework (PPHARF).
Figure 2. Overview of the proposed framework based on adversarial learning
The framework uses multi-task learning approach where the learning process is associated with
different components such as face anonymizer, action detector and face classifier. The framework takes input
video and finds the regions where face appears. Such face region is detected by the face detection module
and that region is given to face anonymizer. Face anonymizer M takes a face (f) or face region (rv) extracted
by face detection module and modifies it by removing sensitive information resulting M(f) or M(rv). The face
region is subtracted from original video frame (v) by face region subtraction module. The combiner module
takes modified face region (rv) and combined with original frame without face region to result in a combined
frame vโ€™. The action detector module (A) tries to detect human action using vโ€™ which is devoid of sensitive
information. Face classifier (the discriminator D) tries to establish identity of human (and expected to fail to
do so). Its detection loss is expressed as in (4).
๐ฟ๐‘‘๐‘’๐‘ก(๐‘€, ๐ด, ๐‘‰) = ๐ธ๐‘ฃ~๐‘‰[๐ฟ๐ด(๐‘ฃโ€ฒ
, {๐‘๐‘–(๐‘ฃ)}, {๐‘ก๐‘–(๐‘ฃ)})] (4)
The action detector module is implemented as in [9] and [11] while the loss function is taken from
[10]. The face classifier is nothing but discriminator D in adversarial learning whose aim is to identify a face.
It is implemented based on the face classifier used in [12] and the adversarial loss function is modelled after
the two-player game explored in [29]. The adversarial loss function is expressed in (5).
๐ฟ๐‘Ž๐‘‘๐‘ฃ(๐‘€, ๐ท, ๐น) = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“), ๐‘–๐‘“)] = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“, ๐‘–๐‘“)] (5)
In order to preserve the structure such as brightness and pose of modified image, we used L1 loss
which is also known as photorealistic loss. In computer vision applications this loss function is used in [30],
[31]. It could force similarity between input image and modified image (some visual similarity). This loss
function is expressed as in (6).
๐ฟ๐ผ1 = (๐‘€, ๐น)๐ธ๐‘“~๐น[๐œ†||๐‘€(๐‘“) โˆ’ ๐‘“||1] (6)
The components in the proposed framework such as A and D are iteratively trained to perform their
functionality effectively. Face detector module is the one used in [32] and another face detector used in [33]
is used in order to eliminate false positives. The face anonymizer is modelled after [34] with 9 residual
blocks. With respect to training Adam solver [35] is used with different parameters such as ฮฒ1=0.5 and
ฮฒ2=0.999 while learning rate is set at 0.0003 for face classifier and 0.001 for face anonymizer. Total number
of epochs used is 12 and the learning rate is dropped to 1/10 after 7th
epoch.
๏ฒ ISSN:2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201
3196
4.2. Action recognition
The action recognition for face anonymized image is done using bi-directional long short term
memory (BLSTM) network. long short term memory (LSTM) is an extension of recurrent neural networks.
Due to its special architecture, which combats the vanishing and exploding gradient problems, it is good at
handling human activity recognition up to a certain depth. The transfer process of the network can be
described as follows: the input tensor is transformed, along with the tensor of the hidden layer (at the last
stage), to the hidden layer by a matrix transformation. Then, the output of the hidden layer passes through an
activation function to the final value of the output layer.
Recurrent neural network (RNN) is a type of artificial neural network, which is used for working
with sequential data. Normal feed forward neural network has only independent data points. But for
sequential data, the current data point is dependent upon the previous data point. Inorder to maintain
dependencies between data points in the neural network, we use RNN which has the concept of memory
which stores the information of previous data points and using this it will generate next data points in
sequence. Different activation functions such as sigmoid, Tanh and Relu functions in RNN. The main
advantages of RNN are ability to handle sequential data, input data having varying lengths and can store past
information in memory. The main advantage of RNN is the vanishing gradient, it can remember data points
information only for a short period tof time. In-order to overcome this problem, we are going to LSTM
networks. When backpropagation with a deep network is used, gradients will vanish rapidly if preventative
measures that permit gradients to flow deeply are not taken. Compared with the simple input concatenation
and activation used in RNNs, LSTM has a particular structure for remembering information for a longer time
as an input gate and a forget gate control how to overwrite the information by comparing the inner memory
with the new information arriving. It enables gradients to flow through time easily. Figure 3 shows basic
structure of LSTM, where vector x denotes input, vector y denotes output and vector h denotes hidden layers.
Figure 3. Basic structure of LSTM
LSTM is a special type of RNN with memory blocks connected through layers. Each memory block
is smaller than a neuron with gates having capability to manage state and output. This gate uses the sigmoid
activation units to add information in memory bloch and to make change of state while operating on an input
Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ
A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala)
3197
sequence. Each LSTM has three types of gates, forget gate (to remove information from the block), inouut
gate (to update the memory based on input values) and output gate (values generated from the memory block
based on input). These gates have weights that are leaned during the training process. In real life, human
trajectories are continuous. Baseline LSTM cells predict the status based only on former information. Some
important information may not be captured properly by the cell if it runs in only one direction. The
improvement in bidirectional LSTM is that the current output is not only related to previous information but
also related to subsequent information. For example, it is appropriate to predict a missing word based on
context. Bidirectional LSTM is made up of two LSTM cells, and the output is determined by the two
together. BLSTM is a kind of neural network having advantage of flowing information from in both
directions means that future to past and past to future. Input flows in two directions making BLSTM different
from regular LSTM. In regular LSTM, we can move information only in one direction either forward or
backward. But in BLSTM, sequence information can move in bothe directions preserving future and past
information in memory units. It consists of two LSTMs one taking input in a forward direction and one
LSTM in a backward direction, effectively increasing the information available in the network. For providing
combination of forward and backward outputs, different functions such as summation, multiplication,
concatenation, and average are used. This BLSTM network is used in our work for action recognition of
anonymized image. Figure 4 shows the structure of standard BLSTM.
Figure 4: Standard BLSTM structure
4.3. Algorithm design
An algorithm named multi-task learning based hybrid prediction algorithm is proposed to realize the
functionality of our approach. As presented in Algorithm 1, it takes set of video frames V, a discriminator D,
set of face images F and set of identity labels are used as input. There is an iterative process that works for
each video frame. From the video frame, the face region is identified and that is anonymized. On the other
hand, the subtraction module removes face region from the video frame and the result is assigned to s. After
anonymizing face image, it is combined with the subtracted frame in order to have better anonymized face
image with visual appearance. Such anonymized face image is taken by action detector that identifies human
action associated with face image. At the same time, the discriminator D (face classifier) tries to establish
identity of the face as part of its adversarial setting. Every time, the fame modifier (M) and the action
detector (A) are updated with improved knowledge. This process continues for all the frames in the given
video. If there are multiple images in a single frame, there will be a sub process to repeat steps for each fame
image in the video frame v.
๏ฒ ISSN:2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201
3198
Algorithm: Multi-Task Learning based Hybrid Prediction Algorithm
Input: D, V, F, identity labels
Output: M, A
1. For each v in V
2. f๏ƒŸDetectFace(v)
3. s ๏ƒŸ Subtract(f)
4. f โ€˜ ๏ƒŸ M(f)
5. Compute L1 loss
๐ฟ๐ผ1 = (๐‘€, ๐น)๐ธ๐‘“~๐น[๐œ†||๐‘€(๐‘“) โˆ’ ๐‘“||1]
6. vโ€™๏ƒŸCombiner (fโ€™, s)
7. a ๏ƒŸ A(vโ€™)
8. Compute detection loss
๐ฟ๐‘‘๐‘’๐‘ก(๐‘€, ๐ด, ๐‘‰) = ๐ธ๐‘ฃ~๐‘‰[๐ฟ๐ด(๐‘ฃโ€ฒ
, {๐‘๐‘–(๐‘ฃ)}, {๐‘ก๐‘–(๐‘ฃ)})]
9. det=D (f โ€˜)
10. Compute adversarial loss
๐ฟ๐‘Ž๐‘‘๐‘ฃ(๐‘€, ๐ท, ๐น) = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“), ๐‘–๐‘“)] = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“, ๐‘–๐‘“)]
11. Update M
12. Update A
5. RESULTS AND DISCUSSION
5.1. Datasets
We used two datasets namely DALY and JHMDB in this dataset. Daily action localization in
YouTube (DALY) is the dataset introduced in [36] and the dataset is obtained from [37]. The dataset has
more than 30 hours of YouTube videos that are annotated in spatial and temporal domains. It consists of 10
human actions that are witnessed every day with 3600 total number of instances. Action classes are
considered with clearly set temporal boundaries. This is essential to overcome ambiguities associated with
noise. Some of the action classes include brushing teeth, phoning, taking photos, drinking, applying makeup
on lips and playing harmonica as presented in Figure 5.
Figure 5. Representative pictures with human action annotated for10 classes of DALY dataset
JHMDB is another dataset introduced in [25] and it is collected from [38]. It is the dataset which is
benchmark for human detection, human actions and pose estimation. JHMDB is derived from the HMDB51
dataset [39] where 5,100 videos consisting of 51 human actions. JHMDB is a subset of HMDB51 with 21
categories that involve a single human with certain actions as presented in Figure 6. Some of the actions
include hug, kick, jump, run, and shoot.
5.2. Results
The experiments are made with DALY and JHMDB datasets. The experimental results are observed
in terms of face verification error and mean average precision. Our anonymization approach is compared
with many states of the art or baseline approaches. They include Blur x 3, masked, noise x 3 and edge.
The qualitative results shown provide the visual difference before and after the face modification.
This is made due to the fact that our algorithm anonymizes prior to performing action recognition. The user
study conducted with famous personalities convinced that our anonymization approach has acceptable
performance.
Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ
A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala)
3199
Figure 6. Some of the images from JHMDB dataset with annotated human actions
As presented in Table 2, the results revealed that for many actions, the proposed approach showed
better performance. The aim of the research is to detect human actions reliably while preventing disclosure of
human identity. This has been fulfilled as observations are made in the experimental study. The results also
reveal that the performance of the proposed method is better over baselines in terms of anonymization as
well. It is evidenced in both empirical study and user study.
Table 2. Shows accuracy in action detection using DALY dataset
MTL-
HPA
Edge Masked
Noise(ฯƒ
2=0.6)
Noise (ฯƒ
2=0.4)
Noise (ฯƒ
2=0.2)
Blur(24x24
)
Blur(16x
16)
Un-
anonymized
Action
90.2892 80.3803 67.1271 83.7136 87.7276 87.7977 92.1721 82.1521 92.3923 Lip
35.1131 29.4895 15.2052 21.4715 25.005 31.4014 39.8798 32.4324 51.3113 Brush
79.1971 78.098 78.9389 81.4013 78.6686 78.4884 80.0099 79.6095 76.8067 Floor
34.5926 30.5405 26.6066 29.6196 32.7127 31.9019 31.8018 32.1721 27.8979 Window
24.9529 10.3203 6.59659 7.43743 8.34834 12.4224 15.2452 10.7507 31.2612 Drink
34.8939 32.6726 25.5555 29.1091 35.3653 34.8348 34.995 31.3814 32.7027 Fold
79.1471 79.2292 73.023 78.048 75.035 76.5765 74.7247 74.9849 75.3753 Iron
48.5665 35.1852 27.8178 33.9639 40.1601 42.4124 46.3763 37.007 51.5515 Phone
57.3753 54.7447 21.3413 27.3774 36.6466 50.1901 51.8318 48.4184 73.9839
Harmoni
ca
57.5955 49.6696 46.8068 46.3964 45.4654 53.5335 53.1431 52.5625 55.7957 Photo
54.37232 48.03299 38.90186 43.85381 46.51347 49.95591 52.01797 48.1471 56.90785 mAP
6. CONCLUSION
We proposed a framework named privacy preserving human activity recognition framework
(PPHARF) with an underlying algorithm known as multi-task learning based hybrid prediction algorithm
(MTL-HPA). The framework is aimed at supporting adversarial learning based multi-task formulation that
accomplishes human action recognition, anonymization of human face and face detection (to evaluate
anonymization). Face anonymizer is part of the framework that is designed to ensure preserving of privacy.
In other words, non-disclosure of human identity and at the same time recognition of action are important
considerations. The face anonymizer is designed to confuse humans and applications in recognition of human
based on face while supporting action recognition accurately. The proposed framework is evaluated with a
prototype using JHMDB and DALY datasets. The proposed approach outperformed traditional face
modification approaches and it showed significantly better performance over the state of the art. In future, we
intend to explore advanced GAN approaches such as SingleGAN with possible improvements.
REFERENCES
[1] K. V. Kumar, J. Harikiran, M.A.R. Prasad, and U. Sirisha, โ€˜โ€˜Privacy-Preserving Human Activity Recognition and
Resolution Image using Deep Learning Algorithms Spatial relationship and increasing the attribute value in
OpenCV,โ€™โ€™ International Journal of Advanced Science and Technology, vol. 29, no. 7, pp. 514-523, 2020.
๏ฒ ISSN:2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201
3200
[2] C. Zhang, Y. Tian, and E. Capezuti, โ€˜โ€˜Privacy Preserving Automatic Fall Detection for Elderly Using RGBD
Cameras,โ€™โ€™International Conference on Computers for Handicapped Persons, 2012, pp. 625-633, doi:
10.1007/978-3-642-31522-0_95.
[3] J. Dai, J. Wu, B. Saghafi, J. Konrad, and P. Ishwar, โ€˜โ€˜Towards Privacy-Preserving Activity Recognition Using
Extremely Low Temporal and Spatial Resolution Cameras,โ€™โ€™ IEEE Conference on Computer Vision and Pattern
Recognition Workshops, pp. 68-76, 2015.
[4] L. Lyu, X. He, Y. W. Law, and M. Palaniswami, โ€˜โ€˜Privacy-Preserving Collaborative Deep Learning with
Application to Human Activity Recognition,โ€™โ€™ ACM, pp. 1219-1228, Nov. 2017, doi: 10.1145/3132847.3132990.
[5] F. Al Machot, M. R. Elkobaisi, and K. Kyamakya, โ€˜โ€˜Zero-Shot Human Activity Recognition Using Non-Visual
Sensors,โ€™โ€™ Sensors, vol. 20, no. 3, p. 825, doi: 10.3390/s20030825.
[6] A. S. Rajput, B. Raman, J. Imran, โ€˜โ€˜Privacy-preserving human action recognition as a remote cloud service using
RGB-D sensors and deep CNN,โ€™โ€™ Expert Systems with Applications, vol. 152, p. 113349, August 2020, doi:
10.1016/j.eswa.2020.113349.
[7] D. Riboni and C. Bettini, โ€˜โ€˜COSAR: hybrid reasoning for context-aware activity recognition,โ€™โ€™ Personal and
Ubiquitous Computing, vol. 15, no. 3, pp. 271-289, 2011, doi: 10.1007/s00779-010-0331-7.
[8] Z. Wu, Z. Wang, Z. Wang, and H. Jin, โ€˜โ€˜Towards Privacy-Preserving Visual Recognition via Adversarial Training:
A Pilot Study,โ€™โ€™ Springer, pp. 1-19, 2018.
[9] X. Peng and C. Schmid, โ€˜โ€˜Multi-region two-stream r-cnn for action detection,โ€™โ€™in: European conference on
computer vision (ECCV), October 2016, pp. 744-759, doi: 10.1007/978-3-319-46493-0_45.
[10] Z. Ren and Y. J. Lee, โ€˜โ€˜Cross-domain self-supervised multi-task feature learning using synthetic imagery,โ€™โ€™in:
CVPR, 2018, pp. 762-771.
[11] M. S. Ryoo, B. Rothrock, C. Fleming, โ€˜โ€˜Privacy-preserving egocentric activity recognition from extreme low
resolution,โ€™โ€™Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017, pp. 4255-
4262.
[12] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, โ€˜โ€˜Sphereface: Deep hypersphere embedding for face
recognition,โ€ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp.
212-220.
[13] Y. Sun, X. Wang, and X. Tang, โ€˜โ€˜Deep learning face representation from predicting 10,000 classes,โ€™โ€™Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1891-1898
[14] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, โ€˜โ€˜Deepface: Closing the gap to human-level performance in face
verification,โ€™โ€™Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp.
1701-1708.
[15] Y. Mao, Shanhe Yi et.al., โ€˜โ€˜A Privacy Preserving Deep Learning Approach for Face Recognition with Edge
Computingโ€™โ€™Conference proceedings, usenix hotedge18-papers, pp.1-6, 2016.
[16] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, โ€˜โ€˜Face recognition: A literature survey,โ€™โ€™in: ACM
computing surveys (CSUR), vol. 35, no. 4, pp. 399-458, December 2003, doi: 10.1145/954339.954342.
[17] M. S. Ryoo, B. Rothrock, C. Fleming, and H. J. Yang, โ€˜โ€˜Privacy-Preserving Human Activity Recognition from
Extreme Low Resolution,โ€ Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17),
2017, pp. 4255-4262.
[18] E. Cippitelli, S. Gasparrini, E. Gambi, and S. Spinsante, โ€˜โ€˜A Human Activity Recognition System Using Skeleton
Data from RGBD Sensors,โ€™โ€™Computational Intelligence and Neuroscience, vol. 2016, pp. 1-14, 2016, doi:
10.1155/2016/4351435.
[19] E. Cippitelli, E. Gambi, and S. Spinsante, โ€˜โ€˜Human Action Recognition with RGB-D Sensors,โ€™โ€™ Motion Tracking
and Gesture Recognition, pp. 100-115, 2017, doi: 10.5772/68121.
[20] S. Zolfaghari and M. R. Keyvanpour, "SARF: Smart activity recognition framework in Ambient Assisted Living,"
2016 Federated Conference on Computer Science and Information Systems (FedCSIS), 2016, pp. 1435-1443.
[21] M. Ciliberto, D. Roggen, and F. J. O. Morales, โ€˜โ€˜Exploring human activity annotation using a privacy preserving
3D model,โ€™โ€™Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing
Adjunct-UbiComp โ€™16, September 2016, pp. 803-812, doi: 10.1145/2968219.2968290.
[22] Z. Gheid, Y. Challal, X. Yi, and A. Derhab, โ€˜โ€˜Efficient and privacy-aware multi-party classification protocol for
human activity recognition,โ€ Journal of Network and Computer Applications, vol. 98, pp. 84-96, November 2017,
doi: 10.1016/j.jnca.2017.09.005.
[23] Z. Gheid and Y. Challal, "Novel Efficient and Privacy-Preserving Protocols for Sensor-Based Human Activity
Recognition," 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted
Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and
Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), 2016, pp. 301-308, doi: 10.1109/UIC-
ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0062.
[24] R. Yonetani, V. N. Boddeti, K. M. Kitani, and Y. Sato, โ€˜โ€˜Privacy-Preserving Visual Learning Using Doubly
Permuted Homomorphic Encryption,โ€ 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp.
2040-2050.
[25] H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black, โ€˜โ€˜Towards understanding action recognition,โ€™โ€™in:
International Conf. on Computer Vision (ICCV), December 2013, pp. 3192-3199.
[26] X. Yu, X. Cai, Z. Ying, T. Li, and G. Li, โ€˜โ€˜SingleGAN: Image-to-Image Translation by a Single-Generator
Network using Multiple Generative Adversarial Learning,โ€ Asian Conference on Computer Vision. Springer,
Cham, December 2018, pp. 341-356, doi: 10.1007/978-3-030-20873-8_22.
Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ
A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala)
3201
[27] X. Peng and C. Schmid, โ€˜โ€˜Multi-region two-stream r-cnn for action detection,โ€™โ€™in: European conference on
computer vision (ECCV), October 2016, pp. 744-759, doi: 10.1007/978-3-319-46493-0_45.
[28] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, โ€˜โ€˜A discriminative feature learning approach for Deep face
recognition,โ€™โ€™European conference on computer vision, 2016.
[29] I. Goodfellow et al., โ€˜โ€˜Generative adversarial nets,โ€™โ€™Advances in neural information processing systems, vol. 27, pp.
1-9, 2014.
[30] P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros, โ€˜โ€˜Image-to-image translation with conditional adversarial networks,โ€™โ€™
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1125-1134.
[31] J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, โ€˜โ€˜Unpaired image-to-image translation using cycle-consistent
adversarial networks,โ€™โ€™Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp.
2223-2232.
[32] M. Najibi, P. Samangouei, R. Chellappa, and L. Davis, โ€˜โ€˜SSH: Single stage headless face detector,โ€™โ€™Proceedings of
the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4875-4884.
[33] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint Face Detection and Alignment Using Multitask Cascaded
Convolutional Networks," in IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, Oct. 2016, doi:
10.1109/LSP.2016.2603342.
[34] J. Johnson, A. Alahi, and L. Fei-Fei, โ€˜โ€˜Perceptual losses for real-time style transfer and super-resolution,โ€™โ€™European
conference on computer vision. Springer, Cham, October 2016, pp. 694-711, doi: 10.1007/978-3-319-46475-6_43.
[35] D. P. Kingma and J. Ba, โ€˜โ€˜Adam: A method for stochastic optimization,โ€™โ€™arXiv preprint arXiv:1412.6980, 2014.
[36] P. Weinzaepfel, X. Martin, and C. Schmid, โ€˜โ€˜Human action localization with sparse spatial supervision,โ€™โ€™arXiv
preprint arXiv:1605.05197, 2016.
[37] Thoth Inria, โ€˜โ€˜Daily Action Localization in Youtube videos,โ€™โ€™[Online]. Available: http://thoth.inrialpes.fr/daly/.
[38] JHMDB Dataset, โ€˜โ€˜A fully annotated data set for human actions and human poses,โ€™โ€™[Online]. Available:
http://jhmdb.is.tue.mpg.de/.
[39] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, "HMDB: A large video database for human motion
recognition," 2011 International Conference on Computer Vision, 2011, pp. 2556-2563, doi:
10.1109/ICCV.2011.6126543.
BIOGRAPHIES OF AUTHORS
Vijaya Kumar Kambala working as Part Time Research Scholar in School of CSE VIT-AP. he is
working as Assistant Professor, in PVP Siddhartha Institute of Technology, Vijayawada. His research
interest includes Image Processing, Video Analysis, human activity recognition, privacy preserving
human activity recognition.
Harikiran Jonnadula working as Assistant professor, School of CSE, Vellore Institute of
Technology, VIT-AP. His reserach interest include microarray image analysis , Hyperspectral Imaging
and Video Analytics.

More Related Content

What's hot

Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...AM Publications
ย 
An Approach for Copy-Move Attack Detection and Transformation Recovery
An Approach for Copy-Move Attack Detection and Transformation RecoveryAn Approach for Copy-Move Attack Detection and Transformation Recovery
An Approach for Copy-Move Attack Detection and Transformation RecoveryIRJET Journal
ย 
Ijetcas14 465
Ijetcas14 465Ijetcas14 465
Ijetcas14 465Iasir Journals
ย 
TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019sipij
ย 
A real time aggressive human behaviour detection system in cage environment a...
A real time aggressive human behaviour detection system in cage environment a...A real time aggressive human behaviour detection system in cage environment a...
A real time aggressive human behaviour detection system in cage environment a...Journal Papers
ย 
Deep convolutional neural network for hand sign language recognition using mo...
Deep convolutional neural network for hand sign language recognition using mo...Deep convolutional neural network for hand sign language recognition using mo...
Deep convolutional neural network for hand sign language recognition using mo...journalBEEI
ย 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesCSCJournals
ย 
The upsurge of deep learning for computer vision applications
The upsurge of deep learning for computer vision applicationsThe upsurge of deep learning for computer vision applications
The upsurge of deep learning for computer vision applicationsIJECEIAES
ย 
Background Subtraction Algorithm Based Human Behavior Detection
Background Subtraction Algorithm Based Human Behavior DetectionBackground Subtraction Algorithm Based Human Behavior Detection
Background Subtraction Algorithm Based Human Behavior DetectionIJERA Editor
ย 
Modelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionModelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionIJERA Editor
ย 
Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...
Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...
Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...Melanie Swan
ย 
An HCI Principles based Framework to Support Deaf Community
An HCI Principles based Framework to Support Deaf CommunityAn HCI Principles based Framework to Support Deaf Community
An HCI Principles based Framework to Support Deaf CommunityIJEACS
ย 
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...CSCJournals
ย 
Cognitive ability of human brain and soft computing techniques
Cognitive ability of human brain and soft computing techniquesCognitive ability of human brain and soft computing techniques
Cognitive ability of human brain and soft computing techniquesDr G R Sinha
ย 
Protection for images transmission
Protection for images transmissionProtection for images transmission
Protection for images transmissionIAEME Publication
ย 
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Paul Gilbreath
ย 
The power of deep learning models applications
The power of deep learning models applicationsThe power of deep learning models applications
The power of deep learning models applicationsSameera Sk
ย 
Raspberry Pi Augmentation: A Cost Effective Solution To Google Glass
Raspberry Pi Augmentation: A Cost Effective Solution To Google GlassRaspberry Pi Augmentation: A Cost Effective Solution To Google Glass
Raspberry Pi Augmentation: A Cost Effective Solution To Google GlassIRJET Journal
ย 
IRJET- Application of MCNN in Object Detection
IRJET-  	  Application of MCNN in Object DetectionIRJET-  	  Application of MCNN in Object Detection
IRJET- Application of MCNN in Object DetectionIRJET Journal
ย 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Editor IJARCET
ย 

What's hot (20)

Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
ย 
An Approach for Copy-Move Attack Detection and Transformation Recovery
An Approach for Copy-Move Attack Detection and Transformation RecoveryAn Approach for Copy-Move Attack Detection and Transformation Recovery
An Approach for Copy-Move Attack Detection and Transformation Recovery
ย 
Ijetcas14 465
Ijetcas14 465Ijetcas14 465
Ijetcas14 465
ย 
TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019
ย 
A real time aggressive human behaviour detection system in cage environment a...
A real time aggressive human behaviour detection system in cage environment a...A real time aggressive human behaviour detection system in cage environment a...
A real time aggressive human behaviour detection system in cage environment a...
ย 
Deep convolutional neural network for hand sign language recognition using mo...
Deep convolutional neural network for hand sign language recognition using mo...Deep convolutional neural network for hand sign language recognition using mo...
Deep convolutional neural network for hand sign language recognition using mo...
ย 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal Features
ย 
The upsurge of deep learning for computer vision applications
The upsurge of deep learning for computer vision applicationsThe upsurge of deep learning for computer vision applications
The upsurge of deep learning for computer vision applications
ย 
Background Subtraction Algorithm Based Human Behavior Detection
Background Subtraction Algorithm Based Human Behavior DetectionBackground Subtraction Algorithm Based Human Behavior Detection
Background Subtraction Algorithm Based Human Behavior Detection
ย 
Modelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionModelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object Recognition
ย 
Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...
Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...
Deep Learning Explained: The future of Artificial Intelligence and Smart Netw...
ย 
An HCI Principles based Framework to Support Deaf Community
An HCI Principles based Framework to Support Deaf CommunityAn HCI Principles based Framework to Support Deaf Community
An HCI Principles based Framework to Support Deaf Community
ย 
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...
ย 
Cognitive ability of human brain and soft computing techniques
Cognitive ability of human brain and soft computing techniquesCognitive ability of human brain and soft computing techniques
Cognitive ability of human brain and soft computing techniques
ย 
Protection for images transmission
Protection for images transmissionProtection for images transmission
Protection for images transmission
ย 
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
ย 
The power of deep learning models applications
The power of deep learning models applicationsThe power of deep learning models applications
The power of deep learning models applications
ย 
Raspberry Pi Augmentation: A Cost Effective Solution To Google Glass
Raspberry Pi Augmentation: A Cost Effective Solution To Google GlassRaspberry Pi Augmentation: A Cost Effective Solution To Google Glass
Raspberry Pi Augmentation: A Cost Effective Solution To Google Glass
ย 
IRJET- Application of MCNN in Object Detection
IRJET-  	  Application of MCNN in Object DetectionIRJET-  	  Application of MCNN in Object Detection
IRJET- Application of MCNN in Object Detection
ย 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
ย 

Similar to A multi-task learning based hybrid prediction algorithm for privacy preserving human activity recognition framework

Survey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNNSurvey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNNIRJET Journal
ย 
Human Activity Recognition
Human Activity RecognitionHuman Activity Recognition
Human Activity RecognitionIRJET Journal
ย 
People Monitoring and Mask Detection using Real-time video analyzing
People Monitoring and Mask Detection using Real-time video analyzingPeople Monitoring and Mask Detection using Real-time video analyzing
People Monitoring and Mask Detection using Real-time video analyzingvivatechijri
ย 
Face and liveness detection with criminal identification using machine learni...
Face and liveness detection with criminal identification using machine learni...Face and liveness detection with criminal identification using machine learni...
Face and liveness detection with criminal identification using machine learni...IAESIJAI
ย 
Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...IAEME Publication
ย 
Real Time Crime Detection using Deep Learning
Real Time Crime Detection using Deep LearningReal Time Crime Detection using Deep Learning
Real Time Crime Detection using Deep LearningIRJET Journal
ย 
Application To Monitor And Manage People In Crowded Places Using Neural Networks
Application To Monitor And Manage People In Crowded Places Using Neural NetworksApplication To Monitor And Manage People In Crowded Places Using Neural Networks
Application To Monitor And Manage People In Crowded Places Using Neural NetworksIJSRED
ย 
Human Activity Recognition Using Neural Network
Human Activity Recognition Using Neural NetworkHuman Activity Recognition Using Neural Network
Human Activity Recognition Using Neural NetworkIRJET Journal
ย 
Novel framework for optimized digital forensic for mitigating complex image ...
Novel framework for optimized digital forensic  for mitigating complex image ...Novel framework for optimized digital forensic  for mitigating complex image ...
Novel framework for optimized digital forensic for mitigating complex image ...IJECEIAES
ย 
IRJET- Prediction of Anomalous Activities in a Video
IRJET-  	  Prediction of Anomalous Activities in a VideoIRJET-  	  Prediction of Anomalous Activities in a Video
IRJET- Prediction of Anomalous Activities in a VideoIRJET Journal
ย 
Automatic video censoring system using deep learning
Automatic video censoring system using deep learningAutomatic video censoring system using deep learning
Automatic video censoring system using deep learningIJECEIAES
ย 
Global-local attention with triplet loss and label smoothed cross-entropy for...
Global-local attention with triplet loss and label smoothed cross-entropy for...Global-local attention with triplet loss and label smoothed cross-entropy for...
Global-local attention with triplet loss and label smoothed cross-entropy for...IAESIJAI
ย 
Deep learning algorithms for intrusion detection systems in internet of thin...
Deep learning algorithms for intrusion detection systems in  internet of thin...Deep learning algorithms for intrusion detection systems in  internet of thin...
Deep learning algorithms for intrusion detection systems in internet of thin...IJECEIAES
ย 
Using Brain Waves as New Biometric Feature for Authenticating a Computer User...
Using Brain Waves as New Biometric Feature for Authenticating a Computer User...Using Brain Waves as New Biometric Feature for Authenticating a Computer User...
Using Brain Waves as New Biometric Feature for Authenticating a Computer User...CSCJournals
ย 
A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...
A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...
A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...aciijournal
ย 
Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)aciijournal
ย 
A Novel Biometric Approach for Authentication In Pervasive Computing Environm...
A Novel Biometric Approach for Authentication In Pervasive Computing Environm...A Novel Biometric Approach for Authentication In Pervasive Computing Environm...
A Novel Biometric Approach for Authentication In Pervasive Computing Environm...aciijournal
ย 
Advance Intelligent Video Surveillance System Using OpenCV
Advance Intelligent Video Surveillance System Using OpenCVAdvance Intelligent Video Surveillance System Using OpenCV
Advance Intelligent Video Surveillance System Using OpenCVIRJET Journal
ย 
IRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning TechniquesIRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning TechniquesIRJET Journal
ย 
Ci31560566
Ci31560566Ci31560566
Ci31560566IJERA Editor
ย 

Similar to A multi-task learning based hybrid prediction algorithm for privacy preserving human activity recognition framework (20)

Survey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNNSurvey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNN
ย 
Human Activity Recognition
Human Activity RecognitionHuman Activity Recognition
Human Activity Recognition
ย 
People Monitoring and Mask Detection using Real-time video analyzing
People Monitoring and Mask Detection using Real-time video analyzingPeople Monitoring and Mask Detection using Real-time video analyzing
People Monitoring and Mask Detection using Real-time video analyzing
ย 
Face and liveness detection with criminal identification using machine learni...
Face and liveness detection with criminal identification using machine learni...Face and liveness detection with criminal identification using machine learni...
Face and liveness detection with criminal identification using machine learni...
ย 
Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...Human activity detection based on edge point movements and spatio temporal fe...
Human activity detection based on edge point movements and spatio temporal fe...
ย 
Real Time Crime Detection using Deep Learning
Real Time Crime Detection using Deep LearningReal Time Crime Detection using Deep Learning
Real Time Crime Detection using Deep Learning
ย 
Application To Monitor And Manage People In Crowded Places Using Neural Networks
Application To Monitor And Manage People In Crowded Places Using Neural NetworksApplication To Monitor And Manage People In Crowded Places Using Neural Networks
Application To Monitor And Manage People In Crowded Places Using Neural Networks
ย 
Human Activity Recognition Using Neural Network
Human Activity Recognition Using Neural NetworkHuman Activity Recognition Using Neural Network
Human Activity Recognition Using Neural Network
ย 
Novel framework for optimized digital forensic for mitigating complex image ...
Novel framework for optimized digital forensic  for mitigating complex image ...Novel framework for optimized digital forensic  for mitigating complex image ...
Novel framework for optimized digital forensic for mitigating complex image ...
ย 
IRJET- Prediction of Anomalous Activities in a Video
IRJET-  	  Prediction of Anomalous Activities in a VideoIRJET-  	  Prediction of Anomalous Activities in a Video
IRJET- Prediction of Anomalous Activities in a Video
ย 
Automatic video censoring system using deep learning
Automatic video censoring system using deep learningAutomatic video censoring system using deep learning
Automatic video censoring system using deep learning
ย 
Global-local attention with triplet loss and label smoothed cross-entropy for...
Global-local attention with triplet loss and label smoothed cross-entropy for...Global-local attention with triplet loss and label smoothed cross-entropy for...
Global-local attention with triplet loss and label smoothed cross-entropy for...
ย 
Deep learning algorithms for intrusion detection systems in internet of thin...
Deep learning algorithms for intrusion detection systems in  internet of thin...Deep learning algorithms for intrusion detection systems in  internet of thin...
Deep learning algorithms for intrusion detection systems in internet of thin...
ย 
Using Brain Waves as New Biometric Feature for Authenticating a Computer User...
Using Brain Waves as New Biometric Feature for Authenticating a Computer User...Using Brain Waves as New Biometric Feature for Authenticating a Computer User...
Using Brain Waves as New Biometric Feature for Authenticating a Computer User...
ย 
A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...
A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...
A NOVEL BIOMETRIC APPROACH FOR AUTHENTICATION IN PERVASIVE COMPUTING ENVIRONM...
ย 
Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)
ย 
A Novel Biometric Approach for Authentication In Pervasive Computing Environm...
A Novel Biometric Approach for Authentication In Pervasive Computing Environm...A Novel Biometric Approach for Authentication In Pervasive Computing Environm...
A Novel Biometric Approach for Authentication In Pervasive Computing Environm...
ย 
Advance Intelligent Video Surveillance System Using OpenCV
Advance Intelligent Video Surveillance System Using OpenCVAdvance Intelligent Video Surveillance System Using OpenCV
Advance Intelligent Video Surveillance System Using OpenCV
ย 
IRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning TechniquesIRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning Techniques
ย 
Ci31560566
Ci31560566Ci31560566
Ci31560566
ย 

More from journalBEEI

Square transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipherSquare transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipherjournalBEEI
ย 
Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...journalBEEI
ย 
Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...journalBEEI
ย 
A secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networksA secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networksjournalBEEI
ย 
Plant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkPlant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkjournalBEEI
ย 
Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...journalBEEI
ย 
Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...journalBEEI
ย 
Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...journalBEEI
ย 
Wireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithmWireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithmjournalBEEI
ย 
Implementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antennaImplementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antennajournalBEEI
ย 
The calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human headThe calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human headjournalBEEI
ย 
Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...journalBEEI
ย 
Design of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting applicationDesign of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting applicationjournalBEEI
ย 
Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...journalBEEI
ย 
Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)journalBEEI
ย 
Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...journalBEEI
ย 
Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...journalBEEI
ย 
An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...journalBEEI
ย 
Parking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentationParking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentationjournalBEEI
ย 
Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...journalBEEI
ย 

More from journalBEEI (20)

Square transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipherSquare transposition: an approach to the transposition process in block cipher
Square transposition: an approach to the transposition process in block cipher
ย 
Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...
ย 
Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...Supervised machine learning based liver disease prediction approach with LASS...
Supervised machine learning based liver disease prediction approach with LASS...
ย 
A secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networksA secure and energy saving protocol for wireless sensor networks
A secure and energy saving protocol for wireless sensor networks
ย 
Plant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural networkPlant leaf identification system using convolutional neural network
Plant leaf identification system using convolutional neural network
ย 
Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...Customized moodle-based learning management system for socially disadvantaged...
Customized moodle-based learning management system for socially disadvantaged...
ย 
Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...Understanding the role of individual learner in adaptive and personalized e-l...
Understanding the role of individual learner in adaptive and personalized e-l...
ย 
Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...Prototype mobile contactless transaction system in traditional markets to sup...
Prototype mobile contactless transaction system in traditional markets to sup...
ย 
Wireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithmWireless HART stack using multiprocessor technique with laxity algorithm
Wireless HART stack using multiprocessor technique with laxity algorithm
ย 
Implementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antennaImplementation of double-layer loaded on octagon microstrip yagi antenna
Implementation of double-layer loaded on octagon microstrip yagi antenna
ย 
The calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human headThe calculation of the field of an antenna located near the human head
The calculation of the field of an antenna located near the human head
ย 
Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...Exact secure outage probability performance of uplinkdownlink multiple access...
Exact secure outage probability performance of uplinkdownlink multiple access...
ย 
Design of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting applicationDesign of a dual-band antenna for energy harvesting application
Design of a dual-band antenna for energy harvesting application
ย 
Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...Transforming data-centric eXtensible markup language into relational database...
Transforming data-centric eXtensible markup language into relational database...
ย 
Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)Key performance requirement of future next wireless networks (6G)
Key performance requirement of future next wireless networks (6G)
ย 
Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...
ย 
Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...Modeling climate phenomenon with software grids analysis and display system i...
Modeling climate phenomenon with software grids analysis and display system i...
ย 
An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...An approach of re-organizing input dataset to enhance the quality of emotion ...
An approach of re-organizing input dataset to enhance the quality of emotion ...
ย 
Parking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentationParking detection system using background subtraction and HSV color segmentation
Parking detection system using background subtraction and HSV color segmentation
ย 
Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...Quality of service performances of video and voice transmission in universal ...
Quality of service performances of video and voice transmission in universal ...
ย 

Recently uploaded

Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
ย 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
ย 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
ย 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
ย 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
ย 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
ย 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
ย 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
ย 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
ย 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
ย 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
ย 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
ย 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
ย 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
ย 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoรฃo Esperancinha
ย 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
ย 
Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”
Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”
Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”soniya singh
ย 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
ย 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
ย 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
ย 

Recently uploaded (20)

Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
ย 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
ย 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ย 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
ย 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
ย 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
ย 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
ย 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
ย 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
ย 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
ย 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
ย 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
ย 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
ย 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
ย 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
ย 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
ย 
Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”
Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”
Model Call Girl in Narela Delhi reach out to us at ๐Ÿ”8264348440๐Ÿ”
ย 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
ย 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
ย 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
ย 

A multi-task learning based hybrid prediction algorithm for privacy preserving human activity recognition framework

  • 1. Bulletin of Electrical Engineering and Informatics Vol. 10, No. 6, December 2021, pp. 3191~3201 ISSN: 2302-9285, DOI: 10.11591/eei.v10i6.3204 ๏ฒ3191 Journal homepage: http://beei.org A multi-task learning based hybrid prediction algorithm for privacy preserving human activity recognition framework Vijaya Kumar Kambala, Harikiran Jonnadula School of CSE, VIT-AP University, Amaravathi, Andhra Pradesh, India Article Info ABSTRACT Article history: Received Aug 7, 2021 Revised Oct 3, 2021 Accepted Oct 31, 2021 There is ever increasing need to use computer vision devices to capture videos as part of many real-world applications. However, invading privacy of people is the cause of concern. There is need for protecting privacy of people while videos are used purposefully based on objective functions. One such use case is human activity recognition without disclosing human identity. In this paper, we proposed a multi-task learning based hybrid prediction algorithm (MTL- HPA) towards realising privacy preserving human activity recognition framework (PPHARF). It serves the purpose by recognizing human activities from videos while preserving identity of humans present in the multimedia object. Face of any person in the video is anonymized to preserve privacy while the actions of the person are exposed to get them extracted. Without losing utility of human activity recognition, anonymization is achieved. Humans and face detection methods file to reveal identity of the persons in video. Action detection is done using bidirectional long short tem memory network. We experimentally confirm with joint-annotated human motion data base (JHMDB) and daily action localization in YouTube (DALY) datasets that the framework recognises human activities and ensures non-disclosure of privacy information. Our approach is better than many traditional anonymization techniques such as noise adding, blurring, and masking. Keywords: Anonymization Human activity recognition Hybrid prediction algorithm Multi-task learning Privacy This is an open access article under the CC BY-SA license. Corresponding Author: Harikiran Jonnadula School of CSE, Vellore Institute of Technology VIT-AP University, G-30, Inavolu, Beside AP Secretariat Amaravati Andhra Pradesh 522237, India Email: harikiran.j@vitap.ac.in 1. INTRODUCTION In the contemporary era, there is an increased necessity for computer vision technology that supports large-scale and automated study of visual data. It has become very crucial in many applications that are useful to society. Usage of cameras associated with such computer vision applications became ubiquitous. In cities across the globe there are millions of cameras that capture visual data on 24/7 basis for leveraging different departments like policing, investigation agencies, healthcare industry, and so on. There is also need for monitoring elderly people to recognize actions like fall. Human action recognition from live videos is a popular research activity as studied in [1]-[4]. In the modern living, it is essential for different kinds of real world applications. Based on human actions, it is possible to determine the kind of activity and take some steps associated with the underlying application. There are many advantages of using cameras in public places and select territories for monitoring. However, with respect to human action recognition, face is an important biometric used to know the identity of the person. Therefore, invasion of privacy in computer vision applications has become an increasing cause of concern, particularly video recording that is not done with prior permission. In the modern society, we
  • 2. ๏ฒ ISSN:2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201 3192 need deployment cameras to capture video and recognise events but it is essential to eliminate privacy invasion. Therefore, the need of the hour is to have mechanisms to recognise human actions from running videos but ensure that the identity of humans is not lost. Towards this end, several researchers proposed methods to ensure privacy-aware human activity recognition. For instance, [5]-[7] explored different methods for privacy preserving action recognition with zero-shot approach, automatic fall detection, position based super pixel transformation and hybrid reasoning respectively. Wu et al. in [8] there is GAN based approach with deep learning for privacy preserving action recognition. Similarly, in [9]-[12] GAN based approaches are discussed for computer vision applications. In different face recognition approaches are explored as the human activity recognition needs to identify face or anonymize face that prevents disclosure of identity [13]-[16]. From the literature, it is understood that most of the existing methods with adversarial learning do not consider a holistic approach with multi-task learning. This gap is filled in this paper by proposing a framework with underlying algorithm to leverage state of the art. Our approach is based on adversarial learning that involves multi-task learning such as face anonymization, face detection and face recognition where there is two-player game is characterized between generator G and discriminator D. The former tries to preserve privacy while the latter tries to break it and with continuous adversarial learning setting, they strive to perform well thus leading to better performance in privacy preserving human activity recognition. Our contributions in this paper are being as. โˆ’ We proposed a framework known as privacy preserving human activity recognition framework (PPHARF) that follows a holistic approach consisting of discriminator and generator in adversarial setting besides face anonymizer. โˆ’ An algorithm named multi-task learning based hybrid prediction algorithm (MTL-HPA) is proposed where multiple tasks are learned to ensure that the algorithm enhances privacy and ensures reliable action recognition. โˆ’ A prototype application is built using Python data science platform. It is used to explore the difference between the proposed and existing methods. The empirical results revealed that the PPHARF outperforms the existing approaches. The remainder of the paper is structured is being as. Section 2 reviews literature pertaining to human action recognition from videos with privacy preserved. 2. RELATED WORK This section reviews literature on various aspects of the research associated with human action recognition while preserving privacy. 2.1. Generative adversarial networks Generative adversarial network (GAN) models are found to be useful in improving performance in computer vision applications. As explored by Wu et al. [8], generative adversary learning has benefits to have a two-player game approach towards improving higher level of accuracy in human action recognition. GAN based research is found in different researchers such as [9]-[12]. Peng and Schmid [9] employed it for action detection where two-stream R-CNN is the deep learning technique employed. Ren and Lee [10] explored multi-task feature learning based on GAN model using synthetic imagery. Ryoo et al. [11] studied on privacy preserving action recognition that is based on GAN model and they considered extreme low- resolution images. Liu et al. [12] used GAN approach for face recognition using deep hypersphere embedding approach. 2.2. Privacy aware action recognition Privacy refers to non-disclosure of identity of humans in this research. Kumar et al. [1] proposed deep learning models like convolutional neural networks (CNN) for privacy preserving human activity recognition. Wu et al. [8] proposed two different strategies for privacy preserving visual recognition. The strategies are known as budget model restarting and ensemble. They also employed adversarial training for better performance. Dai et al. [3] investigated on the trade-off between performance of action recognition while preserving privacy and number of cameras and their resolution. They found that the spatial resolution, the number of cameras used, and temporal resolution have their impact on performance from highest to lowest respectively. Similar kind of work is carried out in [17]. Lyu et al. [4] proposed a collaborative deep learning method that preserves privacy. They employed multiplicative perturbation for making their scheme privacy aware. Ryoo et al. [17] used extreme low-resolution samples for action recognition. They introduced a paradigm known as inverse super resolution (ISR) to learn image transformations optimally by creating suitable low-resolution training images.
  • 3. Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala) 3193 Machot et al. [5] proposed a framework for human action recognition using non-visual sensors. Their framework leverages sensor data to discover unseen activities using a technique known as zero-shot learning. Zhang et al. [2] focused on human action recognition with respect to falling and privacy preserving. Their method involves feature extraction and representation and uses RGBD cameras. Rajput et al. [6] also employed RGBD cameras and deep CNN in order to achieve privacy preserving human action recognition. They reused cloud based learned models for better performance. Cippitelli et al. [18] used skeleton data collected from RGBD sensors for action recognition. Yet another study by Cippitelli et al. [19] is based on RGBD for human action recognition. Ribono and Bettini [7] proposed an ontology-based solution based on hybrid reasoning and context- aware approaches. Zolfaghari and Keyvanpour [20] proposed smart activity recognition framework (SARF) which has different components such as sensing, pre-processing, feature extraction, feature selection, and recognition of ambient assisted living (ASL). Ciliberto et al. [21] explored a privacy preserving 3D model for human activity recognition. Gheid et al. [22] proposed a variant of KNN for privacy preserving action recognition as part of their multi-party classification protocol. Gheid and Challl [23] does similar kind of work that extends the work of [22] with optimization in security. Yonetani et al. [24] investigated on doubly permuted homomorphic encryption (DPHE) approach for privacy preserving visual learning. It could improve both privacy and accuracy over its predecessors. 2.3. Face recognition Recognising face is essential in many computer vision applications. This is well explored problem found in [25]. The performance of face recognition is further enhanced with recent advanced artificial methods and available large datasets as studied in [26]-[28]. A multi-class classification problem is taken with vanilla softmax in two loss functions such as softmax loss and center loss are combined in order to jointly optimize the inter-class distance and intra-class feature distance. Wen et al. in [28] uses 200 million images for training and employed triplet loss function for improving performance in face recognition. The recent work in [26], [27] showed promising performance due to the usage of classification combined with metric learning. From the literature, it is understood that most of the existing methods with adversarial learning do not consider a holistic approach with multi-task learning. This gap is filled in this paper by proposing a framework with underlying algorithm to leverage state of the art. 3. PRELIMINARIES GAN is used in many computer vision applications such image-to-image translation. This section provides an overview of GAN to understand the proposed approach which is based on adversarial learning. GAN has two important components such as generator (G) and discriminator (D). The two components are implemented using deep learning techniques. As presented in Figure 1, the G is used to generate new data denoted as (Gz) and the underlying data distribution is denoted as pg(z). The aim of GAN is to see that training sample denoted as pr(x) and pg(z) are same. On the other hand, the D takes (Gz) and real data x as input. G gets trained while D has fixed parameters. The discriminator D generates output which is denoted as D(G(z)) and between this and sample label, error rate is computed. Figure 1. A typical GAN framework
  • 4. ๏ฒ ISSN:2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201 3194 Backpropagation algorithm is employed to update parameters of G. The D is aimed at finding whether input is really from given sample and provides required feedback that helps in updating Gโ€™s parameters. Based on real input data is x or not, the output of D approaches to either 1 and 0 respectively. The overall process resembles min-max game played by two plyers and loss function of D is computed as in (1). V(D,๐œƒ(๐ท) )= -๐ธ๐‘ฅ~๐‘๐‘Ÿ(๐‘ฅ)[logD(x)]-๐ธ๐‘ง~๐‘๐‘”(๐‘ง)[log(1-D(g(z)))] (1) Similarly, the G has its loss function as in (2). V(G,๐œƒ(๐บ) )=๐ธ๐‘ง~๐‘๐‘”(๐‘ง)[log(1-D(g(z)))] (2) In the game, both players have their loss functions. In the process D maximizes V(D) ,(๐œƒ(๐ท) , ๐œƒ(๐บ) )while G maximizes V(G) ,(๐œƒ(๐ท) , ๐œƒ(๐บ) )by updating ๐œƒ(๐ท) and๐œƒ(๐ท) respectively. The loss functions of the players have parameter dependency. Nash equilibrium is to be achieved in order for a player to update parameters of other player and stop training. Min max V(D,G)= ๐ธ๐‘ฅ~๐‘๐‘Ÿ(๐‘ฅ)[logD(x)]+๐ธ๐‘ง~๐‘๐‘”(๐‘ง)[log(1-D(g(z)))] (3) As the GAN is denoted as min-max optimization problem, the loss function is as expressed as in (3). D makes an objective function as large as possible using read data as input. The goal of D is to ensure that output denoted as D(G(z)) is close to 1. With training the min-max game converges to Nash equilibrium. GAN models are found in many computer vision applications such as [9]-[12]. In this paper, the D tries to establish human identity while the G tries to make human face images as hard as possible to identify while ensuring human action recognition. In the proposed approach there is third component that anonymizes face region so as to see that the privacy is not lost. Reliable human action recognition while defeating disclosure of identity is the main aim of this paper. As presented in Table 1, the notations are provided to understand the symbols used in the proposed algorithm and its underlying equations. Table 1. Notations used in the algorithm Notation Description M Face anonymizer or modifier f or rv input face or input face region A Action detector D The face classifier or discriminator V Set of frames in a video F Set of face images ๐‘Ÿ๐œ A face region M (๐‘Ÿ๐œ) Modifying the face region given ๐œโ€ฒ Updated frame ๐ฟ๐ด sum of the four losses in Faster-RCNN M(f) The act of anonymization of given face ๐ฟ๐ท Face detection loss (by discriminator) F Original image ๐œ† A weight 4. OUR APPROACH Human action recognition from pre-recorded or live videos is an important requirement in many applications such as surveillance. However, those applications generally recognize not only human action but also identity of human. This is where the privacy of the person is lost. In this paper we aim at preventing disclosure of identity while allowing reliable action recognition. 4.1. The framework Inspired by the generative adversarial network (GAN) explained in section 3, we considered adversarial learning phenomenon where two components compete. They include face anonymizer acting as generator (G) and face classifier (D). The former strives to ensure that human face is anonymized so as to preserve privacy while the latter tries to extract sensitive information in order to establish identity. Both G and D competes with each other with quite opposite responsibilities. This kind of adversarial learning is widely using in computer vision applications such as image to image translation [9]-[12]. Without
  • 5. Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala) 3195 compromising on action detection performance, the face anonymizer strives to remove privacy sensitive information from face region of humans in given video input. Figure 2 shows the proposed framework named privacy preserving human activity recognition framework (PPHARF). Figure 2. Overview of the proposed framework based on adversarial learning The framework uses multi-task learning approach where the learning process is associated with different components such as face anonymizer, action detector and face classifier. The framework takes input video and finds the regions where face appears. Such face region is detected by the face detection module and that region is given to face anonymizer. Face anonymizer M takes a face (f) or face region (rv) extracted by face detection module and modifies it by removing sensitive information resulting M(f) or M(rv). The face region is subtracted from original video frame (v) by face region subtraction module. The combiner module takes modified face region (rv) and combined with original frame without face region to result in a combined frame vโ€™. The action detector module (A) tries to detect human action using vโ€™ which is devoid of sensitive information. Face classifier (the discriminator D) tries to establish identity of human (and expected to fail to do so). Its detection loss is expressed as in (4). ๐ฟ๐‘‘๐‘’๐‘ก(๐‘€, ๐ด, ๐‘‰) = ๐ธ๐‘ฃ~๐‘‰[๐ฟ๐ด(๐‘ฃโ€ฒ , {๐‘๐‘–(๐‘ฃ)}, {๐‘ก๐‘–(๐‘ฃ)})] (4) The action detector module is implemented as in [9] and [11] while the loss function is taken from [10]. The face classifier is nothing but discriminator D in adversarial learning whose aim is to identify a face. It is implemented based on the face classifier used in [12] and the adversarial loss function is modelled after the two-player game explored in [29]. The adversarial loss function is expressed in (5). ๐ฟ๐‘Ž๐‘‘๐‘ฃ(๐‘€, ๐ท, ๐น) = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“), ๐‘–๐‘“)] = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“, ๐‘–๐‘“)] (5) In order to preserve the structure such as brightness and pose of modified image, we used L1 loss which is also known as photorealistic loss. In computer vision applications this loss function is used in [30], [31]. It could force similarity between input image and modified image (some visual similarity). This loss function is expressed as in (6). ๐ฟ๐ผ1 = (๐‘€, ๐น)๐ธ๐‘“~๐น[๐œ†||๐‘€(๐‘“) โˆ’ ๐‘“||1] (6) The components in the proposed framework such as A and D are iteratively trained to perform their functionality effectively. Face detector module is the one used in [32] and another face detector used in [33] is used in order to eliminate false positives. The face anonymizer is modelled after [34] with 9 residual blocks. With respect to training Adam solver [35] is used with different parameters such as ฮฒ1=0.5 and ฮฒ2=0.999 while learning rate is set at 0.0003 for face classifier and 0.001 for face anonymizer. Total number of epochs used is 12 and the learning rate is dropped to 1/10 after 7th epoch.
  • 6. ๏ฒ ISSN:2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201 3196 4.2. Action recognition The action recognition for face anonymized image is done using bi-directional long short term memory (BLSTM) network. long short term memory (LSTM) is an extension of recurrent neural networks. Due to its special architecture, which combats the vanishing and exploding gradient problems, it is good at handling human activity recognition up to a certain depth. The transfer process of the network can be described as follows: the input tensor is transformed, along with the tensor of the hidden layer (at the last stage), to the hidden layer by a matrix transformation. Then, the output of the hidden layer passes through an activation function to the final value of the output layer. Recurrent neural network (RNN) is a type of artificial neural network, which is used for working with sequential data. Normal feed forward neural network has only independent data points. But for sequential data, the current data point is dependent upon the previous data point. Inorder to maintain dependencies between data points in the neural network, we use RNN which has the concept of memory which stores the information of previous data points and using this it will generate next data points in sequence. Different activation functions such as sigmoid, Tanh and Relu functions in RNN. The main advantages of RNN are ability to handle sequential data, input data having varying lengths and can store past information in memory. The main advantage of RNN is the vanishing gradient, it can remember data points information only for a short period tof time. In-order to overcome this problem, we are going to LSTM networks. When backpropagation with a deep network is used, gradients will vanish rapidly if preventative measures that permit gradients to flow deeply are not taken. Compared with the simple input concatenation and activation used in RNNs, LSTM has a particular structure for remembering information for a longer time as an input gate and a forget gate control how to overwrite the information by comparing the inner memory with the new information arriving. It enables gradients to flow through time easily. Figure 3 shows basic structure of LSTM, where vector x denotes input, vector y denotes output and vector h denotes hidden layers. Figure 3. Basic structure of LSTM LSTM is a special type of RNN with memory blocks connected through layers. Each memory block is smaller than a neuron with gates having capability to manage state and output. This gate uses the sigmoid activation units to add information in memory bloch and to make change of state while operating on an input
  • 7. Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala) 3197 sequence. Each LSTM has three types of gates, forget gate (to remove information from the block), inouut gate (to update the memory based on input values) and output gate (values generated from the memory block based on input). These gates have weights that are leaned during the training process. In real life, human trajectories are continuous. Baseline LSTM cells predict the status based only on former information. Some important information may not be captured properly by the cell if it runs in only one direction. The improvement in bidirectional LSTM is that the current output is not only related to previous information but also related to subsequent information. For example, it is appropriate to predict a missing word based on context. Bidirectional LSTM is made up of two LSTM cells, and the output is determined by the two together. BLSTM is a kind of neural network having advantage of flowing information from in both directions means that future to past and past to future. Input flows in two directions making BLSTM different from regular LSTM. In regular LSTM, we can move information only in one direction either forward or backward. But in BLSTM, sequence information can move in bothe directions preserving future and past information in memory units. It consists of two LSTMs one taking input in a forward direction and one LSTM in a backward direction, effectively increasing the information available in the network. For providing combination of forward and backward outputs, different functions such as summation, multiplication, concatenation, and average are used. This BLSTM network is used in our work for action recognition of anonymized image. Figure 4 shows the structure of standard BLSTM. Figure 4: Standard BLSTM structure 4.3. Algorithm design An algorithm named multi-task learning based hybrid prediction algorithm is proposed to realize the functionality of our approach. As presented in Algorithm 1, it takes set of video frames V, a discriminator D, set of face images F and set of identity labels are used as input. There is an iterative process that works for each video frame. From the video frame, the face region is identified and that is anonymized. On the other hand, the subtraction module removes face region from the video frame and the result is assigned to s. After anonymizing face image, it is combined with the subtracted frame in order to have better anonymized face image with visual appearance. Such anonymized face image is taken by action detector that identifies human action associated with face image. At the same time, the discriminator D (face classifier) tries to establish identity of the face as part of its adversarial setting. Every time, the fame modifier (M) and the action detector (A) are updated with improved knowledge. This process continues for all the frames in the given video. If there are multiple images in a single frame, there will be a sub process to repeat steps for each fame image in the video frame v.
  • 8. ๏ฒ ISSN:2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201 3198 Algorithm: Multi-Task Learning based Hybrid Prediction Algorithm Input: D, V, F, identity labels Output: M, A 1. For each v in V 2. f๏ƒŸDetectFace(v) 3. s ๏ƒŸ Subtract(f) 4. f โ€˜ ๏ƒŸ M(f) 5. Compute L1 loss ๐ฟ๐ผ1 = (๐‘€, ๐น)๐ธ๐‘“~๐น[๐œ†||๐‘€(๐‘“) โˆ’ ๐‘“||1] 6. vโ€™๏ƒŸCombiner (fโ€™, s) 7. a ๏ƒŸ A(vโ€™) 8. Compute detection loss ๐ฟ๐‘‘๐‘’๐‘ก(๐‘€, ๐ด, ๐‘‰) = ๐ธ๐‘ฃ~๐‘‰[๐ฟ๐ด(๐‘ฃโ€ฒ , {๐‘๐‘–(๐‘ฃ)}, {๐‘ก๐‘–(๐‘ฃ)})] 9. det=D (f โ€˜) 10. Compute adversarial loss ๐ฟ๐‘Ž๐‘‘๐‘ฃ(๐‘€, ๐ท, ๐น) = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“), ๐‘–๐‘“)] = โˆ’๐ธ(๐‘“~๐น,๐‘–๐‘“~๐ผ)[๐ฟ๐ท(๐‘€(๐‘“, ๐‘–๐‘“)] 11. Update M 12. Update A 5. RESULTS AND DISCUSSION 5.1. Datasets We used two datasets namely DALY and JHMDB in this dataset. Daily action localization in YouTube (DALY) is the dataset introduced in [36] and the dataset is obtained from [37]. The dataset has more than 30 hours of YouTube videos that are annotated in spatial and temporal domains. It consists of 10 human actions that are witnessed every day with 3600 total number of instances. Action classes are considered with clearly set temporal boundaries. This is essential to overcome ambiguities associated with noise. Some of the action classes include brushing teeth, phoning, taking photos, drinking, applying makeup on lips and playing harmonica as presented in Figure 5. Figure 5. Representative pictures with human action annotated for10 classes of DALY dataset JHMDB is another dataset introduced in [25] and it is collected from [38]. It is the dataset which is benchmark for human detection, human actions and pose estimation. JHMDB is derived from the HMDB51 dataset [39] where 5,100 videos consisting of 51 human actions. JHMDB is a subset of HMDB51 with 21 categories that involve a single human with certain actions as presented in Figure 6. Some of the actions include hug, kick, jump, run, and shoot. 5.2. Results The experiments are made with DALY and JHMDB datasets. The experimental results are observed in terms of face verification error and mean average precision. Our anonymization approach is compared with many states of the art or baseline approaches. They include Blur x 3, masked, noise x 3 and edge. The qualitative results shown provide the visual difference before and after the face modification. This is made due to the fact that our algorithm anonymizes prior to performing action recognition. The user study conducted with famous personalities convinced that our anonymization approach has acceptable performance.
  • 9. Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala) 3199 Figure 6. Some of the images from JHMDB dataset with annotated human actions As presented in Table 2, the results revealed that for many actions, the proposed approach showed better performance. The aim of the research is to detect human actions reliably while preventing disclosure of human identity. This has been fulfilled as observations are made in the experimental study. The results also reveal that the performance of the proposed method is better over baselines in terms of anonymization as well. It is evidenced in both empirical study and user study. Table 2. Shows accuracy in action detection using DALY dataset MTL- HPA Edge Masked Noise(ฯƒ 2=0.6) Noise (ฯƒ 2=0.4) Noise (ฯƒ 2=0.2) Blur(24x24 ) Blur(16x 16) Un- anonymized Action 90.2892 80.3803 67.1271 83.7136 87.7276 87.7977 92.1721 82.1521 92.3923 Lip 35.1131 29.4895 15.2052 21.4715 25.005 31.4014 39.8798 32.4324 51.3113 Brush 79.1971 78.098 78.9389 81.4013 78.6686 78.4884 80.0099 79.6095 76.8067 Floor 34.5926 30.5405 26.6066 29.6196 32.7127 31.9019 31.8018 32.1721 27.8979 Window 24.9529 10.3203 6.59659 7.43743 8.34834 12.4224 15.2452 10.7507 31.2612 Drink 34.8939 32.6726 25.5555 29.1091 35.3653 34.8348 34.995 31.3814 32.7027 Fold 79.1471 79.2292 73.023 78.048 75.035 76.5765 74.7247 74.9849 75.3753 Iron 48.5665 35.1852 27.8178 33.9639 40.1601 42.4124 46.3763 37.007 51.5515 Phone 57.3753 54.7447 21.3413 27.3774 36.6466 50.1901 51.8318 48.4184 73.9839 Harmoni ca 57.5955 49.6696 46.8068 46.3964 45.4654 53.5335 53.1431 52.5625 55.7957 Photo 54.37232 48.03299 38.90186 43.85381 46.51347 49.95591 52.01797 48.1471 56.90785 mAP 6. CONCLUSION We proposed a framework named privacy preserving human activity recognition framework (PPHARF) with an underlying algorithm known as multi-task learning based hybrid prediction algorithm (MTL-HPA). The framework is aimed at supporting adversarial learning based multi-task formulation that accomplishes human action recognition, anonymization of human face and face detection (to evaluate anonymization). Face anonymizer is part of the framework that is designed to ensure preserving of privacy. In other words, non-disclosure of human identity and at the same time recognition of action are important considerations. The face anonymizer is designed to confuse humans and applications in recognition of human based on face while supporting action recognition accurately. The proposed framework is evaluated with a prototype using JHMDB and DALY datasets. The proposed approach outperformed traditional face modification approaches and it showed significantly better performance over the state of the art. In future, we intend to explore advanced GAN approaches such as SingleGAN with possible improvements. REFERENCES [1] K. V. Kumar, J. Harikiran, M.A.R. Prasad, and U. Sirisha, โ€˜โ€˜Privacy-Preserving Human Activity Recognition and Resolution Image using Deep Learning Algorithms Spatial relationship and increasing the attribute value in OpenCV,โ€™โ€™ International Journal of Advanced Science and Technology, vol. 29, no. 7, pp. 514-523, 2020.
  • 10. ๏ฒ ISSN:2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3191 โ€“ 3201 3200 [2] C. Zhang, Y. Tian, and E. Capezuti, โ€˜โ€˜Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras,โ€™โ€™International Conference on Computers for Handicapped Persons, 2012, pp. 625-633, doi: 10.1007/978-3-642-31522-0_95. [3] J. Dai, J. Wu, B. Saghafi, J. Konrad, and P. Ishwar, โ€˜โ€˜Towards Privacy-Preserving Activity Recognition Using Extremely Low Temporal and Spatial Resolution Cameras,โ€™โ€™ IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 68-76, 2015. [4] L. Lyu, X. He, Y. W. Law, and M. Palaniswami, โ€˜โ€˜Privacy-Preserving Collaborative Deep Learning with Application to Human Activity Recognition,โ€™โ€™ ACM, pp. 1219-1228, Nov. 2017, doi: 10.1145/3132847.3132990. [5] F. Al Machot, M. R. Elkobaisi, and K. Kyamakya, โ€˜โ€˜Zero-Shot Human Activity Recognition Using Non-Visual Sensors,โ€™โ€™ Sensors, vol. 20, no. 3, p. 825, doi: 10.3390/s20030825. [6] A. S. Rajput, B. Raman, J. Imran, โ€˜โ€˜Privacy-preserving human action recognition as a remote cloud service using RGB-D sensors and deep CNN,โ€™โ€™ Expert Systems with Applications, vol. 152, p. 113349, August 2020, doi: 10.1016/j.eswa.2020.113349. [7] D. Riboni and C. Bettini, โ€˜โ€˜COSAR: hybrid reasoning for context-aware activity recognition,โ€™โ€™ Personal and Ubiquitous Computing, vol. 15, no. 3, pp. 271-289, 2011, doi: 10.1007/s00779-010-0331-7. [8] Z. Wu, Z. Wang, Z. Wang, and H. Jin, โ€˜โ€˜Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study,โ€™โ€™ Springer, pp. 1-19, 2018. [9] X. Peng and C. Schmid, โ€˜โ€˜Multi-region two-stream r-cnn for action detection,โ€™โ€™in: European conference on computer vision (ECCV), October 2016, pp. 744-759, doi: 10.1007/978-3-319-46493-0_45. [10] Z. Ren and Y. J. Lee, โ€˜โ€˜Cross-domain self-supervised multi-task feature learning using synthetic imagery,โ€™โ€™in: CVPR, 2018, pp. 762-771. [11] M. S. Ryoo, B. Rothrock, C. Fleming, โ€˜โ€˜Privacy-preserving egocentric activity recognition from extreme low resolution,โ€™โ€™Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017, pp. 4255- 4262. [12] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, โ€˜โ€˜Sphereface: Deep hypersphere embedding for face recognition,โ€ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 212-220. [13] Y. Sun, X. Wang, and X. Tang, โ€˜โ€˜Deep learning face representation from predicting 10,000 classes,โ€™โ€™Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1891-1898 [14] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, โ€˜โ€˜Deepface: Closing the gap to human-level performance in face verification,โ€™โ€™Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1701-1708. [15] Y. Mao, Shanhe Yi et.al., โ€˜โ€˜A Privacy Preserving Deep Learning Approach for Face Recognition with Edge Computingโ€™โ€™Conference proceedings, usenix hotedge18-papers, pp.1-6, 2016. [16] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, โ€˜โ€˜Face recognition: A literature survey,โ€™โ€™in: ACM computing surveys (CSUR), vol. 35, no. 4, pp. 399-458, December 2003, doi: 10.1145/954339.954342. [17] M. S. Ryoo, B. Rothrock, C. Fleming, and H. J. Yang, โ€˜โ€˜Privacy-Preserving Human Activity Recognition from Extreme Low Resolution,โ€ Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017, pp. 4255-4262. [18] E. Cippitelli, S. Gasparrini, E. Gambi, and S. Spinsante, โ€˜โ€˜A Human Activity Recognition System Using Skeleton Data from RGBD Sensors,โ€™โ€™Computational Intelligence and Neuroscience, vol. 2016, pp. 1-14, 2016, doi: 10.1155/2016/4351435. [19] E. Cippitelli, E. Gambi, and S. Spinsante, โ€˜โ€˜Human Action Recognition with RGB-D Sensors,โ€™โ€™ Motion Tracking and Gesture Recognition, pp. 100-115, 2017, doi: 10.5772/68121. [20] S. Zolfaghari and M. R. Keyvanpour, "SARF: Smart activity recognition framework in Ambient Assisted Living," 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), 2016, pp. 1435-1443. [21] M. Ciliberto, D. Roggen, and F. J. O. Morales, โ€˜โ€˜Exploring human activity annotation using a privacy preserving 3D model,โ€™โ€™Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct-UbiComp โ€™16, September 2016, pp. 803-812, doi: 10.1145/2968219.2968290. [22] Z. Gheid, Y. Challal, X. Yi, and A. Derhab, โ€˜โ€˜Efficient and privacy-aware multi-party classification protocol for human activity recognition,โ€ Journal of Network and Computer Applications, vol. 98, pp. 84-96, November 2017, doi: 10.1016/j.jnca.2017.09.005. [23] Z. Gheid and Y. Challal, "Novel Efficient and Privacy-Preserving Protocols for Sensor-Based Human Activity Recognition," 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), 2016, pp. 301-308, doi: 10.1109/UIC- ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0062. [24] R. Yonetani, V. N. Boddeti, K. M. Kitani, and Y. Sato, โ€˜โ€˜Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption,โ€ 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2040-2050. [25] H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black, โ€˜โ€˜Towards understanding action recognition,โ€™โ€™in: International Conf. on Computer Vision (ICCV), December 2013, pp. 3192-3199. [26] X. Yu, X. Cai, Z. Ying, T. Li, and G. Li, โ€˜โ€˜SingleGAN: Image-to-Image Translation by a Single-Generator Network using Multiple Generative Adversarial Learning,โ€ Asian Conference on Computer Vision. Springer, Cham, December 2018, pp. 341-356, doi: 10.1007/978-3-030-20873-8_22.
  • 11. Bulletin of Electr Eng & Inf ISSN: 2302-9285 ๏ฒ A multi-task learning based hybrid prediction algorithm for privacy preserving โ€ฆ (Vijaya Kumar Kambala) 3201 [27] X. Peng and C. Schmid, โ€˜โ€˜Multi-region two-stream r-cnn for action detection,โ€™โ€™in: European conference on computer vision (ECCV), October 2016, pp. 744-759, doi: 10.1007/978-3-319-46493-0_45. [28] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, โ€˜โ€˜A discriminative feature learning approach for Deep face recognition,โ€™โ€™European conference on computer vision, 2016. [29] I. Goodfellow et al., โ€˜โ€˜Generative adversarial nets,โ€™โ€™Advances in neural information processing systems, vol. 27, pp. 1-9, 2014. [30] P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros, โ€˜โ€˜Image-to-image translation with conditional adversarial networks,โ€™โ€™ Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1125-1134. [31] J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, โ€˜โ€˜Unpaired image-to-image translation using cycle-consistent adversarial networks,โ€™โ€™Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2223-2232. [32] M. Najibi, P. Samangouei, R. Chellappa, and L. Davis, โ€˜โ€˜SSH: Single stage headless face detector,โ€™โ€™Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4875-4884. [33] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks," in IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, Oct. 2016, doi: 10.1109/LSP.2016.2603342. [34] J. Johnson, A. Alahi, and L. Fei-Fei, โ€˜โ€˜Perceptual losses for real-time style transfer and super-resolution,โ€™โ€™European conference on computer vision. Springer, Cham, October 2016, pp. 694-711, doi: 10.1007/978-3-319-46475-6_43. [35] D. P. Kingma and J. Ba, โ€˜โ€˜Adam: A method for stochastic optimization,โ€™โ€™arXiv preprint arXiv:1412.6980, 2014. [36] P. Weinzaepfel, X. Martin, and C. Schmid, โ€˜โ€˜Human action localization with sparse spatial supervision,โ€™โ€™arXiv preprint arXiv:1605.05197, 2016. [37] Thoth Inria, โ€˜โ€˜Daily Action Localization in Youtube videos,โ€™โ€™[Online]. Available: http://thoth.inrialpes.fr/daly/. [38] JHMDB Dataset, โ€˜โ€˜A fully annotated data set for human actions and human poses,โ€™โ€™[Online]. Available: http://jhmdb.is.tue.mpg.de/. [39] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, "HMDB: A large video database for human motion recognition," 2011 International Conference on Computer Vision, 2011, pp. 2556-2563, doi: 10.1109/ICCV.2011.6126543. BIOGRAPHIES OF AUTHORS Vijaya Kumar Kambala working as Part Time Research Scholar in School of CSE VIT-AP. he is working as Assistant Professor, in PVP Siddhartha Institute of Technology, Vijayawada. His research interest includes Image Processing, Video Analysis, human activity recognition, privacy preserving human activity recognition. Harikiran Jonnadula working as Assistant professor, School of CSE, Vellore Institute of Technology, VIT-AP. His reserach interest include microarray image analysis , Hyperspectral Imaging and Video Analytics.