SlideShare a Scribd company logo
1 of 11
Download to read offline
International Journal of Computational Intelligence Systems
Vol. 13(1), 2020, pp. 66–76
DOI: https://doi.org/10.2991/ijcis.d.200120.002; ISSN: 1875-6891; eISSN: 1875-6883
https://www.atlantis-press.com/journals/ijcis/
RunPool: A Dynamic Pooling Layer for Convolution Neural
Network
Huang Jin Jie1,*, Putra Wanda1,2,*
1
School of Computer Science & Technology, Harbin University of Science and Technology, Harbin, China
2
University of Respati Yogyakarta, St. Sol, Depok, Yogyakarta, Sleman, 55281, Indonesia
ARTICLE INFO
Article History
Received 05 Jul 2019
Accepted 15 Jan 2020
Keywords
Dynamic pooling
Deep learning
Malicious classification
Social network
ABSTRACT
Deep learning (DL) has achieved a significant performance in computer vision problems, mainly in automatic feature extraction
and representation. However, it is not easy to determine the best pooling method in a different case study. For instance, experts
can implement the best types of pooling in image processing cases, which might not be optimal for various tasks. Thus, it is
required to keep in line with the philosophy of DL. In dynamic neural network architecture, it is not practically possible to find
a proper pooling technique for the layers. It is the primary reason why various pooling cannot be applied in the dynamic and
multidimensional dataset. To deal with the limitations, it needs to construct an optimal pooling method as a better option than
max pooling and average pooling. Therefore, we introduce a dynamic pooling layer called RunPool to train the convolutional
neural network (CNN) architecture. RunPool pooling is proposed to regularize the neural network that replaces the deterministic
pooling functions. In the final section, we test the proposed pooling layer to address classification problems with online social
network (OSN) dataset.
© 2020 The Authors. Published by Atlantis Press SARL.
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
1. INTRODUCTION
In recent years, deep learning (DL) has achieved remarkable per-
formance in various computer applications, especially in computer
vision problems [1–3]. Convolutional neural networks (CNN) is
one of the popular DL algorithms for solving image recognition [4].
The architecture is based on a supervised learning model that con-
sists of several convolutional layers and pooling layers. In neural
network computation, it is required to extract more robust features
against position or movement for object recognition. CNN archi-
tecture harnesses pooling layers to decrease the size of the feature
map and extract features [4].
Those pooling layers are a crucial part of CNN to achieve an effi-
cient training process. To obtain better accuracy with a small loss,
subsampling plays an important role. Commonly, the CNN pooling
techniques are max pooling and average pooling. In the operation
process, max pooling takes the maximum score in the pooling win-
dow, and average pooling selects the average score of the pooling
window. However, the current study picks pooling types without
much consideration.
A paper reveals that the two conventional pooling models are not
optimal [5]. In some cases, the average pooling diminishes critical
information in strong activation so that it can decrease the perfor-
mance on CNN. On the other hand, the max pooling shortcomings
*
Corresponding authors. Email: huangjinjie163@163.com, wpwawan@gmail.com
remain because it ignores all crucial information except the most
significant value. It can degrade an opportunity to calculate other
informative features in large and dynamic input data. A comparison
with max or average pooling schemes cannot predict which parts of
the input will be selected.
A paper proposes the pooling technique by replacing the conven-
tional deterministic pooling with a stochastic model to increase
pooling layer performance. It randomly selects the activation within
each pooling region according to a multinomial distribution [5].
The current study introduces Deep Generalized Max Pooling that
balances the contribution of all activations of a region by reweight-
ing all descriptors [6]. Another paper proposes, alpha-pooling,
which utilizes a trainable parameter 𝛼 to decide the type of pooling.
It is a type of pooling scheme, including max pooling and average
pooling as a particular case. Demonstrated by experiment, alpha-
pooling can improve the accuracy of image recognition problems.
They compared several pooling layers and found that max pooling
was not the optimal pooling method [7].
Besides, there are many other pooling schemes, including geomet-
ric and harmonic average. It is still challenging to find the best pool-
ing method. Thus, it is required to keep in line with the philosophy
of DL. In dynamic neural network architecture, it is not practically
possible to find a proper pooling technique for the layers. It is the
primary reason why various pooling cannot be applied in dynamic
data problems. It needs to construct an optimal pooling method
as an option of max pooling and average pooling to deal with the
limitations. Not just able to select the value randomly, but also can
Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 67
predict which parts of the input will be selected as the most appro-
priate pooling value.
In this paper, we introduce a generic pooling layer as an option to
construct CNN architecture. Based on our survey, it is the one pool-
ing layer that utilizes network parameters in each layer to determine
pooling value in the hidden layer. Notably, we present the main con-
tributions of our work, particularly in CNN as follows:
1. We introduce a new pooling layer in the CNN architecture.
Different from max and average pooling, this pooling imple-
ments a magnetic field concept in calculating the pooling layer.
It runs by dynamically changing the pooling layers’ values for
each convolution in the hidden layer.
2. We construct the pooling layer by obtaining several hyperpa-
rameter in each layer. Then, this function calculates the param-
eters with a range distribution. By establishing the concept, we
get the state-of-the-art in CNN architecture. By demonstrating
the architecture, we harvest higher accuracy and a smaller loss
than conventional CNN architecture max pooling and average
pooling layer.
3. Notably, this function is different from traditional pooling that
obtains the highest valued activation out of the pooling win-
dow or takes an average value of a pooling window. The pro-
posed pooling layers try to find appropriate pooling value to
achieve efficient computation by calculating several informa-
tive parameters in each layer. It is a novel pooling approach in
CNN architecture to reduce overfitting while training a model.
The rest of the paper is structured as follows: Section II gives insight
into the proposed technique, Section III presents the experimental
setup that consists of case study in a social network, dataset, and
data preprocessing. Section IV describes the result and discussion.
Section V offers the conclusion and highlights some open research
questions in CNN architecture.
2. PROPOSED TECHNIQUE
As the hot topic of research, DL is becoming an increasingly popu-
lar approach for solving diverse applications. The CNN algorithms
imitate the structure and function of the human brain [8]. In recent
years, CNN has achieved state-of-the-art results on a variety of
challenging problems in computer vision and speech recognition.
Research communities have implemented this concept to deal with
many computer assignments. Several papers explore the use of deep
belief networks (DBN) for authorship verification by a Gaussian–
Bernoulli in DBN [9], face verification with a Deep Neural Network
(DNN) [10], and document recognition [11].
2.1. Convolutional Neural Network
Most of the current research in DL propose CNN architecture
to deal with various problems, including image, voice, video, and
object detection [8]. In this paper, we construct a CNN architecture
by adding a new pooling layer in the neural network. We mention
the graph as dynamic CNN, a CNN graph with a RunPool pool-
ing function that has three layers in the network, including an input
layer, a hidden layer, and an output layer. In the input layer, the
network layer projects the feature extraction as matrices input. To
compute the feature extraction result, we adopt convolution layer
processing with weight sharing. Thus, it is unnecessary to optimize
many parameters because the method can construct convergence
faster.
This architecture adopts the convolution to address multiple input
feature maps and to produce stacked feature maps of increasing lay-
ers. In this experiment, we define a feature map of ith
layer with Fi.
We denote the total of n feature maps in layer i – 1 = F1
i–1
, … , Fn
i–1
.
Layer i has a specific filter size l with F
j
i,l
feature map. The map
F
j
i,l
has j parameter which represents the index of a feature map in
layer i. We calculate each feature map Fk
i–1
by convolving it with a
matrix M
j,k
i,l
, thus it yields the results:
F
j
i,l
=
n
∑
k=1
M
j,k
i,l
∗ Fk
i–1
(1)
The model defines the computation of the convolution process
between filters and input matrices. The model convolves a set of fil-
ters form F ∈ Rd×m
with the input matrix S ∈ Rd×|s|
. The goal of the
convolutional layer is to compute feature extraction.
After convolution operation, we construct a pooling layer to obtain
the most informative values in the hidden units. Pooling aims to
convert the joint feature representation into a more valuable point
that preserves important information and diminishing irrelevant
details. In operation, the pooling reduces the resolution of the fea-
ture maps over the local neighborhood of the previous layer. Com-
monly, a CNN architecture employs conventional pooling schemes
such as max pooling and average pooling to obtain a maximum
or average value in the hidden layer. The max pooling chooses the
biggest element in each pooling region as:
ykij = max
(p, q)∈Rij
xkpq (2)
where ykij is the output of the pooling operator related to the kth
feature map, xkpq is the element at
(
p, q
)
within the pooling region
Rij which represents a local neighborhood around the position
(
i, j
)
.
The average pooling obtains the arithmetic mean of the elements in
each pooling region as:
ykij =
1
||Rij||
∑
(p,q)∈Rij
xkpq (3)
On the average pooling function, ||Rij|| represents the size of the
pooling region Rij. Although those two types of pooling functions
produce a good result on some datasets, it is still unknown, which
will work better for dealing with a new task with a different dataset.
Thus, it is a little bit complex to choose the pooling operator.
In some image recognition problems, both the max pooling and
average pooling have their shortcomings. Figure 1 illustrates the
drawbacks of max pooling and average pooling in detecting a diag-
onal line.
In the pooling region, max pooling only considers the maximum
element and ignores others. In some cases, it will lead to an unac-
ceptable result because if the most elements consist of high magni-
tudes, the distinguishing feature vanishes after max pooling. On the
68 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76
Figure 1 Pooling process of input feature that illustrates the drawbacks of max pooling and average pooling on
CNN.
other hand, the average pooling gets worse if there are many zero
elements on the input features; the characteristic of the feature map
will be reduced largely [12].
Therefore, as a new way to downsampling the features, we con-
sider replacing the deterministic pooling function with a stochastic
scheme. Instead of using the standard pooling method, we intro-
duce a new pooling function by adopting a nondeterministic pool-
ing concept.
2.2. Main Idea
To construct a pooling function with a stochastic pooling paradigm,
we take a magnetic field problem as the basic idea. The main prob-
lem of this function is how to establish a CNN architecture by opti-
mizing a new pooling layer. We are inspired by a model that adopts
the dynamic k-max pooling to compute the engineered feature in
each layer. By choosing the hyperparameters in each layer, the func-
tion calculates the pooling value to obtain a pooling sequence in 1D
dataset [13].
To implement the magnetic field concept, we calculate the pooling
layer with the Gaussian Law theory (Gaussian random variable).
At the initial step, we compute the pooling function by generat-
ing random numbers from a Gaussian distribution. We employ
the random technique because the randomness comes from atmo-
spheric noise. In several computer applications, it is better than the
common pseudo-random algorithms. A normal distribution has a
bell-shaped density curve described by its mean 𝜇 and standard
deviation 𝜎. To calculate the Gaussian distribution of the range
number, we compute the function:
f
(
x|𝜇, 𝜎2
)
=
1
√2𝜋𝜎2
e
–
(x–𝜇)
2
2𝜍2
(4)
In formula 2, 𝜇 represents the mean, median, mode, or expectation
of the distribution. Notation 𝜎 reflects the standard deviation, and
𝜎2
is the variance. To construct the pooling function, we multiply
several numbers of Gauss random with kmap in each layer. It cal-
culates the kmap as a dynamic value, and it depends on the current
parameters. Finally, this function will compare the kmap ∗f
(
x|𝜇, 𝜎2
)
with the constant ktop to obtain the final k value for the current pool-
ing layer.
2.3. RunPool Pooling
CNN is a famous DL architecture that uses a lot of identical copies
of the same neuron. The process runs extensive computation with a
little number of parameters and has many interleaved convolutional
and pooling layers over the network. The CNN layer receives a sin-
gle input (the feature maps) and computes feature maps as its out-
put by convolving filters across the feature maps. CNN can ignore
the aspect of feature engineering with automatic representation.
One critical layer in CNN training is the pooling layer. The pooling
can act as a generalizer of the lower-level data in the NN training. It
is a sliding window technique that utilizes several kinds of statistical
functions to get the contents on each window. There are two com-
mon types of pooling techniques, max pooling and average pool-
ing. Max pooling employs the max function to obtain informative
materials, while the average pooling harnesses the statistical mean
function to take the contents of the window. Principally, a pool-
ing function simplifies the output features by performing nonlinear
computation and reducing the number of parameters.
Instead of using the conventional pooling layers, we construct a
method that dynamically changes the pooling values based on the
current network parameter. It takes the length of the input vector
of the network to obtain a feature map according to the current
parameters in each layer. The basic idea of the function is how to
calculate network elements of the current map pooling and to find
appropriate pooling value by comparing it with a constant pooling
value. It is to recompute the pooling value using the layer parame-
ters and learn a CNN hierarchy of feature orders. Figure 2 depicts
Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 69
Figure 2 The network topology of proposed CNN with input
dinput = x1, x2, … , xn and doutput = y1, y2 with number
hidden layer h, where the output is represented by Y = y1, y2
and input x = x1, x2, ..xn. The matrix of weights calculated by
Wi ∈ R
xi×hj .
the CNN computation with the RunPool pooling layer in the visible
and the hidden layer.
The hidden layer runs the convolution process by using filters and
input matrix. In the model, a set of filters form F ∈ Rd×m
will
be convolved to the input matrix S ∈ Rd×|s|
. The convolutional
layer aims to compute feature extraction, such as word sequences
throughout the training process. The RunPool computes the matri-
ces conversion based on the feature extraction process. Table 1
describes the mathematical notation for the RunPool function.
We construct the function to allow a smooth extraction of higher
order and to adapt with a longer range element at the hidden
layer. The model employs the k value at intermediate layers (hidden
layer). The dynamic CNN calculates forward pass with the follow-
ing function.
yn
j = ∑
i
kn
ij ∗ xn
i (5)
Table 1 Mathematical notation of RunPool pooling algorithm.
Notation Description
f Filter size
s Stride
l Index of the current layer
kl k value of the current layer
ktop k value of the top layer
kmap k value after calculation of input features
L Total number of layers
C Length of input vector (sequence of features)
p Padding
ktop and L are constants
In the above formula, the first network computes the features maps
as the input layer to the CNN. In Equation (4), xn
i is the i – th input
feature map of the sample n and kn
ij is the ij input kernel of the sam-
ple n. Then, j – th becomes the output feature map of sample n net-
works. Backward and forward operation run simultaneously in one
iteration. The CNN calculates backward pass with the formulas:
𝜕l
𝜕xn
i
= ∑
j
(
𝜕l
𝜕yn
j
)
∗
(
kn
ij
)
(6)
𝜕l
𝜕kn
ij
=
(
𝜕l
𝜕kn
j
)
∗ xn
i (7)
In the backward pass process with Equation (6), the layer will com-
pute the gradient of the l (loss function) with respect to xn
i . The val-
ues of the gradient are calculated by the partial derivative function
𝜕l
𝜕xn
i
and are passed to the first layer of the network. Then Equation
(7) is used to calculate the gradient of the loss function with respect
to kn
ij.
RunPool Function: Assume k be a function of the length of the
input and the depth of the network as the computing model. l is a
current convolution in the hidden layer, L is the total number of
convolution layer. At the initial process, we determine ktop as a con-
stant value for the pooling layer calculation. The function needs
input matrices and some padding to calculate the extracted features.
It computes kl by calculating network parameters in each window.
In CNN process, each convolutional filter produces a matrix of hid-
den variables. The pooling layer is utilized to reduce the size of the
matrix. Max pooling is a procedure that takes an Nin × Nin input
matrix and produces a smaller output matrix (feature maps) Nout ×
Nout. This is achieved by dividing the Nin × Nin square into N 2
out
pooling regions Pi,j:
P
(
i, j
)
⊂ {1, 2, … , Nin}
2
for each
(
i, j
)
∈ {1, … , Nin}
2
Pooling layers are a downsampling layer that combines the output
of layer to a single neuron. The model calculates the input data in
the hidden layer with ni represents i – th layer, f represents filters,
and s reflects stride in the process. Then, the output dimension of
the pooling layer will be calculated:
max
Nin × Nin
= n1 ∗ n2 ∗ nc = [
n1 – f
s
+ 1] × [
n2 – f
s
+ 1] × nc (8)
70 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76
Inspired by DCNN model in [13] with one-dimensional convolu-
tion, we construct a function to obtain appropriate k-max pooling
value by using more layer parameters with dynamic values. Then,
we take Gauss distribution to calculate the final kmap. In forward
propagation, the Nin ×Nin part will produce a single “winning unit”
value. We construct a function to obtain value at each layer instead
of choosing the exact maximum value from l layer. Hence, we get
the dynamic winning out by calculating the simple pooling func-
tion to obtain k value of layer l. We calculate kmap with a function to
get the kl value (k value of layer l) as:
kmap =
(
C (L – l)
L
)
kl = max
(
ktop, kmap
)
(9)
From the function above, the RunPool selects ktop as the current
pooling value if ktop ≥ kmap or selects kmap otherwise. The aim
of the RunPool Equation (9) is to find kl value by comparing ktop
constant and kl produced by parameter calculation. It is a dynamic
pooling function that calculates window parameters to take pool-
ing value in each hidden layer. The model uses wide convolution
in each window, to ensures that all weights get the entire param-
eter action. To obtain final value of kmap, we multiply it with a
Gaussian random value r to produce a positive integer value ℤ+
as
kmap = r|r ∈ ℤ+
, r ≥ 0. Figure 3 depicts how to get a pooling value
in the RunPool function.
The main idea of the function is how to take some parameters of
the current layer to obtain appropriate pooling value in the current
hidden layer. In the RunPool function, l is some current convolu-
tion in the hidden layer, L is the total number of convolution layer,
ktop is the fixed max pooling parameter for the topmost convolu-
tional layer. To obtain optimal max pooling value, we calculate the
RunPool function with Algorithm 1.
Algorithm 1 describes how to calculate the network parameters to
take kmap value. The RunPool tries to find appropriate kmap value
based on the parameters of the current window. It harnesses param-
eter k to obtain optimal k-max pooling value. In the proposed
scheme, the model calculates the window parameter to obtain
Figure 3 Obtaining a pooling value by using
RunPool function in the proposed CNN. The
proposed CNN computes the input dinput = x1,
x1, . . xn and some window parameter dlayer to get
kmap. At the final step, it compares ktop constant
and obtained kvalue to take the pooling value.
dynamic kmap value. The kmap function is derived from the input
or other aspects of the network. The function calculates a pooling
window by turning different length of vectors into the same length.
The convolutional layers run the feature extraction to capture better
representation in the hidden layer. To calculate kmap, the function
obtains length of input matrices C = C1, C2, ...Cn as the represen-
tation of 1D dataset including text, character, and numeric. Finally,
we get final integer value of kmap after multiplying it with a Gaussian
random variable.
We activate the pooling function between the convolution and
before the fully connected layer (FC). FC is an efficient way of learn-
ing nonlinear combinations to classify the data into target (cate-
gories) y = y1, y2. It connects each of neuron layer to every neuron
in another layer. The input to the FC layer is the flattened output
from the final pooling or convolutional layer and then fed into the
FC layer.
For calculating error loss of the model, the experiment utilizes
cross-entropy (CE) loss function. We calculate the loss function
with the function:
J (θ) = –
1
Tmb
Tmb–1
∑
T=0
FN–1
∑
f=0
𝜕
f
y
(t) ln h
(t)
f
(10)
The above criterion function consists of Ttrain as the number of
training examples, Tmb represents the number of training exam-
ples in a mini-batch, h
(t)
f
represents the hidden neuron of each
layer, and 𝜕
( f)
yt represents the derivative. In this experiment, we train
the model by calculating the CE in binary classification problems
because it is a mostly loss function for classification problems. CE
is applied to obtain a probability distribution output and is a pre-
ferred loss function for FC activation like SoftMax.
3. EXPERIMENTAL SETUP
In this study, we experiment with a large amount of 1D datasets,
including numerical and text datasets. We gather a large dataset to
construct a benchmark sample for the training and testing model
with the dataset. In this paper, we focus on the 1D dataset, including
texts, characters, and numeric data, to measure the pooling perfor-
mance in training the model.
Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 71
3.1. OSN Classification
In this experiment, we collect a large online social network (OSN)
as the primary dataset and train it as a binary classification prob-
lem. OSN is a large environment with a large amount of data with
millions of users. OSN become a popular application for sharing
a lot of data items including videos, photos, and messages. The
main trait of OSN is the social relation. It contains a large num-
ber of accounts. They post, send, and share any data items across
the platforms. Indeed, it is easy to gather much information across
the sphere. OSN is a dense interconnection among the user of the
network. In reality, OSN’s model is mostly multirelational that con-
tains a large number of nodes. Basically, in the research area, the
OSN depicts the nodes by a graph. The nodes represent actors who
build diverse relationship type edges that reflect actor interactions
[14]. In current years, OSN becomes an active research interest in
several fields [15].
In the OSN environment, analyzing the data characteristic can pro-
duce patterns of knowledge of the underlying OSN activities and
hidden regularities. Link prediction, as one of the hot topics in
social network research and as one of the analysis problems since a
decade, drives a big amount of the researcher’s attention from var-
ious disciplines [16]. However, the OSN environment remains as
security issues and become the primary concern in the fast-growing
OSN. Protection technique is one of the most critical aspects of an
(OSN). The shortcomings of security and privacy-preserving model
put the user data at risk. Many papers propose various techniques
to address the OSN security issue [17,18].
Commonly, the OSN system conducts anomaly detection by mon-
itoring a sudden user discrepancies activity. If the users’ activities
keep on changing in a period abruptly, the OSN provider catches the
suspicious account up by the changing of access pattern. If it fails to
gain the information and behave with unusual pattern, the abnor-
mal can infect the system with existing fraudulent [19]. However,
suspicious activity exhibits different patterns, type of data, level of
interaction, and the difference in time usage. To identify the anoma-
lies, it needs to classify in detail the features of users. In this paper,
we focus on building a classification model with supervised learn-
ing by computing the OSN dataset in the training process.
3.2. Dataset
In this experiment, we provide the OSN dataset as a 1D dataset and
feed the features into the supervised learning algorithm. At the first
stage, we collect the OSN dataset consists of some high-level fea-
tures of OSN, such as user profiles Pu and relation Ru. User profiles
Pu refers to a real natural member. It is an individual account in
the OSN that consists of user representation, including username,
age, location, pictures information, URL information, and other
attributes. Ru represents the user’s social connections in the OSN.
Ru including how many friends are in their network.
In this paper, we define OSN as G and is represented by G(P, R)
where P = {p1, p2, … pn} is the set of user identity, and the collection
of the OSN relation is described as R = {R1, R2, … Rn}. The user’s
relationship has a normal relation (blue lines), malicious relation
(yellow lines), and fake relation (red lines). Figure 4 depicts OSN
graph G (P, R) with types of relation, and user properties.
Figure 4 OSN graph G(P, R), which consists of
high-profile features, including user link information
and profiles.
3.2.1. OSN link features
Current research discusses various DL methods to conduct link pre-
diction [20,21]. In the OSN communities, anomalies link become
one of the primary concerns of OSN security research. Initially,
the source of menace comes from suspicious user activities. The
infected user automatically spread fake information or deceive user
for targeting other regular users. If the crumbly system cannot adopt
the threat model, it becomes an easy target for phishing or com-
promising attack. An anomaly has a diverse type of threat like ver-
tical attack and horizontal attack. A vertical is usually a threat to
the providers, and the horizontal one poses attacks to other OSN
users [22].
The current study discusses fake account detection in social net-
works by using the learning method. DeepScan proposes mali-
cious account detection with DL in Location-Based Social Net-
works (LBSN) [23]. Another scheme utilizes the similarity of the
user’s friends to calculate the adjacency matrix of the graph to iden-
tify the user as benign or fake [24]. Analyzing social relations is a
new way of determining anomalous users by mapping social ties
and geographical among users. It calculates OSN factors like friend-
ship, common interest, or alliance to determine chosen friends.
Many researchers have proposed methods to construct the OSN
security model. Therefore, in this paper, we build a learning classi-
fier as a new way of protection by analyzing the OSN dataset.
We gather the OSN link dataset of the OSN from several public
OSN, especially the undirected OSN graph. In the security case, link
prediction is a crucial part of analyzing the structure in the OSN as
a security scheme. Instead of exploring the entire graph, we gather
the neighborhood link information by aggregating the features. For
this experiment, we collect topology information of the network
which is available from the data domain. We gather the undirected
OSN dataset from Facebook and Flickr samples to build the bench-
mark dataset. Table 2 displays the dataset which is gathered for the
experiment.
72 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76
Table 2 Dataset of link prediction and classification.
Properties Facebook Dataset Flickr Dataset
Size 63,731 vertices
(users)
1,715,255 vertices
(users)
Volume 817,035 edges
(friendships)
15,551,250 edges
(links)
Average degree 25.640 edges / vertex 18.133 edges / vertex
Size of LCC 63,731 vertices
(users)
1,624,991 vertices
(users)
Gini coefficient 64.3% 88.2%
Relative edge
distribution entropy
93.1% 82.2%
Assortativity 0.17702 –0.015282
Clustering coefficient 14.8% 11.2%
LCC, Largest Connected Component.
This corpus contains the undirected network link of OSN users. In
the dataset, a node denotes a user, and an edge indicates a friend-
ship between two users. The datasets contain a subset sample of
the total friendship graph. Notably, measuring social relation with
supervised learning such as CNN becomes a new approach in OSN
malicious link detection. The main problem of node relation anal-
ysis is to quantify links among the user and classify the user cate-
gories. Finally, we feed the sample into the learning model to train
the OSN attributes (user relations and profiles features) to classify
the OSN account.
3.2.2. OSN profile features
In this part, we gather OSN user-profiles information as the exper-
iment dataset that contains many features that can be trained for
building a classification model. The public OSN usually obtain the
attributes for computing purposes. It is a way to handle the user
and activity anomalies problem. Activity variations among accounts
with contrast behavior from different OSN groups cause horizontal
anomalies [25]. Thus, we take advantage of the vast user informa-
tion as the sample to construct an effective learning model as the
OSN protection.
Firstly, we define OSN as G and is represented by G (P, R) where
P = {p1, p2, … pn} is the set of the user profile as attributes. We pro-
vide some essential user features that represent interaction among
OSN users.
In this study, we provide sample OSN profiles with more than 3000
unique users. We build a dataset by decomposing some user profiles
information. For the experiment dataset, the paper observes user
data around 3000 OSN members. We provide 1500 data with a nor-
mal user profile and 1500 users with fake profiles label. We split the
training data and testing dataset to achieve real accuracy and loss.
3.3. Data Preprocessing
The aim of data preprocessing is to convert the raw dataset at hand
to the input vector for NN training. At the preprocessing, we sep-
arate the several features into parts to preserve the context of each
token. A feature gets a different value when is extracted from the
identity, hostname, location, time, and other attributes of OSN. The
stage includes vectorization, normalization, handling missing val-
ues, and feature extraction.
In the data preprocessing, we split the dataset into a training and
test part. For training the model, we employ the training data and
evaluate the model on the validation data. After finishing the val-
idation model, we test the final model on the test data. For testing
the model, we need to use a wholly different and unseen dataset
to evaluate the model with the test dataset. We set the model to be
unable to gain access to any information about the test dataset.
This study suffers from a not balanced dataset because the data con-
tains about 90%–95% that belong to the same majority class (benign
account). If we conduct training on such data, it can cause the igno-
rance of the clarification of minority clusters (fake account). Thus,
it can influence the overall accuracy of classifications. To deal with
this issue, we utilize the Synthetic Minority Oversampling Tech-
nique (SMOTE) to balance the dataset [26]. It is a relevant approach
if we want to obtain the similarity feature of the nodes and avoid to
diminish information.
In the initial prepreprocessing, we utilize a function to convert inte-
ger and string dataset into the input matrix. We need the technique
to convert the features into a vector because the input may be an
array of data like integers or strings that cannot be processed in neu-
ral network training. In the final phase of preprocessing, we sepa-
rate the dataset into training and testing datasets to test the model
on unseen data. We have two labels for the evaluation dataset, and
these are benign and malicious targets.
We feed a few noises data to make it difficult for the model to be
bias-free and reflects a real condition in a classification problem.
Thus, the model will consider the noisy data patterns, and it pushes
the model toward overfitting in the training and testing process.
After the preprocessing stage, the model feeds the features into the
graph for classification. To train the model, we feed the dataset into
the proposed CNN to make predictions on new unseen data with
binary classification tasks.
4. EXPERIMENT RESULT
Based on the experimental setup, we have conducted training and
testing by gathering two groups of OSN dataset. In this section, we
have shown how the generic pooling layer’s performance in link
classification and fake profile detection in the benchmark OSN
dataset. In this part, we have displayed the accuracy and loss of the
pooling scheme. The experiment also has provided an evaluation
of the performance model by using cross-validation. It is to calcu-
late Recall, Precision, F1 Score, Receiver Operating Characteristic
(ROC) [27].
4.1. OSN Link Classification
In the link prediction problem, we have gained a trade-off between
the accuracy and the performance time. By tuning diverse hyperpa-
rameters, we have undergone the neural network performance. In
the training and testing process, we have tuned epoch = 500, batch
size = 50, and with learning rate (lr = 0.01). Table 3 describes the
proposed CNN performance, especially in the training and testing
process.
We have tested gradient descent and tune optimization algorithms
with different hyperparameters by computing this sample. We have
employed the gradient descent to minimize the objective function
j (𝜃) with the model’s parameters θ ∈ Rd
. The model updates the
parameters in the opposite direction of the gradient of the objective
Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 73
Table 3 CNN with RunPool layer performance in link classification with
different parameter setting.
Input Hyperparameter Optimizer Training
Loss
Testing
Loss
FB link Adam 0.5033 0.5069
#50000 Dropout p = 0.6 SGD, m = 0.5 0.5068 0.5070
H1 = 64
H2 = 32
ktop= 5
Adagrad 0.5057 0.5059
RMSProp 0.5176 0.5223
Flickr Dropout p = 0.6 Adam 0.5052 0.5060
#50000 H1 = 64
H2 = 32
ktop= 5
SGD, m = 0.5 0.5178 0.5160
Adagrad 0.5075 0.5074
RMSProp 0.5092 0.5087
FB, Facebook.
function ∇𝜃j (𝜃). The study has set the learning rate η (lr = 0.01) to
determine the stride size to reach a (local) minimum. In the training
process, we have gained the best value of optimization when train-
ing the model with Adam optimizer. The experiment has fed Adam
optimizer into the graph with different amounts of hidden layers.
The result has shown a dynamic optimizer can achieve a small loss
in a large number of hidden layers.
Moreover, we have tested RunPool CNN with another gradient
descent like stochastic gradient descent SGD with a momentum
that also produces a competitive result. The SGD with momentum
(m = 0.5) can provide a better result if we tune in larger epoch
and big regulizer. However, the experiment has found that adding
more hidden layers for SGD cannot reduce the loss. Using a large
number of hidden layers has caused overhead computation in a
particular device, especially in CPU processing. Testing the NN by
adding more layers and neuron numbers of CNN could not engage
in improving the predictive capability. Because of the limitation of
neuron number, it enlarges computing resources. Thus, using the
RunPool pooling function can increase the accuracy result without
adding more hidden layers in a neural network.
In this experiment, we also calculate the Recall function to define
the percentage of the datasets that belong to the malicious and
identifies it as malicious. The precision determines the rate of the
datasets which identifies as malicious, but actually, do not include
the malicious. Finally, we want to seek a balance between Precision
and Recall. We need to measure the OSN account classification who
have the malicious link by accepting a low precision if the testing
result is not significant. Hence, we require an F1 score as an opti-
mal blend of Precision and Recall. Tables 4 and 5 show the Recall,
Precision, and F1 Score in both OSN datasets.
Based on the table above, the accuracy result has shown that super-
vised machine learning produces different accuracy. There are
Naïve Bayes (78.77%), Logistic Regression (LR) (92.06%), Support
Vector Machine (SVM) (92.47%, 92.52%), and Neural Network
(NN) (94.10%). The proposed CNN architecture with the RunPool
layer can produce about 98.34% in several times of the testing pro-
cess with the benchmark dataset. To make sure the model perfor-
mance, we add a few noise data into the dataset that makes the
Table 4 Accuracy and F1 Score with several machine learning
algorithms for the Facebook dataset.
Classification Model Accuracy Average Precision
and Recall
F1 Score
Bernoulli 78.77 0.2123 -
Logistic regression 91.49 0.6803 0.7563
SVM (rs = 40) 91.95 0.6974 0.7724
SVM (rs = 50) 91.99 0.6990 0.7745
S-neural network 94.10 0.9120 0.9210
Proposed approach 98.04 0.9775 0.9595
Table 5 Accuracy and F1 Score with several machine learning
algorithms for the Flickr dataset.
Classification Model Accuracy Average Precision
and Recall
F1 Score
Bernoulli 78.77 0.2123 -
Logistic regression 92.06 0.7048 0.7787
SVM (rs = 40) 92.47 0.7217 0.7897
SVM (rs = 50) 92.52 0.7257 0.7938
S-Neural network 94.10 0.9280 0.9321
Proposed approach 98.34 0.9732 0.9846
common Machine Learning (ML) algorithm more difficult to make
a classification. Fortunately, the proposed CNN can achieve bet-
ter accuracy than other ML algorithms. We also calculate the Area
Under the Curve (AUC) score that produces the highest score of the
AUC = 0.9970. It reflects that this model can achieve a good level of
prediction in the classification problem.
4.2. OSN Account Classification
In this section, we test the proposed RunPool with several learn-
ing parameters to classify the user categories based on OSN profile
attributes. In the training process, this study was tested with some
popular optimizers, including SGD with momentum, Adam, Ada-
grad, and RMSprop. The study calculates the gradient of loss when
training networks in the critical part to measure comparison result
between training and test result. We have put the local variable con-
cept to compute the loss function and compute the gradient of the
loss concerning the weight with various learning rates.
In this experiment, we tune the gradient descent and optimization
algorithms with different hyperparameters. In NN training, choos-
ing the value of a hyperparameter is one of the most important
parts to increase the network training quality. By tuning appro-
priate hyperparameters in the CNN, we obtain the highest perfor-
mance of the neural network. In the training and testing process, we
tune epoch = 15 and batch size = 50. We have produced a trade-off
between the accuracy and the performance time. Table 6 describes
the proposed CNN performance in computing validation accuracy
and loss.
This experiment also tests the Precision, Recall, and F1 Score to
compare how significant the impact of the proposed pooling layer
in classification assignment. Table 7 shows the Recall, Precision, and
F1 Score with different algorithms.
Based on the table above, accuracy result has shown that super-
vised machine learning produces diverse accuracy. There are
Bernoulli Naïve Bayes (86.90%), Gradient Boosting (GB) (90.66%),
74 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76
LR (90.49%), SVM (90.04%), NN (92.10%). The proposed CNN
architecture with the RunPool pooling layer can produce 94.00% in
several times of the testing process with the benchmark dataset. We
have found that the proposed pooling layer can harvest the highest
accuracy than other common machine learning in the benchmark
OSN user-profiles dataset. At the final stage, we also calculate the
ROC curve as a binary classification. The ROC curve shows that
the proposed model can produce an AUC area of more than 0.9. It
reflects the model that it can be a good classifier for training the
dynamic dataset such as OSN.
4.3. Pooling Comparison
In this part, we provide the comparison result among RunPool
Pooling, Max Pooling, and Average Pooling to calculate model
accuracy. We calculate the accuracy of the model combined with
different pooling functions to measure the performance of each
pooling in the classification task. This study tested the accuracy
among RunPool Pooling, Max Pooling, and Average Pooling in the
same hyperparameter tuning and environment. It is to convince
the performance of each pooling in the classification task. Table 8
displays a comparison result among different pooling function by
using 2254 training samples and 545 validation samples.
In the first experiment, the study trains the classifier without a pool-
ing layer that means CNN replaces the pooling layer by convolu-
tion with stride. However, CNN without pooling cannot produce a
Table 6 Classification result of the proposed CNN with RunPool layer in
with different parameter setting.
Parameter Optimizer Val-Loss Val-Accuracy
Dropout p = 0.5 Adam 0.3633 0.9131
H1 = 32 SGD, m = 0.9 0.2087 0.9291
H2 = 64 Adagrad 0.3620 0.9025
ktop= 7 RMSProp 0.4049 0.8794
Dropout p = 0.3 Adam 0.4018 0.8954
H1 = 16 SGD, m = 0.9 0.2143 0.9344
H2 = 32 Adagrad 0.3559 0.8989
ktop= 5 RMSProp 0.4017 0.8865
Table 7 Comparison result between some common classification
algorithms and the proposed approach in the OSN dataset.
Classification Model Precision Recall F1 Score
Bernoulli (Naïve Bayes) 86.90 86.94 87.00
Gradient Boosting (n estim=50) 90.66 90.69 91.00
Logistic Regression 90.47 90.59 90.61
SVM (rs = 31.6) 90.04 87.34 87.24
S-Neural Network 92.10 92.80 92.91
Proposed Approach 94.00 93.21 93.42
Table 8 Comparison of validation accuracy among pooling functions
when training with the same hyperparameter tuning and dataset.
Pooling Type Epoch = 7 Epoch = 10 Epoch = 15
Average Pooling 91.13% 91.49% 87.94%
Max Pooling 91.21% 91.31% 89.89%
Max. Pool + RunPool 91.67% 91.67% 89.36%
Avg. Pool + RunPool 91.40% 91.89% 88.48%
RunPool Pooling 92.38% 92.54% 91.08%
good result in diverse training periods. In this case, CNN requires
pooling layers to reduce the dimensions of output features and
render some amount of invariance. Based on the training result,
CNN with pooling layers are better than without pooling.
To ensure the performance among RunPool Pooling, Max Pool-
ing, and Average Pooling, we add the pooling layers into the CNN.
As we know, Max Pooling and Average Pooling are deterministic
pooling types that take the pooling value upon the maximum and
average score of a pooling window. Instead of using ordinary deter-
ministic pooling, we test the RunPool as a nondeterministic pooling
to train the CNN model in the classification task. The experiment
shows that the proposed CNN with RunPool Pooling can achieve a
promising accuracy for an account classification task. However, this
experiment suffers from decreasing accuracy when trained with a
large epoch (many iterations). It can be solved by providing more
dataset for training the model.
At the final stage, we evaluate the classifier by using ROC and AUC
scores. This technique is used to assess the false alerts of malicious
detection. Based on the experiment, the graph depicts a comparison
among several pooling layers in calculating the AUC score and False
Positive Rate (FPR) score. Figure 5 depicts the AUC comparison
among three pooling functions in malicious classification by using
OSN features as the dataset.
Figure 4 depicts that the AUC scores with the proposed approach
(a) are higher than the conventional pooling layer (a) and (b). In
this part, we plot AUC as a binary classification by adopting prob-
lem distributed observations (X1; Y1) and (X2; Y2). TheAUC illus-
trates the percentage of the area under the ROC curve with range
between 0~1. The proposed model can achieve a promising result
with AUC = 0.9889. Thus, the AUC reflects that the classifier can
discriminate between positive and negative classes (malicious and
benign label) effectively.
Figure 5 Comparison result of the AUC score among Max
Pooling, Average Pooling, and RunPool Pooling layer in the
malicious classification task.
Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 75
To sum up the benefit of the proposed pooling layer in the CNN
architecture with the 1D dataset, we have presented the training and
validation result by calculating the huge sample (OSN features). We
find that the proposed CNN with RunPool can achieve an efficient
result and increase the learning performance of the neural network.
In DL study, the proposed pooling technique can be an alternative
way to train a DL model not only for the 1D dataset (text) but also
for a 2D dataset (image) and 3D dataset (video and voice).
5. CONCLUSION
In this paper, we introduce a new pooling layer called RunPool to
find different optimal pooling values within the pooling window.
We propose the pooling function to deal with deterministic pooling
issues by harnessing some network parameters to obtain an infor-
mative feature map. It calculates some layer parameters and Gaus-
sian distribution numbers based on ktop initialization. Then, the
function compares kmap and ktop to achieve the best pooling value
in each hidden layer, respectively.
In the test process, we obtain a dynamic optimizer that is better to
achieve high accuracy and small loss. On the other hand, the SGD
with momentum also produces a competitive result. It can provide
an accurate and minimum loss if we tune it with a larger epoch and
big regulizer. SGD with momentum optimizer become an excellent
option to train the CNN RunPool model. In fake OSN account clas-
sification, we compare the proposed CNN accuracy to the super-
vised ML, including Bernoulli Naïve Bayes (86.90%), GB (90.66%),
LR (90.49%), and SVM (90.04%). The proposed CNN can achieve
the highest score by producing 94.00%.
Demonstrated by experiment with 1D OSN dataset, it is confirmed
that RunPool can improve the performance of the CNN to address
the drawback of common max pooling and average pooling. This
type of stochastic pooling can be combined with the common pool-
ing layers to produce better accuracy with tiny loss. By implement-
ing CNN with RunPool, we harvest an increased level in the graph’s
performance and gain better accuracy in the validation process for
the classification case. Not just calculate the accuracy and loss, we
also conduct the evaluation metric to measure the performance of
the classifier. By calculating the Precision, Recall, and F1 Score, we
obtain a promising result because the proposed model can achieve
the highest AUC score.
As the future directions of malicious classification study with DL, it
needs an exploration for handling accounts and malware hierarchy
links that reflect the sourceof the threat. Then,neural network com-
putation is necessary to calculate with a new technique like adaptive
loss function rather than a regular loss function. We also suggest
the next study to test this model by utilizing a 2D dataset (image)
and 3D (video or audio) dataset for measuring training and testing
accuracy.
CONFLICT OF INTEREST
The authors declare they have no conflicts of interest.
AUTHORS’ CONTRIBUTIONS
All authors has contribution in this research.
ACKNOWLEDGMENTS
This paper is conducted in the Institute of Research in Information Process-
ing Laboratory, Harbin University of Science and Technology under CSC
Scholarship (No. 2016BSZ263). This project is available on Github.
REFERENCES
[1] K. Simonyan, A. Zisserman, Very deep convolutional networks for
large-scale image recognition, arXiv/1409.1556, 2014.
[2] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification
with deep convolutional neural networks, in Advances in Neu-
ral Information Processing Systems (NIPS), Lake Tahoe, Nevada,
2012, pp. 1106–1114.
[3] C. Szegedy, S. Ioffe, V. Vanhoucke, A.A. Alemi, Inception-v4,
inceptionresnet and the impact of residual connections on learn-
ing, in Association for the Advancement of Artificial Intelligence
(AAAI), San Francisco, 2017, pp. 4278–4284.
[4] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, Cam-
bridge, MIT Press, 2016.
[5] M.D. Zeiler, R. Fergus, Stochastic pooling for regularization of
deep convolutional neural networks, arXiv/1301.3557, 2013.
[6] C. Vincent, L. Spranger, M. Seuret, A. Nicolaou, P. Král, A. Maier,
Deep generalized max pooling, arXiv: 1908.05040, 2019.
[7] E. Hayoung, H. Choi, Alpha-pooling for convolutional neural net-
works, arXiv: 1811.03436v1 [cs.LG], 2018.
[8] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature. 521
(2015), 436–444.
[9] L. Marcelo, T. Issa, W. Isaac, S. Mohammad, Authorship verifica-
tion using deep belief network system, Int. J. Commun. Syst. 30
(2017), e3259.
[10] Y. Taigman, M. Yang, M. Ranzato, L. Wolf, Deepface: closing the
gap to human-level performance in face verification, in Com-
puter Vision and Pattern Recognition (CVPR), IEEE Conference
on Pattern Recognition, Columbus, 2014, pp. 1701–1708.
[11] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learn-
ing applied to document recognition, Proc. IEEE. 86 (1998),
2278–2324.
[12] D. Yu, H. Wang, P. Chen, Z. Wei, Mixed pooling for convolutional
neural networks, in International Conference on Rough Sets and
Knowledge Technology, Shanghai, China, 2014.
[13] N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional
neural network for modelling sentences, in Proceedings of the
52nd Annual Meeting of the Association for Computational Lin-
guistics, Association for Computational Linguistics, Baltimore,
2014.
[14] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, D.U. Hwang,
Complex networks: structure and dynamics, Phy. Rep. 424 (2006),
175–308.
[15] P. Wanda, H.J. Jie, URLDeep: continuous prediction of malicious
URL with dynamic deep learning in social networks, Int. J. Netw.
Secur. 21 (2019), 971–978.
[16] P. Wang, B. Xu, Y. Wu, X. Zhou, Link prediction in social networks:
the state-of-the-art, Sci. China Inf. Sci. 58 (2015), 1–38.
[17] W. Putra, S. Selo, B.S. Hantono, Model of secure P2P mobile
instant messaging based on virtual network, in 2014 International
Conference on Information Technology Systems and Innovation
(ICITSI), Bandung, 2014, pp. 81–85.
76 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76
[18] P. Wanda, S. Selo, B.S. Hantono, Efficient message security
based HyperElliptic Curve Cryptosystem (HECC) for mobile
instant messenger, in 2014 The 1st International Conference on
Information Technology, Computer, and Electrical Engineering,
Semarang, 2014, pp. 245–249.
[19] M.G. Vigliotti, C. Hankin, Discovery of anomalous behaviour in
temporal networks, Soc. Netw. 41 (2015), 18–25.
[20] T. Li, B. Wang, Y. Jiang, Y. Zhang, Y. Yan, Restricted Boltzmann
machine-based approaches for link prediction in dynamic net-
works, IEEE Access. 6 (2018), 29940–29951.
[21] T. Li, J. Zhang, P.S. Yu, Y. Zhang, Y. Yan, Deep dynamic net-
work embedding for link prediction, IEEE Access. 6 (2018),
29219–29230.
[22] P. Bindu, S. Thilagam, Mining social networks for anomalies:
methods and challenges, J. Netw. Comput. Appl. 68 (2016),
213–229.
[23] Q. Gong, Y. Chen, X. He, Z. Zhuang, T. Wang, H. Huang, X. Wang,
X. Fu, DeepScan: exploiting deep learning for malicious account
detection in location-based social networks, IEEE Commun. Mag.
56 (2018), 21–27.
[24] M. Mohammadrezaei, M.E. Shiri, A.M. Rahmani, Identifying fake
accounts on social networks based on graph analysis and classifi-
cation algorithms, Hindawi Secur. Commun. Netw. 2018 (2018),
1–8.
[25] D. Savage, X. Zhang, X. Yu, P. Chou, Q. Wang, Anomaly detection
in online social networks, Social Netw. 39 (2014), 62–70.
[26] N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE:
synthetic minority over-sampling technique, J. Artif. Intell. Res.
16 (2002), 321–357.
[27] J.A. Swets, Signal Detection Theory and ROC Analysis in Psychol-
ogy and Diagnostics: Collected Papers, Lawrence Erlbaum Asso-
ciates, Mahwah, 1996.

More Related Content

What's hot

Architecture neural network deep optimizing based on self organizing feature ...
Architecture neural network deep optimizing based on self organizing feature ...Architecture neural network deep optimizing based on self organizing feature ...
Architecture neural network deep optimizing based on self organizing feature ...journalBEEI
 
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...CSCJournals
 
AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...
AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...
AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...IJNSA Journal
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingJinwon Lee
 
Optimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesOptimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
 
Mlp mixer image_process_210613 deeplearning paper review!
Mlp mixer image_process_210613 deeplearning paper review!Mlp mixer image_process_210613 deeplearning paper review!
Mlp mixer image_process_210613 deeplearning paper review!taeseon ryu
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMLAI2
 
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...ali hassan
 
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...Dr. Amarjeet Singh
 
AN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHMAN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHMIJNSA Journal
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...ijcsit
 
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from XailientEdge AI and Vision Alliance
 
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...Serhan
 

What's hot (20)

Architecture neural network deep optimizing based on self organizing feature ...
Architecture neural network deep optimizing based on self organizing feature ...Architecture neural network deep optimizing based on self organizing feature ...
Architecture neural network deep optimizing based on self organizing feature ...
 
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
 
AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...
AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...
AN EFFICIENT M-ARY QIM DATA HIDING ALGORITHM FOR THE APPLICATION TO IMAGE ERR...
 
M017427985
M017427985M017427985
M017427985
 
MultiModal Retrieval Image
MultiModal Retrieval ImageMultiModal Retrieval Image
MultiModal Retrieval Image
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
 
Optimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesOptimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral Images
 
O017429398
O017429398O017429398
O017429398
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
Mlp mixer image_process_210613 deeplearning paper review!
Mlp mixer image_process_210613 deeplearning paper review!Mlp mixer image_process_210613 deeplearning paper review!
Mlp mixer image_process_210613 deeplearning paper review!
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
 
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
 
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
High Speed Data Exchange Algorithm in Telemedicine with Wavelet based on 4D M...
 
AN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHMAN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHM
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data Fragments
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
Cray HPC + D + A = HPDA
Cray HPC + D + A = HPDACray HPC + D + A = HPDA
Cray HPC + D + A = HPDA
 
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
 
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
 
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
 

Similar to RunPool: A Dynamic Pooling Layer for Convolution Neural Network

Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...TELKOMNIKA JOURNAL
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningIRJET Journal
 
DEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONDEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONIRJET Journal
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...rinzindorjej
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...rinzindorjej
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...IAEME Publication
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesIRJET Journal
 
IRJET- Face Recognition using Machine Learning
IRJET- Face Recognition using Machine LearningIRJET- Face Recognition using Machine Learning
IRJET- Face Recognition using Machine LearningIRJET Journal
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...Bomm Kim
 
Efficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEfficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEL Mehdi RAOUHI
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learningijtsrd
 
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHESTEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHESsipij
 
Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...journalBEEI
 
Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...IJECEIAES
 

Similar to RunPool: A Dynamic Pooling Layer for Convolution Neural Network (20)

Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
 
PointNet
PointNetPointNet
PointNet
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep Learning
 
DEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONDEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTION
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
 
6119ijcsitce01
6119ijcsitce016119ijcsitce01
6119ijcsitce01
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
 
IRJET- Face Recognition using Machine Learning
IRJET- Face Recognition using Machine LearningIRJET- Face Recognition using Machine Learning
IRJET- Face Recognition using Machine Learning
 
Mnist report
Mnist reportMnist report
Mnist report
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
Efficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEfficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognit
 
28 01-2021-05
28 01-2021-0528 01-2021-05
28 01-2021-05
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learning
 
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHESTEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
 
Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...Hyper-parameter optimization of convolutional neural network based on particl...
Hyper-parameter optimization of convolutional neural network based on particl...
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...Residual balanced attention network for real-time traffic scene semantic segm...
Residual balanced attention network for real-time traffic scene semantic segm...
 

More from Putra Wanda

Efficient Data Security for Mobile Instant Messenger
Efficient Data Security for Mobile Instant MessengerEfficient Data Security for Mobile Instant Messenger
Efficient Data Security for Mobile Instant MessengerPutra Wanda
 
Model Pengamanan end-to-end pada M-Banking
Model Pengamanan end-to-end pada M-BankingModel Pengamanan end-to-end pada M-Banking
Model Pengamanan end-to-end pada M-BankingPutra Wanda
 
Materi workshop Mikrotik #3
Materi workshop Mikrotik #3Materi workshop Mikrotik #3
Materi workshop Mikrotik #3Putra Wanda
 
Modul Pengantar Teknologi Informasi
Modul Pengantar Teknologi InformasiModul Pengantar Teknologi Informasi
Modul Pengantar Teknologi InformasiPutra Wanda
 
Materi jaringan nirkabel
Materi jaringan nirkabelMateri jaringan nirkabel
Materi jaringan nirkabelPutra Wanda
 
Workshop mikrotik#1
Workshop mikrotik#1Workshop mikrotik#1
Workshop mikrotik#1Putra Wanda
 
Workshop Mikrotik
Workshop MikrotikWorkshop Mikrotik
Workshop MikrotikPutra Wanda
 
Prak9-Bandwith Limiter
Prak9-Bandwith LimiterPrak9-Bandwith Limiter
Prak9-Bandwith LimiterPutra Wanda
 
Prak8-blok https mikrotik
Prak8-blok https mikrotikPrak8-blok https mikrotik
Prak8-blok https mikrotikPutra Wanda
 

More from Putra Wanda (20)

Efficient Data Security for Mobile Instant Messenger
Efficient Data Security for Mobile Instant MessengerEfficient Data Security for Mobile Instant Messenger
Efficient Data Security for Mobile Instant Messenger
 
Model Pengamanan end-to-end pada M-Banking
Model Pengamanan end-to-end pada M-BankingModel Pengamanan end-to-end pada M-Banking
Model Pengamanan end-to-end pada M-Banking
 
Materi workshop Mikrotik #3
Materi workshop Mikrotik #3Materi workshop Mikrotik #3
Materi workshop Mikrotik #3
 
Modul Pengantar Teknologi Informasi
Modul Pengantar Teknologi InformasiModul Pengantar Teknologi Informasi
Modul Pengantar Teknologi Informasi
 
Materi jaringan nirkabel
Materi jaringan nirkabelMateri jaringan nirkabel
Materi jaringan nirkabel
 
Praktikum 11
Praktikum 11Praktikum 11
Praktikum 11
 
Praktikum 10
Praktikum 10Praktikum 10
Praktikum 10
 
Praktikum 9
Praktikum 9Praktikum 9
Praktikum 9
 
Praktikum 8
Praktikum 8Praktikum 8
Praktikum 8
 
Praktikum 7
Praktikum 7Praktikum 7
Praktikum 7
 
Praktikum 6
Praktikum 6Praktikum 6
Praktikum 6
 
Praktikum 5
Praktikum 5Praktikum 5
Praktikum 5
 
Praktikum 4
Praktikum 4Praktikum 4
Praktikum 4
 
Praktikum 3
Praktikum 3Praktikum 3
Praktikum 3
 
Praktikum 2
Praktikum 2Praktikum 2
Praktikum 2
 
Praktikum 1
Praktikum 1Praktikum 1
Praktikum 1
 
Workshop mikrotik#1
Workshop mikrotik#1Workshop mikrotik#1
Workshop mikrotik#1
 
Workshop Mikrotik
Workshop MikrotikWorkshop Mikrotik
Workshop Mikrotik
 
Prak9-Bandwith Limiter
Prak9-Bandwith LimiterPrak9-Bandwith Limiter
Prak9-Bandwith Limiter
 
Prak8-blok https mikrotik
Prak8-blok https mikrotikPrak8-blok https mikrotik
Prak8-blok https mikrotik
 

Recently uploaded

Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

RunPool: A Dynamic Pooling Layer for Convolution Neural Network

  • 1. International Journal of Computational Intelligence Systems Vol. 13(1), 2020, pp. 66–76 DOI: https://doi.org/10.2991/ijcis.d.200120.002; ISSN: 1875-6891; eISSN: 1875-6883 https://www.atlantis-press.com/journals/ijcis/ RunPool: A Dynamic Pooling Layer for Convolution Neural Network Huang Jin Jie1,*, Putra Wanda1,2,* 1 School of Computer Science & Technology, Harbin University of Science and Technology, Harbin, China 2 University of Respati Yogyakarta, St. Sol, Depok, Yogyakarta, Sleman, 55281, Indonesia ARTICLE INFO Article History Received 05 Jul 2019 Accepted 15 Jan 2020 Keywords Dynamic pooling Deep learning Malicious classification Social network ABSTRACT Deep learning (DL) has achieved a significant performance in computer vision problems, mainly in automatic feature extraction and representation. However, it is not easy to determine the best pooling method in a different case study. For instance, experts can implement the best types of pooling in image processing cases, which might not be optimal for various tasks. Thus, it is required to keep in line with the philosophy of DL. In dynamic neural network architecture, it is not practically possible to find a proper pooling technique for the layers. It is the primary reason why various pooling cannot be applied in the dynamic and multidimensional dataset. To deal with the limitations, it needs to construct an optimal pooling method as a better option than max pooling and average pooling. Therefore, we introduce a dynamic pooling layer called RunPool to train the convolutional neural network (CNN) architecture. RunPool pooling is proposed to regularize the neural network that replaces the deterministic pooling functions. In the final section, we test the proposed pooling layer to address classification problems with online social network (OSN) dataset. © 2020 The Authors. Published by Atlantis Press SARL. This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/). 1. INTRODUCTION In recent years, deep learning (DL) has achieved remarkable per- formance in various computer applications, especially in computer vision problems [1–3]. Convolutional neural networks (CNN) is one of the popular DL algorithms for solving image recognition [4]. The architecture is based on a supervised learning model that con- sists of several convolutional layers and pooling layers. In neural network computation, it is required to extract more robust features against position or movement for object recognition. CNN archi- tecture harnesses pooling layers to decrease the size of the feature map and extract features [4]. Those pooling layers are a crucial part of CNN to achieve an effi- cient training process. To obtain better accuracy with a small loss, subsampling plays an important role. Commonly, the CNN pooling techniques are max pooling and average pooling. In the operation process, max pooling takes the maximum score in the pooling win- dow, and average pooling selects the average score of the pooling window. However, the current study picks pooling types without much consideration. A paper reveals that the two conventional pooling models are not optimal [5]. In some cases, the average pooling diminishes critical information in strong activation so that it can decrease the perfor- mance on CNN. On the other hand, the max pooling shortcomings * Corresponding authors. Email: huangjinjie163@163.com, wpwawan@gmail.com remain because it ignores all crucial information except the most significant value. It can degrade an opportunity to calculate other informative features in large and dynamic input data. A comparison with max or average pooling schemes cannot predict which parts of the input will be selected. A paper proposes the pooling technique by replacing the conven- tional deterministic pooling with a stochastic model to increase pooling layer performance. It randomly selects the activation within each pooling region according to a multinomial distribution [5]. The current study introduces Deep Generalized Max Pooling that balances the contribution of all activations of a region by reweight- ing all descriptors [6]. Another paper proposes, alpha-pooling, which utilizes a trainable parameter 𝛼 to decide the type of pooling. It is a type of pooling scheme, including max pooling and average pooling as a particular case. Demonstrated by experiment, alpha- pooling can improve the accuracy of image recognition problems. They compared several pooling layers and found that max pooling was not the optimal pooling method [7]. Besides, there are many other pooling schemes, including geomet- ric and harmonic average. It is still challenging to find the best pool- ing method. Thus, it is required to keep in line with the philosophy of DL. In dynamic neural network architecture, it is not practically possible to find a proper pooling technique for the layers. It is the primary reason why various pooling cannot be applied in dynamic data problems. It needs to construct an optimal pooling method as an option of max pooling and average pooling to deal with the limitations. Not just able to select the value randomly, but also can
  • 2. Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 67 predict which parts of the input will be selected as the most appro- priate pooling value. In this paper, we introduce a generic pooling layer as an option to construct CNN architecture. Based on our survey, it is the one pool- ing layer that utilizes network parameters in each layer to determine pooling value in the hidden layer. Notably, we present the main con- tributions of our work, particularly in CNN as follows: 1. We introduce a new pooling layer in the CNN architecture. Different from max and average pooling, this pooling imple- ments a magnetic field concept in calculating the pooling layer. It runs by dynamically changing the pooling layers’ values for each convolution in the hidden layer. 2. We construct the pooling layer by obtaining several hyperpa- rameter in each layer. Then, this function calculates the param- eters with a range distribution. By establishing the concept, we get the state-of-the-art in CNN architecture. By demonstrating the architecture, we harvest higher accuracy and a smaller loss than conventional CNN architecture max pooling and average pooling layer. 3. Notably, this function is different from traditional pooling that obtains the highest valued activation out of the pooling win- dow or takes an average value of a pooling window. The pro- posed pooling layers try to find appropriate pooling value to achieve efficient computation by calculating several informa- tive parameters in each layer. It is a novel pooling approach in CNN architecture to reduce overfitting while training a model. The rest of the paper is structured as follows: Section II gives insight into the proposed technique, Section III presents the experimental setup that consists of case study in a social network, dataset, and data preprocessing. Section IV describes the result and discussion. Section V offers the conclusion and highlights some open research questions in CNN architecture. 2. PROPOSED TECHNIQUE As the hot topic of research, DL is becoming an increasingly popu- lar approach for solving diverse applications. The CNN algorithms imitate the structure and function of the human brain [8]. In recent years, CNN has achieved state-of-the-art results on a variety of challenging problems in computer vision and speech recognition. Research communities have implemented this concept to deal with many computer assignments. Several papers explore the use of deep belief networks (DBN) for authorship verification by a Gaussian– Bernoulli in DBN [9], face verification with a Deep Neural Network (DNN) [10], and document recognition [11]. 2.1. Convolutional Neural Network Most of the current research in DL propose CNN architecture to deal with various problems, including image, voice, video, and object detection [8]. In this paper, we construct a CNN architecture by adding a new pooling layer in the neural network. We mention the graph as dynamic CNN, a CNN graph with a RunPool pool- ing function that has three layers in the network, including an input layer, a hidden layer, and an output layer. In the input layer, the network layer projects the feature extraction as matrices input. To compute the feature extraction result, we adopt convolution layer processing with weight sharing. Thus, it is unnecessary to optimize many parameters because the method can construct convergence faster. This architecture adopts the convolution to address multiple input feature maps and to produce stacked feature maps of increasing lay- ers. In this experiment, we define a feature map of ith layer with Fi. We denote the total of n feature maps in layer i – 1 = F1 i–1 , … , Fn i–1 . Layer i has a specific filter size l with F j i,l feature map. The map F j i,l has j parameter which represents the index of a feature map in layer i. We calculate each feature map Fk i–1 by convolving it with a matrix M j,k i,l , thus it yields the results: F j i,l = n ∑ k=1 M j,k i,l ∗ Fk i–1 (1) The model defines the computation of the convolution process between filters and input matrices. The model convolves a set of fil- ters form F ∈ Rd×m with the input matrix S ∈ Rd×|s| . The goal of the convolutional layer is to compute feature extraction. After convolution operation, we construct a pooling layer to obtain the most informative values in the hidden units. Pooling aims to convert the joint feature representation into a more valuable point that preserves important information and diminishing irrelevant details. In operation, the pooling reduces the resolution of the fea- ture maps over the local neighborhood of the previous layer. Com- monly, a CNN architecture employs conventional pooling schemes such as max pooling and average pooling to obtain a maximum or average value in the hidden layer. The max pooling chooses the biggest element in each pooling region as: ykij = max (p, q)∈Rij xkpq (2) where ykij is the output of the pooling operator related to the kth feature map, xkpq is the element at ( p, q ) within the pooling region Rij which represents a local neighborhood around the position ( i, j ) . The average pooling obtains the arithmetic mean of the elements in each pooling region as: ykij = 1 ||Rij|| ∑ (p,q)∈Rij xkpq (3) On the average pooling function, ||Rij|| represents the size of the pooling region Rij. Although those two types of pooling functions produce a good result on some datasets, it is still unknown, which will work better for dealing with a new task with a different dataset. Thus, it is a little bit complex to choose the pooling operator. In some image recognition problems, both the max pooling and average pooling have their shortcomings. Figure 1 illustrates the drawbacks of max pooling and average pooling in detecting a diag- onal line. In the pooling region, max pooling only considers the maximum element and ignores others. In some cases, it will lead to an unac- ceptable result because if the most elements consist of high magni- tudes, the distinguishing feature vanishes after max pooling. On the
  • 3. 68 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 Figure 1 Pooling process of input feature that illustrates the drawbacks of max pooling and average pooling on CNN. other hand, the average pooling gets worse if there are many zero elements on the input features; the characteristic of the feature map will be reduced largely [12]. Therefore, as a new way to downsampling the features, we con- sider replacing the deterministic pooling function with a stochastic scheme. Instead of using the standard pooling method, we intro- duce a new pooling function by adopting a nondeterministic pool- ing concept. 2.2. Main Idea To construct a pooling function with a stochastic pooling paradigm, we take a magnetic field problem as the basic idea. The main prob- lem of this function is how to establish a CNN architecture by opti- mizing a new pooling layer. We are inspired by a model that adopts the dynamic k-max pooling to compute the engineered feature in each layer. By choosing the hyperparameters in each layer, the func- tion calculates the pooling value to obtain a pooling sequence in 1D dataset [13]. To implement the magnetic field concept, we calculate the pooling layer with the Gaussian Law theory (Gaussian random variable). At the initial step, we compute the pooling function by generat- ing random numbers from a Gaussian distribution. We employ the random technique because the randomness comes from atmo- spheric noise. In several computer applications, it is better than the common pseudo-random algorithms. A normal distribution has a bell-shaped density curve described by its mean 𝜇 and standard deviation 𝜎. To calculate the Gaussian distribution of the range number, we compute the function: f ( x|𝜇, 𝜎2 ) = 1 √2𝜋𝜎2 e – (x–𝜇) 2 2𝜍2 (4) In formula 2, 𝜇 represents the mean, median, mode, or expectation of the distribution. Notation 𝜎 reflects the standard deviation, and 𝜎2 is the variance. To construct the pooling function, we multiply several numbers of Gauss random with kmap in each layer. It cal- culates the kmap as a dynamic value, and it depends on the current parameters. Finally, this function will compare the kmap ∗f ( x|𝜇, 𝜎2 ) with the constant ktop to obtain the final k value for the current pool- ing layer. 2.3. RunPool Pooling CNN is a famous DL architecture that uses a lot of identical copies of the same neuron. The process runs extensive computation with a little number of parameters and has many interleaved convolutional and pooling layers over the network. The CNN layer receives a sin- gle input (the feature maps) and computes feature maps as its out- put by convolving filters across the feature maps. CNN can ignore the aspect of feature engineering with automatic representation. One critical layer in CNN training is the pooling layer. The pooling can act as a generalizer of the lower-level data in the NN training. It is a sliding window technique that utilizes several kinds of statistical functions to get the contents on each window. There are two com- mon types of pooling techniques, max pooling and average pool- ing. Max pooling employs the max function to obtain informative materials, while the average pooling harnesses the statistical mean function to take the contents of the window. Principally, a pool- ing function simplifies the output features by performing nonlinear computation and reducing the number of parameters. Instead of using the conventional pooling layers, we construct a method that dynamically changes the pooling values based on the current network parameter. It takes the length of the input vector of the network to obtain a feature map according to the current parameters in each layer. The basic idea of the function is how to calculate network elements of the current map pooling and to find appropriate pooling value by comparing it with a constant pooling value. It is to recompute the pooling value using the layer parame- ters and learn a CNN hierarchy of feature orders. Figure 2 depicts
  • 4. Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 69 Figure 2 The network topology of proposed CNN with input dinput = x1, x2, … , xn and doutput = y1, y2 with number hidden layer h, where the output is represented by Y = y1, y2 and input x = x1, x2, ..xn. The matrix of weights calculated by Wi ∈ R xi×hj . the CNN computation with the RunPool pooling layer in the visible and the hidden layer. The hidden layer runs the convolution process by using filters and input matrix. In the model, a set of filters form F ∈ Rd×m will be convolved to the input matrix S ∈ Rd×|s| . The convolutional layer aims to compute feature extraction, such as word sequences throughout the training process. The RunPool computes the matri- ces conversion based on the feature extraction process. Table 1 describes the mathematical notation for the RunPool function. We construct the function to allow a smooth extraction of higher order and to adapt with a longer range element at the hidden layer. The model employs the k value at intermediate layers (hidden layer). The dynamic CNN calculates forward pass with the follow- ing function. yn j = ∑ i kn ij ∗ xn i (5) Table 1 Mathematical notation of RunPool pooling algorithm. Notation Description f Filter size s Stride l Index of the current layer kl k value of the current layer ktop k value of the top layer kmap k value after calculation of input features L Total number of layers C Length of input vector (sequence of features) p Padding ktop and L are constants In the above formula, the first network computes the features maps as the input layer to the CNN. In Equation (4), xn i is the i – th input feature map of the sample n and kn ij is the ij input kernel of the sam- ple n. Then, j – th becomes the output feature map of sample n net- works. Backward and forward operation run simultaneously in one iteration. The CNN calculates backward pass with the formulas: 𝜕l 𝜕xn i = ∑ j ( 𝜕l 𝜕yn j ) ∗ ( kn ij ) (6) 𝜕l 𝜕kn ij = ( 𝜕l 𝜕kn j ) ∗ xn i (7) In the backward pass process with Equation (6), the layer will com- pute the gradient of the l (loss function) with respect to xn i . The val- ues of the gradient are calculated by the partial derivative function 𝜕l 𝜕xn i and are passed to the first layer of the network. Then Equation (7) is used to calculate the gradient of the loss function with respect to kn ij. RunPool Function: Assume k be a function of the length of the input and the depth of the network as the computing model. l is a current convolution in the hidden layer, L is the total number of convolution layer. At the initial process, we determine ktop as a con- stant value for the pooling layer calculation. The function needs input matrices and some padding to calculate the extracted features. It computes kl by calculating network parameters in each window. In CNN process, each convolutional filter produces a matrix of hid- den variables. The pooling layer is utilized to reduce the size of the matrix. Max pooling is a procedure that takes an Nin × Nin input matrix and produces a smaller output matrix (feature maps) Nout × Nout. This is achieved by dividing the Nin × Nin square into N 2 out pooling regions Pi,j: P ( i, j ) ⊂ {1, 2, … , Nin} 2 for each ( i, j ) ∈ {1, … , Nin} 2 Pooling layers are a downsampling layer that combines the output of layer to a single neuron. The model calculates the input data in the hidden layer with ni represents i – th layer, f represents filters, and s reflects stride in the process. Then, the output dimension of the pooling layer will be calculated: max Nin × Nin = n1 ∗ n2 ∗ nc = [ n1 – f s + 1] × [ n2 – f s + 1] × nc (8)
  • 5. 70 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 Inspired by DCNN model in [13] with one-dimensional convolu- tion, we construct a function to obtain appropriate k-max pooling value by using more layer parameters with dynamic values. Then, we take Gauss distribution to calculate the final kmap. In forward propagation, the Nin ×Nin part will produce a single “winning unit” value. We construct a function to obtain value at each layer instead of choosing the exact maximum value from l layer. Hence, we get the dynamic winning out by calculating the simple pooling func- tion to obtain k value of layer l. We calculate kmap with a function to get the kl value (k value of layer l) as: kmap = ( C (L – l) L ) kl = max ( ktop, kmap ) (9) From the function above, the RunPool selects ktop as the current pooling value if ktop ≥ kmap or selects kmap otherwise. The aim of the RunPool Equation (9) is to find kl value by comparing ktop constant and kl produced by parameter calculation. It is a dynamic pooling function that calculates window parameters to take pool- ing value in each hidden layer. The model uses wide convolution in each window, to ensures that all weights get the entire param- eter action. To obtain final value of kmap, we multiply it with a Gaussian random value r to produce a positive integer value ℤ+ as kmap = r|r ∈ ℤ+ , r ≥ 0. Figure 3 depicts how to get a pooling value in the RunPool function. The main idea of the function is how to take some parameters of the current layer to obtain appropriate pooling value in the current hidden layer. In the RunPool function, l is some current convolu- tion in the hidden layer, L is the total number of convolution layer, ktop is the fixed max pooling parameter for the topmost convolu- tional layer. To obtain optimal max pooling value, we calculate the RunPool function with Algorithm 1. Algorithm 1 describes how to calculate the network parameters to take kmap value. The RunPool tries to find appropriate kmap value based on the parameters of the current window. It harnesses param- eter k to obtain optimal k-max pooling value. In the proposed scheme, the model calculates the window parameter to obtain Figure 3 Obtaining a pooling value by using RunPool function in the proposed CNN. The proposed CNN computes the input dinput = x1, x1, . . xn and some window parameter dlayer to get kmap. At the final step, it compares ktop constant and obtained kvalue to take the pooling value. dynamic kmap value. The kmap function is derived from the input or other aspects of the network. The function calculates a pooling window by turning different length of vectors into the same length. The convolutional layers run the feature extraction to capture better representation in the hidden layer. To calculate kmap, the function obtains length of input matrices C = C1, C2, ...Cn as the represen- tation of 1D dataset including text, character, and numeric. Finally, we get final integer value of kmap after multiplying it with a Gaussian random variable. We activate the pooling function between the convolution and before the fully connected layer (FC). FC is an efficient way of learn- ing nonlinear combinations to classify the data into target (cate- gories) y = y1, y2. It connects each of neuron layer to every neuron in another layer. The input to the FC layer is the flattened output from the final pooling or convolutional layer and then fed into the FC layer. For calculating error loss of the model, the experiment utilizes cross-entropy (CE) loss function. We calculate the loss function with the function: J (θ) = – 1 Tmb Tmb–1 ∑ T=0 FN–1 ∑ f=0 𝜕 f y (t) ln h (t) f (10) The above criterion function consists of Ttrain as the number of training examples, Tmb represents the number of training exam- ples in a mini-batch, h (t) f represents the hidden neuron of each layer, and 𝜕 ( f) yt represents the derivative. In this experiment, we train the model by calculating the CE in binary classification problems because it is a mostly loss function for classification problems. CE is applied to obtain a probability distribution output and is a pre- ferred loss function for FC activation like SoftMax. 3. EXPERIMENTAL SETUP In this study, we experiment with a large amount of 1D datasets, including numerical and text datasets. We gather a large dataset to construct a benchmark sample for the training and testing model with the dataset. In this paper, we focus on the 1D dataset, including texts, characters, and numeric data, to measure the pooling perfor- mance in training the model.
  • 6. Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 71 3.1. OSN Classification In this experiment, we collect a large online social network (OSN) as the primary dataset and train it as a binary classification prob- lem. OSN is a large environment with a large amount of data with millions of users. OSN become a popular application for sharing a lot of data items including videos, photos, and messages. The main trait of OSN is the social relation. It contains a large num- ber of accounts. They post, send, and share any data items across the platforms. Indeed, it is easy to gather much information across the sphere. OSN is a dense interconnection among the user of the network. In reality, OSN’s model is mostly multirelational that con- tains a large number of nodes. Basically, in the research area, the OSN depicts the nodes by a graph. The nodes represent actors who build diverse relationship type edges that reflect actor interactions [14]. In current years, OSN becomes an active research interest in several fields [15]. In the OSN environment, analyzing the data characteristic can pro- duce patterns of knowledge of the underlying OSN activities and hidden regularities. Link prediction, as one of the hot topics in social network research and as one of the analysis problems since a decade, drives a big amount of the researcher’s attention from var- ious disciplines [16]. However, the OSN environment remains as security issues and become the primary concern in the fast-growing OSN. Protection technique is one of the most critical aspects of an (OSN). The shortcomings of security and privacy-preserving model put the user data at risk. Many papers propose various techniques to address the OSN security issue [17,18]. Commonly, the OSN system conducts anomaly detection by mon- itoring a sudden user discrepancies activity. If the users’ activities keep on changing in a period abruptly, the OSN provider catches the suspicious account up by the changing of access pattern. If it fails to gain the information and behave with unusual pattern, the abnor- mal can infect the system with existing fraudulent [19]. However, suspicious activity exhibits different patterns, type of data, level of interaction, and the difference in time usage. To identify the anoma- lies, it needs to classify in detail the features of users. In this paper, we focus on building a classification model with supervised learn- ing by computing the OSN dataset in the training process. 3.2. Dataset In this experiment, we provide the OSN dataset as a 1D dataset and feed the features into the supervised learning algorithm. At the first stage, we collect the OSN dataset consists of some high-level fea- tures of OSN, such as user profiles Pu and relation Ru. User profiles Pu refers to a real natural member. It is an individual account in the OSN that consists of user representation, including username, age, location, pictures information, URL information, and other attributes. Ru represents the user’s social connections in the OSN. Ru including how many friends are in their network. In this paper, we define OSN as G and is represented by G(P, R) where P = {p1, p2, … pn} is the set of user identity, and the collection of the OSN relation is described as R = {R1, R2, … Rn}. The user’s relationship has a normal relation (blue lines), malicious relation (yellow lines), and fake relation (red lines). Figure 4 depicts OSN graph G (P, R) with types of relation, and user properties. Figure 4 OSN graph G(P, R), which consists of high-profile features, including user link information and profiles. 3.2.1. OSN link features Current research discusses various DL methods to conduct link pre- diction [20,21]. In the OSN communities, anomalies link become one of the primary concerns of OSN security research. Initially, the source of menace comes from suspicious user activities. The infected user automatically spread fake information or deceive user for targeting other regular users. If the crumbly system cannot adopt the threat model, it becomes an easy target for phishing or com- promising attack. An anomaly has a diverse type of threat like ver- tical attack and horizontal attack. A vertical is usually a threat to the providers, and the horizontal one poses attacks to other OSN users [22]. The current study discusses fake account detection in social net- works by using the learning method. DeepScan proposes mali- cious account detection with DL in Location-Based Social Net- works (LBSN) [23]. Another scheme utilizes the similarity of the user’s friends to calculate the adjacency matrix of the graph to iden- tify the user as benign or fake [24]. Analyzing social relations is a new way of determining anomalous users by mapping social ties and geographical among users. It calculates OSN factors like friend- ship, common interest, or alliance to determine chosen friends. Many researchers have proposed methods to construct the OSN security model. Therefore, in this paper, we build a learning classi- fier as a new way of protection by analyzing the OSN dataset. We gather the OSN link dataset of the OSN from several public OSN, especially the undirected OSN graph. In the security case, link prediction is a crucial part of analyzing the structure in the OSN as a security scheme. Instead of exploring the entire graph, we gather the neighborhood link information by aggregating the features. For this experiment, we collect topology information of the network which is available from the data domain. We gather the undirected OSN dataset from Facebook and Flickr samples to build the bench- mark dataset. Table 2 displays the dataset which is gathered for the experiment.
  • 7. 72 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 Table 2 Dataset of link prediction and classification. Properties Facebook Dataset Flickr Dataset Size 63,731 vertices (users) 1,715,255 vertices (users) Volume 817,035 edges (friendships) 15,551,250 edges (links) Average degree 25.640 edges / vertex 18.133 edges / vertex Size of LCC 63,731 vertices (users) 1,624,991 vertices (users) Gini coefficient 64.3% 88.2% Relative edge distribution entropy 93.1% 82.2% Assortativity 0.17702 –0.015282 Clustering coefficient 14.8% 11.2% LCC, Largest Connected Component. This corpus contains the undirected network link of OSN users. In the dataset, a node denotes a user, and an edge indicates a friend- ship between two users. The datasets contain a subset sample of the total friendship graph. Notably, measuring social relation with supervised learning such as CNN becomes a new approach in OSN malicious link detection. The main problem of node relation anal- ysis is to quantify links among the user and classify the user cate- gories. Finally, we feed the sample into the learning model to train the OSN attributes (user relations and profiles features) to classify the OSN account. 3.2.2. OSN profile features In this part, we gather OSN user-profiles information as the exper- iment dataset that contains many features that can be trained for building a classification model. The public OSN usually obtain the attributes for computing purposes. It is a way to handle the user and activity anomalies problem. Activity variations among accounts with contrast behavior from different OSN groups cause horizontal anomalies [25]. Thus, we take advantage of the vast user informa- tion as the sample to construct an effective learning model as the OSN protection. Firstly, we define OSN as G and is represented by G (P, R) where P = {p1, p2, … pn} is the set of the user profile as attributes. We pro- vide some essential user features that represent interaction among OSN users. In this study, we provide sample OSN profiles with more than 3000 unique users. We build a dataset by decomposing some user profiles information. For the experiment dataset, the paper observes user data around 3000 OSN members. We provide 1500 data with a nor- mal user profile and 1500 users with fake profiles label. We split the training data and testing dataset to achieve real accuracy and loss. 3.3. Data Preprocessing The aim of data preprocessing is to convert the raw dataset at hand to the input vector for NN training. At the preprocessing, we sep- arate the several features into parts to preserve the context of each token. A feature gets a different value when is extracted from the identity, hostname, location, time, and other attributes of OSN. The stage includes vectorization, normalization, handling missing val- ues, and feature extraction. In the data preprocessing, we split the dataset into a training and test part. For training the model, we employ the training data and evaluate the model on the validation data. After finishing the val- idation model, we test the final model on the test data. For testing the model, we need to use a wholly different and unseen dataset to evaluate the model with the test dataset. We set the model to be unable to gain access to any information about the test dataset. This study suffers from a not balanced dataset because the data con- tains about 90%–95% that belong to the same majority class (benign account). If we conduct training on such data, it can cause the igno- rance of the clarification of minority clusters (fake account). Thus, it can influence the overall accuracy of classifications. To deal with this issue, we utilize the Synthetic Minority Oversampling Tech- nique (SMOTE) to balance the dataset [26]. It is a relevant approach if we want to obtain the similarity feature of the nodes and avoid to diminish information. In the initial prepreprocessing, we utilize a function to convert inte- ger and string dataset into the input matrix. We need the technique to convert the features into a vector because the input may be an array of data like integers or strings that cannot be processed in neu- ral network training. In the final phase of preprocessing, we sepa- rate the dataset into training and testing datasets to test the model on unseen data. We have two labels for the evaluation dataset, and these are benign and malicious targets. We feed a few noises data to make it difficult for the model to be bias-free and reflects a real condition in a classification problem. Thus, the model will consider the noisy data patterns, and it pushes the model toward overfitting in the training and testing process. After the preprocessing stage, the model feeds the features into the graph for classification. To train the model, we feed the dataset into the proposed CNN to make predictions on new unseen data with binary classification tasks. 4. EXPERIMENT RESULT Based on the experimental setup, we have conducted training and testing by gathering two groups of OSN dataset. In this section, we have shown how the generic pooling layer’s performance in link classification and fake profile detection in the benchmark OSN dataset. In this part, we have displayed the accuracy and loss of the pooling scheme. The experiment also has provided an evaluation of the performance model by using cross-validation. It is to calcu- late Recall, Precision, F1 Score, Receiver Operating Characteristic (ROC) [27]. 4.1. OSN Link Classification In the link prediction problem, we have gained a trade-off between the accuracy and the performance time. By tuning diverse hyperpa- rameters, we have undergone the neural network performance. In the training and testing process, we have tuned epoch = 500, batch size = 50, and with learning rate (lr = 0.01). Table 3 describes the proposed CNN performance, especially in the training and testing process. We have tested gradient descent and tune optimization algorithms with different hyperparameters by computing this sample. We have employed the gradient descent to minimize the objective function j (𝜃) with the model’s parameters θ ∈ Rd . The model updates the parameters in the opposite direction of the gradient of the objective
  • 8. Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 73 Table 3 CNN with RunPool layer performance in link classification with different parameter setting. Input Hyperparameter Optimizer Training Loss Testing Loss FB link Adam 0.5033 0.5069 #50000 Dropout p = 0.6 SGD, m = 0.5 0.5068 0.5070 H1 = 64 H2 = 32 ktop= 5 Adagrad 0.5057 0.5059 RMSProp 0.5176 0.5223 Flickr Dropout p = 0.6 Adam 0.5052 0.5060 #50000 H1 = 64 H2 = 32 ktop= 5 SGD, m = 0.5 0.5178 0.5160 Adagrad 0.5075 0.5074 RMSProp 0.5092 0.5087 FB, Facebook. function ∇𝜃j (𝜃). The study has set the learning rate η (lr = 0.01) to determine the stride size to reach a (local) minimum. In the training process, we have gained the best value of optimization when train- ing the model with Adam optimizer. The experiment has fed Adam optimizer into the graph with different amounts of hidden layers. The result has shown a dynamic optimizer can achieve a small loss in a large number of hidden layers. Moreover, we have tested RunPool CNN with another gradient descent like stochastic gradient descent SGD with a momentum that also produces a competitive result. The SGD with momentum (m = 0.5) can provide a better result if we tune in larger epoch and big regulizer. However, the experiment has found that adding more hidden layers for SGD cannot reduce the loss. Using a large number of hidden layers has caused overhead computation in a particular device, especially in CPU processing. Testing the NN by adding more layers and neuron numbers of CNN could not engage in improving the predictive capability. Because of the limitation of neuron number, it enlarges computing resources. Thus, using the RunPool pooling function can increase the accuracy result without adding more hidden layers in a neural network. In this experiment, we also calculate the Recall function to define the percentage of the datasets that belong to the malicious and identifies it as malicious. The precision determines the rate of the datasets which identifies as malicious, but actually, do not include the malicious. Finally, we want to seek a balance between Precision and Recall. We need to measure the OSN account classification who have the malicious link by accepting a low precision if the testing result is not significant. Hence, we require an F1 score as an opti- mal blend of Precision and Recall. Tables 4 and 5 show the Recall, Precision, and F1 Score in both OSN datasets. Based on the table above, the accuracy result has shown that super- vised machine learning produces different accuracy. There are Naïve Bayes (78.77%), Logistic Regression (LR) (92.06%), Support Vector Machine (SVM) (92.47%, 92.52%), and Neural Network (NN) (94.10%). The proposed CNN architecture with the RunPool layer can produce about 98.34% in several times of the testing pro- cess with the benchmark dataset. To make sure the model perfor- mance, we add a few noise data into the dataset that makes the Table 4 Accuracy and F1 Score with several machine learning algorithms for the Facebook dataset. Classification Model Accuracy Average Precision and Recall F1 Score Bernoulli 78.77 0.2123 - Logistic regression 91.49 0.6803 0.7563 SVM (rs = 40) 91.95 0.6974 0.7724 SVM (rs = 50) 91.99 0.6990 0.7745 S-neural network 94.10 0.9120 0.9210 Proposed approach 98.04 0.9775 0.9595 Table 5 Accuracy and F1 Score with several machine learning algorithms for the Flickr dataset. Classification Model Accuracy Average Precision and Recall F1 Score Bernoulli 78.77 0.2123 - Logistic regression 92.06 0.7048 0.7787 SVM (rs = 40) 92.47 0.7217 0.7897 SVM (rs = 50) 92.52 0.7257 0.7938 S-Neural network 94.10 0.9280 0.9321 Proposed approach 98.34 0.9732 0.9846 common Machine Learning (ML) algorithm more difficult to make a classification. Fortunately, the proposed CNN can achieve bet- ter accuracy than other ML algorithms. We also calculate the Area Under the Curve (AUC) score that produces the highest score of the AUC = 0.9970. It reflects that this model can achieve a good level of prediction in the classification problem. 4.2. OSN Account Classification In this section, we test the proposed RunPool with several learn- ing parameters to classify the user categories based on OSN profile attributes. In the training process, this study was tested with some popular optimizers, including SGD with momentum, Adam, Ada- grad, and RMSprop. The study calculates the gradient of loss when training networks in the critical part to measure comparison result between training and test result. We have put the local variable con- cept to compute the loss function and compute the gradient of the loss concerning the weight with various learning rates. In this experiment, we tune the gradient descent and optimization algorithms with different hyperparameters. In NN training, choos- ing the value of a hyperparameter is one of the most important parts to increase the network training quality. By tuning appro- priate hyperparameters in the CNN, we obtain the highest perfor- mance of the neural network. In the training and testing process, we tune epoch = 15 and batch size = 50. We have produced a trade-off between the accuracy and the performance time. Table 6 describes the proposed CNN performance in computing validation accuracy and loss. This experiment also tests the Precision, Recall, and F1 Score to compare how significant the impact of the proposed pooling layer in classification assignment. Table 7 shows the Recall, Precision, and F1 Score with different algorithms. Based on the table above, accuracy result has shown that super- vised machine learning produces diverse accuracy. There are Bernoulli Naïve Bayes (86.90%), Gradient Boosting (GB) (90.66%),
  • 9. 74 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 LR (90.49%), SVM (90.04%), NN (92.10%). The proposed CNN architecture with the RunPool pooling layer can produce 94.00% in several times of the testing process with the benchmark dataset. We have found that the proposed pooling layer can harvest the highest accuracy than other common machine learning in the benchmark OSN user-profiles dataset. At the final stage, we also calculate the ROC curve as a binary classification. The ROC curve shows that the proposed model can produce an AUC area of more than 0.9. It reflects the model that it can be a good classifier for training the dynamic dataset such as OSN. 4.3. Pooling Comparison In this part, we provide the comparison result among RunPool Pooling, Max Pooling, and Average Pooling to calculate model accuracy. We calculate the accuracy of the model combined with different pooling functions to measure the performance of each pooling in the classification task. This study tested the accuracy among RunPool Pooling, Max Pooling, and Average Pooling in the same hyperparameter tuning and environment. It is to convince the performance of each pooling in the classification task. Table 8 displays a comparison result among different pooling function by using 2254 training samples and 545 validation samples. In the first experiment, the study trains the classifier without a pool- ing layer that means CNN replaces the pooling layer by convolu- tion with stride. However, CNN without pooling cannot produce a Table 6 Classification result of the proposed CNN with RunPool layer in with different parameter setting. Parameter Optimizer Val-Loss Val-Accuracy Dropout p = 0.5 Adam 0.3633 0.9131 H1 = 32 SGD, m = 0.9 0.2087 0.9291 H2 = 64 Adagrad 0.3620 0.9025 ktop= 7 RMSProp 0.4049 0.8794 Dropout p = 0.3 Adam 0.4018 0.8954 H1 = 16 SGD, m = 0.9 0.2143 0.9344 H2 = 32 Adagrad 0.3559 0.8989 ktop= 5 RMSProp 0.4017 0.8865 Table 7 Comparison result between some common classification algorithms and the proposed approach in the OSN dataset. Classification Model Precision Recall F1 Score Bernoulli (Naïve Bayes) 86.90 86.94 87.00 Gradient Boosting (n estim=50) 90.66 90.69 91.00 Logistic Regression 90.47 90.59 90.61 SVM (rs = 31.6) 90.04 87.34 87.24 S-Neural Network 92.10 92.80 92.91 Proposed Approach 94.00 93.21 93.42 Table 8 Comparison of validation accuracy among pooling functions when training with the same hyperparameter tuning and dataset. Pooling Type Epoch = 7 Epoch = 10 Epoch = 15 Average Pooling 91.13% 91.49% 87.94% Max Pooling 91.21% 91.31% 89.89% Max. Pool + RunPool 91.67% 91.67% 89.36% Avg. Pool + RunPool 91.40% 91.89% 88.48% RunPool Pooling 92.38% 92.54% 91.08% good result in diverse training periods. In this case, CNN requires pooling layers to reduce the dimensions of output features and render some amount of invariance. Based on the training result, CNN with pooling layers are better than without pooling. To ensure the performance among RunPool Pooling, Max Pool- ing, and Average Pooling, we add the pooling layers into the CNN. As we know, Max Pooling and Average Pooling are deterministic pooling types that take the pooling value upon the maximum and average score of a pooling window. Instead of using ordinary deter- ministic pooling, we test the RunPool as a nondeterministic pooling to train the CNN model in the classification task. The experiment shows that the proposed CNN with RunPool Pooling can achieve a promising accuracy for an account classification task. However, this experiment suffers from decreasing accuracy when trained with a large epoch (many iterations). It can be solved by providing more dataset for training the model. At the final stage, we evaluate the classifier by using ROC and AUC scores. This technique is used to assess the false alerts of malicious detection. Based on the experiment, the graph depicts a comparison among several pooling layers in calculating the AUC score and False Positive Rate (FPR) score. Figure 5 depicts the AUC comparison among three pooling functions in malicious classification by using OSN features as the dataset. Figure 4 depicts that the AUC scores with the proposed approach (a) are higher than the conventional pooling layer (a) and (b). In this part, we plot AUC as a binary classification by adopting prob- lem distributed observations (X1; Y1) and (X2; Y2). TheAUC illus- trates the percentage of the area under the ROC curve with range between 0~1. The proposed model can achieve a promising result with AUC = 0.9889. Thus, the AUC reflects that the classifier can discriminate between positive and negative classes (malicious and benign label) effectively. Figure 5 Comparison result of the AUC score among Max Pooling, Average Pooling, and RunPool Pooling layer in the malicious classification task.
  • 10. Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 75 To sum up the benefit of the proposed pooling layer in the CNN architecture with the 1D dataset, we have presented the training and validation result by calculating the huge sample (OSN features). We find that the proposed CNN with RunPool can achieve an efficient result and increase the learning performance of the neural network. In DL study, the proposed pooling technique can be an alternative way to train a DL model not only for the 1D dataset (text) but also for a 2D dataset (image) and 3D dataset (video and voice). 5. CONCLUSION In this paper, we introduce a new pooling layer called RunPool to find different optimal pooling values within the pooling window. We propose the pooling function to deal with deterministic pooling issues by harnessing some network parameters to obtain an infor- mative feature map. It calculates some layer parameters and Gaus- sian distribution numbers based on ktop initialization. Then, the function compares kmap and ktop to achieve the best pooling value in each hidden layer, respectively. In the test process, we obtain a dynamic optimizer that is better to achieve high accuracy and small loss. On the other hand, the SGD with momentum also produces a competitive result. It can provide an accurate and minimum loss if we tune it with a larger epoch and big regulizer. SGD with momentum optimizer become an excellent option to train the CNN RunPool model. In fake OSN account clas- sification, we compare the proposed CNN accuracy to the super- vised ML, including Bernoulli Naïve Bayes (86.90%), GB (90.66%), LR (90.49%), and SVM (90.04%). The proposed CNN can achieve the highest score by producing 94.00%. Demonstrated by experiment with 1D OSN dataset, it is confirmed that RunPool can improve the performance of the CNN to address the drawback of common max pooling and average pooling. This type of stochastic pooling can be combined with the common pool- ing layers to produce better accuracy with tiny loss. By implement- ing CNN with RunPool, we harvest an increased level in the graph’s performance and gain better accuracy in the validation process for the classification case. Not just calculate the accuracy and loss, we also conduct the evaluation metric to measure the performance of the classifier. By calculating the Precision, Recall, and F1 Score, we obtain a promising result because the proposed model can achieve the highest AUC score. As the future directions of malicious classification study with DL, it needs an exploration for handling accounts and malware hierarchy links that reflect the sourceof the threat. Then,neural network com- putation is necessary to calculate with a new technique like adaptive loss function rather than a regular loss function. We also suggest the next study to test this model by utilizing a 2D dataset (image) and 3D (video or audio) dataset for measuring training and testing accuracy. CONFLICT OF INTEREST The authors declare they have no conflicts of interest. AUTHORS’ CONTRIBUTIONS All authors has contribution in this research. ACKNOWLEDGMENTS This paper is conducted in the Institute of Research in Information Process- ing Laboratory, Harbin University of Science and Technology under CSC Scholarship (No. 2016BSZ263). This project is available on Github. REFERENCES [1] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv/1409.1556, 2014. [2] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neu- ral Information Processing Systems (NIPS), Lake Tahoe, Nevada, 2012, pp. 1106–1114. [3] C. Szegedy, S. Ioffe, V. Vanhoucke, A.A. Alemi, Inception-v4, inceptionresnet and the impact of residual connections on learn- ing, in Association for the Advancement of Artificial Intelligence (AAAI), San Francisco, 2017, pp. 4278–4284. [4] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, Cam- bridge, MIT Press, 2016. [5] M.D. Zeiler, R. Fergus, Stochastic pooling for regularization of deep convolutional neural networks, arXiv/1301.3557, 2013. [6] C. Vincent, L. Spranger, M. Seuret, A. Nicolaou, P. Král, A. Maier, Deep generalized max pooling, arXiv: 1908.05040, 2019. [7] E. Hayoung, H. Choi, Alpha-pooling for convolutional neural net- works, arXiv: 1811.03436v1 [cs.LG], 2018. [8] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature. 521 (2015), 436–444. [9] L. Marcelo, T. Issa, W. Isaac, S. Mohammad, Authorship verifica- tion using deep belief network system, Int. J. Commun. Syst. 30 (2017), e3259. [10] Y. Taigman, M. Yang, M. Ranzato, L. Wolf, Deepface: closing the gap to human-level performance in face verification, in Com- puter Vision and Pattern Recognition (CVPR), IEEE Conference on Pattern Recognition, Columbus, 2014, pp. 1701–1708. [11] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learn- ing applied to document recognition, Proc. IEEE. 86 (1998), 2278–2324. [12] D. Yu, H. Wang, P. Chen, Z. Wei, Mixed pooling for convolutional neural networks, in International Conference on Rough Sets and Knowledge Technology, Shanghai, China, 2014. [13] N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network for modelling sentences, in Proceedings of the 52nd Annual Meeting of the Association for Computational Lin- guistics, Association for Computational Linguistics, Baltimore, 2014. [14] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, D.U. Hwang, Complex networks: structure and dynamics, Phy. Rep. 424 (2006), 175–308. [15] P. Wanda, H.J. Jie, URLDeep: continuous prediction of malicious URL with dynamic deep learning in social networks, Int. J. Netw. Secur. 21 (2019), 971–978. [16] P. Wang, B. Xu, Y. Wu, X. Zhou, Link prediction in social networks: the state-of-the-art, Sci. China Inf. Sci. 58 (2015), 1–38. [17] W. Putra, S. Selo, B.S. Hantono, Model of secure P2P mobile instant messaging based on virtual network, in 2014 International Conference on Information Technology Systems and Innovation (ICITSI), Bandung, 2014, pp. 81–85.
  • 11. 76 Huang Jin Jie and Putra Wanda / International Journal of Computational Intelligence Systems 13(1) 66–76 [18] P. Wanda, S. Selo, B.S. Hantono, Efficient message security based HyperElliptic Curve Cryptosystem (HECC) for mobile instant messenger, in 2014 The 1st International Conference on Information Technology, Computer, and Electrical Engineering, Semarang, 2014, pp. 245–249. [19] M.G. Vigliotti, C. Hankin, Discovery of anomalous behaviour in temporal networks, Soc. Netw. 41 (2015), 18–25. [20] T. Li, B. Wang, Y. Jiang, Y. Zhang, Y. Yan, Restricted Boltzmann machine-based approaches for link prediction in dynamic net- works, IEEE Access. 6 (2018), 29940–29951. [21] T. Li, J. Zhang, P.S. Yu, Y. Zhang, Y. Yan, Deep dynamic net- work embedding for link prediction, IEEE Access. 6 (2018), 29219–29230. [22] P. Bindu, S. Thilagam, Mining social networks for anomalies: methods and challenges, J. Netw. Comput. Appl. 68 (2016), 213–229. [23] Q. Gong, Y. Chen, X. He, Z. Zhuang, T. Wang, H. Huang, X. Wang, X. Fu, DeepScan: exploiting deep learning for malicious account detection in location-based social networks, IEEE Commun. Mag. 56 (2018), 21–27. [24] M. Mohammadrezaei, M.E. Shiri, A.M. Rahmani, Identifying fake accounts on social networks based on graph analysis and classifi- cation algorithms, Hindawi Secur. Commun. Netw. 2018 (2018), 1–8. [25] D. Savage, X. Zhang, X. Yu, P. Chou, Q. Wang, Anomaly detection in online social networks, Social Netw. 39 (2014), 62–70. [26] N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res. 16 (2002), 321–357. [27] J.A. Swets, Signal Detection Theory and ROC Analysis in Psychol- ogy and Diagnostics: Collected Papers, Lawrence Erlbaum Asso- ciates, Mahwah, 1996.