The document summarizes a presentation on detecting adversary nodes in machine-to-machine communication networks using machine learning-based trust models. It introduces machine-to-machine communication and the need to identify adversary nodes that provide false information. The presentation evaluates several machine learning models—including extreme gradient boosting, random forest, and a proposed binary particle swarm optimization extreme gradient boosting model—and compares their performance on a simulated network with varying percentages of adversary nodes. The proposed model achieved promising results in accurately detecting adversary nodes based on features extracted from node transmission data.
1. Detection of Adersary Nodes in Machine-
ToMachineCommunicationUsingMachineLearningBasedTrust
Speaker: Elvin,Eziama
University of Windsor
eziama@uwindsor.ca
November 28, 2019
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 1 / 14
2. PRESENTATION SUMMARY
1 INTRODUCTION
2 METHODOLOGY
3 COMPARISON OF SIMULATED RESULTS
4 RECOMMENDATION AND CONCLUSION
5 QUESTIONS AND ANSWERS
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 2 / 14
3. INTRODUCTION
Machine-to-Machine Communication (M2M-C) is an emerging
technology a large number of smart devices can collect, process and
communicate information for collaborative decisions without direct
human intervention.
To achieve an effective M2M-C, it is required to enable
inter-operability , confidentiality, and privacy without restricting
applications potential benefits.
A reliable incorporation of the M2M-C in some mission critical, safety
and emergency applications such as in Vehicle to Everything (V2X)
network will enable magnitude of enhancements in transportation
safety and efficiency. However, any disruption to this network can
potentially have deadly consequences.
We used Machine Learning Trust models in check-mating the effects
of adversary nodes in the network.
The performance metrices used in this paper are False Positive
Rate(FPR), True Positive Rate (TPR) and Accuracy
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 3 / 14
4. Attack Formulation
Our work formulates a robust algorithm for on-and- off attack and
false feed back among peers. The model integrates of consistency and
Jaccard similarity in the modeling.
connected vehicles/nodes are considered and generated in Matlab
simulation.
Nodes transmit their opinions in a scheduled broadcast in the form of
recommendation to q of their neighboring nodes. Nodes may decide
to report malicious information and we achieved this by increasing the
variance of sequence of transmission of some percentages of the
nodes in the network with probability of p and a misjudgment error
probability of 0.04
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 4 / 14
5. MACHINE LEARNING DETECTION MODELS (MLDM)
Entropy score as a feature selection criterion, to reduce the number of
redundant features
Suppose f denotes the variables of the underlying data, given as
{Xi | i ∈ F} and F with |F| = f . The class variable Y is represented
by binary class with vector (0, 1) , (1, 0).
The aim of feature selection is to select a subset of features, S ⊂ F,
in order to accurately predict the target Y , on the condition that the
cardinality of S is m, (m < f ).
Supposing that {Xi | i ∈ S} is represented by Xs, for any set S. The
capability of Y given Xs in terms of prediction is quantified by the
entropy of Y given Xs, and this is defined as follows:
H (X) = −
x∈Xs
P(x)log2(x) (1)
H (Y | Xs) − E (Y , Xs) (ln p (Ys)) (2)
Given that S is a selected feature set, the goal of set objective is to
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 5 / 14
6. Binary Swarm Optimization
Particle Swarm Optimization is a widely used swarm intelligent
algorithm. The model uses real number randomness with global
communication among swam of particles.
Each of the particles is determined by flight direction and distance by
its own position and speed.
The particle’s position determines the fitness value of optimized
objective function and determines the performance of the particles
and entire group’s optimal solution in each iterations.
Mathematically, the particle’s position and velocity are expressed as
follows:
Xi = (Xi1, Xi2, ..., Xik) , Xij ∈ {0, 1} , j = 1,2,3,..., K
Vi = (Vi1, Vi2, ..., Vik) , Vij ∈ {−Vmax , Vmax }, j = 1,2,3, ..., K
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 6 / 14
7. Binary Swarm Optimization Cont’D
The velocity function of t + 1 iteration is represented as follows :
V t+1
i = wV t
i + C1r1 Pi − Xt
i + C2r2 Pg − Xit
(4)
where Pi , Pg denote the best position visited by particle i and the
best position found by the swarm, w is an inertia factor that varies
over time, and r1, r2 are the random values uniform on [0, 1].
The transfer function denoted by T (Vij ) helps in converting velocities
to probabilities and it is expressed as follows:
T (Vij ) =
1
1 + eVij
(5)
and this transfer function equally helps in updating each bit position
as follows:
Xij =
1 if ∪ (0, 1) < T (Vij )
0 otherwise
(6)
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 7 / 14
8. Extreme Gradient Boostin(XGBoost)
The XGBoost is an ensemble meta algorithm basically to reduce the
bias and variance in supervised learning.
The algorithm is well designed to speed up performance by using
gradient-boosted decision trees.
Giving that the dataset is D = {(xi , yi ) : i = 1...n, xi ∈ Rm, yi ∈ R},
and where n and m are samples and features respectively.
ˆy =
k
k=1
fk (xi ) , fk ∈ F (7)
The objective function is achieved using Taylor expansion
objt
=
k
i=1
gi ft (xi )2
+
1
2
hi ft (xi )2
+ Ω (ft) (8)
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 8 / 14
9. RANDOM FOREST
Random Forest (RF) is a model that helps in aggregating weak model
in trust computation.
From mathematical point of view, RF can be expressed as follows:
R = {t1, t2, ..., tN}, when each of the trees t is constructing, it learns
a function
F : X → C(0,1), where0and1denotethehonestandmaliciousinformation
The probability of estimation of the output variables can be expressed
as follows:
P C(0,1) | X = 1
N
N
i=1 Pi C(0,1) | X .
The model has adaptive threshold which automates gives best
estimation for accurate classification of the output labels.
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 9 / 14
10. PROPOSED MODEL
Our approach is able to provide an effective detection framework by
improving on the drawbacks of the proposed XGoost algorithm as
follows
Step 1: Entropy Based Feature Engineering (EBFE) to obtain a
satisfactory future vector for the proposed method.
Step 2: The data set with optimal features is passed to the proposed
BPSO-XGBoost model. The optimized model robustly searches for
the optimal values that have the best fitness value. This mechanism
of the optimization process is further discussed as follows:
Definition 1: The BPSO takes one of the parameter sets of XGBoost
as a particle in the group, and each of the particles in the group
represents a candidate solution. Assuming that one of the Task
parameters is taken as a particle by BPSO is expressed as follows:
pi = (gi0, gi2, ..., gik) , i = 1, 2, ...N
Definition 2: Each particle in the group is measured by fitness
function. In line with (9),the fitness function can be realized.
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 10 / 14
11. COMPARISON OF SIMULATION RESULTS
Figure: Accuracy vs. attackers
percentage scenario
Figure: True positive rate vs.
attacker densities
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 11 / 14
13. CONCLUSION AND RECOMMENDATION
M2M-C maintains interoperability, scalable connection, and provision
of reliable information among enormous heterogeneous nodes.
However for personal interests, some nodes maliciously share wrong
reports with nodes in the network.
Trust evaluation of nodes remain a challenge in communication fields.
A scenario in Matlab was created where the attacker nodes were
formulated by increase in variance of some percentages of the total
node.
This paper gives an insight on the applicability of the ML models in
VBM2M-C with respect to security.
To detect the behaviors of the nodes based on the transmitted
message, the paper explored the feature engineering techniques to
extract meaningful features from the data feed into the model, to
enable the proposed method to obtain a promising results
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 13 / 14
14. Questions and Answers
Speaker: Elvin,Eziama (UWINDSOR) Detection of Adersary Nodes in Machine-ToM achineCommunicationUsingMachineLearningBNovember 28, 2019 14 / 14