Master's Thesis Presentation

Proposal of a Terrorist
Detection Model in
Social Networks
Master’s thesis defense
Presented By : Wajdi Khattel on 07.12.2019
2018 / 2019
In front of jury composed of:
● President: Najet AROUS
● Evaluator: Olfa EL MOURALI
● Academic supervisor: Ramzi GUETARI
● Laboratory supervisor: Nour El Houda BEN CHAABENE

Outline
2
1
Introduction
2 3
Proposed
Model
5
Implementation
& Results
6
Conclusion
& Perspective
Existing works

Context
▰The appearance of social networks created an ease of
communication
▰The usage of social networks differs: Friendly vs harmful
▰Terrorists are one of the most dangerous category
▰The detection of these users is important
4

Problematic
▰Terrorists tend to hide their abnormal behavior
▰Normal user could adopt terrorist behavior
▰Socio-cultural definition of a terrorist could change over time
⇒ Time is important
5

Objective
▰Propose a terrorist detection model
▻Consider over-time user’s behavior change
▻Consider over-time behavior’s definition change
▰Cover Limitation of existing models
6

Anomaly Detection
9
Paper Input
Format
Description Multiple
Social
Networks
Multiple
Input data
types
User’s
Behavior
Change
Behavior
Definition
Change
Lashakry et al.,
2019
Activity Proposal of model for user profile
creation to monitor users
✓ ✓ ✗ ✗
Zamanian et
al.,2019
Activity Proposal of model for user activity
pattern recognition
✗ ✓ ✓ ✗
Bhattacharjee
et al., 2017
Graph Proposal of a probabilistic anomaly
classifier mode
✗ ✗ ✓ ✓
Chen et al.,
2018
Graph Proposal of a user profiling
framework that can be used to
detect anomalous users
✗ ✓ ✗ ✗

Proposed Input Format
10
Hybrid Input Format:
▰Graph Input
▰Activity-based score for each node

11
Terrorism Detection
Alvari et al. (2019)
- Different data
collecting methods
- Textual-content data
features
Chitrakar et al.
(2016)
Kalpakis et al. (2019)
- Advantages of using
Convolutional Neural
Network (CNN)
Transfer Learning
Technique
multidimensional
networks
- Social Network
Analysis
methodologies

▰Model Input: Multidimensional Network
▰Three sub-models:
▻Text classification model
▻Image classification model
▻General Information classification model
▰Decision Making
13
Proposed Model

Model Input
14
▰Nodes: Users
▰Dimensions: User’s social medias content
▰Edges: Connection between users on a
certain dimension
Multidimensional Network

▰Input: Textual data
▰Process:
▻Natural Language Processing
▻Word Embedding
▻Machine Learning classification
▰Output: Score
15
Text Classification Model (TCM)

▰Objective: Make the machine able to understand the human
language
▰Process:
▻Morphological Analysis
▻Syntactical Analysis
▻Semantical Analysis
16
TCM: Natural Language Processing

▰Objective: Represent text in a numerical way
while preserving its semantics
▰Process:
▻Term Frequency-Inverse Document Frequency
(TF-IDF)
17
TCM: Word Embedding

▰Input: Image data
▰Process:
▻Use pre-trained convolutional neural network
model
▻Add new convolutional layers
▰Output: Score
19
Image Classification Model (ICM)

20
ICM: CNN Architecture
Terrorist
Not Terrorist

▰Input: General Information data
▰Process:
▻If data is non-numerical ⇒ Encode it
▻Machine Learning classification
▰Output: Score
21
General Information Classification Model

▰Input: 3 submodels scores
▰Process:
▻Calculate user score
▻Classify it based on threshold
▰Output: User category (Terrorist or not)
22
Decision Making
TCM ICM GICM
Decision
Making
S1
S1 = Score1 * Weight1
S2 S3

▰Offline Data: Data used for the model training
▻Textual Data: Tweets from banned Twitter accounts
▻Image Data: Images from google image
▻General Information Data: PIRUS dataset
▰Online Data: Data used for testing and live usage
▻Facebook Graph API
▻Instagram REST API
▻Twitter REST API
25
Data Collection

27
Label Number of samples
Positive labels 122619
Negative labels 181691
Total Data 304310
TCM: Training Data

28
Model Name Accuracy F1-Score Training Time
Logistic Regression 0.9726 0.9674 39.9 secs
SVM 0.9626 0.9548 6h 48min 33secs
Neural Network 0.9774 0.9719 1min 11secs
TCM: Classification Model

29
Positive labels 219
Negative labels 314
Total Data 533
ICM: Training Data

31
ICM: Classification Model
CNN 0.7631 0.7219 3mins 50secs
CNN + DA 0.7781 0.7463 4mins 12secs
CNN + TL 0.8291 0.8103 8mins 48secs
CNN + DA + TL 0.8571 0.8454 9min 23secs

32
Positive labels 114
Negative labels 126
Total Data 240
GICM: Training Data

33
GICM: Classification Model
Logistic Regression 0.7650 0.7873 5 secs
SVM 0.8300 0.8495 7 secs
Neural Network 0.8173 0.8325 48.6 secs

34
Proposed Model
▰Text Classification Model: Neural Network
▰Image Classification Model: CNN + DA + TL
▰General Information Model: SVM

Conclusion
37
▰Proof-of-concept of terrorist detection model
▰Working with multiple social networks and multiple
data types
▰Supports over-time behavior change

Perspective
38
▰Graph Analysis
▰Support more data types: Video
▰Train on more data

39
Thank you for your
attention !

Master's Thesis Presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Master's Thesis Presentation

Similar to Master's Thesis Presentation (20)

Recently uploaded

Recently uploaded (20)

Master's Thesis Presentation