Updated version here:
https://www.slideshare.net/xavigiro/hate-speech-in-pixels-detection-of-offensive-memes-towards-automatic-moderation-205809641
This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge. Memes are pixel-based multimedia documents that contain photos or illustrations together with phrases which, when combined, usually adopt a funny meaning. However, hate memes are also used to spread hate through social networks, so their automatic detection would help reduce their harmful societal impact. Our results indicate that the model can learn to detect some of the memes, but that the task is far from being solved with this simple architecture. While previous work focuses on linguistic hate speech, our experiments indicate how the visual modality can be much more informative for hate speech detection than the linguistic one in memes. In our experiments, we built a dataset of 5,020 memes to train and evaluate a multi-layer perceptron over the visual and language representations, whether independently or fused.
9. OCR Extraction (II)
OCR
When you act up in class
and your teacher starts
calling your parents but
you gave her the number to
Pizza Hut
Tesseract 4.0
Uses neural networks
0.5s / image → previous extraction
9
11. Text Feature Extraction (II)
When you act up in class
and your teacher starts
calling your parents but
you gave her the number to
Pizza Hut
OCR Text Embedder
Feature
Vector
[0.32,
-0.79,
...,
1.04,
0.02]
11
(t1
, t2
, …, tM
)
12. Text Feature Extraction (III). BERT
When you act up in class
and your teacher starts
calling your parents but
you gave her the number to
Pizza Hut
BERT
Feature
Vector
[0.32,
-0.79,
...,
1.04,
0.02]
12
(t1
, t2
, …, tM
)
15. Image Feature Extraction (III)
We make the assumption that hidden layers have relevant information for tasks other
than ImageNet classification (for which it was trained) [ref].
15
Scheme of the VGG-16
20. Dataset (I)
20
● No labelled data for our task
● Downloaded (neutral or non-hate memes from the Reddit Memes
Dataset (3325 memes)
● Downloaded from Google images Memes with the following keywords
(1695):
○ racist meme: 643 memes
○ jew meme: 551 memes
○ muslim meme. 501 memes
● Total of 5020 memes.
● Dubious quality of annotations
● Train: 85%
● Validation: 15%
21. Implementation - Setup
21
● Main framework: Python
● Neural Nets Framework: PyTorch
● VGG16 Implementation and Pretrained weights: Torchvision
● BERT Implementation and Pretrained weights:
https://github.com/huggingface/pytorch-pretrained-BERT7
● OCR: Tesseract 4.0 -> Pytesseract wrapper for Python
22. Preprocessing
22
● Previous OCR extraction → Much faster training process.
● Character sequence to BERT Tokens sequence (BERT Input)
● Crop / Pad BERT Token sequence to 50 tokens
● Images to size 224x224 (VGG inputs size)
23. Experiments and Results (I). Baseline
23
● No baseline for our task.
● Starting point:
○ Frozen VGG16 and BERT
○ Classifier. A Multi-Layer Perceptron (MLP) with two Hidden Layers, Hidden size =
100.
○ Optimizer: SGD with momentum. Learining rate = 0.01, momentum = 0.9.
○ Batch size = 30
○ Loss function: Mean Squared Error (MSE).
Result: 82.6% Validation Accuracy
In this figure we observe in (a) the validation Accuracy and in (b) the train loss.
(a) (b)
24. Experiments and Results (II). Data Augmentation
24
● Resize image to 255x255 (Instead of 224x224)
● Randomly crop 224x244 patch
● Result: Accuracy 82.0%
25. Experiments and Results (III). Capacity Reduction
25
● No data Augmentation
● Hidden size = 50 (not 100)
● Result: Accuracy 82%
26. Experiments and Results (IV). Dropout
26
● No data Augmentation
● Hidden size = 100
● Result: Accuracy 81 %
● Dropout:
○ All the MLP layers (p=0.5)
27. Experiments and Results (V). Dropout
27
● No data Augmentation
● Hidden size = 50
● Result: Accuracy 81.7%
● Dropout:
○ First MLP layer (p=0.2)
28. Experiments and Results (VI).
28
Regularization Summary:
● Baseline: 82.6%. Overfitting
● Data augmentation (Random Cropping): 81%. Overfitting
● Capacity Reduction: 82%. Overfitting
● Dropout:
○ All the MLP, p=0.5, 81%, random forgetting
○ First MLP HL 50, p=0.2, 81.7%, no overfitting
31. Fine-tuning the descriptors (II). BERT & VGG
31
After unfreezing BERT and VGG’s classifier (top layers) we got a accuracy of 83.0%
32. Fine-tuning the descriptors (III). BERT & VGG
32
Progressive Fine-Tuning. We unfreze the weights at epoch X.
(a) for validation accuracy and (b) for validation loss.
Blue: no fine.tuning. Light Blue: finetuning from epoch 10. Acc: 83.7%. Pink: Finetuning from epoch
50. Acc: 84.3%.
34. Failed experiments (I). Unsupervised Pretraining
34
Hate Speech Detection
Architecture
Unsupervised
task (image +text
matching)
We downloaded 1500 unlabelled images, and separated them from the labelled data.
We were not able to learn anything from this task (50% accuracy).
35. Failed experiments (II). Introducing expert knowledge
35
We make a list of 12 words that can potentially be hate speech. We one-hot encode the
presence of these words in the OCR extracted text and concatenate this vector along with
image and text features.
38. Further work
38
● Dataset
○ Poor annotation
○ Probably visually biased
○ Small
● Descriptors
○ XLNet Models
○ Expert knowledge
● Better ways of fusing multimode embeddings.
● OCR extraction
39. Conclusions
39
● Accuracy up to 84.4%
● Explored regularization techniques
● This unsupervised pre-training is useless
● Poor dataset
● Need to find a way to introduce expert knowledge.