SlideShare a Scribd company logo
Hate Speech in Pixels:
Detection of Offensive Memes
towards Automatic Moderation
Benet Oriol Sàbat
Co-Directed by:
Xavier Giró
Cristian Canton
Contents
● Motivation
● System Description
● Experiments - Results
● Qualitative Results
● Further Work
● Conclusion
2
Motivation (I): Memes
What are memes?
3
Motivation (II): Hate Memes
What are hate memes?
4
Motivation (II): Hate Memes
What are hate memes?
5
Motivation (III): Hate Memes Detection
Hate Speech Detection
6
Overall System
Hate Speech Detection
7
OCR Extraction (I)
Hate Speech Detection
8
OCR Extraction (II)
OCR
When you act up in class
and your teacher starts
calling your parents but
you gave her the number to
Pizza Hut
Tesseract 4.0
Uses neural networks
0.5s / image → previous extraction
9
Text Feature Extraction (I)
Hate Speech Detection
10
Text Feature Extraction (II)
When you act up in class
and your teacher starts
calling your parents but
you gave her the number to
Pizza Hut
OCR Text Embedder
Feature
Vector
[0.32,
-0.79,
...,
1.04,
0.02]
11
(t1
, t2
, …, tM
)
Text Feature Extraction (III). BERT
When you act up in class
and your teacher starts
calling your parents but
you gave her the number to
Pizza Hut
BERT
Feature
Vector
[0.32,
-0.79,
...,
1.04,
0.02]
12
(t1
, t2
, …, tM
)
Image Feature Extraction (I)
Hate Speech Detection
13
Image Feature Extraction (II)
Image
embedder
[0.01,
-1.2,
…
0.5,
0.52]
14
(i1
, i2
, …, iN
)
Image Feature Extraction (III)
We make the assumption that hidden layers have relevant information for tasks other
than ImageNet classification (for which it was trained) [ref].
15
Scheme of the VGG-16
Feature Fusion (I)
Hate Speech Detection
16
Feature Fusion (II). Concatenation
Feature fusion
Image Embedding
Text Embedding
Image + Text Embedding
Concatenation
(i1
, i2
, …, iN
)
(t1
, t2
, …, tM
)
(i1
, i2
, …, iN,
t1
, t2
, …, tM
)
17
Hate Predictor (I)
Hate Speech Detection
18
Hate Predictor (II)
(i1
, i2
, …, iN,
t1
, t2
, …, tM
)
19
Feature fusion Hate score ∈ R
Dataset (I)
20
● No labelled data for our task
● Downloaded (neutral or non-hate memes from the Reddit Memes
Dataset (3325 memes)
● Downloaded from Google images Memes with the following keywords
(1695):
○ racist meme: 643 memes
○ jew meme: 551 memes
○ muslim meme. 501 memes
● Total of 5020 memes.
● Dubious quality of annotations
● Train: 85%
● Validation: 15%
Implementation - Setup
21
● Main framework: Python
● Neural Nets Framework: PyTorch
● VGG16 Implementation and Pretrained weights: Torchvision
● BERT Implementation and Pretrained weights:
https://github.com/huggingface/pytorch-pretrained-BERT7
● OCR: Tesseract 4.0 -> Pytesseract wrapper for Python
Preprocessing
22
● Previous OCR extraction → Much faster training process.
● Character sequence to BERT Tokens sequence (BERT Input)
● Crop / Pad BERT Token sequence to 50 tokens
● Images to size 224x224 (VGG inputs size)
Experiments and Results (I). Baseline
23
● No baseline for our task.
● Starting point:
○ Frozen VGG16 and BERT
○ Classifier. A Multi-Layer Perceptron (MLP) with two Hidden Layers, Hidden size =
100.
○ Optimizer: SGD with momentum. Learining rate = 0.01, momentum = 0.9.
○ Batch size = 30
○ Loss function: Mean Squared Error (MSE).
Result: 82.6% Validation Accuracy
In this figure we observe in (a) the validation Accuracy and in (b) the train loss.
(a) (b)
Experiments and Results (II). Data Augmentation
24
● Resize image to 255x255 (Instead of 224x224)
● Randomly crop 224x244 patch
● Result: Accuracy 82.0%
Experiments and Results (III). Capacity Reduction
25
● No data Augmentation
● Hidden size = 50 (not 100)
● Result: Accuracy 82%
Experiments and Results (IV). Dropout
26
● No data Augmentation
● Hidden size = 100
● Result: Accuracy 81 %
● Dropout:
○ All the MLP layers (p=0.5)
Experiments and Results (V). Dropout
27
● No data Augmentation
● Hidden size = 50
● Result: Accuracy 81.7%
● Dropout:
○ First MLP layer (p=0.2)
Experiments and Results (VI).
28
Regularization Summary:
● Baseline: 82.6%. Overfitting
● Data augmentation (Random Cropping): 81%. Overfitting
● Capacity Reduction: 82%. Overfitting
● Dropout:
○ All the MLP, p=0.5, 81%, random forgetting
○ First MLP HL 50, p=0.2, 81.7%, no overfitting
Multimodal Fusion. Mono-mode systems
29
Dataset lower
bound!
Fine-tuning the descriptors (I). BERT
30
Text Only classifier, with and without BERT finetuning
Fine-tuning the descriptors (II). BERT & VGG
31
After unfreezing BERT and VGG’s classifier (top layers) we got a accuracy of 83.0%
Fine-tuning the descriptors (III). BERT & VGG
32
Progressive Fine-Tuning. We unfreze the weights at epoch X.
(a) for validation accuracy and (b) for validation loss.
Blue: no fine.tuning. Light Blue: finetuning from epoch 10. Acc: 83.7%. Pink: Finetuning from epoch
50. Acc: 84.3%.
Fine-tuning the descriptors (IV). Summary
33
Failed experiments (I). Unsupervised Pretraining
34
Hate Speech Detection
Architecture
Unsupervised
task (image +text
matching)
We downloaded 1500 unlabelled images, and separated them from the labelled data.
We were not able to learn anything from this task (50% accuracy).
Failed experiments (II). Introducing expert knowledge
35
We make a list of 12 words that can potentially be hate speech. We one-hot encode the
presence of these words in the OCR extracted text and concatenate this vector along with
image and text features.
Qualitative analysis (I). Best predictions
36
Qualitative analysis (II). Worse predictions
37
Further work
38
● Dataset
○ Poor annotation
○ Probably visually biased
○ Small
● Descriptors
○ XLNet Models
○ Expert knowledge
● Better ways of fusing multimode embeddings.
● OCR extraction
Conclusions
39
● Accuracy up to 84.4%
● Explored regularization techniques
● This unsupervised pre-training is useless
● Poor dataset
● Need to find a way to introduce expert knowledge.
40
41

More Related Content

What's hot

DENTSU - 2023 Global Ad Spend Forecasts.pdf
DENTSU - 2023 Global Ad Spend Forecasts.pdfDENTSU - 2023 Global Ad Spend Forecasts.pdf
DENTSU - 2023 Global Ad Spend Forecasts.pdfdigitalinasia
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithmnishant24894
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitterprnk08
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisDinesh V
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning BasicsSuresh Arora
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningKoundinya Desiraju
 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection projectHarshdaGhai
 
Bias in Artificial Intelligence
Bias in Artificial IntelligenceBias in Artificial Intelligence
Bias in Artificial IntelligenceNeelima Kumar
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesKarol Chlasta
 
Fake News detection.pptx
Fake News detection.pptxFake News detection.pptx
Fake News detection.pptxSanad Bhowmik
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on TwitterSmritiAgarwal26
 
Ethics in Data Science and Machine Learning
Ethics in Data Science and Machine LearningEthics in Data Science and Machine Learning
Ethics in Data Science and Machine LearningHJ van Veen
 
social network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysissocial network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysisAshish Mundra
 
Detecting Fake News Through NLP
Detecting Fake News Through NLPDetecting Fake News Through NLP
Detecting Fake News Through NLPSakha Global
 
Hate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningHate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningIRJET Journal
 
Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021
Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021
Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021Mirko Marras
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social mediaDiana Maynard
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitterpiya chauhan
 

What's hot (20)

DENTSU - 2023 Global Ad Spend Forecasts.pdf
DENTSU - 2023 Global Ad Spend Forecasts.pdfDENTSU - 2023 Global Ad Spend Forecasts.pdf
DENTSU - 2023 Global Ad Spend Forecasts.pdf
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithm
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection project
 
Bias in Artificial Intelligence
Bias in Artificial IntelligenceBias in Artificial Intelligence
Bias in Artificial Intelligence
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
 
Fake News detection.pptx
Fake News detection.pptxFake News detection.pptx
Fake News detection.pptx
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Ethics in Data Science and Machine Learning
Ethics in Data Science and Machine LearningEthics in Data Science and Machine Learning
Ethics in Data Science and Machine Learning
 
social network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysissocial network analysis project twitter sentimental analysis
social network analysis project twitter sentimental analysis
 
Detecting Fake News Through NLP
Detecting Fake News Through NLPDetecting Fake News Through NLP
Detecting Fake News Through NLP
 
Hate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningHate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine Learning
 
Social Media Sentiment Analysis
Social Media Sentiment AnalysisSocial Media Sentiment Analysis
Social Media Sentiment Analysis
 
Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021
Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021
Tutorial on Advances in Bias-aware Recommendation on the Web @ WSDM 2021
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social media
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitter
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 

Similar to Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Ricardo Guerrero Gómez-Olmedo
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningVan Huy
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature LearningAmgad Muhammad
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLSeldon
 
Computer Architecture and Organization
Computer Architecture and OrganizationComputer Architecture and Organization
Computer Architecture and Organizationssuserdfc773
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainNishant Jain
 
Important Concepts for Machine Learning
Important Concepts for Machine LearningImportant Concepts for Machine Learning
Important Concepts for Machine LearningSolivarLabs
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Omid Vahdaty
 
Data Structures and Algorithm Analysis
Data Structures  and  Algorithm AnalysisData Structures  and  Algorithm Analysis
Data Structures and Algorithm AnalysisMary Margarat
 
BSSML17 - Deepnets
BSSML17 - DeepnetsBSSML17 - Deepnets
BSSML17 - DeepnetsBigML, Inc
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...NETFest
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkSigOpt
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf
 

Similar to Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation (20)

Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
 
Deep MIML Network
Deep MIML NetworkDeep MIML Network
Deep MIML Network
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep Learning
 
Eye deep
Eye deepEye deep
Eye deep
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
Computer Architecture and Organization
Computer Architecture and OrganizationComputer Architecture and Organization
Computer Architecture and Organization
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 
Practical ML
Practical MLPractical ML
Practical ML
 
Important Concepts for Machine Learning
Important Concepts for Machine LearningImportant Concepts for Machine Learning
Important Concepts for Machine Learning
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Data Structures and Algorithm Analysis
Data Structures  and  Algorithm AnalysisData Structures  and  Algorithm Analysis
Data Structures and Algorithm Analysis
 
ML in Android
ML in AndroidML in Android
ML in Android
 
BSSML17 - Deepnets
BSSML17 - DeepnetsBSSML17 - Deepnets
BSSML17 - Deepnets
 
eam2
eam2eam2
eam2
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 

More from Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundOppotus
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单yhkoc
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单ewymefz
 

Recently uploaded (20)

How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

  • 1. Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation Benet Oriol Sàbat Co-Directed by: Xavier Giró Cristian Canton
  • 2. Contents ● Motivation ● System Description ● Experiments - Results ● Qualitative Results ● Further Work ● Conclusion 2
  • 4. Motivation (II): Hate Memes What are hate memes? 4
  • 5. Motivation (II): Hate Memes What are hate memes? 5
  • 6. Motivation (III): Hate Memes Detection Hate Speech Detection 6
  • 8. OCR Extraction (I) Hate Speech Detection 8
  • 9. OCR Extraction (II) OCR When you act up in class and your teacher starts calling your parents but you gave her the number to Pizza Hut Tesseract 4.0 Uses neural networks 0.5s / image → previous extraction 9
  • 10. Text Feature Extraction (I) Hate Speech Detection 10
  • 11. Text Feature Extraction (II) When you act up in class and your teacher starts calling your parents but you gave her the number to Pizza Hut OCR Text Embedder Feature Vector [0.32, -0.79, ..., 1.04, 0.02] 11 (t1 , t2 , …, tM )
  • 12. Text Feature Extraction (III). BERT When you act up in class and your teacher starts calling your parents but you gave her the number to Pizza Hut BERT Feature Vector [0.32, -0.79, ..., 1.04, 0.02] 12 (t1 , t2 , …, tM )
  • 13. Image Feature Extraction (I) Hate Speech Detection 13
  • 14. Image Feature Extraction (II) Image embedder [0.01, -1.2, … 0.5, 0.52] 14 (i1 , i2 , …, iN )
  • 15. Image Feature Extraction (III) We make the assumption that hidden layers have relevant information for tasks other than ImageNet classification (for which it was trained) [ref]. 15 Scheme of the VGG-16
  • 16. Feature Fusion (I) Hate Speech Detection 16
  • 17. Feature Fusion (II). Concatenation Feature fusion Image Embedding Text Embedding Image + Text Embedding Concatenation (i1 , i2 , …, iN ) (t1 , t2 , …, tM ) (i1 , i2 , …, iN, t1 , t2 , …, tM ) 17
  • 18. Hate Predictor (I) Hate Speech Detection 18
  • 19. Hate Predictor (II) (i1 , i2 , …, iN, t1 , t2 , …, tM ) 19 Feature fusion Hate score ∈ R
  • 20. Dataset (I) 20 ● No labelled data for our task ● Downloaded (neutral or non-hate memes from the Reddit Memes Dataset (3325 memes) ● Downloaded from Google images Memes with the following keywords (1695): ○ racist meme: 643 memes ○ jew meme: 551 memes ○ muslim meme. 501 memes ● Total of 5020 memes. ● Dubious quality of annotations ● Train: 85% ● Validation: 15%
  • 21. Implementation - Setup 21 ● Main framework: Python ● Neural Nets Framework: PyTorch ● VGG16 Implementation and Pretrained weights: Torchvision ● BERT Implementation and Pretrained weights: https://github.com/huggingface/pytorch-pretrained-BERT7 ● OCR: Tesseract 4.0 -> Pytesseract wrapper for Python
  • 22. Preprocessing 22 ● Previous OCR extraction → Much faster training process. ● Character sequence to BERT Tokens sequence (BERT Input) ● Crop / Pad BERT Token sequence to 50 tokens ● Images to size 224x224 (VGG inputs size)
  • 23. Experiments and Results (I). Baseline 23 ● No baseline for our task. ● Starting point: ○ Frozen VGG16 and BERT ○ Classifier. A Multi-Layer Perceptron (MLP) with two Hidden Layers, Hidden size = 100. ○ Optimizer: SGD with momentum. Learining rate = 0.01, momentum = 0.9. ○ Batch size = 30 ○ Loss function: Mean Squared Error (MSE). Result: 82.6% Validation Accuracy In this figure we observe in (a) the validation Accuracy and in (b) the train loss. (a) (b)
  • 24. Experiments and Results (II). Data Augmentation 24 ● Resize image to 255x255 (Instead of 224x224) ● Randomly crop 224x244 patch ● Result: Accuracy 82.0%
  • 25. Experiments and Results (III). Capacity Reduction 25 ● No data Augmentation ● Hidden size = 50 (not 100) ● Result: Accuracy 82%
  • 26. Experiments and Results (IV). Dropout 26 ● No data Augmentation ● Hidden size = 100 ● Result: Accuracy 81 % ● Dropout: ○ All the MLP layers (p=0.5)
  • 27. Experiments and Results (V). Dropout 27 ● No data Augmentation ● Hidden size = 50 ● Result: Accuracy 81.7% ● Dropout: ○ First MLP layer (p=0.2)
  • 28. Experiments and Results (VI). 28 Regularization Summary: ● Baseline: 82.6%. Overfitting ● Data augmentation (Random Cropping): 81%. Overfitting ● Capacity Reduction: 82%. Overfitting ● Dropout: ○ All the MLP, p=0.5, 81%, random forgetting ○ First MLP HL 50, p=0.2, 81.7%, no overfitting
  • 29. Multimodal Fusion. Mono-mode systems 29 Dataset lower bound!
  • 30. Fine-tuning the descriptors (I). BERT 30 Text Only classifier, with and without BERT finetuning
  • 31. Fine-tuning the descriptors (II). BERT & VGG 31 After unfreezing BERT and VGG’s classifier (top layers) we got a accuracy of 83.0%
  • 32. Fine-tuning the descriptors (III). BERT & VGG 32 Progressive Fine-Tuning. We unfreze the weights at epoch X. (a) for validation accuracy and (b) for validation loss. Blue: no fine.tuning. Light Blue: finetuning from epoch 10. Acc: 83.7%. Pink: Finetuning from epoch 50. Acc: 84.3%.
  • 33. Fine-tuning the descriptors (IV). Summary 33
  • 34. Failed experiments (I). Unsupervised Pretraining 34 Hate Speech Detection Architecture Unsupervised task (image +text matching) We downloaded 1500 unlabelled images, and separated them from the labelled data. We were not able to learn anything from this task (50% accuracy).
  • 35. Failed experiments (II). Introducing expert knowledge 35 We make a list of 12 words that can potentially be hate speech. We one-hot encode the presence of these words in the OCR extracted text and concatenate this vector along with image and text features.
  • 36. Qualitative analysis (I). Best predictions 36
  • 37. Qualitative analysis (II). Worse predictions 37
  • 38. Further work 38 ● Dataset ○ Poor annotation ○ Probably visually biased ○ Small ● Descriptors ○ XLNet Models ○ Expert knowledge ● Better ways of fusing multimode embeddings. ● OCR extraction
  • 39. Conclusions 39 ● Accuracy up to 84.4% ● Explored regularization techniques ● This unsupervised pre-training is useless ● Poor dataset ● Need to find a way to introduce expert knowledge.
  • 40. 40
  • 41. 41