Factify_Data_Challenge.pptx

Logically at the Factify 2022:
Multimodal Fact Verification
Jie Gao, Hella-Franziska Hoffmann, Stylianos Oikonomou,
David Kiskovski, Anil Bandhakavi
Feb 28, 2022

1
Introduction
Solution: Ensemble Approach
Experiments
Conclusion
2
3
4
Table of content

Factify Challenge: Introduction - Task
Claim
Text: China’s famed wandering
elephants are on the move again,
heading southwest while a male who
broke from the herd is still keeping his
distance. https://t.co/o5j7PDDveJ
Text: By Julia Hollingsworth and Zixu
Wang, CNNUpdated 1:03 AM ET, Fri June
11, 2021 (CNN)At least a dozen buzzing
drones monitor them around the clock.
Wherever they go, they're escorted by
police. And when they eat or sleep, they're
watched by millions online. CNN's Jessie
Yeung contributed to this report.
Document
label:
SUPPORT_MULTIMODAL
_MULTIMODAL
SUPPORT
images similar /
about the same
situation
doc text supports
claim text
Data challenge as part of De-Factify at AAAI ‘22
Train pairs: 35000
Validation pairs: 7500
Test pairs: 7500
4 weeks to train/eval, 1 week to apply to test

Factify Challenge: Introduction - Usage
● Entailment prediction is a
technique for claim
verification, i.e., predict
whether the evidence
supports or refute the claim
● Typically, given a tweet with
text message and image, and
a potential evidence article,
can we automatically predict
the veracity ?
Overview
Claim Detection Claim Verification
Worthiness Prioritising
Evidence
Retrieval
Veracity
Prediction
Produce
Justification
Factify
(Multimodal
Entailment)
Claim Matching

Factify Challenge: Introduction - Usage
● Entailment prediction is a
technique for claim
verification, i.e., predict
whether the evidence
supports or refute the claim
● Typically, given a tweet with
text message and image, and
a potential evidence article,
can we automatically predict
the veracity ?
Overview
Claim Detection Claim Verification
Worthiness Prioritising
Evidence
Retrieval
Veracity
Prediction
Produce
Justification
Claim Matching

Solution: Ensemble Model
● Train two unimodal models:
○ 3-way Textual Entailment:
“What is the relationship between
document and claim?”
support / refute / neutral
○ Image Relatedness:
“Is the doc. image contextually
related to the claim text + image?”
Y / N
● Combine the two unimodal models
with data-specific features into a
multimodal 5-way classifier.
Approach

Experiments: 5-way Multimodal Entailment
● Ensemble Model:
sklearn's DecisionTreeClassifier with
‘best’ split and ‘gini’ impurity matrix
as training criteria and an upper
bound of 8 on the number of layers.
● Feature Creation:
○ Text Entailment:
pre-trained BigBird model
fine-tuned on factify data set
○ pretrained ResNet-50 for
image cosine sim
○ sklearn 1-hot encoders for
image domains
Experiment setup
Validation
Test

Experiments: 5-way Multimodal Entailment
● Ensemble Model:
sklearn's DecisionTreeClassifier with
‘best’ split and ‘gini’ impurity matrix
as training criteria and an upper
bound of 8 on the number of layers.
● Feature Creation:
○ Text Entailment:
pre-trained BigBird model
fine-tuned on factify data set
○ pretrained ResNet-50 for
image cosine sim
○ sklearn 1-hot encoders for
image domains
Experiment setup
Leaderboard

Experiments: 3-way Textual Entailment
● As part of the design we chose to train a
separate model to address the textual
entailment part of the multi-modal task:
“Given a claim and an evidence
document, determine if the text evidence
supports, refutes, or is neutral towards the
claim.”
● Best model setup:
○ pre-trained Huggingface BigBird
○ fine-tuned for pairwise classification
of claim / doc text pairs over 2 epochs
with AdamW optimizer, learning rate
2e-5, epsilon 1e-8, batch size 4, and
max. sentence length of 1396 tokens.
Experiment setup
Validation Scores
Factify Label Text Entailment Label
Support_Multimodal Support
Support_Text Support
Insufficient_Multimodal Insufficient_Evidence
Insufficient_Text Insufficient_Evidence
Refute Refute
Label Mapping

Data Bias
Text Length Distribution by Label (Train)
OCR Text Length Distribution by Label (Train)
Many of our model choices were inspired by
inherent biases observed in the data.
Generating large annotated gold data sets that
appropriately represent the real-world fact
checking domain remains an ongoing challenge.
Text Word Overlap by Label (Train/Val)

Data Bias
Img Similarity by Label (Val)
Claim Image Source Distribution by Label (Train)
Many of our model choices were inspired by
inherent biases observed in the data.
Generating large annotated gold data sets that
appropriately represent the real-world fact
checking domain remains an ongoing challenge.

Incorrect and Ambiguous Labels
Insufficient_Multimodal
claim: Special counsel Robert
Mueller did not have sufficient
evidence to prosecute
obstruction, but does not
exonerate President Trump.
https://t.co/nfbBsVjDBG
https://t.co/83P7RDQadK
doc: “Attorney General William
Barr will now review the report.
Robert Mueller ends Russia
investigation without more
indictments: SourceSpecial counsel
Robert Mueller's much-anticipated
report -- the product of nearly two
years of investigation -- [..]
Support_Text
doc: In an unprecedented move, her
casket has been placed outside on the
court steps. Remembering Supreme Court
Justice Ruth Bader GinsburgThree days
of public mourning for Justice Ruth Bader
Ginsburg, a champion of equality and
pioneer of women's rights, began
Wednesday when her casket arrived at
the Supreme Court [..]
claim: President Trump and first lady
Melania Trump paid their respects to
Supreme Court Justice Ruth Bader
Ginsburg as a crowd booed and chanted
"Vote him out." https://t.co/M7m7kEIBg7
https://t.co/tWYfyKIdIF

Conclusion and Discussion
Learnings:
● DecisionTree classifier as best performing model
● 3-way text entailment as separate task with its own value
● DNN-based multimodal model suffers overfitting (refer to paper for details)
● Clear data bias and ambiguous labels (e.g. “support_multimodal” vs “support_text”)
Recommendations:
● Improve data creation process to reduce bias
● More practical labels and annotation scheme for real-world applications/challenges
● Further experimentation with multimodal architectures

Factify_Data_Challenge.pptx

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Factify_Data_Challenge.pptx