Talk on 'Tracking False Information Online' at W-NUT workshop at EMNLP 2019.
=========
Digital media enables fast sharing of information and discussions among users. While this comes with many benefits to today’s society, such as broadening information access, the manner in which information is disseminated also has obvious downsides. Since fast access to information is expected by many users and news outlets are often under financial pressure, speedy access often comes at the expense of accuracy, which leads to misinformation. Moreover, digital media can be misused by campaigns to intentionally spread false information, i.e. disinformation, about events, individuals or governments. In this talk, I will present on different ways false information is spread online, including misinformation and disinformation. I will then report findings from our recent and ongoing work on automatic fact checking, stance detection and framing attitudes.
Night 7k Call Girls Noida New Ashok Nagar Escorts Call Me: 8448380779
Tracking False Information Online
1. W-NUT Workshop
4 November 2019
Tracking False
Information Online
Isabelle Augenstein*
augenstein@di.ku.dk
@IAugenstein
http://isabelleaugenstein.github.io/
*Credit for some of the slides: Mareike Hartmann
8. Types of False Information
• Disinformation:
• Intentionally false, spread deliberately
• Misinformation:
• Unintentionally false information
• Clickbait:
• Exaggerating information and under-delivering it
• Satire:
• Intentionally false for humorous purposes
• Biased Reporting:
• Reporting only some of the facts to serve an agenda
9. Types of False Information
• Disinformation:
• Intentionally false, spread deliberately
• Misinformation:
• Unintentionally false information
• Clickbait:
• Exaggerating information and under-delivering it
• Satire:
• Intentionally false for humorous purposes
• Biased Reporting:
• Reporting only some of the facts to serve an agenda
10. Tracking False Information Online: NLP Tasks
04/11/2019 11
“Immigrants are
a drain on the
economy”
Disinformation (Network)
Detection
Target: Immigration
Stance: negative
x1
c!
1
c1
h!
1
h1
x2
c!
2
c2
h!
2
h2
x3
c!
3
c3
h!
3
h3
x4
c!
4
c4
h!
4
h4
x5
c!
5
c5
h!
5
h5
x6
c!
6
c6
h!
6
h6
x7
c!
7
c7
h!
7
h7
x8
c!
8
c8
h!
8
h8
x9
c!
9
c9
h!
9
h9
Legalization of Abortion A foetus has rights too !
Target Tweet
Figure 1: Bidirectional encoding of tweet conditioned on bidirectional encoding of target ([c!
3 c1 ]). The stance is predicted using
the last forward and reversed output representations ([h!
9 h4 ]).
Here, xt is an input vector at time step t, ct denotes
the LSTM memory, ht 2 Rk is an output vector and
the remaining weight matrices and biases are train-
able parameters. We concatenate the two output vec-
tor representations and classify the stance using the
softmax over a non-linear projection
softmax(tanh(Wta
htarget + Wtw
htweet + b))
into the space of the three classes for stance detec-
tion where Wta, Wtw 2 R3⇥k are trainable weight
matrices and b 2 R3 is a trainable class bias. This
model learns target-independent distributed repre-
sentations for the tweets and relies on the non-
linear projection layer to incorporate the target in the
stance prediction.
3.2 Conditional Encoding
In order to learn target-dependent tweet representa-
tions, we use conditional encoding as previously ap-
plied to the task of recognising textual entailment
(Rockt¨aschel et al., 2016). We use one LSTM to en-
code the target as a fixed-length vector. Then, we
encode the tweet with another LSTM, whose state
is initialised with the representation of the target.
Finally, we use the last output vector of the tweet
LSTM to predict the stance of the target-tweet pair.
Formally, let (x1, . . . , xT ) be a sequence of tar-
get word vectors, (xT+1, . . . , xN ) be a sequence of
tweet word vectors and [h0 c0] be a start state of
zeros. The two LSTMs map input vectors and a pre-
vious state to a next state as follows:
[h1 c1] = LSTMtarget
(x1, h0, c0)
. . .
[hT cT ] = LSTMtarget
(xT , hT 1, cT 1)
[hT+1 cT+1] = LSTMtweet
(xT+1, h0, cT )
. . .
[hN cN ] = LSTMtweet
(xN , hN 1, cN 1)
Finally, the stance of the tweet w.r.t. the target is
classified using a non-linear projection
c = tanh(WhN )
where W 2 R3⇥k is a trainable weight matrix.
This effectively allows the second LSTM to read the
tweet in a target-specific manner, which is crucial
since the stance of the tweet depends on the target
(recall the Donald Trump example above).
3.3 Bidirectional Conditional Encoding
Bidirectional LSTMs (Graves and Schmidhuber,
2005) have been shown to learn improved represen-
tations of sequences by encoding a sequence from
left to right and from right to left. Therefore, we
adapt the conditional encoding model from Sec-
tion 3.2 to use bidirectional LSTMs, which repre-
sent the target and the tweet using two vectors for
each of them, one obtained by reading the target
Stance Detection
Veracity Prediction
Veracity: false
Target: Immigration
Frame: Economy
Frame Detection78 Economic
234 Legality, constitutionality & jurisprudence
166 Policy prescription and evaluation
186 Crime & punishment
96 Political
760 Total
(Multi-label) sequence classification without training data in the target domain
Issue Framing in Online Discussion Fora
Mareike Hartmann1
Tallulah Jansen2
Isabelle Augenstein1
Anders Søgaard1
1 Department of Computer Science, University of Copenhagen, Denmark
2 Institute of Cognitive Science, Osnabrück University, Germany
Framing in Online Discussion Fora
The framing of an issue refers to a choice of perspective when talking about it:
Economic frame: “But as we have seen, supporting same-sex
marriage saves money.”
Legality & constitutionality frame:
“So you admit that it is a right and it is
being denied?”
We annotate a subset of an online discussion corpus (Argument
Extraction Corpus, Swanson et al. 2015) with the 5 most frequent
frames of the Policy Frames Codebook
Number of sequences per frame in our dataset:
Results & Examples
-0.2
0
0.2
0.4
1 5 6 7 13
Overall
(1)Economic
(5)Legality
(13)Political
(6)Policypresc.
&evaluation
(7)Crime&punishment
Gold LSTM MTL Adv. Sequence
5 7 5 5
But, star gazer, we had guns then when the
Constitution was written and enshrined in the BOR and
now incorporated into th 14th Civil Rights Amendment.
6 1 5 6 Gun control is about preventing such security risks.
7 1 5 7
First, you warn me of the dangers of using violent
means to stop a crime.
5 6 6 6 So I don't see restrictions on handguns in D.C. as
being a clear violation of the Second Amendment.
Boydstun et al. (2014) develop the Policy Frames Codebook,
with generic frames applicable across topics and domains
Improvement over a random baseline
overall and per class
With no labeled training data in the target domain, training on additional data from other
domains and additional annotations in the target domain is useful for predicting the
target domain
Model predictions
Approach
11. Tracking False Information Online: NLP Tasks
04/11/2019 12
“Immigrants are
a drain on the
economy”
Disinformation (Network)
Detection
Target: Immigration
Stance: negative
x1
c!
1
c1
h!
1
h1
x2
c!
2
c2
h!
2
h2
x3
c!
3
c3
h!
3
h3
x4
c!
4
c4
h!
4
h4
x5
c!
5
c5
h!
5
h5
x6
c!
6
c6
h!
6
h6
x7
c!
7
c7
h!
7
h7
x8
c!
8
c8
h!
8
h8
x9
c!
9
c9
h!
9
h9
Legalization of Abortion A foetus has rights too !
Target Tweet
Figure 1: Bidirectional encoding of tweet conditioned on bidirectional encoding of target ([c!
3 c1 ]). The stance is predicted using
the last forward and reversed output representations ([h!
9 h4 ]).
Here, xt is an input vector at time step t, ct denotes
the LSTM memory, ht 2 Rk is an output vector and
the remaining weight matrices and biases are train-
able parameters. We concatenate the two output vec-
tor representations and classify the stance using the
softmax over a non-linear projection
softmax(tanh(Wta
htarget + Wtw
htweet + b))
into the space of the three classes for stance detec-
tion where Wta, Wtw 2 R3⇥k are trainable weight
matrices and b 2 R3 is a trainable class bias. This
model learns target-independent distributed repre-
sentations for the tweets and relies on the non-
linear projection layer to incorporate the target in the
stance prediction.
3.2 Conditional Encoding
In order to learn target-dependent tweet representa-
tions, we use conditional encoding as previously ap-
plied to the task of recognising textual entailment
(Rockt¨aschel et al., 2016). We use one LSTM to en-
code the target as a fixed-length vector. Then, we
encode the tweet with another LSTM, whose state
is initialised with the representation of the target.
Finally, we use the last output vector of the tweet
LSTM to predict the stance of the target-tweet pair.
Formally, let (x1, . . . , xT ) be a sequence of tar-
get word vectors, (xT+1, . . . , xN ) be a sequence of
tweet word vectors and [h0 c0] be a start state of
zeros. The two LSTMs map input vectors and a pre-
vious state to a next state as follows:
[h1 c1] = LSTMtarget
(x1, h0, c0)
. . .
[hT cT ] = LSTMtarget
(xT , hT 1, cT 1)
[hT+1 cT+1] = LSTMtweet
(xT+1, h0, cT )
. . .
[hN cN ] = LSTMtweet
(xN , hN 1, cN 1)
Finally, the stance of the tweet w.r.t. the target is
classified using a non-linear projection
c = tanh(WhN )
where W 2 R3⇥k is a trainable weight matrix.
This effectively allows the second LSTM to read the
tweet in a target-specific manner, which is crucial
since the stance of the tweet depends on the target
(recall the Donald Trump example above).
3.3 Bidirectional Conditional Encoding
Bidirectional LSTMs (Graves and Schmidhuber,
2005) have been shown to learn improved represen-
tations of sequences by encoding a sequence from
left to right and from right to left. Therefore, we
adapt the conditional encoding model from Sec-
tion 3.2 to use bidirectional LSTMs, which repre-
sent the target and the tweet using two vectors for
each of them, one obtained by reading the target
Stance Detection
Veracity Prediction
Veracity: false
Target: Immigration
Frame: Economy
Frame Detection78 Economic
234 Legality, constitutionality & jurisprudence
166 Policy prescription and evaluation
186 Crime & punishment
96 Political
760 Total
(Multi-label) sequence classification without training data in the target domain
Issue Framing in Online Discussion Fora
Mareike Hartmann1
Tallulah Jansen2
Isabelle Augenstein1
Anders Søgaard1
1 Department of Computer Science, University of Copenhagen, Denmark
2 Institute of Cognitive Science, Osnabrück University, Germany
Framing in Online Discussion Fora
The framing of an issue refers to a choice of perspective when talking about it:
Economic frame: “But as we have seen, supporting same-sex
marriage saves money.”
Legality & constitutionality frame:
“So you admit that it is a right and it is
being denied?”
We annotate a subset of an online discussion corpus (Argument
Extraction Corpus, Swanson et al. 2015) with the 5 most frequent
frames of the Policy Frames Codebook
Number of sequences per frame in our dataset:
Results & Examples
-0.2
0
0.2
0.4
1 5 6 7 13
Overall
(1)Economic
(5)Legality
(13)Political
(6)Policypresc.
&evaluation
(7)Crime&punishment
Gold LSTM MTL Adv. Sequence
5 7 5 5
But, star gazer, we had guns then when the
Constitution was written and enshrined in the BOR and
now incorporated into th 14th Civil Rights Amendment.
6 1 5 6 Gun control is about preventing such security risks.
7 1 5 7
First, you warn me of the dangers of using violent
means to stop a crime.
5 6 6 6 So I don't see restrictions on handguns in D.C. as
being a clear violation of the Second Amendment.
Boydstun et al. (2014) develop the Policy Frames Codebook,
with generic frames applicable across topics and domains
Improvement over a random baseline
overall and per class
With no labeled training data in the target domain, training on additional data from other
domains and additional annotations in the target domain is useful for predicting the
target domain
Model predictions
Approach
12. Stance Detection with Bidirectional
Conditional Encoding
Isabelle Augenstein, Tim Rocktäschel,
Andreas Vlachos, Kalina Bontcheva
EMNLP 2016
13
13. Stance Detection with Conditional Encoding
No more #NastyWomen or #BadHombres
Task: Is tweet positive, negative or neutral towards a given
target (Donald Trump)?
Problems:
- Interpretation depends on target
- Target not always mentioned in tweet
- No training data for test target
SemEval 2016, EMNLP 2016
14. Stance Detection Model:
Bidirectional Conditional Encoding
x1
c!
1
c1
h!
1
h1
x2
c!
2
c2
h!
2
h2
x3
c!
3
c3
h!
3
h3
x4
c!
4
c4
h!
4
h4
x5
c!
5
c5
h!
5
h5
x6
c!
6
c6
h!
6
h6
x7
c!
7
c7
h!
7
h7
x8
c!
8
c8
h!
8
h8
x9
c!
9
c9
h!
9
h9
Legalization of Abortion A foetus has rights too !
Target Tweet
: Bidirectional encoding of tweet conditioned on bidirectional encoding of target ([c!
3 c1 ]). The stance is predicted us
orward and reversed output representations ([h!
9 h4 ]).
15. Stance Detection with Conditional Encoding
• Weakly Supervised Setting
• Weakly label Donald Trump tweets using hashtags / expressions,
evaluate on Donald Trump tweets
positive:
make( ?)america( ?)great( ?)again
trump( ?)(for|4)( ?)president
negative:
#dumptrump
#notrump
16. Stance Detection with Conditional Encoding
• Weakly Supervised Setting
• Weakly label Donald Trump tweets using hashtags / expressions,
evaluate on Donald Trump tweets
* state of the art on dataset
Model Stance P R F1
FAVOR 0.5506 0.5878 0.5686
Concat AGAINST 0.5794 0.4883 0.5299
Macro 0.5493
FAVOR 0.6268 0.6014 0.6138
BiCond AGAINST 0.6057 0.4983 0.5468
Macro 0.5803 *
17. Multi-task Learning of Pairwise
Sequence Classification Tasks Over
Disparate Label Spaces
Isabelle Augenstein*, Sebastian Ruder*,
Anders Søgaard
NAACL HLT 2018 (long)
*equal contributions
24
18. Problem
25
- Different NLU tasks (e.g. stance detection, aspect-based
sentiment analysis, natural language inference)
- Limited training data for most individual tasks
- However:
- they can be modelled with same base neural model
- they are semantically related
- they have similar labels
- How to exploit synergies between those tasks?
19. Datasets and Tasks
Topic-based sentiment analysis:
Tweet: No power at home, sat in the
dark listening to AC/DC in the hope
it’ll make the electricity come back
again
Topic: AC/DC
Label: positive
Target-dependent sentiment
analysis:
Text: how do you like settlers of catan
for the wii?
Target: wii
Label: neutral
Aspect-based sentiment analysis:
Text: For the price, you cannot eat
this well in Manhattan
Aspects: restaurant prices, food
quality
Label: positive
26
Stance detection:
Tweet: Be prepared - if we continue the
policies of the liberal left, we will be
#Greece
Target: Donald Trump
Label: favor
Fake news detection:
Document: Dino Ferrari hooked the
whopper wels catfish, (...), which could be
the biggest in the world.
Headline: Fisherman lands 19 STONE
catfish which could be the biggest in the
world to be hooked
Label: agree
Natural language inference:
Premise: Fun for only children
Hypothesis: Fun for adults and children
Label: contradiction
25. Goal: Exploiting Synergies between Tasks
32
- Modelling tasks in a joint label space
- Label Transfer Network that learns to transfer labels
between tasks
- Use semi-supervised learning, trained end-to-end
with
multi-task learning model
- Extensive evaluation on a set of pairwise sequence
classification tasks
32. Tracking False Information Online: NLP Tasks
04/11/2019 51
“Immigrants are
a drain on the
economy”
Disinformation (Network)
Detection
Target: Immigration
Stance: negative
x1
c!
1
c1
h!
1
h1
x2
c!
2
c2
h!
2
h2
x3
c!
3
c3
h!
3
h3
x4
c!
4
c4
h!
4
h4
x5
c!
5
c5
h!
5
h5
x6
c!
6
c6
h!
6
h6
x7
c!
7
c7
h!
7
h7
x8
c!
8
c8
h!
8
h8
x9
c!
9
c9
h!
9
h9
Legalization of Abortion A foetus has rights too !
Target Tweet
Figure 1: Bidirectional encoding of tweet conditioned on bidirectional encoding of target ([c!
3 c1 ]). The stance is predicted using
the last forward and reversed output representations ([h!
9 h4 ]).
Here, xt is an input vector at time step t, ct denotes
the LSTM memory, ht 2 Rk is an output vector and
the remaining weight matrices and biases are train-
able parameters. We concatenate the two output vec-
tor representations and classify the stance using the
softmax over a non-linear projection
softmax(tanh(Wta
htarget + Wtw
htweet + b))
into the space of the three classes for stance detec-
tion where Wta, Wtw 2 R3⇥k are trainable weight
matrices and b 2 R3 is a trainable class bias. This
model learns target-independent distributed repre-
sentations for the tweets and relies on the non-
linear projection layer to incorporate the target in the
stance prediction.
3.2 Conditional Encoding
In order to learn target-dependent tweet representa-
tions, we use conditional encoding as previously ap-
plied to the task of recognising textual entailment
(Rockt¨aschel et al., 2016). We use one LSTM to en-
code the target as a fixed-length vector. Then, we
encode the tweet with another LSTM, whose state
is initialised with the representation of the target.
Finally, we use the last output vector of the tweet
LSTM to predict the stance of the target-tweet pair.
Formally, let (x1, . . . , xT ) be a sequence of tar-
get word vectors, (xT+1, . . . , xN ) be a sequence of
tweet word vectors and [h0 c0] be a start state of
zeros. The two LSTMs map input vectors and a pre-
vious state to a next state as follows:
[h1 c1] = LSTMtarget
(x1, h0, c0)
. . .
[hT cT ] = LSTMtarget
(xT , hT 1, cT 1)
[hT+1 cT+1] = LSTMtweet
(xT+1, h0, cT )
. . .
[hN cN ] = LSTMtweet
(xN , hN 1, cN 1)
Finally, the stance of the tweet w.r.t. the target is
classified using a non-linear projection
c = tanh(WhN )
where W 2 R3⇥k is a trainable weight matrix.
This effectively allows the second LSTM to read the
tweet in a target-specific manner, which is crucial
since the stance of the tweet depends on the target
(recall the Donald Trump example above).
3.3 Bidirectional Conditional Encoding
Bidirectional LSTMs (Graves and Schmidhuber,
2005) have been shown to learn improved represen-
tations of sequences by encoding a sequence from
left to right and from right to left. Therefore, we
adapt the conditional encoding model from Sec-
tion 3.2 to use bidirectional LSTMs, which repre-
sent the target and the tweet using two vectors for
each of them, one obtained by reading the target
Stance Detection
Veracity Prediction
Veracity: false
Target: Immigration
Frame: Economy
Frame Detection78 Economic
234 Legality, constitutionality & jurisprudence
166 Policy prescription and evaluation
186 Crime & punishment
96 Political
760 Total
(Multi-label) sequence classification without training data in the target domain
Issue Framing in Online Discussion Fora
Mareike Hartmann1
Tallulah Jansen2
Isabelle Augenstein1
Anders Søgaard1
1 Department of Computer Science, University of Copenhagen, Denmark
2 Institute of Cognitive Science, Osnabrück University, Germany
Framing in Online Discussion Fora
The framing of an issue refers to a choice of perspective when talking about it:
Economic frame: “But as we have seen, supporting same-sex
marriage saves money.”
Legality & constitutionality frame:
“So you admit that it is a right and it is
being denied?”
We annotate a subset of an online discussion corpus (Argument
Extraction Corpus, Swanson et al. 2015) with the 5 most frequent
frames of the Policy Frames Codebook
Number of sequences per frame in our dataset:
Results & Examples
-0.2
0
0.2
0.4
1 5 6 7 13
Overall
(1)Economic
(5)Legality
(13)Political
(6)Policypresc.
&evaluation
(7)Crime&punishment
Gold LSTM MTL Adv. Sequence
5 7 5 5
But, star gazer, we had guns then when the
Constitution was written and enshrined in the BOR and
now incorporated into th 14th Civil Rights Amendment.
6 1 5 6 Gun control is about preventing such security risks.
7 1 5 7
First, you warn me of the dangers of using violent
means to stop a crime.
5 6 6 6 So I don't see restrictions on handguns in D.C. as
being a clear violation of the Second Amendment.
Boydstun et al. (2014) develop the Policy Frames Codebook,
with generic frames applicable across topics and domains
Improvement over a random baseline
overall and per class
With no labeled training data in the target domain, training on additional data from other
domains and additional annotations in the target domain is useful for predicting the
target domain
Model predictions
Approach
33. Issue Framing in Online Discussion
Fora
Mareike Hartmann, Tallulah Jansen,
Isabelle Augenstein, Anders
Søgaard
NAACL 2019
52
34. Motivation
53
- Framing: what aspect of a
topic is referred to
- Previous work:
- News articles, Twitter
- Small datasets
- Here:
- Online fora
- Transfer learning, no
data from target
domain needed
35. Framing in Online Discussion Fora
54
We annotate a subset of an online discussion corpus (Argument Extraction
Corpus, Swanson et al. 2015) with the 5 most frequent frames of the Policy
Frames Codebook (Boydstun et al. (2014))
Economic frame: “But as we have seen, supporting same-sex
marriage saves money.”
Legality & constitutionality frame:
“So you admit that it is a right and it is being denied?”
Number of sequences per frame in our dataset:
78 Economic
234 Legality, constitutionality & jurisprudence
166 Policy prescription and evaluation
186 Crime & punishment
96 Political
760 Total
36. Approach
55
Multi-label) sequence classification without training data in the target domain
pproach
(Multi-label) sequence classification without training data in the target domain
Model predictions
Approach
37. Results
56
Results & Examples
-0.2
0
0.2
0.4
1 5 6
Overall
(1)Economic
(5)Legality
(13)Political
(6)Policypresc.
&evaluation
(7)Crime&punishment
Improvement over a random baseline
overall and per class
38. Example Predictions & Conclusion
57
6 7 13
Gold LSTM MTL Adv. Sequence
5 7 5 5
But, star gazer, we had guns then when the
Constitution was written and enshrined in the BOR and
now incorporated into th 14th Civil Rights Amendment.
6 1 5 6 Gun control is about preventing such security risks.
7 1 5 7
First, you warn me of the dangers of using violent
means to stop a crime.
5 6 6 6 So I don't see restrictions on handguns in D.C. as
being a clear violation of the Second Amendment.
With no labeled training data in the target domain, training on additional data from other
domains and additional annotations in the target domain is useful for predicting the
target domain
Model predictions
Conclusion:
• Training on other domains useful in lieue of target annotations
• Adversarial training more fruitful than multi-task learning
Labels: Economic (1); Political (13); Legality (5); Policy (6); Crime (7)
39. Tracking False Information Online: NLP Tasks
04/11/2019 58
“Immigrants are
a drain on the
economy”
Disinformation (Network)
Detection
Target: Immigration
Stance: negative
x1
c!
1
c1
h!
1
h1
x2
c!
2
c2
h!
2
h2
x3
c!
3
c3
h!
3
h3
x4
c!
4
c4
h!
4
h4
x5
c!
5
c5
h!
5
h5
x6
c!
6
c6
h!
6
h6
x7
c!
7
c7
h!
7
h7
x8
c!
8
c8
h!
8
h8
x9
c!
9
c9
h!
9
h9
Legalization of Abortion A foetus has rights too !
Target Tweet
Figure 1: Bidirectional encoding of tweet conditioned on bidirectional encoding of target ([c!
3 c1 ]). The stance is predicted using
the last forward and reversed output representations ([h!
9 h4 ]).
Here, xt is an input vector at time step t, ct denotes
the LSTM memory, ht 2 Rk is an output vector and
the remaining weight matrices and biases are train-
able parameters. We concatenate the two output vec-
tor representations and classify the stance using the
softmax over a non-linear projection
softmax(tanh(Wta
htarget + Wtw
htweet + b))
into the space of the three classes for stance detec-
tion where Wta, Wtw 2 R3⇥k are trainable weight
matrices and b 2 R3 is a trainable class bias. This
model learns target-independent distributed repre-
sentations for the tweets and relies on the non-
linear projection layer to incorporate the target in the
stance prediction.
3.2 Conditional Encoding
In order to learn target-dependent tweet representa-
tions, we use conditional encoding as previously ap-
plied to the task of recognising textual entailment
(Rockt¨aschel et al., 2016). We use one LSTM to en-
code the target as a fixed-length vector. Then, we
encode the tweet with another LSTM, whose state
is initialised with the representation of the target.
Finally, we use the last output vector of the tweet
LSTM to predict the stance of the target-tweet pair.
Formally, let (x1, . . . , xT ) be a sequence of tar-
get word vectors, (xT+1, . . . , xN ) be a sequence of
tweet word vectors and [h0 c0] be a start state of
zeros. The two LSTMs map input vectors and a pre-
vious state to a next state as follows:
[h1 c1] = LSTMtarget
(x1, h0, c0)
. . .
[hT cT ] = LSTMtarget
(xT , hT 1, cT 1)
[hT+1 cT+1] = LSTMtweet
(xT+1, h0, cT )
. . .
[hN cN ] = LSTMtweet
(xN , hN 1, cN 1)
Finally, the stance of the tweet w.r.t. the target is
classified using a non-linear projection
c = tanh(WhN )
where W 2 R3⇥k is a trainable weight matrix.
This effectively allows the second LSTM to read the
tweet in a target-specific manner, which is crucial
since the stance of the tweet depends on the target
(recall the Donald Trump example above).
3.3 Bidirectional Conditional Encoding
Bidirectional LSTMs (Graves and Schmidhuber,
2005) have been shown to learn improved represen-
tations of sequences by encoding a sequence from
left to right and from right to left. Therefore, we
adapt the conditional encoding model from Sec-
tion 3.2 to use bidirectional LSTMs, which repre-
sent the target and the tweet using two vectors for
each of them, one obtained by reading the target
Stance Detection
Veracity Prediction
Veracity: false
Target: Immigration
Frame: Economy
Frame Detection78 Economic
234 Legality, constitutionality & jurisprudence
166 Policy prescription and evaluation
186 Crime & punishment
96 Political
760 Total
(Multi-label) sequence classification without training data in the target domain
Issue Framing in Online Discussion Fora
Mareike Hartmann1
Tallulah Jansen2
Isabelle Augenstein1
Anders Søgaard1
1 Department of Computer Science, University of Copenhagen, Denmark
2 Institute of Cognitive Science, Osnabrück University, Germany
Framing in Online Discussion Fora
The framing of an issue refers to a choice of perspective when talking about it:
Economic frame: “But as we have seen, supporting same-sex
marriage saves money.”
Legality & constitutionality frame:
“So you admit that it is a right and it is
being denied?”
We annotate a subset of an online discussion corpus (Argument
Extraction Corpus, Swanson et al. 2015) with the 5 most frequent
frames of the Policy Frames Codebook
Number of sequences per frame in our dataset:
Results & Examples
-0.2
0
0.2
0.4
1 5 6 7 13
Overall
(1)Economic
(5)Legality
(13)Political
(6)Policypresc.
&evaluation
(7)Crime&punishment
Gold LSTM MTL Adv. Sequence
5 7 5 5
But, star gazer, we had guns then when the
Constitution was written and enshrined in the BOR and
now incorporated into th 14th Civil Rights Amendment.
6 1 5 6 Gun control is about preventing such security risks.
7 1 5 7
First, you warn me of the dangers of using violent
means to stop a crime.
5 6 6 6 So I don't see restrictions on handguns in D.C. as
being a clear violation of the Second Amendment.
Boydstun et al. (2014) develop the Policy Frames Codebook,
with generic frames applicable across topics and domains
Improvement over a random baseline
overall and per class
With no labeled training data in the target domain, training on additional data from other
domains and additional annotations in the target domain is useful for predicting the
target domain
Model predictions
Approach
40. MultiFC: A Real-World Multi-Domain
Dataset for Evidence-Based Fact
Checking of Claims
Isabelle Augenstein, Christina
Lioma, Dongsheng Wang, Lucas
Chaves Lima, Casper Hansen,
Christian Hansen and Jakob Grue
Simonsen
EMNLP-IJCNLP 2019
59
41. Problem
60
- Misinformation and disinformation online
- Existing fact checking datasets
- Small and/or
- Artificial
- How to create large real-world fact checking dataset?
- Crawl English fact checking websites
- Obtain:
- Claims
- Metadata
- Evidence pages
42. Example
61057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
y available
tual claims
m verifica-
nglish fact
ual sources
for verac-
We present
, highlight-
. Further,
eracity pre-
selines and
int ranking
eracity that
ificant per-
y encoding
data. Our
Macro F1
Feature Value
ClaimID farg-00004
Claim Mexico and Canada assemble cars
with foreign parts and send them to
the U.S. with no tax.
Label distorts
Claim URL https://www.
factcheck.org/2018/10/
factchecking-trump-on-trade/
Reason None
Category the-factcheck-wire
Speaker Donald Trump
Checker Eugene Kiely
Tags North American Free Trade Agree-
ment
Claim Entities United States, Canada, Mexico
Article Title FactChecking Trump on Trade
Publish Date October 3, 2018
Claim Date Monday, October 1, 2018
Table 1: An example of a claim instance. Entities are
43. Entities in Claims
62
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
ACL 2019 Submission ***. Confidential Review Copy. DO NOT
Entity Frequency
United States 2810
Barack Obama 1598
Republican Party (United States) 783
Texas 665
Democratic Party (United States) 560
Donald Trump 556
Wisconsin 471
United States Congress 354
Hillary Rodham Clinton 306
Bill Clinton 292
California 285
Russia 275
Ohio 239
China 229
George W. Bush 208
Medicare (United States) 206
Australia 186
Iran 183
Brad Pitt 180
Islam 178
Table 3: Top 30 most frequent entities listed by their
Wikipedia URL with prefix omitted
Figure 1: Dist
model used in Sec
ing our novel evid
diction model in S
data encoding mod
4.1 Multi-Doma
with Dispara
45. More Problems
65
- How to model fact checking over disparate label spaces?
- Augenstein et al. 2018
- How to incorporate evidence?
- Google Search snippets
- Train Evidence Ranking Model
46. Evidence-Based Fact Checking Model
66
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
Evidence-Based Fact Checking of Claims
Anonymous ACL submission
Abstract
We contribute the largest publicly available
dataset of naturally occurring factual claims
for the purpose of automatic claim verifica-
tion. It is collected from 38 English fact
checking websites, paired with textual sources
and rich metadata, and labelled for verac-
ity by human expert journalists. We present
an in-depth analysis of the dataset, highlight-
ing characteristics and challenges. Further,
we present results for automatic veracity pre-
diction, both with established baselines and
with with a novel method for joint ranking
of evidence pages and predicting veracity that
outperforms all baselines. Significant per-
formance increases are achieved by encoding
evidence, and by modelling metadata. Our
best-performing model achieves a Macro F1
of 45.9%, showing that this is a challenging
testbed for claim veracity prediction.
1 Introduction
Misinformation and disinformation are two of the
most pertinent and difficult challenges of the in-
Feature Value
ClaimID farg-00004
Claim Mexico and Canada assemble cars
with foreign parts and send them to
the U.S. with no tax.
Label distorts
Claim URL https://www.
factcheck.org/2018/10/
factchecking-trump-on-trade/
Reason None
Category the-factcheck-wire
Speaker Donald Trump
Checker Eugene Kiely
Tags North American Free Trade Agree-
ment
Claim Entities United States, Canada, Mexico
Article Title FactChecking Trump on Trade
Publish Date October 3, 2018
Claim Date Monday, October 1, 2018
Table 1: An example of a claim instance. Entities are
obtained via entity linking. Article and outlink texts,
evidence search snippets and pages are not shown.
51. Result Trends
• Meta-data: topic tags most important, entities least
important
• Correctly predicting ‘true’ claims is much easier than
‘false’ ones
• Most confusions happen over close labels
• General topic tags frequently co-occur with incorrect
predictions; more specific tags often co-occur with
correct predictions
71
52. Error Analysis
• Difficult Instances
• Long claims
• General tags (e.g. ‘politics’)
• Easy Instances
• Short claims
• Strong lexical cues in certain domains, e.g. death hoaxes
• High Learned Evidence Ranking
• High overlap with claim
72
53. Conclusions
• To what degree are fact checking portals useful for
veracity prediction?
• Depends on portal: some genuinely challenging, others
easy to overfit to (e.g. those debunking celebrity death
hoaxes)
• What does this mean for automatic fact checking?
• Portals are a good resource as such
• More challenges evaluation setups should be investigated
for more realistic evaluation, e.g. better negative sampling
73
54. Tracking False Information Online: NLP Tasks
04/11/2019 74
“Immigrants are
a drain on the
economy”
Disinformation (Network)
Detection
Target: Immigration
Stance: negative
x1
c!
1
c1
h!
1
h1
x2
c!
2
c2
h!
2
h2
x3
c!
3
c3
h!
3
h3
x4
c!
4
c4
h!
4
h4
x5
c!
5
c5
h!
5
h5
x6
c!
6
c6
h!
6
h6
x7
c!
7
c7
h!
7
h7
x8
c!
8
c8
h!
8
h8
x9
c!
9
c9
h!
9
h9
Legalization of Abortion A foetus has rights too !
Target Tweet
Figure 1: Bidirectional encoding of tweet conditioned on bidirectional encoding of target ([c!
3 c1 ]). The stance is predicted using
the last forward and reversed output representations ([h!
9 h4 ]).
Here, xt is an input vector at time step t, ct denotes
the LSTM memory, ht 2 Rk is an output vector and
the remaining weight matrices and biases are train-
able parameters. We concatenate the two output vec-
tor representations and classify the stance using the
softmax over a non-linear projection
softmax(tanh(Wta
htarget + Wtw
htweet + b))
into the space of the three classes for stance detec-
tion where Wta, Wtw 2 R3⇥k are trainable weight
matrices and b 2 R3 is a trainable class bias. This
model learns target-independent distributed repre-
sentations for the tweets and relies on the non-
linear projection layer to incorporate the target in the
stance prediction.
3.2 Conditional Encoding
In order to learn target-dependent tweet representa-
tions, we use conditional encoding as previously ap-
plied to the task of recognising textual entailment
(Rockt¨aschel et al., 2016). We use one LSTM to en-
code the target as a fixed-length vector. Then, we
encode the tweet with another LSTM, whose state
is initialised with the representation of the target.
Finally, we use the last output vector of the tweet
LSTM to predict the stance of the target-tweet pair.
Formally, let (x1, . . . , xT ) be a sequence of tar-
get word vectors, (xT+1, . . . , xN ) be a sequence of
tweet word vectors and [h0 c0] be a start state of
zeros. The two LSTMs map input vectors and a pre-
vious state to a next state as follows:
[h1 c1] = LSTMtarget
(x1, h0, c0)
. . .
[hT cT ] = LSTMtarget
(xT , hT 1, cT 1)
[hT+1 cT+1] = LSTMtweet
(xT+1, h0, cT )
. . .
[hN cN ] = LSTMtweet
(xN , hN 1, cN 1)
Finally, the stance of the tweet w.r.t. the target is
classified using a non-linear projection
c = tanh(WhN )
where W 2 R3⇥k is a trainable weight matrix.
This effectively allows the second LSTM to read the
tweet in a target-specific manner, which is crucial
since the stance of the tweet depends on the target
(recall the Donald Trump example above).
3.3 Bidirectional Conditional Encoding
Bidirectional LSTMs (Graves and Schmidhuber,
2005) have been shown to learn improved represen-
tations of sequences by encoding a sequence from
left to right and from right to left. Therefore, we
adapt the conditional encoding model from Sec-
tion 3.2 to use bidirectional LSTMs, which repre-
sent the target and the tweet using two vectors for
each of them, one obtained by reading the target
Stance Detection
Veracity Prediction
Veracity: false
Target: Immigration
Frame: Economy
Frame Detection78 Economic
234 Legality, constitutionality & jurisprudence
166 Policy prescription and evaluation
186 Crime & punishment
96 Political
760 Total
(Multi-label) sequence classification without training data in the target domain
Issue Framing in Online Discussion Fora
Mareike Hartmann1
Tallulah Jansen2
Isabelle Augenstein1
Anders Søgaard1
1 Department of Computer Science, University of Copenhagen, Denmark
2 Institute of Cognitive Science, Osnabrück University, Germany
Framing in Online Discussion Fora
The framing of an issue refers to a choice of perspective when talking about it:
Economic frame: “But as we have seen, supporting same-sex
marriage saves money.”
Legality & constitutionality frame:
“So you admit that it is a right and it is
being denied?”
We annotate a subset of an online discussion corpus (Argument
Extraction Corpus, Swanson et al. 2015) with the 5 most frequent
frames of the Policy Frames Codebook
Number of sequences per frame in our dataset:
Results & Examples
-0.2
0
0.2
0.4
1 5 6 7 13
Overall
(1)Economic
(5)Legality
(13)Political
(6)Policypresc.
&evaluation
(7)Crime&punishment
Gold LSTM MTL Adv. Sequence
5 7 5 5
But, star gazer, we had guns then when the
Constitution was written and enshrined in the BOR and
now incorporated into th 14th Civil Rights Amendment.
6 1 5 6 Gun control is about preventing such security risks.
7 1 5 7
First, you warn me of the dangers of using violent
means to stop a crime.
5 6 6 6 So I don't see restrictions on handguns in D.C. as
being a clear violation of the Second Amendment.
Boydstun et al. (2014) develop the Policy Frames Codebook,
with generic frames applicable across topics and domains
Improvement over a random baseline
overall and per class
With no labeled training data in the target domain, training on additional data from other
domains and additional annotations in the target domain is useful for predicting the
target domain
Model predictions
Approach
55. Mapping (Dis-)Information Flow
about the MH17 Plane Crash
Mareike Hartmann, Yevgeny
Golovchenko, Isabelle Augenstein
NLP4IF @ EMNLP-IJCNLP 2019
75
57. Information Flow on Twitter
77
• Goal: Produce retweet network
• Red: pro-Russian edges; Blue: pro-Ukrainian; grey: neutral edges
58. Data (Golovchenko et al. 2018)
78
Challenges:
• Small dataset size
• Skewed class distribution
• Specific definition of polarization
• Background knowledge is required
61. Error Analysis
81
True Class Prediction Tweet Potential Reason
for Error
Pro-Ukr
Pro-Ru
Pro-Ru
Pro-Ukr
@Werteverwalter @Ian56789 @ClarkeMicah no SU-25
re #MH17 believer has ever been able to explain it,facts
always get in their way
RT @NinaByzantina: #MH17 redux: 1) #Kolomoisky
admits involvement URL 2) gets $1.8B of #Ukraine’s
bailout funds
Event-specific
background
knowledge
needed
Pro-Ukr
Pro-Ru
Pro-Ru
Pro-Ukr
#Russia again claiming that #MH17 was shot down by
air-to-air missile, which of course wasn’t russian-made.
#LOL
RT @merahza: If you believe the pro Russia rebels shot
#MH17 then you’ll believe Justine Bieber is the next
US President and that Coke is a ...
Irony/humour
Pro-Ukr
Pro-Ru
Pro-Ru
Pro-Ukr
RT @ChadPergram: Hill intel sources say Russia has
the capability to potentially shoot down a #MH17 but
not Ukraine.
RT @truthhonour: Yes Washington was behind
Eukraine jets that shot down MH17 as pretext to
conflict with Russia. No secrets there
Overfitting
62. Conclusion
83
• Tracking false information online often involves
dealing with noisy and limited labelled data
• Possible solutions presented here:
• Multi-task learning and adversarial training for learning with
limited data
• Label embeddings for dealing with disparate labels
automatically
• Crawling real-world noisy data to obtain more training data
63. Presented Papers
Stance Detection with Bidirectional Conditional Encoding.
Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina
Bontcheva.
EMNLP 2016
Multi-task Learning of Pairwise Sequence Classification Tasks Over
Disparate Label Spaces.
Isabelle Augenstein, Sebastian Ruder, Anders Søgaard.
NAACL HLT 2018
Issue Framing in Online Discussion Fora.
Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein, Anders
Søgaard.
NAACL HLT 2019
84
64. Presented Papers
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact
Checking of Claims.
Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves
Lima, Casper Hansen, Christian Hansen and Jakob Grue Simonsen
EMNLP 2019
Mapping (Dis-)Information Flow about the MH17 Plane Crash.
Mareike Hartmann, Yevgeny Golovchenko, Isabelle Augenstein
NLP4IF @ EMNLP-IJCNLP 2019
85
65. Future Work
86
Relationship between stance detection and gender bias
Why is that interesting / relevant?
• 3 times as many negative body-related verbs modifying female nouns
as opposed to male nouns (Hoyle et al., 2019)
• Female MPs receive significantly more abuse than male ones (Gorrell
et al., 2018)
• Significantly more negative tweets directed towards Hillary Clinton
than Bernie Sanders during 2016 US Election (Tromble & Hovy, 2016)
Goal: identifying gender-biased language in attitudes
towards entities on social media
66. Hiring
PhD student (3 years) – gender bias & stance detection on
social media
funded by Danish Research Council (DFF) grant
application deadline: 1 December 2019
starting date: Spring 2020
https://tinyurl.com/y4nkp8kh
PhD students (3 years) – open topic
funded by H2020 Marie-Curie COFUND
application deadline: March 2020
starting date: 1 August 2020
https://talent.ku.dk/
87
67. Research Group
88
CopeNLU
https://copenlu.github.io/
Clockwise from the left:
Postdocs: Johannes Bjerva
PhD Students: Pepa Atanasova, Andreas Nugaard Holm,
Dustin Wright
PhD Interns: Wei Zhao
PhD Students (affiliated): Yova Kementchedjhieva,
Ana Valeria González, Nils Rethmeier,
Mareike Hartmann, Andrea Lekkas