SlideShare a Scribd company logo
1 of 29
Download to read offline
Jing Ma (CUHK) 2016/7/1
1
Detect Rumors in Microblog Posts Using
Propagation Structure via Kernel Learning
Jing Ma1, Wei Gao2*, Kam-Fai Wong1,3
1The Chinese University of Hong Kong
2Victoria University of Wellington, New Zealand
3MoE Key Laboratory of High Confidence Software Technologies, China
July 31- August 2, 2017 – ACL 2017 @ Vancouver, Canada
*Work done when Wei Gao was in QCRI
Jing Ma (CUHK)
■Introduction
■Related Work
■Tweets Propagation
■Kernel Modeling
■Evaluation
■Conclusion and Future Work
Outline
2016/7/1
2
Jing Ma (CUHK)
Introduction
2016/7/1
3
A story or statement whose truth value is
unverified or deliberately false
Jing Ma (CUHK)
The fake news went viral
Introduction
2016/7/1
4
‘*’ indicates the level of influence.
Start from a grass-roots users, promoted by some influential accounts, widely spread
Jing Ma (CUHK)
Motivation
■We generally are not good at distinguishing rumors
■It is crucial to track and debunk rumors early to minimize
their harmful effects.
■Online fact-checking services have limited topical
coverage and long delay.
■Existing models use feature engineering – over simplistic;
or recently deep neural networks – ignore propagation
structures.
2016/7/1
5
Jing Ma (CUHK)
Contributions
■Represent information spread on Twitter with propagation
tree, formed by harvesting user’s interactions, to capture
high-order propagation patterns of rumors.
■Propose a kernel-based data-driven method to generate
relevant features automatically for estimating the similarity
between two propagation tees.
■Enhance the proposed model by considering propagation
paths from source tweet to subtrees to capture the context
of transmission.
■Release two real-world twitter datasets with finer-grained
ground truth labels.
2016/7/1
6
Jing Ma(CUHK)
Outline
2016/7/1
7
■Introduction
■Related Work
■Tweets Propagation
■Kernel Modeling
■Evaluation
■Conclusion and Future Work
Jing Ma (CUHK)
Related Work
■Systems based on common sense and investigative
journalism, e.g.,
■ snopes.com
■ factcheck.org
■Learning-based models for rumor detection
■ Information credibility: Castillo et al. (2011), Yang et al. (2012)
■ Using handcrafted and temporal features: Liu et al. (2015), Ma
et al. (2015), Kwon et al. (2013, 2017)
■ Using cue terms: Zhao et al. (2015)
■ Using recurrent neural networks: Ma et al. (2016)
■Kernel-based works
■ Tree kernel: syntactic parsing (Collins and Duffy, 2001)
■ Question-answering (Moschitti, 2006)
■ Semantic analysis (Moschitti, 2004)
■ Relation extraction (Zhang et al., 2008)
■ Machine translation (Sun et al., 2010)
2016/7/1
8
Jing Ma(CUHK)
Outline
2016/7/1
9
■Introduction
■Related Work
■Tweets Propagation
■Kernel Modeling
■Evaluation
■Conclusion and Future Work
Jing Ma (CUHK)
Problem Statement
2016/7/1
10
■Given a set of microblog posts R = {𝑟}, model each source
tweet as a tree structure T 𝑟 = < 𝑉, 𝐸 >, where each
node 𝑣 = (𝑢 𝑣, 𝑐 𝑣, 𝑡 𝑣) provide the creator of the post, the
text content and post time. And 𝐸 is directed edges
corresponding to response relation.
■Task 1 – finer-grained classification for each source post
false rumor, true rumor, non-rumor, unverified rumor
■Task 2 – detect rumor as early as possible
Jing Ma (CUHK)
Propagation Structure
2016/7/111
Text Features
Propagation & Temporal
Features
User Features
Propagatio
n tree
O: Walmart donates $10,000 to support Darren
Wilson and the on going racist police murders
U1: You don't honestly
believe that, do you?
U2: i honestly do
U5: where is the
credible link?
U3: …Sam Walton gave
300k to Obama's campaign?
THINK.
U4: Sam Walton was dead
before #Obama was born.
He have wired campaign
donation from heavens.
U6: Need proof of
this-can't find any...
U7: not sure....sorry I
see a meme trending
but no proof...perhaps
if we had real journalists?
U8: I'm pretty good at
research-I think this is not
true-plenty of other
reasons to boycott WalMart. :)
……
……
Jing Ma (CUHK)
Observation & Hypothesis
2016/7/1
12
■Content-based signals (e.g., stance) (Zhao et al., 2015)
■Network-based signals (e.g., relative influence) and
temporal traits (Kwon et al., 2017)
■Our hypothesis: high-order patterns needs to/could be
captured using kernel method
(a) In rumors (b) In non-rumors
Doubt
Support
Neutral
Influence/Popularity
Jing Ma(CUHK)
Outline
2016/7/1
13
■Introduction
■Related Work
■Tweets Propagation
■Kernel Modeling
■Evaluation
■Conclusion and Future Work
Jing Ma (CUHK)
Traditional Tree Kernel (TK)
2016/7/1
14
■TK compute the syntactic similarity between two sentences
by counting the common subtrees
■Kernel Function:σ 𝑣 𝑖∈𝑉1
σ 𝑣 𝑗∈𝑉2
∆(𝑣𝑖, 𝑣𝑗)
■ ∆ 𝑣𝑖, 𝑣𝑗 : common subtrees rooted at 𝑣𝑖 and 𝑣𝑗
Jing Ma (CUHK)
Propagation Tree Kernel (PTK)
2016/7/1
15
■Why PTK?
■ Existing tree kernel cannot apply here, since in our case (1) node
is a vector of continuous numerical values; (2) similarity needs to
be softly defined between two trees instead of hardly counting on
identical nodes
■Similarity Definition
■ User Similarity: ℰ(𝑢𝑖, 𝑢𝑗)=||𝑢𝑖 − 𝑢𝑗||
■ Content Similarity: 𝒥(𝑐𝑖, 𝑐𝑗)=
|𝑁𝑔𝑟𝑎𝑚(𝑐 𝑖)∩𝑁𝑔𝑟𝑎𝑚(𝑐 𝑗)|
|𝑁𝑔𝑟𝑎𝑚(𝑐 𝑖)∪𝑁𝑔𝑟𝑎𝑚(𝑐 𝑗)|
■ Node Similarity:
𝑓 𝑣𝑖, 𝑣𝑗 = 𝑒−𝑡(𝛼ℰ(𝑢𝑖, 𝑢𝑗)+(1−𝛼)𝒥(𝑐𝑖, 𝑐𝑗) )
Jing Ma (CUHK)
Propagation Tree Kernel
2016/7/1
16
■ Given two trees 𝑇1 =< 𝑉1, 𝐸1 > and 𝑇2 =< 𝑉2, 𝐸2 > ,PTK compute
similarity between them by enumerating all similar subtrees.
■ Kernel Function: σ 𝑣 𝑖∈𝑉1
∆ 𝑣𝑖, 𝑣𝑖
′
+ σ 𝑣 𝑗∈𝑉2
∆ 𝑣𝑗
′
, 𝑣𝑗
■ 𝑣𝑖 and 𝑣𝑖
′
are similar node pairs from 𝑉1 and 𝑉2 respectively
𝑣𝑖
′
= arg max
𝑣 𝑗∈𝑉2
𝑓(𝑣𝑖, 𝑣𝑗)
■ ∆ 𝑣, 𝑣′ : similarity of two subtrees rooted at 𝑣 and 𝑣′
■ Kernel algorithm
1) if 𝒗 or 𝒗′
are leaf nodes, then
∆ 𝒗, 𝒗′ = 𝒇(𝒗, 𝒗′);
2) else
∆ 𝒗, 𝒗′ = 𝒇(𝒗, 𝒗′) ς 𝒌=𝟏
𝒎𝒊𝒏(𝒏𝒄 𝒗 ,𝒏𝒄(𝒗′))
(𝟏 + ∆(𝒄𝒉 𝒗, 𝒌 , 𝒄𝒉(𝒗′, 𝒌)));
𝑣1
𝑣2 .
.
𝒗 𝟑
𝑣4 …… 𝑣5
Sub-tree
Jing Ma (CUHK)
Context-Sensitive Extension of PTK
2016/7/1
17
■ Consider propagation paths from root node to the subtree
■ Why cPTK?
■ PTK ignores the clues outside the subtrees and the route embed how the
propagation happens.
■ Similar intuition to context-sensitive tree kernel (Zhou et al., 2007)
■ Kernel Function: σ 𝑣 𝑖∈𝑉1
σ 𝑥=0
𝐿 𝑣 𝑖
𝑟1−1
∆ 𝑥(𝑣𝑖, 𝑣𝑖
′
) + σ 𝑣 𝑗∈𝑉2
σ 𝑥=0
𝐿 𝑣 𝑗
𝑟2−1
∆ 𝑥(𝑣𝑗
′
, 𝑣𝑗)
■ 𝑳 𝒗
𝒓
: the length of propagation path from root 𝒓 to 𝒗.
■ 𝒗 𝒙 : the x-th ancestor of 𝒗.
■ ∆ 𝑥(𝑣, 𝑣′
): similarity of subtrees rooted at 𝑣[𝑥] and 𝑣′
𝑥 .
■ Kernel Algorithm
1) if 𝒗[𝒙] and 𝒗′
[𝒙] are the x-th ancestor nodes of 𝒗 and 𝒗′
,then
∆ 𝒙 𝒗, 𝒗′
= 𝒇(𝒗[𝒙], 𝒗′
[𝒙]);
2) else
∆ 𝒙 𝒗, 𝒗′ = ∆ 𝒗, 𝒗′ ;) i.e., PTK)
𝒗 𝟏
𝒗 𝟐 .
.
𝒗 𝟑
𝑣4 …… 𝑣5
Context path
Subtree root
Jing Ma (CUHK)
Rumor Detection via Kernel Learning
2016/7/1
18
• Incorporate the proposed tree kernel functions (i.e., PTK or
cPTK) into a supervised learning framework, for which we
utilize a kernel-based SVM classifier.
• Avoid feature engineering – the kernel function can explore
an implicit feature space when calculating the similarity
between two objects.
• For multi-class task, perform One vs. all, i.e., building K (#
of classes) basic binary classifiers so as to separate one class
from all the others.
Jing Ma(CUHK)
Outline
2016/7/1
19
■Introduction
■Related Work
■Tweets Propagation
■Kernel Modeling
■Evaluation
■Conclusion and Future Work
Jing Ma (CUHK)
Data Collection
■Construct our propagation tree datasets based on two
reference Twitter datasets:
■ Twitter15 (Liu et al, 2015)
■ Twitter16 (Ma et al, 2016)
2016/7/1
20
Extract popular
source tweets
Collect
propagation
threads
revised labels
( retweets: Twrench.com )
( replies: Web crawler )
Convert event label:
binary -> quarternary
(Source tweet: highly retweeted or replied)
Jing Ma (CUHK)
Statistics of Data Collection
2016/7/1
21
URL of the datasets:
https://www.dropbox.com/s/0jhsfwep3ywvpca/rumdetect2017.zip?dl=0
Jing Ma (CUHK)
Approaches to compare with
■ DTR: Decision tree-based ranking model using enquiry phrases
to identify trending rumors (Zhao et al., 2015)
■ DTC and SVM-RBF: Twitter information credibility model
using Decision Tree Classifier (Castillo et al., 2011); SVM-based
model with RBF kernel (Yang et al., 2012)
■ RFC: Random Forest Classifier using three parameters to fit the
temporal tweets volume curve (Kwon et al., 2013)
■ SVM-TS: Linear SVM classifier using time-series structures to
model the variation of social context features. (Ma et al., 2015)
■ GRU: The RNN-based rumor detection model. (Ma et al., 2016)
■ BOW: linear SVM classifier using bag-of-words.
■ Ours (PTK and cPTK): Our kernel based model
■ PTK- and cPTK-: Our kernel based model with subset node
features.
2016/7/1
22
Jing Ma (CUHK)
Results on Twitter15
2016/7/1
23
Method Accu.
NR FR TR UR
F1 F1 F1 F1
DTR 0.409 0.501 0.311 0.364 0.473
SVM-RBF 0.318 0.455 0.037 0.218 0.225
DTC 0.454 0.733 0.355 0.317 0.415
SVM-TS 0.544 0.796 0.472 0.404 0.483
RFC 0.565 0.810 0.422 0.401 0.543
GRU 0.646 0.792 0.574 0.608 0.592
BOW 0.548 0.564 0.524 0.582 0.512
PTK- 0.657 0.734 0.624 0.673 0.612
cPTK- 0.697 0.760 0.645 0.696 0.689
PTK 0.710 0.825 0.685 0.688 0.647
cPTK 0.750 0.804 0.698 0.765 0.733
NR: Non-Rumor; FR: False Rumor;
TR: True Rumor; UR: Unverified Rumor;
Jing Ma (CUHK)
Results on Twitter16
2016/7/1
24
Method Accu.
NR FR TR UR
F1 F1 F1 F1
DTR 0.414 0.394 0.273 0.630 0.344
SVM-RBF 0.321 0.423 0.085 0.419 0.037
DTC 0.465 0.643 0.393 0.419 0.403
SVM-TS 0.574 0.755 0.420 0.571 0.526
RFC 0.585 0.752 0.415 0.547 0.563
GRU 0.633 0.772 0.489 0.686 0.593
BOW 0.585 0.553 0.556 0.655 0.578
PTK- 0.653 0.673 0.640 0.722 0.567
cPTK- 0.702 0.711 0.664 0.816 0.608
PTK 0.722 0.784 0.690 0.786 0.644
cPTK 0.732 0.740 0.709 0.836 0.686
NR: Non-Rumor; FR: False Rumor;
TR: True Rumor; UR: Unverified Rumor;
Jing Ma (CUHK)
Results on Early Detection
❑ In the first few hours, the accuracy of the kernel-based methods climbs more
rapidly and stabilize more quickly
❑ cPTK can detect rumors with 72% accuracy for Twitter15 and 69.0% for
Twitter16 within 12 hours, which is much earlier than the baselines and the
mean official report times
2016/7/1
25
(a) Twitter15 DATASET (b) Twitter16 DATASET
Jing Ma (CUHK)
Early Detection Example
2016/7/1
26
Example subtree of a rumor captured by the algorithm at early stage of propagation
Influential users boost its propagation, unpopular-to-popular information flow,
Textual signals (underlined)
Jing Ma(CUHK)
Outline
2016/7/1
27
■Introduction
■Related Work
■Tweets Propagation
■Kernel Modeling
■Evaluation
■Conclusion and Future Work
Jing Ma (CUHK)
Conclusion and future work
2016/7/1
28
■ Apply kernel learning method for rumor debunking by utilizing the
propagation tree structures.
■ Propagation tree encodes the spread of a source tweet with complex
structured patterns and flat information regarding content, user and
time associated with the tree nodes.
■ Our kernel are combined under supervised framework for identifying
rumors of finer-grained levels by directly measuring the similarity
among propagation trees.
■ Future work:
❑ Explore network representation method to improve the rumor
detection task.
❑ Develop unsupervised models due to massive unlabeled data
from social media.
Jing Ma (CUHK) 2016/7/1
29
Q & A

More Related Content

What's hot

Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
EuroIoTa
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Esteban Donato
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
The impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classificationThe impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classification
Universitat Politècnica de Catalunya
 

What's hot (20)

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
 
My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
 
Mahoney mlconf-nov13
Mahoney mlconf-nov13Mahoney mlconf-nov13
Mahoney mlconf-nov13
 
[251] implementing deep learning using cu dnn
[251] implementing deep learning using cu dnn[251] implementing deep learning using cu dnn
[251] implementing deep learning using cu dnn
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networks
 
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
 
The impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classificationThe impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classification
 

Similar to Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning

Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencl
eSAT Publishing House
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...
TELKOMNIKA JOURNAL
 
Predictive Metabonomics
Predictive MetabonomicsPredictive Metabonomics
Predictive Metabonomics
Marilyn Arceo
 

Similar to Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning (20)

Parallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using openclParallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using opencl
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencl
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
 
[IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collab...
[IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collab...[IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collab...
[IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collab...
 
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
 
A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...
A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...
A Connectionist Approach to Dynamic Resource Management for Virtualised Netwo...
 
Predictive Metabonomics
Predictive MetabonomicsPredictive Metabonomics
Predictive Metabonomics
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part I
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
 
Kdd'20 presentation 223
Kdd'20 presentation 223Kdd'20 presentation 223
Kdd'20 presentation 223
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
DL for molecules
DL for moleculesDL for molecules
DL for molecules
 
Conformer review
Conformer reviewConformer review
Conformer review
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 

More from Association for Computational Linguistics

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 

Recently uploaded (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning

  • 1. Jing Ma (CUHK) 2016/7/1 1 Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning Jing Ma1, Wei Gao2*, Kam-Fai Wong1,3 1The Chinese University of Hong Kong 2Victoria University of Wellington, New Zealand 3MoE Key Laboratory of High Confidence Software Technologies, China July 31- August 2, 2017 – ACL 2017 @ Vancouver, Canada *Work done when Wei Gao was in QCRI
  • 2. Jing Ma (CUHK) ■Introduction ■Related Work ■Tweets Propagation ■Kernel Modeling ■Evaluation ■Conclusion and Future Work Outline 2016/7/1 2
  • 3. Jing Ma (CUHK) Introduction 2016/7/1 3 A story or statement whose truth value is unverified or deliberately false
  • 4. Jing Ma (CUHK) The fake news went viral Introduction 2016/7/1 4 ‘*’ indicates the level of influence. Start from a grass-roots users, promoted by some influential accounts, widely spread
  • 5. Jing Ma (CUHK) Motivation ■We generally are not good at distinguishing rumors ■It is crucial to track and debunk rumors early to minimize their harmful effects. ■Online fact-checking services have limited topical coverage and long delay. ■Existing models use feature engineering – over simplistic; or recently deep neural networks – ignore propagation structures. 2016/7/1 5
  • 6. Jing Ma (CUHK) Contributions ■Represent information spread on Twitter with propagation tree, formed by harvesting user’s interactions, to capture high-order propagation patterns of rumors. ■Propose a kernel-based data-driven method to generate relevant features automatically for estimating the similarity between two propagation tees. ■Enhance the proposed model by considering propagation paths from source tweet to subtrees to capture the context of transmission. ■Release two real-world twitter datasets with finer-grained ground truth labels. 2016/7/1 6
  • 7. Jing Ma(CUHK) Outline 2016/7/1 7 ■Introduction ■Related Work ■Tweets Propagation ■Kernel Modeling ■Evaluation ■Conclusion and Future Work
  • 8. Jing Ma (CUHK) Related Work ■Systems based on common sense and investigative journalism, e.g., ■ snopes.com ■ factcheck.org ■Learning-based models for rumor detection ■ Information credibility: Castillo et al. (2011), Yang et al. (2012) ■ Using handcrafted and temporal features: Liu et al. (2015), Ma et al. (2015), Kwon et al. (2013, 2017) ■ Using cue terms: Zhao et al. (2015) ■ Using recurrent neural networks: Ma et al. (2016) ■Kernel-based works ■ Tree kernel: syntactic parsing (Collins and Duffy, 2001) ■ Question-answering (Moschitti, 2006) ■ Semantic analysis (Moschitti, 2004) ■ Relation extraction (Zhang et al., 2008) ■ Machine translation (Sun et al., 2010) 2016/7/1 8
  • 9. Jing Ma(CUHK) Outline 2016/7/1 9 ■Introduction ■Related Work ■Tweets Propagation ■Kernel Modeling ■Evaluation ■Conclusion and Future Work
  • 10. Jing Ma (CUHK) Problem Statement 2016/7/1 10 ■Given a set of microblog posts R = {𝑟}, model each source tweet as a tree structure T 𝑟 = < 𝑉, 𝐸 >, where each node 𝑣 = (𝑢 𝑣, 𝑐 𝑣, 𝑡 𝑣) provide the creator of the post, the text content and post time. And 𝐸 is directed edges corresponding to response relation. ■Task 1 – finer-grained classification for each source post false rumor, true rumor, non-rumor, unverified rumor ■Task 2 – detect rumor as early as possible
  • 11. Jing Ma (CUHK) Propagation Structure 2016/7/111 Text Features Propagation & Temporal Features User Features Propagatio n tree O: Walmart donates $10,000 to support Darren Wilson and the on going racist police murders U1: You don't honestly believe that, do you? U2: i honestly do U5: where is the credible link? U3: …Sam Walton gave 300k to Obama's campaign? THINK. U4: Sam Walton was dead before #Obama was born. He have wired campaign donation from heavens. U6: Need proof of this-can't find any... U7: not sure....sorry I see a meme trending but no proof...perhaps if we had real journalists? U8: I'm pretty good at research-I think this is not true-plenty of other reasons to boycott WalMart. :) …… ……
  • 12. Jing Ma (CUHK) Observation & Hypothesis 2016/7/1 12 ■Content-based signals (e.g., stance) (Zhao et al., 2015) ■Network-based signals (e.g., relative influence) and temporal traits (Kwon et al., 2017) ■Our hypothesis: high-order patterns needs to/could be captured using kernel method (a) In rumors (b) In non-rumors Doubt Support Neutral Influence/Popularity
  • 13. Jing Ma(CUHK) Outline 2016/7/1 13 ■Introduction ■Related Work ■Tweets Propagation ■Kernel Modeling ■Evaluation ■Conclusion and Future Work
  • 14. Jing Ma (CUHK) Traditional Tree Kernel (TK) 2016/7/1 14 ■TK compute the syntactic similarity between two sentences by counting the common subtrees ■Kernel Function:σ 𝑣 𝑖∈𝑉1 σ 𝑣 𝑗∈𝑉2 ∆(𝑣𝑖, 𝑣𝑗) ■ ∆ 𝑣𝑖, 𝑣𝑗 : common subtrees rooted at 𝑣𝑖 and 𝑣𝑗
  • 15. Jing Ma (CUHK) Propagation Tree Kernel (PTK) 2016/7/1 15 ■Why PTK? ■ Existing tree kernel cannot apply here, since in our case (1) node is a vector of continuous numerical values; (2) similarity needs to be softly defined between two trees instead of hardly counting on identical nodes ■Similarity Definition ■ User Similarity: ℰ(𝑢𝑖, 𝑢𝑗)=||𝑢𝑖 − 𝑢𝑗|| ■ Content Similarity: 𝒥(𝑐𝑖, 𝑐𝑗)= |𝑁𝑔𝑟𝑎𝑚(𝑐 𝑖)∩𝑁𝑔𝑟𝑎𝑚(𝑐 𝑗)| |𝑁𝑔𝑟𝑎𝑚(𝑐 𝑖)∪𝑁𝑔𝑟𝑎𝑚(𝑐 𝑗)| ■ Node Similarity: 𝑓 𝑣𝑖, 𝑣𝑗 = 𝑒−𝑡(𝛼ℰ(𝑢𝑖, 𝑢𝑗)+(1−𝛼)𝒥(𝑐𝑖, 𝑐𝑗) )
  • 16. Jing Ma (CUHK) Propagation Tree Kernel 2016/7/1 16 ■ Given two trees 𝑇1 =< 𝑉1, 𝐸1 > and 𝑇2 =< 𝑉2, 𝐸2 > ,PTK compute similarity between them by enumerating all similar subtrees. ■ Kernel Function: σ 𝑣 𝑖∈𝑉1 ∆ 𝑣𝑖, 𝑣𝑖 ′ + σ 𝑣 𝑗∈𝑉2 ∆ 𝑣𝑗 ′ , 𝑣𝑗 ■ 𝑣𝑖 and 𝑣𝑖 ′ are similar node pairs from 𝑉1 and 𝑉2 respectively 𝑣𝑖 ′ = arg max 𝑣 𝑗∈𝑉2 𝑓(𝑣𝑖, 𝑣𝑗) ■ ∆ 𝑣, 𝑣′ : similarity of two subtrees rooted at 𝑣 and 𝑣′ ■ Kernel algorithm 1) if 𝒗 or 𝒗′ are leaf nodes, then ∆ 𝒗, 𝒗′ = 𝒇(𝒗, 𝒗′); 2) else ∆ 𝒗, 𝒗′ = 𝒇(𝒗, 𝒗′) ς 𝒌=𝟏 𝒎𝒊𝒏(𝒏𝒄 𝒗 ,𝒏𝒄(𝒗′)) (𝟏 + ∆(𝒄𝒉 𝒗, 𝒌 , 𝒄𝒉(𝒗′, 𝒌))); 𝑣1 𝑣2 . . 𝒗 𝟑 𝑣4 …… 𝑣5 Sub-tree
  • 17. Jing Ma (CUHK) Context-Sensitive Extension of PTK 2016/7/1 17 ■ Consider propagation paths from root node to the subtree ■ Why cPTK? ■ PTK ignores the clues outside the subtrees and the route embed how the propagation happens. ■ Similar intuition to context-sensitive tree kernel (Zhou et al., 2007) ■ Kernel Function: σ 𝑣 𝑖∈𝑉1 σ 𝑥=0 𝐿 𝑣 𝑖 𝑟1−1 ∆ 𝑥(𝑣𝑖, 𝑣𝑖 ′ ) + σ 𝑣 𝑗∈𝑉2 σ 𝑥=0 𝐿 𝑣 𝑗 𝑟2−1 ∆ 𝑥(𝑣𝑗 ′ , 𝑣𝑗) ■ 𝑳 𝒗 𝒓 : the length of propagation path from root 𝒓 to 𝒗. ■ 𝒗 𝒙 : the x-th ancestor of 𝒗. ■ ∆ 𝑥(𝑣, 𝑣′ ): similarity of subtrees rooted at 𝑣[𝑥] and 𝑣′ 𝑥 . ■ Kernel Algorithm 1) if 𝒗[𝒙] and 𝒗′ [𝒙] are the x-th ancestor nodes of 𝒗 and 𝒗′ ,then ∆ 𝒙 𝒗, 𝒗′ = 𝒇(𝒗[𝒙], 𝒗′ [𝒙]); 2) else ∆ 𝒙 𝒗, 𝒗′ = ∆ 𝒗, 𝒗′ ;) i.e., PTK) 𝒗 𝟏 𝒗 𝟐 . . 𝒗 𝟑 𝑣4 …… 𝑣5 Context path Subtree root
  • 18. Jing Ma (CUHK) Rumor Detection via Kernel Learning 2016/7/1 18 • Incorporate the proposed tree kernel functions (i.e., PTK or cPTK) into a supervised learning framework, for which we utilize a kernel-based SVM classifier. • Avoid feature engineering – the kernel function can explore an implicit feature space when calculating the similarity between two objects. • For multi-class task, perform One vs. all, i.e., building K (# of classes) basic binary classifiers so as to separate one class from all the others.
  • 19. Jing Ma(CUHK) Outline 2016/7/1 19 ■Introduction ■Related Work ■Tweets Propagation ■Kernel Modeling ■Evaluation ■Conclusion and Future Work
  • 20. Jing Ma (CUHK) Data Collection ■Construct our propagation tree datasets based on two reference Twitter datasets: ■ Twitter15 (Liu et al, 2015) ■ Twitter16 (Ma et al, 2016) 2016/7/1 20 Extract popular source tweets Collect propagation threads revised labels ( retweets: Twrench.com ) ( replies: Web crawler ) Convert event label: binary -> quarternary (Source tweet: highly retweeted or replied)
  • 21. Jing Ma (CUHK) Statistics of Data Collection 2016/7/1 21 URL of the datasets: https://www.dropbox.com/s/0jhsfwep3ywvpca/rumdetect2017.zip?dl=0
  • 22. Jing Ma (CUHK) Approaches to compare with ■ DTR: Decision tree-based ranking model using enquiry phrases to identify trending rumors (Zhao et al., 2015) ■ DTC and SVM-RBF: Twitter information credibility model using Decision Tree Classifier (Castillo et al., 2011); SVM-based model with RBF kernel (Yang et al., 2012) ■ RFC: Random Forest Classifier using three parameters to fit the temporal tweets volume curve (Kwon et al., 2013) ■ SVM-TS: Linear SVM classifier using time-series structures to model the variation of social context features. (Ma et al., 2015) ■ GRU: The RNN-based rumor detection model. (Ma et al., 2016) ■ BOW: linear SVM classifier using bag-of-words. ■ Ours (PTK and cPTK): Our kernel based model ■ PTK- and cPTK-: Our kernel based model with subset node features. 2016/7/1 22
  • 23. Jing Ma (CUHK) Results on Twitter15 2016/7/1 23 Method Accu. NR FR TR UR F1 F1 F1 F1 DTR 0.409 0.501 0.311 0.364 0.473 SVM-RBF 0.318 0.455 0.037 0.218 0.225 DTC 0.454 0.733 0.355 0.317 0.415 SVM-TS 0.544 0.796 0.472 0.404 0.483 RFC 0.565 0.810 0.422 0.401 0.543 GRU 0.646 0.792 0.574 0.608 0.592 BOW 0.548 0.564 0.524 0.582 0.512 PTK- 0.657 0.734 0.624 0.673 0.612 cPTK- 0.697 0.760 0.645 0.696 0.689 PTK 0.710 0.825 0.685 0.688 0.647 cPTK 0.750 0.804 0.698 0.765 0.733 NR: Non-Rumor; FR: False Rumor; TR: True Rumor; UR: Unverified Rumor;
  • 24. Jing Ma (CUHK) Results on Twitter16 2016/7/1 24 Method Accu. NR FR TR UR F1 F1 F1 F1 DTR 0.414 0.394 0.273 0.630 0.344 SVM-RBF 0.321 0.423 0.085 0.419 0.037 DTC 0.465 0.643 0.393 0.419 0.403 SVM-TS 0.574 0.755 0.420 0.571 0.526 RFC 0.585 0.752 0.415 0.547 0.563 GRU 0.633 0.772 0.489 0.686 0.593 BOW 0.585 0.553 0.556 0.655 0.578 PTK- 0.653 0.673 0.640 0.722 0.567 cPTK- 0.702 0.711 0.664 0.816 0.608 PTK 0.722 0.784 0.690 0.786 0.644 cPTK 0.732 0.740 0.709 0.836 0.686 NR: Non-Rumor; FR: False Rumor; TR: True Rumor; UR: Unverified Rumor;
  • 25. Jing Ma (CUHK) Results on Early Detection ❑ In the first few hours, the accuracy of the kernel-based methods climbs more rapidly and stabilize more quickly ❑ cPTK can detect rumors with 72% accuracy for Twitter15 and 69.0% for Twitter16 within 12 hours, which is much earlier than the baselines and the mean official report times 2016/7/1 25 (a) Twitter15 DATASET (b) Twitter16 DATASET
  • 26. Jing Ma (CUHK) Early Detection Example 2016/7/1 26 Example subtree of a rumor captured by the algorithm at early stage of propagation Influential users boost its propagation, unpopular-to-popular information flow, Textual signals (underlined)
  • 27. Jing Ma(CUHK) Outline 2016/7/1 27 ■Introduction ■Related Work ■Tweets Propagation ■Kernel Modeling ■Evaluation ■Conclusion and Future Work
  • 28. Jing Ma (CUHK) Conclusion and future work 2016/7/1 28 ■ Apply kernel learning method for rumor debunking by utilizing the propagation tree structures. ■ Propagation tree encodes the spread of a source tweet with complex structured patterns and flat information regarding content, user and time associated with the tree nodes. ■ Our kernel are combined under supervised framework for identifying rumors of finer-grained levels by directly measuring the similarity among propagation trees. ■ Future work: ❑ Explore network representation method to improve the rumor detection task. ❑ Develop unsupervised models due to massive unlabeled data from social media.
  • 29. Jing Ma (CUHK) 2016/7/1 29 Q & A