SlideShare a Scribd company logo
20 February 2020
Detecting and tracking
misinformation in the
Internet age
FACT CHECKING APPROACH:
UNCOVERING THE TRUTH
Detecting and tracking fake news and misinformation at scale
• An excellent approach and something to be deployed with vigour in any
situation where it can usefully be applied
but ...
• Problem #1 It rarely happens and when it does, it’s often an accident
• Problem #2 It’s takes a lot of effort for humans to do it
• Problem #3 It’s impossible (more or less) for computers
Detecting and tracking fake news and misinformation at scale
Solution #1 Do it directly with machines anyway!
Solution #2 Try to spot it indirectly with correlating proxies
Solution #2 Other correlating proxies are available ... ‘canary accounts’
• Solution #3 Use machines to increase human productivity
Detecting and tracking fake news and misinformation at scale
Case Study: Fact checking your article
Case Study: Fact checking your article
Read
article
Identify
claims
Collect
evidence
Rank
evidence
Output
Detecting and tracking fake news and misinformation at scale
• Problem #4 It’s always something new
• Problem #5 True believers don’t care
• Problem #6 It’s just not good politics
CONFLICT APPROACH:
UNCOVERING THE CONTEST
Detecting and tracking fake news and misinformation at scale
• Characterise the conflict
• Identify the activities
Detecting and tracking fake news and misinformation at scale
Case Study: Disrupting Daesh – Golden Age
• 2014-2015 Golden Age on Twitter for Islamic State
• Thriving online community (50,000 – 70,000) active accounts
• Very easy access to contact and content
• Obvious markers of support (avatars, screen names, hashtags)
• Strong and supportive ideological community and sub-
communities (e.g. Chechens, ‘Sisters’)
Case Study: Disrupting Daesh – late 2015 disruption begins
• From mid 2015 - community disruption begins
o Account suspensions and takedowns
o Disruption of hashtags
• Reactions:
o Flight to Telegram
o May have strengthened community cohesion
• Late 2016: what was left?
o Impact on online Twitter community?
o Activities on Twitter?
Method52: A platform for agile modular investigation
Method52 allows user to 'fail fast' and iterate to find patterns of use
• Grounded theory (Glaser et al., 1968)
• "Unbiased examination of the available data"
• Iterative exploration of what fits
Scheme 1 Scheme 2 Scheme 3
Case Study: Disrupting Daesh. Build bespoke pipelines that are
adapted to the specific scenario
Data
Store
social
media
data
Construction,
maintenance &
analysis tools
Disruption
Monitoring
System
Visualisation
& Evaluation
Daesh
propaganda
analysis system
Visualisation
Engine
Pipeline
Construction
Engine
1
2 3
4
5
seed
accounts
A
B
Case Study: Disrupting Daesh
Data Store
-Account details
-Tweet details
-Link details
tweets
Score & analyse
confirmed accounts
Assess
relevancy
(i)
seed
accounts
seed
search
terms
Analyse links in
flagged tweets
Identify new
terms
(ii)
(iii)
(iv)
(v)
(vi)
CANDIDATE ACCOUNTS
Strategies for identifying accounts
• Content of tweets
• Generic words (qa'idin, bay'ah, nifaq, mushrik)
• Current topics (tabqa, Suwaydiya, Abu Ali al-Turki)
• Presence of generic coms links (Telegram, YouTube etc.)
• Specific known links (images, YouTube, other videos)
• Specific known hashtags (#tabqa)
• Mentions of specific 'canary' accounts (@39_nas)
• Network analysis
• Build out and understand network. Possible typology: 'source',
'canary', 'news gathering', 'signpost' and 'protected chat'
accounts
• Followers of known 'source' accounts (p_vanostaeyen)
• Followers of known 'canary' accounts (whoamidude)
• Followed by or followers of network members (protected chat
network, 'news gathering' accounts, 'signpost' accounts)
Case Study: Disrupting Daesh - strategies for identifying accounts
Tweets Followers Friends
IS 51 14 33
Other Jihadi 320 189 122
Case Study: Disrupting Daesh - account suspension rate
Case Study: Disrupting Daesh – seeding propaganda
11 12 13 14 15 16 17 18 19 20 21 22 23 00 01 02 03 04 05 06 07
DocPakistan
justpaste.it
KronaThe
omar_367
KronaThe
KronaThe
DocPakistan
justpaste.it
lhfg08
nycijm
sendvid.com
693mstafa
el_cvk
AllyOfTruth
justpaste.it
DocPakistan
justpaste.it
JulUllil
el_uhj
el_bhv x2
el_gyf
y8m...
qfg...
7wv...
wmy...
onk...
y8m...
skdj...
njnj...
0 3 3 4 31 41 61 81 121 137 525 632 697 766 776 777 790 794 798 835 842
archive.org vid
justpaste.it x2
sendvid.com
youtube.com
youtu.be
justpaste.it
Google photos
Googledrive
yadi.sk
youtu.be
archive (justpaste.it)
Language Key
English
Somali
Arabic
Case Study: Disrupting Daesh - account suspension rate
* Excludes 7 Mar which had 240 URLs (Rumiyah release)
0%
20%
40%
60%
80%
100%
Feb Mar Apr
URLs per day (mean)
Others (26 domains)
vimple.co, store6.up, pc.cd,
4shared.com
the vid.net
Google Drive
YouTube
sendvid.com
archive.org
IS’s own server
justpaste.it
4 Feb – 8 Feb 4 Mar – 8 Mar* 4 Apr – 8 Apr
cloud.mail.ru, addpost.it, vid.me
Case Study: Disrupting Daesh – URLs used as destinations
Note: All accounts
tracked were created
before 0600Z on Tuesday
4 April. Data set created
at 0600Z
Case Study: Disrupting Daesh – intercepting the propaganda
*Print media, websites, forums, social media
Inbound
Data*
Assess
relevancy
Sites and
accounts
Analyse
message
Search
terms
Identify
accounts
Identify
narrative
Cluster
narratives
Identify
attributes
Identify
networks
Emerging general methodology: the first iteration
CONFLICT APPROACH:
UNCOVERING THE CONTEST
Detecting and tracking fake news and misinformation at scale
Characterising conflict: The concept of ’Information Operations’
• Information operations are vast in scale and numerous in strategies and tactics
• A focus on ‘fake news’ or ‘misinformation’ is myopic
• Most information is not ‘fake’, but the selective amplification of reputable stories
• Information operations are characterised by erratic bursts of activity
• Information operations exploit cultural and social division
• Although information operations are coordinated, they are inconsistent, presenting a
challenge to third-party identification of inauthentic accounts.
Case Study: Internet Research Agency operations in the UK
Phase 1: Spam and the process of building credible accounts
I'm ready to eat healthy and workout.
@xhibellamy @William_Stokes @guru_paul
@ThomasAmor1 @jennyc08318 @richtweten
http://t.co/TAZ9Co1QF9
.@pedrareyes148 pedra @Chloe0354
ASDFGchloeHJKLL? @pulmonxry Yeezus
@Nick281051 Nick @puffylore163 lore
http://t.co/ZLpIlrsV33
Case Study: Internet Research Agency operations in the UK
Phase 2: Brexit Vote
Those who are still EU members can enjoy
their political correctness and tolerance
#BrexitVote https://t.co/VeMW7bagDQ
This is the simplest explanation. Just like UK we
too want to stop globalist liberals from ruining
us! #BrexitVote https://t.co/XkNFpNof1c
Case Study: Internet Research Agency operations in the UK
Phase 3: London Terror Attacks
Welcome To The New Europe! Muslim
migrants shouting in London “This is our
country now, GET OUT!” #Rapefugees
https://t.co/GCiFT96h76
Sharia NO-GO areas in BRITAIN. Citizens
blocked from their own suburbs. Only #Trump
can stop this here! https://t.co/IuQDe8rvPA
Case Study: Internet Research Agency operations in Europe
DISINFO: GEARING UP FOR THE
US PRESIDENTIAL ELECTIONS
Detecting and tracking fake news and misinformation at scale
Colleagues who participated in the work and/or developed Method52
www.taglaboratory.org
Jeremy Reffin

More Related Content

What's hot

Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...
Diana Maynard
 
Artificial Intelligence: Beyond Machine Learning
Artificial Intelligence: Beyond Machine LearningArtificial Intelligence: Beyond Machine Learning
Artificial Intelligence: Beyond Machine Learning
Gordon Haff
 
Fake News Defined
Fake News DefinedFake News Defined
Data and society media manipulation and disinformation online
Data and society media manipulation and disinformation onlineData and society media manipulation and disinformation online
Data and society media manipulation and disinformation online
Alejandro Sánchez Marín
 
#ThinkPH Social Media Sentiment Analysis
#ThinkPH Social Media Sentiment Analysis#ThinkPH Social Media Sentiment Analysis
#ThinkPH Social Media Sentiment Analysis
Robin Leonard
 
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...
Marcus Leaning
 
Filth and lies: analysing social media
Filth and lies: analysing social mediaFilth and lies: analysing social media
Filth and lies: analysing social media
Diana Maynard
 

What's hot (7)

Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...
 
Artificial Intelligence: Beyond Machine Learning
Artificial Intelligence: Beyond Machine LearningArtificial Intelligence: Beyond Machine Learning
Artificial Intelligence: Beyond Machine Learning
 
Fake News Defined
Fake News DefinedFake News Defined
Fake News Defined
 
Data and society media manipulation and disinformation online
Data and society media manipulation and disinformation onlineData and society media manipulation and disinformation online
Data and society media manipulation and disinformation online
 
#ThinkPH Social Media Sentiment Analysis
#ThinkPH Social Media Sentiment Analysis#ThinkPH Social Media Sentiment Analysis
#ThinkPH Social Media Sentiment Analysis
 
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...
 
Filth and lies: analysing social media
Filth and lies: analysing social mediaFilth and lies: analysing social media
Filth and lies: analysing social media
 

Similar to Reffin meetup talk slides 20 02-20c

An Introduction to Maskirovka aka Information Operations
An Introduction to Maskirovka aka Information OperationsAn Introduction to Maskirovka aka Information Operations
An Introduction to Maskirovka aka Information Operations
Heather Vescent
 
Carl Miller
Carl MillerCarl Miller
Carl Miller
MRS
 
An overview of fake media and its evolution
An overview of fake media and its evolutionAn overview of fake media and its evolution
An overview of fake media and its evolution
Touradj Ebrahimi
 
Risks and Security of Internet and System
Risks and Security of Internet and SystemRisks and Security of Internet and System
Risks and Security of Internet and System
Param Nanavati
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdf
Elena Simperl
 
FINAL presentationMay2016
FINAL presentationMay2016FINAL presentationMay2016
FINAL presentationMay2016
Melissa Krasnow
 
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Weverify
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
bis_foresight
 
Effective Cybersecurity Communication Skills
Effective Cybersecurity Communication SkillsEffective Cybersecurity Communication Skills
Effective Cybersecurity Communication Skills
Jack Whitsitt
 
Personal. Portable. Participatory. Pervasive.
Personal. Portable. Participatory. Pervasive.Personal. Portable. Participatory. Pervasive.
Personal. Portable. Participatory. Pervasive.
Pew Research Center's Internet & American Life Project
 
Social Engineering, or hacking people
Social Engineering, or hacking peopleSocial Engineering, or hacking people
Social Engineering, or hacking people
Tudor Damian
 
War Against Terrorism - CIO's Role
War Against Terrorism - CIO's RoleWar Against Terrorism - CIO's Role
War Against Terrorism - CIO's Role
Ayodeji Rotibi
 
Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Social Media Training at AED: Day 2
Social Media Training at AED: Day 2
Eric Schwartzman
 
Cyber Resilience
Cyber ResilienceCyber Resilience
Cyber Resilience
Ian-Edward Stafrace
 
Introduction to Cybersecurity - Secondary School_0.pptx
Introduction to Cybersecurity - Secondary School_0.pptxIntroduction to Cybersecurity - Secondary School_0.pptx
Introduction to Cybersecurity - Secondary School_0.pptx
ShubhamGupta833557
 
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....
KDMC
 
Fake news and fact finding
Fake news and fact findingFake news and fact finding
Fake news and fact finding
Yumonomics
 
Big Data Privacy - Society Issues + Big Data
Big Data Privacy - Society Issues + Big DataBig Data Privacy - Society Issues + Big Data
Big Data Privacy - Society Issues + Big Data
Sylvia Ogweng
 
Ethics & Technology :Facebook
Ethics & Technology :FacebookEthics & Technology :Facebook
Ethics & Technology :Facebook
rahul8793
 
Learning From the COViD-19 Global Pandemic
Learning From the COViD-19 Global PandemicLearning From the COViD-19 Global Pandemic
Learning From the COViD-19 Global Pandemic
Tyrone Grandison
 

Similar to Reffin meetup talk slides 20 02-20c (20)

An Introduction to Maskirovka aka Information Operations
An Introduction to Maskirovka aka Information OperationsAn Introduction to Maskirovka aka Information Operations
An Introduction to Maskirovka aka Information Operations
 
Carl Miller
Carl MillerCarl Miller
Carl Miller
 
An overview of fake media and its evolution
An overview of fake media and its evolutionAn overview of fake media and its evolution
An overview of fake media and its evolution
 
Risks and Security of Internet and System
Risks and Security of Internet and SystemRisks and Security of Internet and System
Risks and Security of Internet and System
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdf
 
FINAL presentationMay2016
FINAL presentationMay2016FINAL presentationMay2016
FINAL presentationMay2016
 
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
 
Effective Cybersecurity Communication Skills
Effective Cybersecurity Communication SkillsEffective Cybersecurity Communication Skills
Effective Cybersecurity Communication Skills
 
Personal. Portable. Participatory. Pervasive.
Personal. Portable. Participatory. Pervasive.Personal. Portable. Participatory. Pervasive.
Personal. Portable. Participatory. Pervasive.
 
Social Engineering, or hacking people
Social Engineering, or hacking peopleSocial Engineering, or hacking people
Social Engineering, or hacking people
 
War Against Terrorism - CIO's Role
War Against Terrorism - CIO's RoleWar Against Terrorism - CIO's Role
War Against Terrorism - CIO's Role
 
Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Social Media Training at AED: Day 2
Social Media Training at AED: Day 2
 
Cyber Resilience
Cyber ResilienceCyber Resilience
Cyber Resilience
 
Introduction to Cybersecurity - Secondary School_0.pptx
Introduction to Cybersecurity - Secondary School_0.pptxIntroduction to Cybersecurity - Secondary School_0.pptx
Introduction to Cybersecurity - Secondary School_0.pptx
 
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....
 
Fake news and fact finding
Fake news and fact findingFake news and fact finding
Fake news and fact finding
 
Big Data Privacy - Society Issues + Big Data
Big Data Privacy - Society Issues + Big DataBig Data Privacy - Society Issues + Big Data
Big Data Privacy - Society Issues + Big Data
 
Ethics & Technology :Facebook
Ethics & Technology :FacebookEthics & Technology :Facebook
Ethics & Technology :Facebook
 
Learning From the COViD-19 Global Pandemic
Learning From the COViD-19 Global PandemicLearning From the COViD-19 Global Pandemic
Learning From the COViD-19 Global Pandemic
 

Recently uploaded

社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
yuvarajkumar334
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
MastanaihnaiduYasam
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 

Recently uploaded (20)

社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 

Reffin meetup talk slides 20 02-20c

  • 1. 20 February 2020 Detecting and tracking misinformation in the Internet age
  • 2. FACT CHECKING APPROACH: UNCOVERING THE TRUTH Detecting and tracking fake news and misinformation at scale
  • 3. • An excellent approach and something to be deployed with vigour in any situation where it can usefully be applied but ... • Problem #1 It rarely happens and when it does, it’s often an accident • Problem #2 It’s takes a lot of effort for humans to do it • Problem #3 It’s impossible (more or less) for computers Detecting and tracking fake news and misinformation at scale
  • 4. Solution #1 Do it directly with machines anyway!
  • 5. Solution #2 Try to spot it indirectly with correlating proxies
  • 6. Solution #2 Other correlating proxies are available ... ‘canary accounts’
  • 7. • Solution #3 Use machines to increase human productivity Detecting and tracking fake news and misinformation at scale
  • 8. Case Study: Fact checking your article
  • 9. Case Study: Fact checking your article Read article Identify claims Collect evidence Rank evidence Output
  • 10. Detecting and tracking fake news and misinformation at scale • Problem #4 It’s always something new • Problem #5 True believers don’t care • Problem #6 It’s just not good politics
  • 11. CONFLICT APPROACH: UNCOVERING THE CONTEST Detecting and tracking fake news and misinformation at scale
  • 12. • Characterise the conflict • Identify the activities Detecting and tracking fake news and misinformation at scale
  • 13. Case Study: Disrupting Daesh – Golden Age • 2014-2015 Golden Age on Twitter for Islamic State • Thriving online community (50,000 – 70,000) active accounts • Very easy access to contact and content • Obvious markers of support (avatars, screen names, hashtags) • Strong and supportive ideological community and sub- communities (e.g. Chechens, ‘Sisters’)
  • 14. Case Study: Disrupting Daesh – late 2015 disruption begins • From mid 2015 - community disruption begins o Account suspensions and takedowns o Disruption of hashtags • Reactions: o Flight to Telegram o May have strengthened community cohesion • Late 2016: what was left? o Impact on online Twitter community? o Activities on Twitter?
  • 15. Method52: A platform for agile modular investigation
  • 16. Method52 allows user to 'fail fast' and iterate to find patterns of use • Grounded theory (Glaser et al., 1968) • "Unbiased examination of the available data" • Iterative exploration of what fits Scheme 1 Scheme 2 Scheme 3
  • 17. Case Study: Disrupting Daesh. Build bespoke pipelines that are adapted to the specific scenario Data Store social media data Construction, maintenance & analysis tools Disruption Monitoring System Visualisation & Evaluation Daesh propaganda analysis system Visualisation Engine Pipeline Construction Engine 1 2 3 4 5 seed accounts A B
  • 18. Case Study: Disrupting Daesh Data Store -Account details -Tweet details -Link details tweets Score & analyse confirmed accounts Assess relevancy (i) seed accounts seed search terms Analyse links in flagged tweets Identify new terms (ii) (iii) (iv) (v) (vi)
  • 19. CANDIDATE ACCOUNTS Strategies for identifying accounts • Content of tweets • Generic words (qa'idin, bay'ah, nifaq, mushrik) • Current topics (tabqa, Suwaydiya, Abu Ali al-Turki) • Presence of generic coms links (Telegram, YouTube etc.) • Specific known links (images, YouTube, other videos) • Specific known hashtags (#tabqa) • Mentions of specific 'canary' accounts (@39_nas) • Network analysis • Build out and understand network. Possible typology: 'source', 'canary', 'news gathering', 'signpost' and 'protected chat' accounts • Followers of known 'source' accounts (p_vanostaeyen) • Followers of known 'canary' accounts (whoamidude) • Followed by or followers of network members (protected chat network, 'news gathering' accounts, 'signpost' accounts) Case Study: Disrupting Daesh - strategies for identifying accounts
  • 20. Tweets Followers Friends IS 51 14 33 Other Jihadi 320 189 122 Case Study: Disrupting Daesh - account suspension rate
  • 21. Case Study: Disrupting Daesh – seeding propaganda 11 12 13 14 15 16 17 18 19 20 21 22 23 00 01 02 03 04 05 06 07 DocPakistan justpaste.it KronaThe omar_367 KronaThe KronaThe DocPakistan justpaste.it lhfg08 nycijm sendvid.com 693mstafa el_cvk AllyOfTruth justpaste.it DocPakistan justpaste.it JulUllil el_uhj el_bhv x2 el_gyf y8m... qfg... 7wv... wmy... onk... y8m... skdj... njnj... 0 3 3 4 31 41 61 81 121 137 525 632 697 766 776 777 790 794 798 835 842 archive.org vid justpaste.it x2 sendvid.com youtube.com youtu.be justpaste.it Google photos Googledrive yadi.sk youtu.be archive (justpaste.it) Language Key English Somali Arabic
  • 22. Case Study: Disrupting Daesh - account suspension rate
  • 23. * Excludes 7 Mar which had 240 URLs (Rumiyah release) 0% 20% 40% 60% 80% 100% Feb Mar Apr URLs per day (mean) Others (26 domains) vimple.co, store6.up, pc.cd, 4shared.com the vid.net Google Drive YouTube sendvid.com archive.org IS’s own server justpaste.it 4 Feb – 8 Feb 4 Mar – 8 Mar* 4 Apr – 8 Apr cloud.mail.ru, addpost.it, vid.me Case Study: Disrupting Daesh – URLs used as destinations
  • 24. Note: All accounts tracked were created before 0600Z on Tuesday 4 April. Data set created at 0600Z Case Study: Disrupting Daesh – intercepting the propaganda
  • 25. *Print media, websites, forums, social media Inbound Data* Assess relevancy Sites and accounts Analyse message Search terms Identify accounts Identify narrative Cluster narratives Identify attributes Identify networks Emerging general methodology: the first iteration
  • 26. CONFLICT APPROACH: UNCOVERING THE CONTEST Detecting and tracking fake news and misinformation at scale
  • 27. Characterising conflict: The concept of ’Information Operations’ • Information operations are vast in scale and numerous in strategies and tactics • A focus on ‘fake news’ or ‘misinformation’ is myopic • Most information is not ‘fake’, but the selective amplification of reputable stories • Information operations are characterised by erratic bursts of activity • Information operations exploit cultural and social division • Although information operations are coordinated, they are inconsistent, presenting a challenge to third-party identification of inauthentic accounts.
  • 28. Case Study: Internet Research Agency operations in the UK Phase 1: Spam and the process of building credible accounts I'm ready to eat healthy and workout. @xhibellamy @William_Stokes @guru_paul @ThomasAmor1 @jennyc08318 @richtweten http://t.co/TAZ9Co1QF9 .@pedrareyes148 pedra @Chloe0354 ASDFGchloeHJKLL? @pulmonxry Yeezus @Nick281051 Nick @puffylore163 lore http://t.co/ZLpIlrsV33
  • 29. Case Study: Internet Research Agency operations in the UK Phase 2: Brexit Vote Those who are still EU members can enjoy their political correctness and tolerance #BrexitVote https://t.co/VeMW7bagDQ This is the simplest explanation. Just like UK we too want to stop globalist liberals from ruining us! #BrexitVote https://t.co/XkNFpNof1c
  • 30. Case Study: Internet Research Agency operations in the UK Phase 3: London Terror Attacks Welcome To The New Europe! Muslim migrants shouting in London “This is our country now, GET OUT!” #Rapefugees https://t.co/GCiFT96h76 Sharia NO-GO areas in BRITAIN. Citizens blocked from their own suburbs. Only #Trump can stop this here! https://t.co/IuQDe8rvPA
  • 31. Case Study: Internet Research Agency operations in Europe
  • 32. DISINFO: GEARING UP FOR THE US PRESIDENTIAL ELECTIONS Detecting and tracking fake news and misinformation at scale
  • 33. Colleagues who participated in the work and/or developed Method52