The document describes a project to detect fake news using machine learning models. It discusses how the project classified news websites as real or fake using a combination of bag-of-words, word embeddings and feature descriptions with 87.39% accuracy. Some ways to improve the model are also provided, such as using more features in the word embeddings. Real-world applications of fake news detection include verifying news on social media during elections and detecting fake job postings.
This presentation outlines five ways to find data on your reporting beat that can be developed into unique stories. It also outlines several data-driven story ideas on three beats: cops and courts, health, and government. And it includes exercises on how to sort in Excel and search for stories in government databases. It was created by Manuel Torres, enterprise editor for The Times-Picayune | Nola.com, for APME's NewsTrain in Monroe, La., on Oct. 15-16, 2015. It is accompanied by two handouts: "Data-Driven Enterprise off Your Beat" and "Help Getting Public Records." NewsTrain is a training initiative of Associated Press Media Editors: http://bit.ly/NewsTrain
Evaluating Real World Information (NJLA 2018)Megan Dempsey
Presented at the 2018 New Jersey Library Association Annual Conference. Discusses examples of misinformation and distorted information found online and a method for thinking critically about the information we encounter.
This is an invited talk I presented at the University of Zurich, speakers' series 2.10.2017. The presentation is based on the following paper: Brandtzaeg, P. B., & Følstad, A. (2017). Trust and distrust in online fact-checking services. Communications of the ACM. 60(9): 65-71
Lightning Talk: Using Data without Compromising PrivacyGordon Haff
Deep learning and machine learning more broadly depend on large quantities of data to develop accurate predictive models. In areas such as medical research, sharing data among institutions can lead to even greater value. However, data often includes personally identifiable information that we may not want to (or even be legally allowed to) share with others. Traditional anonymization techniques only help to some degree.
In this talk, Red Hat's Gordon Haff will share with you the active research activity taking place in academia and elsewhere into techniques such as multi-party computation and homomorphic encryption. The goal of this research is to enable broad information sharing leading to better models while preserving the anonymity of individual data points.
SANSFIRE - Elections, Deceptions and Political BreachesJohn Bambenek
Its been the year of political breaches. While campaigns are odd entities, there are lessons enterprises can draw from what happened in 2016 to protect their organizations from attacks.
The document describes a project to detect fake news using machine learning models. It discusses how the project classified news websites as real or fake using a combination of bag-of-words, word embeddings and feature descriptions with 87.39% accuracy. Some ways to improve the model are also provided, such as using more features in the word embeddings. Real-world applications of fake news detection include verifying news on social media during elections and detecting fake job postings.
This presentation outlines five ways to find data on your reporting beat that can be developed into unique stories. It also outlines several data-driven story ideas on three beats: cops and courts, health, and government. And it includes exercises on how to sort in Excel and search for stories in government databases. It was created by Manuel Torres, enterprise editor for The Times-Picayune | Nola.com, for APME's NewsTrain in Monroe, La., on Oct. 15-16, 2015. It is accompanied by two handouts: "Data-Driven Enterprise off Your Beat" and "Help Getting Public Records." NewsTrain is a training initiative of Associated Press Media Editors: http://bit.ly/NewsTrain
Evaluating Real World Information (NJLA 2018)Megan Dempsey
Presented at the 2018 New Jersey Library Association Annual Conference. Discusses examples of misinformation and distorted information found online and a method for thinking critically about the information we encounter.
This is an invited talk I presented at the University of Zurich, speakers' series 2.10.2017. The presentation is based on the following paper: Brandtzaeg, P. B., & Følstad, A. (2017). Trust and distrust in online fact-checking services. Communications of the ACM. 60(9): 65-71
Lightning Talk: Using Data without Compromising PrivacyGordon Haff
Deep learning and machine learning more broadly depend on large quantities of data to develop accurate predictive models. In areas such as medical research, sharing data among institutions can lead to even greater value. However, data often includes personally identifiable information that we may not want to (or even be legally allowed to) share with others. Traditional anonymization techniques only help to some degree.
In this talk, Red Hat's Gordon Haff will share with you the active research activity taking place in academia and elsewhere into techniques such as multi-party computation and homomorphic encryption. The goal of this research is to enable broad information sharing leading to better models while preserving the anonymity of individual data points.
SANSFIRE - Elections, Deceptions and Political BreachesJohn Bambenek
Its been the year of political breaches. While campaigns are odd entities, there are lessons enterprises can draw from what happened in 2016 to protect their organizations from attacks.
Using language to save the world: interactions between society, behaviour and...Diana Maynard
The document discusses social media analysis and natural language processing as applied to Twitter data. It provides statistics on Twitter usage and the most followed accounts. It then discusses challenges in analyzing social media text due to informal language usage and outlines common NLP preprocessing steps. Applications discussed include identifying named entities, geotagging tweets, user and topic classification, and analyzing hate speech directed at politicians on Twitter around UK elections in 2015 and 2017.
The past decade or so has seen such rapid advances in supervised deep learning and neural networks that those areas, and machine learning more generally, have become almost synonymous with AI especially in popular media. However, there are other broad areas of research that have fed into AI historically and continue to be important today.
In this talk, Red Hat’s Gordon Haff will place machine learning within this set of broader science and engineering specialties that include cognitive psychology, control theory, linguistics, and human factors. The goal is to provide attendees with a broader context for both learning and applying cross-disciplinary fields of study to their AI-related work.
Internet subcultures like trolls, gamergaters, hate groups, conspiracy theorists, hyper-partisan news outlets, and politicians take advantage of vulnerabilities in the current media ecosystem to manipulate news frames and propagate their ideas. They use techniques like memes, bots, and strategic amplification on social media to increase the visibility of their messages. Factors like lack of trust in the media, decline of local news, and the attention economy make the media vulnerable to such manipulation. The outcomes can include increased misinformation, distrust of the media, and further radicalization.
#ThinkPH Social Media Sentiment AnalysisRobin Leonard
My presentation at #ThinkPH 'The Internet, Big Data and You' Conference, on August 23, 2013 at New World Hotel, Makati.
Click here to see the #ThinkPH conference details and agenda: http://www.rappler.com/bulletin-board/36539-agenda-rappler-google-thinkph-internet-big-data-conference
Event hosted by Rappler, Google and SocialGood.
My slides cover:
1. Why analyze sentiment?
2. How does sentiment analysis work?
3. Practical applications
4. Sentiment of #ThinkPH Conference
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...Marcus Leaning
Social media platforms collect vast amounts of personal data through user activities and interactions, which is then analyzed and integrated with other data sources to build detailed profiles of individuals. These profiles can accurately predict personal attributes and behaviors. Marketers and political groups utilize these insights to micro-target advertising and fake news stories meant to influence opinions. The 2016 US election saw the effective use of social media data and fake news to sway voters through methods developed by firms like Cambridge Analytica.
This document summarizes the GATE toolkit and its tools for social media analysis including analyzing sentiment, topics, and hate speech. It discusses how the tools can be used to study misinformation and disinformation online, characterize abuse against women journalists, and understand the escalation of online violence. Challenges discussed include fairness in models, balancing free speech with safety, and ensuring ethical use of personal data.
The document summarizes the work of the Centre for the Analysis of Social Media (CASM). It notes that social media use has rapidly increased, capturing more political, social, and intellectual activity. CASM aims to use this "social media intelligence" or "socmint" to inform understanding, predict events, and provide situational awareness. It discusses using natural language processing to create classifiers and analyze social media data around the 2012 Olympics to detect events in real-time. However, it also outlines challenges with using social media data, including issues of representativeness, veracity, reality, validation, use, and legitimacy. It argues that social media intelligence is still an emerging discipline that must address these challenges to be a legitimate and
This document provides an overview of fake media and its evolution. It discusses how cheap devices and software have enabled the widespread production and distribution of manipulated content. The document outlines the main drivers behind the rise of seamless fake content, including cheap devices, editing software, storage and distribution methods. It also discusses how picture manipulation techniques have evolved over time for purposes like propaganda, election influence and rewriting history. The document proposes that fake media is a multidimensional challenge requiring educational, legal and technical solutions and outlines JPEG's activities to develop standards in this area.
1. Cyber Ethics and Cyber Crime
2. Security in Social Media & Risk of Child Internet
3. Social media in Schools and photo privacy
4. Risk of OSNs and Security, Privacy of Facebook
5. Risk and Security of Social Networking site Facebook and Twitter
6. Risk analysis of Government and Online Transaction
Data commons and their role in fighting misinformation.pdfElena Simperl
The document discusses the role of data commons and open data in fighting misinformation. It notes that algorithms used to detect misinformation are only as good as the data they are trained on, and that data work faces challenges around transparency, accountability, and mitigating biases. However, open data initiatives involving contributions from many users can help by making more trusted data available to algorithms. Overall, participatory and transparent approaches to data are needed to build critical infrastructure for combating the spread of misinformation.
This document summarizes a presentation on privacy, security and ethics related to big data analytics. It discusses several key points:
1. Big data promises new opportunities but also new privacy and surveillance risks due to the vast amount of personal data being collected and analyzed.
2. Privacy risks are best managed proactively through techniques like Privacy by Design which embeds privacy protections from the start of a project.
3. Innovation and privacy are not mutually exclusive; it is possible to gain insights from big data analytics while also protecting privacy through approaches like Privacy by Design.
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019Weverify
Presentation "WEVERIFY: ASSISTIVE AI TOOLS FOR ANALYSING FALSE CONTENT, DISINFORMATION FLOWS, AND ONLINE INFLUENCE CAMPAIGNS". By Kalina Bontcheva. Oct 2019.
This document summarizes a presentation on big data given by Sir Mark Walport, the UK's Chief Scientific Adviser. It discusses the opportunities and risks of big data, including how it can improve health and infrastructure but also enable privacy violations. While data can be anonymized, it is difficult to fully protect privacy due to the ability to match anonymous data with other public datasets. Both utopian and dystopian futures are possible depending on how data is governed and balanced with individual privacy. Moving forward will require advances in technology, open communication, and governance measures to control data access.
Effective Cybersecurity Communication SkillsJack Whitsitt
Presentation describes the problems associated with communication with others - as an information receiver or provider - about cybersecurity and provides insights into how those problems may be overcome through structured communication, the use of positive and negative space, and the setting of perspective and context through lensing.
Pew Internet Director Lee Rainie discussed the new media ecosystem with leaders of community foundations from Western states and several other locales. He described how three technology revolutions have made the media world personal, portable, participatory, and pervasive in people’s lives and how those changes have affected communities.
DefCamp #5, Bucharest, November 29th
Just as a chain is as weak as its weakest link, computer systems are as vulnerable as their weakest component – and that’s rarely the technology itself, it’s more often the people using it. This is precisely why it’s usually easier to exploit people’s natural inclination to trust than it is to discover ways to hack into computer systems. As the art of manipulating people into them giving up confidential information, Social Engineering has been a hot topic for many years. This session will discuss some of the most common Social Engineering techniques and countermeasures.
The document discusses the role of CIOs in combating terrorism through cybersecurity. It outlines how terrorists now use the internet and social media to recruit, fundraise, and plan attacks. CIOs must secure corporate networks and share threat information to prevent their networks from being used by terrorists. The document proposes establishing a regional cybersecurity cooperation center to facilitate collaboration between companies, governments, and law enforcement in addressing cyber threats.
Social Media Training at AED by Eric Schwartzman. This is Day 2 of a 2-Day Seminar delivered on Nov. 10, 2010 in Wqshington, D.C. Feel free to use this deck but please credit www.ericschwartzman.com
Cyber Resilience presented at the Malta Association of Risk Management (MARM) Cybercrime Seminar of 24 June 2013 by Mr Donald Tabone. Mr Tabone, Associate Director and Head of Information Protection and Business Resilience Services at KPMG Malta, presented a six-point action plan corporate entities can follow in order to reach a sustainable level of cyber resilience.
Introduction to Cybersecurity - Secondary School_0.pptxShubhamGupta833557
This document provides an introduction to cybersecurity and discusses various cybersecurity topics such as why people hack, phishing and social engineering, securing public networks and cellular data, what to do if hacked, and tips for increasing password security. Specifically, it explains that hackers may target users for financial gain, revenge, or fun; outlines common phishing techniques on personal accounts and social media; recommends using a VPN on public Wi-Fi and avoiding giving personal info on cellular networks; and advises changing passwords and running antivirus scans if hacked.
Using language to save the world: interactions between society, behaviour and...Diana Maynard
The document discusses social media analysis and natural language processing as applied to Twitter data. It provides statistics on Twitter usage and the most followed accounts. It then discusses challenges in analyzing social media text due to informal language usage and outlines common NLP preprocessing steps. Applications discussed include identifying named entities, geotagging tweets, user and topic classification, and analyzing hate speech directed at politicians on Twitter around UK elections in 2015 and 2017.
The past decade or so has seen such rapid advances in supervised deep learning and neural networks that those areas, and machine learning more generally, have become almost synonymous with AI especially in popular media. However, there are other broad areas of research that have fed into AI historically and continue to be important today.
In this talk, Red Hat’s Gordon Haff will place machine learning within this set of broader science and engineering specialties that include cognitive psychology, control theory, linguistics, and human factors. The goal is to provide attendees with a broader context for both learning and applying cross-disciplinary fields of study to their AI-related work.
Internet subcultures like trolls, gamergaters, hate groups, conspiracy theorists, hyper-partisan news outlets, and politicians take advantage of vulnerabilities in the current media ecosystem to manipulate news frames and propagate their ideas. They use techniques like memes, bots, and strategic amplification on social media to increase the visibility of their messages. Factors like lack of trust in the media, decline of local news, and the attention economy make the media vulnerable to such manipulation. The outcomes can include increased misinformation, distrust of the media, and further radicalization.
#ThinkPH Social Media Sentiment AnalysisRobin Leonard
My presentation at #ThinkPH 'The Internet, Big Data and You' Conference, on August 23, 2013 at New World Hotel, Makati.
Click here to see the #ThinkPH conference details and agenda: http://www.rappler.com/bulletin-board/36539-agenda-rappler-google-thinkph-internet-big-data-conference
Event hosted by Rappler, Google and SocialGood.
My slides cover:
1. Why analyze sentiment?
2. How does sentiment analysis work?
3. Practical applications
4. Sentiment of #ThinkPH Conference
Lecture 10 Inferential Data Analysis, Personality Quizes and Fake News...Marcus Leaning
Social media platforms collect vast amounts of personal data through user activities and interactions, which is then analyzed and integrated with other data sources to build detailed profiles of individuals. These profiles can accurately predict personal attributes and behaviors. Marketers and political groups utilize these insights to micro-target advertising and fake news stories meant to influence opinions. The 2016 US election saw the effective use of social media data and fake news to sway voters through methods developed by firms like Cambridge Analytica.
This document summarizes the GATE toolkit and its tools for social media analysis including analyzing sentiment, topics, and hate speech. It discusses how the tools can be used to study misinformation and disinformation online, characterize abuse against women journalists, and understand the escalation of online violence. Challenges discussed include fairness in models, balancing free speech with safety, and ensuring ethical use of personal data.
The document summarizes the work of the Centre for the Analysis of Social Media (CASM). It notes that social media use has rapidly increased, capturing more political, social, and intellectual activity. CASM aims to use this "social media intelligence" or "socmint" to inform understanding, predict events, and provide situational awareness. It discusses using natural language processing to create classifiers and analyze social media data around the 2012 Olympics to detect events in real-time. However, it also outlines challenges with using social media data, including issues of representativeness, veracity, reality, validation, use, and legitimacy. It argues that social media intelligence is still an emerging discipline that must address these challenges to be a legitimate and
This document provides an overview of fake media and its evolution. It discusses how cheap devices and software have enabled the widespread production and distribution of manipulated content. The document outlines the main drivers behind the rise of seamless fake content, including cheap devices, editing software, storage and distribution methods. It also discusses how picture manipulation techniques have evolved over time for purposes like propaganda, election influence and rewriting history. The document proposes that fake media is a multidimensional challenge requiring educational, legal and technical solutions and outlines JPEG's activities to develop standards in this area.
1. Cyber Ethics and Cyber Crime
2. Security in Social Media & Risk of Child Internet
3. Social media in Schools and photo privacy
4. Risk of OSNs and Security, Privacy of Facebook
5. Risk and Security of Social Networking site Facebook and Twitter
6. Risk analysis of Government and Online Transaction
Data commons and their role in fighting misinformation.pdfElena Simperl
The document discusses the role of data commons and open data in fighting misinformation. It notes that algorithms used to detect misinformation are only as good as the data they are trained on, and that data work faces challenges around transparency, accountability, and mitigating biases. However, open data initiatives involving contributions from many users can help by making more trusted data available to algorithms. Overall, participatory and transparent approaches to data are needed to build critical infrastructure for combating the spread of misinformation.
This document summarizes a presentation on privacy, security and ethics related to big data analytics. It discusses several key points:
1. Big data promises new opportunities but also new privacy and surveillance risks due to the vast amount of personal data being collected and analyzed.
2. Privacy risks are best managed proactively through techniques like Privacy by Design which embeds privacy protections from the start of a project.
3. Innovation and privacy are not mutually exclusive; it is possible to gain insights from big data analytics while also protecting privacy through approaches like Privacy by Design.
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019Weverify
Presentation "WEVERIFY: ASSISTIVE AI TOOLS FOR ANALYSING FALSE CONTENT, DISINFORMATION FLOWS, AND ONLINE INFLUENCE CAMPAIGNS". By Kalina Bontcheva. Oct 2019.
This document summarizes a presentation on big data given by Sir Mark Walport, the UK's Chief Scientific Adviser. It discusses the opportunities and risks of big data, including how it can improve health and infrastructure but also enable privacy violations. While data can be anonymized, it is difficult to fully protect privacy due to the ability to match anonymous data with other public datasets. Both utopian and dystopian futures are possible depending on how data is governed and balanced with individual privacy. Moving forward will require advances in technology, open communication, and governance measures to control data access.
Effective Cybersecurity Communication SkillsJack Whitsitt
Presentation describes the problems associated with communication with others - as an information receiver or provider - about cybersecurity and provides insights into how those problems may be overcome through structured communication, the use of positive and negative space, and the setting of perspective and context through lensing.
Pew Internet Director Lee Rainie discussed the new media ecosystem with leaders of community foundations from Western states and several other locales. He described how three technology revolutions have made the media world personal, portable, participatory, and pervasive in people’s lives and how those changes have affected communities.
DefCamp #5, Bucharest, November 29th
Just as a chain is as weak as its weakest link, computer systems are as vulnerable as their weakest component – and that’s rarely the technology itself, it’s more often the people using it. This is precisely why it’s usually easier to exploit people’s natural inclination to trust than it is to discover ways to hack into computer systems. As the art of manipulating people into them giving up confidential information, Social Engineering has been a hot topic for many years. This session will discuss some of the most common Social Engineering techniques and countermeasures.
The document discusses the role of CIOs in combating terrorism through cybersecurity. It outlines how terrorists now use the internet and social media to recruit, fundraise, and plan attacks. CIOs must secure corporate networks and share threat information to prevent their networks from being used by terrorists. The document proposes establishing a regional cybersecurity cooperation center to facilitate collaboration between companies, governments, and law enforcement in addressing cyber threats.
Social Media Training at AED by Eric Schwartzman. This is Day 2 of a 2-Day Seminar delivered on Nov. 10, 2010 in Wqshington, D.C. Feel free to use this deck but please credit www.ericschwartzman.com
Cyber Resilience presented at the Malta Association of Risk Management (MARM) Cybercrime Seminar of 24 June 2013 by Mr Donald Tabone. Mr Tabone, Associate Director and Head of Information Protection and Business Resilience Services at KPMG Malta, presented a six-point action plan corporate entities can follow in order to reach a sustainable level of cyber resilience.
Introduction to Cybersecurity - Secondary School_0.pptxShubhamGupta833557
This document provides an introduction to cybersecurity and discusses various cybersecurity topics such as why people hack, phishing and social engineering, securing public networks and cellular data, what to do if hacked, and tips for increasing password security. Specifically, it explains that hackers may target users for financial gain, revenge, or fun; outlines common phishing techniques on personal accounts and social media; recommends using a VPN on public Wi-Fi and avoiding giving personal info on cellular networks; and advises changing passwords and running antivirus scans if hacked.
Denver Event - 2013 - New Media Ecosystem: Personal. Portable. Participatory....KDMC
The document summarizes key findings from a Pew Research Center report on digital technology trends in the United States. It finds that broadband internet access at home has increased dramatically, with 66% of Americans now having broadband at home. Mobile internet access through smartphones and tablets is also widespread, with 56% owning smartphones. Social media usage has also increased significantly, with 61% of American adults now using some form of social media. The document concludes by discussing how digital technologies have networked both people and information, changing civic engagement and the flow of information.
This document provides resources for teaching students how to identify and avoid fake news. It includes links to websites run by organizations like the Tampa Bay Times and Stanford University that provide fact-checking tools and strategies. It also discusses psychological factors that can cause the spread of fake news, like confirmation bias, and strategies for overcoming things like emotional or fast thinking. Overall, the document aims to equip students and teachers with the skills and knowledge to more carefully evaluate the credibility of news and information they encounter online.
Big Data Privacy - Society Issues + Big DataSylvia Ogweng
A review of the six societal issues related to big data and privacy, including:
- Perception
- The necessity of data sharing
- Cost reduction
- Public mistrust
- Hubris & Hyperbole
The document provides information on Mark Zuckerberg and the founding of Facebook. It details how Zuckerberg created "Facemash" in 2003 which objectified Harvard students and got him in disciplinary trouble. It then summarizes the founding of Facebook in 2004, its mission/vision, key people, and a timeline of events including data breaches and actions taken in response.
During the COVID-19 Global Pandemic, there were multiple lessons provided to the world. In this talk, I set the stage for the discussion, highlight the issues we faced (and still face), I speak to an effort that contributed to help address one of those issues, then speak to future challenges and our responsibilities going forward.
Similar to Reffin meetup talk slides 20 02-20c (20)
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of March 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
Generative Classifiers: Classifying with Bayesian decision theory, Bayes’ rule, Naïve Bayes classifier.
Discriminative Classifiers: Logistic Regression, Decision Trees: Training and Visualizing a Decision Tree, Making Predictions, Estimating Class Probabilities, The CART Training Algorithm, Attribute selection measures- Gini impurity; Entropy, Regularization Hyperparameters, Regression Trees, Linear Support vector machines.
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of May 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
3. • An excellent approach and something to be deployed with vigour in any
situation where it can usefully be applied
but ...
• Problem #1 It rarely happens and when it does, it’s often an accident
• Problem #2 It’s takes a lot of effort for humans to do it
• Problem #3 It’s impossible (more or less) for computers
Detecting and tracking fake news and misinformation at scale
9. Case Study: Fact checking your article
Read
article
Identify
claims
Collect
evidence
Rank
evidence
Output
10. Detecting and tracking fake news and misinformation at scale
• Problem #4 It’s always something new
• Problem #5 True believers don’t care
• Problem #6 It’s just not good politics
12. • Characterise the conflict
• Identify the activities
Detecting and tracking fake news and misinformation at scale
13. Case Study: Disrupting Daesh – Golden Age
• 2014-2015 Golden Age on Twitter for Islamic State
• Thriving online community (50,000 – 70,000) active accounts
• Very easy access to contact and content
• Obvious markers of support (avatars, screen names, hashtags)
• Strong and supportive ideological community and sub-
communities (e.g. Chechens, ‘Sisters’)
14. Case Study: Disrupting Daesh – late 2015 disruption begins
• From mid 2015 - community disruption begins
o Account suspensions and takedowns
o Disruption of hashtags
• Reactions:
o Flight to Telegram
o May have strengthened community cohesion
• Late 2016: what was left?
o Impact on online Twitter community?
o Activities on Twitter?
16. Method52 allows user to 'fail fast' and iterate to find patterns of use
• Grounded theory (Glaser et al., 1968)
• "Unbiased examination of the available data"
• Iterative exploration of what fits
Scheme 1 Scheme 2 Scheme 3
17. Case Study: Disrupting Daesh. Build bespoke pipelines that are
adapted to the specific scenario
Data
Store
social
media
data
Construction,
maintenance &
analysis tools
Disruption
Monitoring
System
Visualisation
& Evaluation
Daesh
propaganda
analysis system
Visualisation
Engine
Pipeline
Construction
Engine
1
2 3
4
5
seed
accounts
A
B
18. Case Study: Disrupting Daesh
Data Store
-Account details
-Tweet details
-Link details
tweets
Score & analyse
confirmed accounts
Assess
relevancy
(i)
seed
accounts
seed
search
terms
Analyse links in
flagged tweets
Identify new
terms
(ii)
(iii)
(iv)
(v)
(vi)
19. CANDIDATE ACCOUNTS
Strategies for identifying accounts
• Content of tweets
• Generic words (qa'idin, bay'ah, nifaq, mushrik)
• Current topics (tabqa, Suwaydiya, Abu Ali al-Turki)
• Presence of generic coms links (Telegram, YouTube etc.)
• Specific known links (images, YouTube, other videos)
• Specific known hashtags (#tabqa)
• Mentions of specific 'canary' accounts (@39_nas)
• Network analysis
• Build out and understand network. Possible typology: 'source',
'canary', 'news gathering', 'signpost' and 'protected chat'
accounts
• Followers of known 'source' accounts (p_vanostaeyen)
• Followers of known 'canary' accounts (whoamidude)
• Followed by or followers of network members (protected chat
network, 'news gathering' accounts, 'signpost' accounts)
Case Study: Disrupting Daesh - strategies for identifying accounts
20. Tweets Followers Friends
IS 51 14 33
Other Jihadi 320 189 122
Case Study: Disrupting Daesh - account suspension rate
23. * Excludes 7 Mar which had 240 URLs (Rumiyah release)
0%
20%
40%
60%
80%
100%
Feb Mar Apr
URLs per day (mean)
Others (26 domains)
vimple.co, store6.up, pc.cd,
4shared.com
the vid.net
Google Drive
YouTube
sendvid.com
archive.org
IS’s own server
justpaste.it
4 Feb – 8 Feb 4 Mar – 8 Mar* 4 Apr – 8 Apr
cloud.mail.ru, addpost.it, vid.me
Case Study: Disrupting Daesh – URLs used as destinations
24. Note: All accounts
tracked were created
before 0600Z on Tuesday
4 April. Data set created
at 0600Z
Case Study: Disrupting Daesh – intercepting the propaganda
25. *Print media, websites, forums, social media
Inbound
Data*
Assess
relevancy
Sites and
accounts
Analyse
message
Search
terms
Identify
accounts
Identify
narrative
Cluster
narratives
Identify
attributes
Identify
networks
Emerging general methodology: the first iteration
27. Characterising conflict: The concept of ’Information Operations’
• Information operations are vast in scale and numerous in strategies and tactics
• A focus on ‘fake news’ or ‘misinformation’ is myopic
• Most information is not ‘fake’, but the selective amplification of reputable stories
• Information operations are characterised by erratic bursts of activity
• Information operations exploit cultural and social division
• Although information operations are coordinated, they are inconsistent, presenting a
challenge to third-party identification of inauthentic accounts.
28. Case Study: Internet Research Agency operations in the UK
Phase 1: Spam and the process of building credible accounts
I'm ready to eat healthy and workout.
@xhibellamy @William_Stokes @guru_paul
@ThomasAmor1 @jennyc08318 @richtweten
http://t.co/TAZ9Co1QF9
.@pedrareyes148 pedra @Chloe0354
ASDFGchloeHJKLL? @pulmonxry Yeezus
@Nick281051 Nick @puffylore163 lore
http://t.co/ZLpIlrsV33
29. Case Study: Internet Research Agency operations in the UK
Phase 2: Brexit Vote
Those who are still EU members can enjoy
their political correctness and tolerance
#BrexitVote https://t.co/VeMW7bagDQ
This is the simplest explanation. Just like UK we
too want to stop globalist liberals from ruining
us! #BrexitVote https://t.co/XkNFpNof1c
30. Case Study: Internet Research Agency operations in the UK
Phase 3: London Terror Attacks
Welcome To The New Europe! Muslim
migrants shouting in London “This is our
country now, GET OUT!” #Rapefugees
https://t.co/GCiFT96h76
Sharia NO-GO areas in BRITAIN. Citizens
blocked from their own suburbs. Only #Trump
can stop this here! https://t.co/IuQDe8rvPA