ISIS Twitter Analysis
Problem & Hypothesis 
 Problem 
In late August 2014, terrorist group the Islamic Sate of Iraq and Syria 
(ISIS) increased their use of Twitter to further their organizational 
agenda and degrade U.S. competencies and reputation 
 Hypothesis 
Collecting and analyzing ISIS Twitter data can enhance 
understanding of their networks and tactics, techniques, and 
procedures (TTPs) 
• ISIS Twitter activity correlates to U.S. military efforts to combat the 
organization 
• Network analysis can identify the most influential users and their 
communities
Data Challenges 
Twitter removed known ISIS accounts from the web in early 
September prompting a shift in collection strategy 
Decided to select accounts associated with self-identified jihadists, 
leveraging the following criteria: 
• English language 
• “Active” – posted consistently in last 30 days 
• “Popular” – 200+ followers
@AbDujanah 
The coward dies a 
thousand deaths 
*Collection error, 
unable to obtain data 
@julaybeeeeb 
A very poor servant of 
Allah longing for his 
mercy 
Data Collection 
@4bu_Muhaj1r 
Islamic State 
@AbooJihad2013 @AbuTalha001 @Dawlat_Islam2 
Amongst the Islamists in 
Sham 
Account Suspended 
@AbuHussain104 
Random British Mujahid 
Somewhere In The Islamic 
State 
Account Suspended 
@FarisBritani 
OUR REVOLUTION IS 
LIKE THE SALAH 
@jab2victory 
Al-Nusra Fi Bilaad Shaam 
@onthatpath3 
Tweets 118 
Followers 202 
Friends 37 
Location Baqiyaa 
Tweets 459 
Followers 424 
Friends 605 
Location dunya 
Tweets 30 
Followers 749 
Friends 80 
Location Dowlatul 
Islam 
Tweets 832 
Followers 2166 
Friends 484 
Location Syria 
Tweets 872 
Followers 984 
Friends 340 
Location NA 
Tweets 1794 
Followers 359 
Friends 84 
Location NA 
Tweets 4470 
Followers 1559 
Friends 252 
Location NA 
Tweets 2160 
Followers 1503 
Friends 58 
Location Sham 
Tweets 458 
Followers 1221 
Friends 51 
Location IS
Tweets Per User
Data Architecture 
Identify, collect and prepare data: 
1. Identified 10 jihadist Twitter accounts 
2. Pulsed Twitter API for account friend, 
follower, & status data 
3. Extracted key data fields 
4. Stored in MongoDB 
NoSQL Database 
MongoDB 
Twitter 
10 User 
Accounts 
Twython 
Tweepy 
Data Wrangling 
Extract 
Metadata 
Data Ingestion 
Capture Profile Data
Analytic Approach 
Network Analysis 
Isolate Jihadist communities 
and influencers 
Tools 
• NetworkX 
• Gephi 
Approaches 
• Louvain Method 
• Eigenvector Centrality 
Text Analytics 
Identify common and like 
terms; correlate to events 
Tools 
• NLTK • Numpy 
• Pandas • Scipy 
• Sklearn • Genism 
Approaches 
• Frequency Analysis 
• TFIDF Cosine Similarity Matrix 
• Trend Analysis (Man-Kendall 
algorithm) 
• Pearson Correlation
NETWORK ANALYSIS 
RESULTS
Users with Highest Centrality
800 
700 
600 
500 
400 
300 
200 
100 
0 
Retweet s 
Top Retweeted Accounts of the Nine 
Retweeted User 
*Chart reflects 2700 tweets of 13080 
collected 
4bu_Muhaj1r AbooJihad2013 AbuHussain104 AbuTalha001 Dawlat_Islam2 
FarisBritani jab2victory julaybeeeeb onthatpath3 
Nine 
Accounts
TIME SERIES
Tweet and Airstrikes Time Series
Tweets to Airstrikes Regression
TEXT ANALYTICS 
RESULTS
LIVE DEMO
CONCLUSIONS
Conclusions 
There are two distinct communities of users connected by 
@AbuTalha001: 
• Users following self-proclaimed British foreign fighters in Syria 
@FarisBritani and @AbuHussain104 
• Smaller overlapping clusters following information about Arhar 
ash-Sham and al-Nusrah Front 
Syrian Islamist news and translation account 
@IbnNabih identified as most influential user 
in collective network
Conclusions 
No statistical correlation between U.S. airstrikes and the volume 
of Twitter data 
• One user -- @Dawat_Islam2 – has a positive correlation 
between an upward trend in “key words” and airstrikes 
Very few tactical or threatening words/statements 
Two primary categories for user statuses: 
• Religious statements, top 10 words for each user included 
“muslim(s)” and “allah” 
• General news and updates about Islamist fighting in Syria
Lessons Learned & Next Steps 
We need more data; iterating collection from users in our 
network would enhance results 
Traditional methods for text analytics limited in effectiveness for 
Twitter data 
Several other opportunities for analysis: 
• Studying the way users interact 
• The posting of original content vs. retweets (removal of 
media statements) 
• Collecting and analyzing text data of each community

Data Analytics Capstone

  • 1.
  • 2.
    Problem & Hypothesis  Problem In late August 2014, terrorist group the Islamic Sate of Iraq and Syria (ISIS) increased their use of Twitter to further their organizational agenda and degrade U.S. competencies and reputation  Hypothesis Collecting and analyzing ISIS Twitter data can enhance understanding of their networks and tactics, techniques, and procedures (TTPs) • ISIS Twitter activity correlates to U.S. military efforts to combat the organization • Network analysis can identify the most influential users and their communities
  • 3.
    Data Challenges Twitterremoved known ISIS accounts from the web in early September prompting a shift in collection strategy Decided to select accounts associated with self-identified jihadists, leveraging the following criteria: • English language • “Active” – posted consistently in last 30 days • “Popular” – 200+ followers
  • 4.
    @AbDujanah The cowarddies a thousand deaths *Collection error, unable to obtain data @julaybeeeeb A very poor servant of Allah longing for his mercy Data Collection @4bu_Muhaj1r Islamic State @AbooJihad2013 @AbuTalha001 @Dawlat_Islam2 Amongst the Islamists in Sham Account Suspended @AbuHussain104 Random British Mujahid Somewhere In The Islamic State Account Suspended @FarisBritani OUR REVOLUTION IS LIKE THE SALAH @jab2victory Al-Nusra Fi Bilaad Shaam @onthatpath3 Tweets 118 Followers 202 Friends 37 Location Baqiyaa Tweets 459 Followers 424 Friends 605 Location dunya Tweets 30 Followers 749 Friends 80 Location Dowlatul Islam Tweets 832 Followers 2166 Friends 484 Location Syria Tweets 872 Followers 984 Friends 340 Location NA Tweets 1794 Followers 359 Friends 84 Location NA Tweets 4470 Followers 1559 Friends 252 Location NA Tweets 2160 Followers 1503 Friends 58 Location Sham Tweets 458 Followers 1221 Friends 51 Location IS
  • 5.
  • 6.
    Data Architecture Identify,collect and prepare data: 1. Identified 10 jihadist Twitter accounts 2. Pulsed Twitter API for account friend, follower, & status data 3. Extracted key data fields 4. Stored in MongoDB NoSQL Database MongoDB Twitter 10 User Accounts Twython Tweepy Data Wrangling Extract Metadata Data Ingestion Capture Profile Data
  • 7.
    Analytic Approach NetworkAnalysis Isolate Jihadist communities and influencers Tools • NetworkX • Gephi Approaches • Louvain Method • Eigenvector Centrality Text Analytics Identify common and like terms; correlate to events Tools • NLTK • Numpy • Pandas • Scipy • Sklearn • Genism Approaches • Frequency Analysis • TFIDF Cosine Similarity Matrix • Trend Analysis (Man-Kendall algorithm) • Pearson Correlation
  • 8.
  • 11.
  • 12.
    800 700 600 500 400 300 200 100 0 Retweet s Top Retweeted Accounts of the Nine Retweeted User *Chart reflects 2700 tweets of 13080 collected 4bu_Muhaj1r AbooJihad2013 AbuHussain104 AbuTalha001 Dawlat_Islam2 FarisBritani jab2victory julaybeeeeb onthatpath3 Nine Accounts
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
    Conclusions There aretwo distinct communities of users connected by @AbuTalha001: • Users following self-proclaimed British foreign fighters in Syria @FarisBritani and @AbuHussain104 • Smaller overlapping clusters following information about Arhar ash-Sham and al-Nusrah Front Syrian Islamist news and translation account @IbnNabih identified as most influential user in collective network
  • 21.
    Conclusions No statisticalcorrelation between U.S. airstrikes and the volume of Twitter data • One user -- @Dawat_Islam2 – has a positive correlation between an upward trend in “key words” and airstrikes Very few tactical or threatening words/statements Two primary categories for user statuses: • Religious statements, top 10 words for each user included “muslim(s)” and “allah” • General news and updates about Islamist fighting in Syria
  • 22.
    Lessons Learned &Next Steps We need more data; iterating collection from users in our network would enhance results Traditional methods for text analytics limited in effectiveness for Twitter data Several other opportunities for analysis: • Studying the way users interact • The posting of original content vs. retweets (removal of media statements) • Collecting and analyzing text data of each community

Editor's Notes