ABSTRACT: The majority of experts agree that 2016 US elections were targeted by IRA (Internet Research Agency), a Russian pro-government company that used the social networks to interfere with the elections. Twitter, Facebook, and Youtube have become useful tools for Russian propaganda to sway the voters. In October 2018, Twitter released a big dataset containing all the tweets - over 9 million - shared by the Russian trolls. The analysed data show how Internet Research Agency has carefully developed its propaganda. This talk wants to illustrate how Russian propaganda on Twitter has worked.
BIO: My name is Luigi Gubello, but on Twitter I am better known as @evaristegal0is. My name has become public within the case #Hack5Stelle. In my free time I like writing Python code, especially to analyse Twitter data, and I try to find bugs and vulnerabilities in the bug bounty programs: it's fun to look for mistakes in big companies' products. Writing about me is really not my thing, I have issues with "About me" pages.
2. :~$ whoami
● Luigi Gubello by day
● @evaristegal0is by night
● Wannabe mathematician
● Linux user
● Python enthusiast
● Twitter addicted
● Cats lover
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
3. 📅 Some important dates
● 9th Nov 2016: Trump wins 2016 election
● 10th Apr 2018: Zuckerberg testifies before Congress on
data scandal (Cambridge Analytica)
● 31st Jul 2018: FiveThirtyEight shares three million
Russian troll tweets
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
4. ● 17th Oct 2018: Twitter releases the complete dataset of
Internet Research Agency tweets
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
5. What information has Twitter
shared?
● Information about the users
● Information about the tweets
● Huge archive of tweet media (videos and pics)
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
6. What information has Twitter not
shared?
● A lot of accounts are hashed
● Nothing about followers
● Nothing about changes in the accounts
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
7. 🔧 Python modules I have used
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
● pandas
● csv
● langid
● re
● emoji
● matplotlib
● wordcloud
● seaborn
8. ⚽ Goal
1. Was there Russian propaganda in Europe 🇪🇺 ?
2. Did the Internet Research Agency try to manipulate
information in Italy ?
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
9. Some numbers about the dataset
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
● There are 3.613
accounts
● The accounts were
created mainly in 2013
and 2014, two years
before the US election*
* Numbers based on version 1.0 of the dataset
10. Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
The most used languages are:
English, Russian, German,
Ukrainian and Bulgarian.
There are:
● 2.384 English accounts (62,1%)
● 1.039 Russian accounts (27,1%)
● 111 German accounts (1,6%)
11. Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
The most common words are:
● follow (203)
● love (128)
● conservative (121)
● life (119)
● Trump (103)
● MAGA (96)
12. Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
9 million tweets from
2012 to 2018.
Interactions in 2016
and 2017 are
impressive:
● 30 million
● 25 million
● almost 2 million
14. Russian propaganda in the US
Graph of English accounts by creation date [red lines]**
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
Most accounts were created
between May, 2013, and
September, 2014, two years
before the US election.
* Red lines represent the accounts with English as default language
** Numbers based on version 1.1 of the dataset
15. Volume of English tweets by month
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
Internet Research Agency
was very active in the US
between July 2014 and April
2017.
16. Russian strategy in the US
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
● Fake local news accounts
● Sharing polarizing news
● Sharing daily and common
news (e.g. sport)
● Meme
● Interactions with trusted or
popular users (e.g. Trump,
journalists)
● Different languages in
different places
17. Russian propaganda in Europe 🇪🇺 ?
Numbers based on version 1.0 of the dataset
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
18. Russian propaganda in Germany
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
Many German accounts were
created in October, 2015,
and July, 2016, two years
before German federal
election.
* Numbers based on version 1.1 of the datasetGraph of German accounts by creation date [red lines] *
19. Volume of German tweets by month
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
● 2016: Brexit
● 2017: federal election in
Germany
20. Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
Most used hashtags are:
● Merkel
● Erdogan
● political parties
● Syria and refugees
● Brexit
● Trump
* Numbers based on the version 1.0 of the dataset
21. Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
The most retweeted accounts
are trusted mainstream media
and journalists.
22. Did Internet Research Agency try to
manipulate information in Italy ?
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
23. Volume of Italian tweets by month *
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
There are 2.494 tweets and
17.882 retweets in Italian.
* Numbers based on version 1.1 of the dataset
24. Graph of Italian accounts by creation date [red lines]
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
The spike in March and April,
2017, was generated by only
nine coordinate accounts,
created on 6th March 2017.
25. Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
These nine accounts posted
only retweets, therefore their
direct impact was irrelevant.
* Numbers based on version 1.0 of the dataset
26. 📊 The pattern
Volume of Italian tweets by month for each account *
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
Two groups:
● Only active in March
● Also active in other
months
* Numbers based on version 1.1 of the dataset
27. Daily rhythm of each account
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com
Eight accounts were more
active on Tuesday between
3 p.m. and 5 p.m. and
Wednesday between 8 a.m.
and 10 a.m..
28. Conclusion
● A sophisticated propaganda campaign took place in the
United States before the 2016 election
● There was a disinformation campaign in Germany,
especially before the federal election
● There was no Russian propaganda in Italy, but there was
a coordinate sub-network that shared Italian news for
unknown reasons
Speck & Tech - 4th October 2019 luigi.gubello [at] protonmail.com