0
the thrilling story of the

#socialbots
Sune Lehmann
Associate Professor
Department of Applied Mathematics and Computer Sc...
Art by Allan Criss
network structure

Artwork: C. Hidalgo
an ambitious project to map all protein–protein interactions in yeast
is currently estimated to detect approximately 20% o...
relapse
02805 Social graphs and interactions
twitterbots!
twitterbots!
anecdote #1: you must be careful what you say
[FAIL]
competition #1
who can get the most
Justin Bieber
followers?

[v2.0 bots inspired by Tim Hwang]
strategy
1. manually create a “realistic” profile (including
a few tweets)
2. pick users with 50 - 200 followers
3. follow ...
some early insights
strategy
1. manually create a “realistic” profile (including
a few tweets)
2. pick users with 50 - 200 followers
3. follow ...
bieber has a lot of (dark matter) spam followers!
anecdote #2: an accidental turing test
Noise
1

?

Twitter mood predicts the stock market.
Johan Bollen1,? ,Huina Mao1,? ,Xiao-Jun Zeng2 .

s.CE] 14 Oct 2010

?:...
a robot education
• recognize bots
• use ML to recognize “good” content (to tweet and retweet (using features, sentiment, ...
social influence?
trends are local
so bots must be
Bostonian
little is know about
exact form of
trending algorithm
(twitter secret
sauce)
we...
how?
how?
build convincing avatars and use high follower
counts as part of your disguise
how?
build convincing avatars and use high follower
counts as part of your disguise
how?
build convincing avatars and use high follower
counts as part of your disguise
how?
build convincing avatars and use high follower
counts as part of your disguise
use machine learning to separate bots ...
how?
build convincing avatars and use high follower
counts as part of your disguise
use machine learning to separate bots ...
how?
build convincing avatars and use high follower
counts as part of your disguise
use machine learning to separate bots ...
use network analysis to explore communities
surrounding existing followers, making sure bot
www.nature.com/scientificrepor...
interventions: collaboration is key
two manual tweets per bot
geotag & time w Boston
favorite everything with hashtag
retw...
interventions: collaboration is key
two manual tweets per bot
geotag & time w Boston
favorite everything with hashtag
retw...
interventions: collaboration is key
two manual tweets per bot
geotag & time w Boston
favorite everything with hashtag
retw...
#bostonthanks
#MeInThree
red, white, and blue
coffee, coffee, coffee
math, insight, irreverence
iron & wine + white buffalo + donavon fr...
#banksyinboston
[crude photoshop work]
real world action
“discovery of existing
art”
great investigative
journalism by
@abtran
work to verify rumors
(similar to ...
work to verify rumors (similar to london riots)
lessons
surprisingly easy to get followers, impersonate
humans using crude means
in some areas Twitter is very different f...
perspectives
probably no real influence (“stickyness” still central)
bots can make a difference
perspectives
probably no real influence (“stickyness” still central)
bots can make a difference
frightening/dystopian persp...
[thanks!]
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Sune Lehman: #socialbots and how we created the Boston Banksy Hoax
Upcoming SlideShare
Loading in...5
×

Sune Lehman: #socialbots and how we created the Boston Banksy Hoax

312

Published on

Prof. Sune Lehman used his course machine learning to create socialbots, that in their turn created a Boston Banksy Hoax.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
312
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Sune Lehman: #socialbots and how we created the Boston Banksy Hoax"

  1. 1. the thrilling story of the #socialbots Sune Lehmann Associate Professor Department of Applied Mathematics and Computer Science Technical University of Denmark @suneman
  2. 2. Art by Allan Criss
  3. 3. network structure Artwork: C. Hidalgo
  4. 4. an ambitious project to map all protein–protein interactions in yeast is currently estimated to detect approximately 20% of connections14. doi:10.1038/nature09182 collection continues to increase, networks become As the rate of data tion link a ales. are oca4b, the 50 km low b t= om0.24 nity t = 0.27 ures geoLETTERS reak t = 0.27 c , we Largest Figure 1 a b community y in the disc Second network 1,2 1,2 3,4 ong largest Yong-Yeol Ahn *, James P. Bagrow * & Sune Lehmann * b, Comp Third largest ures displaye organiza e for Threshold, t = 0.20 dendrog . To d, e, An d Largest community   Largest matrix vely represent link communities (Fig. 1d, e and Methods). In this den- ( Networks have become a key approach tosubcommunity D systems understanding dendrog around drogram, links occupy unique positions whereas nodes naturally of interacting objects, unifying the study of diverse phenomena regions Family 1–3 c 0.4 0.6 1 occupy multiple positions, owing to their links. We in same link com- t extract including biological organisms and human society . One crucial t 0.8 related Buildings neighborhood produce munities at multiple levels by cutting this dendrogram at various step when studying the structure and dynamics of networks is to thresholds. Each node inherits all memberships of its links and can identify communities4,5: groups of related nodes that correspond Thes University Home and work thus belong to multiple, overlapping communities. Even though we to functional subunits such as protein complexes6,7 or social structu 8–10 9,10 assign only a single membership per link, link communities canlogical also spheres . Communities in networks often overlap such that work ca capture multiple relationships between nodes, because multiple28 nodes simultaneously belong to several groups. Meanwhile, many Joint appointment terms nodes can simultaneously belong to3–4 several communities together. ( networks are known to possess hierarchical organization, where 2–4 nodes d 1 e 1–4 determ communities are recursively grouped into a hierarchical strucThe link dendrogram provides a rich hierarchy of structure, but to 2 2–3 they co 1–2 ture11–13. However, the fact that many real networks have com- obtain the most relevant communities it is necessary to determine the 1–3 3 estimat 9 4–7 munities with pervasive overlap, where each and every node best level at which to cut the tree. For this purpose, we introduce a quality Remaining 5–6 4 03 4–6 7 belongs to more than one group, has the consequence that a global natural objective function, the partition density, D, based on quality’ link hierarchy 4–5 6 20 bolic hierarchy of nodes cannot capture the relationships between overdensity inside communities; unlike 7–9 modularity , D does not sufferp 8 7–8 metabo 5 e 1 Phone Word of links Metabolic lapping groups. Here we reinvent communities as groupsassociation from a resolution limit25 (Methods). 8–9 Computing D at each level of the find hig Experiment, science f networ rather than nodes and show that this unorthodox approach suclink dendrogram allows us to pick the best level to cut (although Smart, intellect, scientists 0.8 Chemical Biologist Flask Beaker classifie Exceptional Invent cessfully reconciles the antagonistic organizing principles of over- con- LETTERS x Link communities reveal multiscale complexity in networks
  5. 5. relapse
  6. 6. 02805 Social graphs and interactions
  7. 7. twitterbots!
  8. 8. twitterbots!
  9. 9. anecdote #1: you must be careful what you say
  10. 10. [FAIL]
  11. 11. competition #1 who can get the most Justin Bieber followers? [v2.0 bots inspired by Tim Hwang]
  12. 12. strategy 1. manually create a “realistic” profile (including a few tweets) 2. pick users with 50 - 200 followers 3. follow 10-20 new users per day 4. un-follow whoever doesn’t follow you back within 24 hours 5. repeat step 2-4
  13. 13. some early insights
  14. 14. strategy 1. manually create a “realistic” profile (including a few tweets) 2. pick users with 50 - 200 followers 3. follow 10-20 new users per day 4. un-follow whoever doesn’t follow you back within 24 hours 5. repeat step 2-4 100 [initial goal was approx 50 followers per bot]
  15. 15. bieber has a lot of (dark matter) spam followers!
  16. 16. anecdote #2: an accidental turing test
  17. 17. Noise 1 ? Twitter mood predicts the stock market. Johan Bollen1,? ,Huina Mao1,? ,Xiao-Jun Zeng2 . s.CE] 14 Oct 2010 ?: authors made equal contributions. Abstract—Behavioral economics tells us that emotions can profoundly affect individual behavior and decision-making. Does this also apply to societies at large, i.e. can societies experience mood states that affect their collective decision making? By extension is the public mood correlated or even predictive of economic indicators? Here we investigate whether measurements of collective mood states derived from large-scale Twitter feeds are correlated to the value of the Dow Jones Industrial Average (DJIA) over time. We analyze the text content of daily Twitter feeds by two mood tracking tools, namely OpinionFinder that measures positive vs. negative mood and Google-Profile of Mood States (GPOMS) that measures mood in terms of 6 dimensions (Calm, Alert, Sure, Vital, Kind, and Happy). We cross-validate the resulting mood time series by comparing their ability to detect the public’s response to the presidential election and Thanksgiving day in 2008. A Granger causality analysis and a Self-Organizing Fuzzy Neural Network are then used to investigate the hypothesis that public mood states, as measured by the OpinionFinder and GPOMS mood time series, are predictive of changes in DJIA closing values. Our results indicate that the accuracy of DJIA predictions can be significantly improved by sentiment from blogs. In addition, Google search queries have been shown to provide early indicators of disease infection rates and consumer spending [14]. [9] investigates the relations between breaking financial news and stock price changes. Most recently [13] provide a ground-breaking demonstration of how public sentiment related to movies, as expressed on Twitter, can actually predict box office receipts. Although news most certainly influences stock market prices, public mood states or sentiment may play an equally important role. We know from psychological research that emotions, in addition to information, play an significant role in human decision-making [16], [18], [39]. Behavioral finance has provided further proof that financial decisions are significantly driven by emotion and mood [19]. It is therefore reasonable to assume that the public mood and sentiment can drive stock market values as much as news. This is supported by recent research by [10] who extract an indicator of public anxiety from LiveJournal posts and investigate whether its
  18. 18. a robot education • recognize bots • use ML to recognize “good” content (to tweet and retweet (using features, sentiment, etc) • detect topics in your tweet stream • analyze the twitter network to find communities, influentials, etc
  19. 19. social influence?
  20. 20. trends are local so bots must be Bostonian little is know about exact form of trending algorithm (twitter secret sauce) we know that there’s a focus on novelty, and that sustained growth + retweets + faves might matter
  21. 21. how?
  22. 22. how? build convincing avatars and use high follower counts as part of your disguise
  23. 23. how? build convincing avatars and use high follower counts as part of your disguise
  24. 24. how? build convincing avatars and use high follower counts as part of your disguise
  25. 25. how? build convincing avatars and use high follower counts as part of your disguise use machine learning to separate bots from humans (so we can focus on humans)
  26. 26. how? build convincing avatars and use high follower counts as part of your disguise use machine learning to separate bots from humans (so we can focus on humans) use natural language processing and ML to find quality content & topics to tweet and re-tweet
  27. 27. how? build convincing avatars and use high follower counts as part of your disguise use machine learning to separate bots from humans (so we can focus on humans) use natural language processing and ML to find quality content & topics to tweet and re-tweet use network analysis to explore communities surrounding existing followers, making sure bot actions reach entire communities
  28. 28. use network analysis to explore communities surrounding existing followers, making sure bot www.nature.com/scientificreports actions reach entire communities Figure 1 | The importance of community structure in the spreading of social contagions. (A) Structural trapping: dense communities with few outgoing Lilian Weng, Filippo Menczer & Yong-Yeolreinforcement: people who have adopted a meme (black nodes) trigger multiple exposures to others (red links naturally trap information flow. (B) Social Ahn Virality Prediction and Community Structure in Social Networks to produce more multiple exposures than in the case of low clustering, nodes). In the presence of high clustering, any additional adoption is likely Scientific Reports,additional adoptions. (C) Homophily: people in the same community (same color nodes) are more likely to be similar and to adopt inducing cascades of 3:2522
  29. 29. interventions: collaboration is key two manual tweets per bot geotag & time w Boston favorite everything with hashtag retweet everything with hashtag (but don’t go crazy) coordinate timing share list of followers (yy’s result/multiple exposures)
  30. 30. interventions: collaboration is key two manual tweets per bot geotag & time w Boston favorite everything with hashtag retweet everything with hashtag (but don’t go crazy) share list of followers (yy’s result/multiple exposures)
  31. 31. interventions: collaboration is key two manual tweets per bot geotag & time w Boston favorite everything with hashtag retweet everything with hashtag (but don’t go crazy) share list of followers (yy’s result/multiple exposures)
  32. 32. #bostonthanks
  33. 33. #MeInThree red, white, and blue coffee, coffee, coffee math, insight, irreverence iron & wine + white buffalo + donavon frankenreiter [intended to work via engagement, but failed]
  34. 34. #banksyinboston
  35. 35. [crude photoshop work]
  36. 36. real world action “discovery of existing art” great investigative journalism by @abtran work to verify rumors (similar to london riots)
  37. 37. work to verify rumors (similar to london riots)
  38. 38. lessons surprisingly easy to get followers, impersonate humans using crude means in some areas Twitter is very different from what we experience
  39. 39. perspectives probably no real influence (“stickyness” still central) bots can make a difference
  40. 40. perspectives probably no real influence (“stickyness” still central) bots can make a difference frightening/dystopian perspectives
  41. 41. [thanks!]
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×