Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mining Social Media Data For Policing

142 views

Published on

Keynote slides for the Alberto Mendelzon Workshop, Cali, Colombia, May 2018

Published in: Science
  • Be the first to comment

  • Be the first to like this

Mining Social Media Data For Policing

  1. 1. 1! Alberto Mendelzon Workshop (AWM) 23rd May 2018 1! Mining Social Media Data for Policing Presenting: Miriam Fernandez, Knowledge Media Institute Work done in collaboration with some fantastic colleagues! @miriam_fs fernandezmiriam @miriamfs
  2. 2. 2! Alberto Mendelzon Workshop (AWM) 23rd May 2018 2! Who we are? 2
  3. 3. 3! Alberto Mendelzon Workshop (AWM) 23rd May 2018 3! Three lines of work presented in this talk •  Detecting Grooming Behaviour on Social Media •  Radicalisation detection on Social Media •  Policing Engagement via Social Media
  4. 4. 4! Alberto Mendelzon Workshop (AWM) 23rd May 2018
  5. 5. 5! Alberto Mendelzon Workshop (AWM) 23rd May 2018
  6. 6. 6! Alberto Mendelzon Workshop (AWM) 23rd May 2018 6! Detecting Grooming Behaviour on Social Media Cano, E; Miriam, F.; and Alani, H (2014). Detecting child grooming behaviour patterns on social media. The 6th International Conference on Social Informatics (SocInfo), Barcelona, Spain. Some of the next slides from: https://www.slideshare.net/halani
  7. 7. 7! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Child Grooming Premeditated behaviour intending to secure the trust of a minor as a first step towards future engagement in sexual conduct. Choo, K-K R. Responding to online child sexual grooming: an industry perspective, Trends & issues in crime and criminal justice, no. 379. July 2009
  8. 8. 8! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Claire Lilley, Ruth Ball, Heather Vernon, The experiences of 11-16 year olds on social networking sites, NSPCC 2014 “findings show that approximately 190,000 UK children (1 in 58) will suffer contact sexual abuse by a non- related adult before turning 18, with approximately 10,000 new child victims of contact sexual abuse being reported in the UK each year.”
  9. 9. 9! Alberto Mendelzon Workshop (AWM) 23rd May 2018 “50% of all 11 and 12 year-olds in the UK use a social networking site, according to our research. This is because it's easy for children to access sites intended for older users.” https://www.nspcc.org.uk/preventing-abuse/keeping-children- safe/share-aware/
  10. 10. 10! Alberto Mendelzon Workshop (AWM) 23rd May 2018 https://www.statista.com/statistics/ 271348/facebook-users-in-the- united-kingdom-uk-by-age/
  11. 11. 11! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Children’s use of mobile phones - A special report 2014. http://www.gsma.com/publicpolicy/wp-content/uploads/2012/03/GSMA_Childrens_use_of_mobile_phones_2014.pdf
  12. 12. 12! Alberto Mendelzon Workshop (AWM) 23rd May 2018 https://www.thinkuknow.co.uk/parents/articles/Online-grooming/ Online Grooming
  13. 13. 13! Alberto Mendelzon Workshop (AWM) 23rd May 2018 https:// www.thinkuknow.co.uk/ 14_plus/Need-advice/ Online-grooming/ Signs of Online Grooming
  14. 14. 14! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Predator: hey whats up?… Predator: I like your pic, very cute Predator: so you're in san diego? 13-yr-old-girl: not far Predator: ok, you like older guys? 13-yr-old-girl: thers nice or bad ppl all ages Predator: have some pics if you want to see Predator: do your parents look on your computer? Predator: so are you by yourself or is someone else there with you? Predator: so it should just be us, our little secret Predator: so have you ever snuck out? 13-yr-old-girl: not rlly lol Predator: yeah, what about tonight? Predator: think you could sneak out tonight? Predator: well if the wrong person found out then I'd be screwed 13-yr-old-girl: im not a teller lol Predator: I know, just wouldn't want your dad to find out Predator: if you are still up why not sneak out for a few minutes Predator: but that's the fun of it 13-yr-old-girl: fun to sneak? Predator: yes Predator: so your dad doesn't know Predator: would take a nap but I leave for bible study around 6:30 Predator: I know I'm bad, going to bible study and talking about sex with you Predator: yeah, there's nothing wrong with us being friends, we have the same lord remember ;) Predator: would take me like an hour and a half to get there Predator: see you in a little while ~700 messages Over a 5 month period Grooming in Action
  15. 15. 15! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Olson, L. N., Daggs, J. L., Ellevold, B. L. and Rogers, T. K. K. (2007), Entrapping the Innocent: Toward a Theory of Child Sexual Predators’ Luring Communication. Communication Theory, 17: 231–251 Olson’s Theory of Luring Communication (LTC)
  16. 16. 16! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Predator: hey whats up?… Predator: I like your pic, very cute Predator: so you're in san diego? 13-yr-old-girl: not far Predator: ok, you like older guys? 13-yr-old-girl: thers nice or bad ppl all ages Predator: have some pics if you want to see Predator: do your parents look on your computer? Predator: so are you by yourself or is someone else there with you? Predator: so it should just be us, our little secret Predator: so have you ever snuck out? 13-yr-old-girl: not rlly lol Predator: yeah, what about tonight? Predator: think you could sneak out tonight? Predator: well if the wrong person found out then I'd be screwed 13-yr-old-girl: im not a teller lol Predator: I know, just wouldn't want your dad to find out Predator: if you are still up why not sneak out for a few minutes Predator: but that's the fun of it 13-yr-old-girl: fun to sneak? Predator: yes Predator: so your dad doesn't know Predator: would take a nap but I leave for bible study around 6:30 Predator: I know I'm bad, going to bible study and talking about sex with you Predator: yeah, there's nothing wrong with us being friends, we have the same lord remember ;) Predator: would take me like an hour and a half to get there Predator: see you in a little while Approach Grooming Trust Development Isolation Physical Approach Physical Approach
  17. 17. 17! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Predator: hey whats up?… Predator: I like your pic, very cute Predator: so you're in san diego? 13-yr-old-girl: not far Predator: ok, you like older guys? 13-yr-old-girl: thers nice or bad ppl all ages Predator: have some pics if you want to see Predator: do your parents look on your computer? Predator: so are you by yourself or is someone else there with you? Predator: so it should just be us, our little secret Predator: so have you ever snuck out? 13-yr-old-girl: not rlly lol Predator: yeah, what about tonight? Predator: think you could sneak out tonight? Predator: well if the wrong person found out then I'd be screwed 13-yr-old-girl: im not a teller lol Predator: I know, just wouldn't want your dad to find out Predator: if you are still up why not sneak out for a few minutes Predator: but that's the fun of it 13-yr-old-girl: fun to sneak? Predator: yes Predator: so your dad doesn't know Predator: would take a nap but I leave for bible study around 6:30 Predator: I know I'm bad, going to bible study and talking about sex with you Predator: yeah, there's nothing wrong with us being friends, we have the same lord remember ;) Predator: would take me like an hour and a half to get there Predator: see you in a little while Approach Grooming Trust Development Isolation Physical Approach Physical Approach Can we automatically identify these stages?
  18. 18. 18! Alberto Mendelzon Workshop (AWM) 23rd May 2018 “think you could sneak out tonight?“ Grooming Trust Development Physical Approach other Automatic Classifiers Yes NoNoNo Identifying Grooming Stages
  19. 19. 19! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Dataset •  50 transcripts of conversations between convicted predators and volunteers who posed as minors •  Conversations vary between 83 to 12K lines. •  Each predator line manually labelled by two annotators. •  Annotations labels: 1)Trust development, 2) Grooming, 3) Seek physical approach, 4) Other. Trust Dev. Grooming Phys. Approach Other 1225 3304 2700 3304sentences
  20. 20. 20! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Processing Chat Text •  Challenges in processing chat-room conversations –  Use of irregular and ill-formed words. –  Use of chat slang and teen-lingo –  Use of emoticons. Generated a list of over 1K terms and definitions: Chat term Translation Emoticon Translation ASLP Age, sex, location, picture :’-( I’m crying AWGTHTHTTA Are we going to have to go through this again? o/o High five BRB Be right back @_@ I’m tired, trying to stay awake CWOT Complete waste of time ( ‘}{‘ ) kiss
  21. 21. 21! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Analysis Features and Results Results - with all features: Feature Description N-gram word combinations extracted from text (N=1,2,3) Part-of-speech tagging noun, verb, adjective, plural, etc. sentiment average sentiment of terms in sentence length number of words in sentence Psycho-linguistic Patterns 62 psycho-linguistic patterns in English (swearing, sexual, agreement, etc.) LIWC Semantic frames Type of event, relation, or entity in text, e.g., secrecy, desirability, emotion, kinship (SEMAPHORE) Trust Development Grooming Phys. Approach average Precision 79.2% 87.6% 87.2% 84.7% Recall 82.3% 88.8% 88.7% 86.6% F1 80.7% 88.2% 87.9% 85.6%
  22. 22. 22! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Next Steps •  Explore the development into Apps •  Understand how the alerts should be provided, when •  What action should they enforce or suggest •  How to assess vulnerability and how to inform the child •  Explore the use of more features, higher accuracy
  23. 23. 23! Alberto Mendelzon Workshop (AWM) 23rd May 2018 23! Radicalisation detection on Social Media Fernandez M.. Asif, M. Alani, H. Understanding the roots of radicalisation on Twitter. WebScience2018 Saif H. Fernandez M. Dickinson T, Kastler L. & Alani H. A Semantic Graph-based Approach for Radicalisation Detection on Social Media. ESWC 2017 Saif H. Fernandez, M. Rowe, M. & Alani H. On the Role of Semantics for Detecting pro-ISIS stances on social media. ISWC 2016 Rowe M & Saif H. Mining Pro-ISIS Radicalisation Signals from Social Media Users. ICWSM 2016. Some of the next slides from: https://www.slideshare.net/Staano/
  24. 24. 24! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Online Radicalisation •  Is the process by which individuals are introduced to ideological messages and belief systems that encourage movement from mainstream beliefs toward extreme views, primarily through the use of online media[International Assoc of Chiefs of Police and United States of America]
  25. 25. 25! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Islamic State in Iraq and Syria (ISIS) Social Media Propaganda & Recruiting
  26. 26. 26! Alberto Mendelzon Workshop (AWM) 23rd May 2018 ISIS on Social Media
  27. 27. 27! Alberto Mendelzon Workshop (AWM) 23rd May 2018
  28. 28. 28! Alberto Mendelzon Workshop (AWM) 23rd May 2018 28 Mining Pro-ISIS Radicalisation Signals from Social Media Users
  29. 29. 29! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Research Questions and Objectives •  RQ1: How can we detect when a user has adopted a pro- ISIS stance? •  RQ2: What happens to Twitter users before and after the exhibit radicalised behaviour? •  RQ3: What influences users to adopt pro-ISIS language?
  30. 30. 30! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Data Collection and Analysis Kurdish Jihadist Pro-Assad Secular/ Moderate Fig. 1: Syrian account network (652 nodes, 3,260 edges). Four major categories; Jihadist (gold, right), Kurdish (red, top), Pro-Assad (purple, left), and Secular/Moderate opposition (blue, center). Black nodes are members of multiple communities. Visualization was performed with the OpenOrd layout in Gephi. contrast with the polarization analyzed in certain studies of mainstream political activism [3], [10], the three communities selected consist of two polar opposites, jihadist and secular revolutionary, with the third community considerably moderate in comparison. The analysis process includes the generation found few references to these from the liberal and conservative blogs), but suggested that they could be considered in future analysis. Progressive and conservative polarization on Twitter was investigated by Conover et al. , where hashtags were used to gather data leading to two network representations based on O’Callaghan et al. 2014 625 Users 2.4M Users 154K EU Users 104M Tweets English 43% Arabic 41% Others 16%
  31. 31. 31! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Identifying Signals of Radicalisation Lexicon- and Network-based Approach H1 – Sharing Incitement Material H2 – Using Extremist Language ‫الخلافة‬ ‫دولة‬ ISIS Shirk Caliphate Islamic State ‫ارهاب‬ Radicalization LexiconKnown suspended ISIS Accounts 727!
  32. 32. 32! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Activation Points (RQ1) •  Increase in users activated between May 2014 and November 2014 coincides with execution of 6 hostages by ISIS and the videos of these executions posted via social media •  The majority of users posts pro- ISIS terms before sharing content from pro-ISIS accounts Table 2: Significant events involving ISIS/ISIL and the West. Date Description 08-04-2013 ISIS expand into Syria 04-01-2014 Fallujah captured by ISIS 15-01-2014 ISIL retake Ar-Raqqah 01-05-2014 ISIS carry out public executions in Ar-Raqqah 09-06-2014 Mosul falls under ISIS control 02-09-2014 Hostage Steven Sotloff executed 13-09-2014 Hostage David Haines executed 22-09-2014 Hostage Samira Salih al-Nuaimi executed 03-10-2014 Hostage Alan Henning executed 07-10-2014 Abu Bakr al-Baghdadi injured in US air strike 16-10-2014 Hostage Peter Kassig executed 14-01-2015 Christopher Lee Cornell arrested for bomb plot 25-01-2015 Hostage Haruna Yukawa executed 31-01-2015 Hotage Kenji Goto executed 06-02-2015 Hostage Kayla Mueller killed in air strike 26-02-2015 Jihadi John is identified as Mohammed Emwazi 18-03-2015 ISIS responsible for Tunisia museum attack 15-05-2015 Abu Sayyaf killed by US special forces 30-06-2015 Alaa Saadeh arrested for attempts to aid ISIS 11-07-2015 Maher Meshaal killed in coalition air strike ses. Figure 2(a) and figure 2(b) show the number of users who are activated on each day according to each hypothesis. We note that the span of activations of H1 users is shorter than H2 users - as the former requires sharing content from banned or pro-ISIS accounts, while the latter looks at the use of pro-ISIS terms. One thing that is immediately appar- ent from the plots is that there is a large surge in activity from May 2014 onwards - for both H1 and H2 activations. To investigate why this surge occurs, we identified a series of key events related to ISIS/ISIL from 2013 onwards - these are shown in Table 2. As noted, the increase in activations between May 2014 and November 2014 coincides with exe- cution of 6 hostages by ISIS and the videos of these execu- tions posted via social media. Although we cannot discern causation (of activation) from correlation here, there does Detecting Having det the H1 and amine wha RQ2: What icalised beh haviour is measureme used by a u tweets), (ii (i.e. propag the user has as lexical, s forms a dis from a give Each distrib distribution window: fo (PL [t,t0)) is t within the u dealing wit cess of tran to English u guages to b In order changed on (aka. Kullb dows. Each then forms mension ha fore the mi denote the distribution puted using As ment over three w
  33. 33. 33! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Behaviour Before/After Activation (RQ2) •  Users exhibit a large divergence in their language once activated –  Before activation the majority of topics users discuss focus on politics, where words like Syria, Israel and Egypt are mentioned in a negative context and with high frequency –  After activation religious words (e.g. Allah, muslims, quran) become more popular. Pre-Activation Activation Post-Activation
  34. 34. 34! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Influencing Pro-ISIS Term Adoption (RQ3) •  We study the effect of –  Lexical Homophily: similarity in language –  Sharing Homophily: diffusion of information from the same accounts –  Interaction Homophily: common communications Social dynamics play a strong role in term uptake. Subcommunities act as bridges between radicalised user and the future adopter pro-ISIS UserPotential Adopter
  35. 35. 35! Alberto Mendelzon Workshop (AWM) 23rd May 2018 35 Automatic detection of pro-ISIS stances on social media
  36. 36. 36! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Radicalisation Detection Background Machine Learning ApproachesLexicon-based Approaches Stance Label Gonna kidnap journalists and cut their heads off ISIS isn’t evil, it’s made up of people doing what they think is best for their community The brothers from Charlie Hebdo attack did their part. It’s time for brothers in the UK to do their part ‫الخلافة‬ ‫دولة‬ ISIS Shirk Caliphate Islamic State ‫ارهاب‬ Radicalization Lexicon
  37. 37. 37! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Tweets Conceptual. Semantics. Extraction DBpedia Semantic.Graph. Representation Frequent.Semantic. Subgraph.Mining Classifier.Training Pipeline of detecting pro-ISIS stances using semantic sub-graph mining-based feature extraction •  Extract and use the semantic interdependencies and relations between words to learn patterns of radicalisation. ISIS Syria Jihadist Group Country (Military Intervention Against ISIL, place, Syria) Entities Concepts Semantic Relations Semantic Graph-based Approach for Pro-ISIS Stance Detection
  38. 38. 38! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Semantic Graph-based Approach for Pro-ISIS Stance Detection Step 1. Conceptual Semantic Extraction Training Data: 566 pro-ISIS users / 566 anti-ISIS users (extracted using lexicons) Entity Extraction and Semantics Mapping Syria -> Country ISIL-> Jihadist Group Syria -> Country ISIL-> Jihadist Group pro-ISIS No. of Unique Entities 32,406 No. of Unique Concepts 35 Entity Concept E Top 10 Frequent Entities & their Concepts MSNBC Company B Iraq Country U Allah Person K America Continent L Muslim Person IS Officer JobTitle S Wounds HealthCondition E Syria Country Ir WAPO PrintMedia K Israel Country P Table 1: Total number and top 10 frequent entities and their associated sem from our dataset. two named entities this approach takes as input the identifiers (i.e entity es, the target entity et and an integer value K that determin length of the relations between the two named entities. The outpu queries that enable the retrieval of paths of length at most K conn that in order to extract all the paths, all the combinations of ingoing be considered. For example, if we were interested in finding pat connecting es = Syria and et = ISIL our approach will consid SPARQL queries: SELECT * WHERE {:Syria ?p1 :ISIL} SELECT * WHERE {:ISIL ?p1 :Syria} SELECT * WHERE {:Syria ?p1 ?n1. ?n1 ?p2 :ISIL} SELECT * WHERE {:Syria ?p1 ?n1. :ISIL ?p2 ?n1} SELECT * WHERE {?n1 ?p1 :Syria. :ISIL ?p2 ?n1} SELECT * WHERE {?n1 ?p1 :Syria. ?n1 ?p2 :ISIL} As it can be observed, the first two queries consider paths o path may exist in two directions, two queries are required. The length 2 requires 4 queries. In general, given a value K, to retrie 2k queries are required. Figure 2 shows an example of the sema entities Syria and ISIL. As can be noted, these two entities are direct semantic relation (e.g., ISIL < headquarters > Syria) (e.g., ISIL < ideology > Pan Islam < ideology > Musl DBpedia Step 2. Semantic Graph Representation Step 3. Sub-graph Mining CloseGraph Method (Yan and Han 2003) !
  39. 39. 39! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Evaluation & Results •  Baseline for comparison SVM classifiers trained from Unigrams, Topic, Sentiment and Network feature sets. •  10-Folds cross validation over 30 runs classifiers trained from the 4 sets of features described in Section 4.2. Results in all experiments are computed using 10-fold cross validation over 10 runs of different ran- dom splits of the data to test their significance. Statistical significance is done using Wilcoxon signed-rank test [16]. Note that all the results in average Precision, Recall and F1-measure reported in this section are statistically significant with ⇢ < 0.001. Table 3 shows the results of our binary stance classification (pro-ISIS vs. anti-ISIS) using Unigrams, Sentiment, Topic, and Semantic features after feature selection, applied over the 1,132 users in our dataset. The table reports three sets of precision (P), recall (R), and F1-measure (F1), one for anti-ISIS stance identification, one for pro-ISIS stance identification, and the third shows the averages of the two. The table also reports the total number of features used for classification under each feature set. anti-ISIS pro-ISIS Average No. of Features P R F1 P R F1 P R F1 UNIGRAMS 41,200 0.814 0.919 0.863 0.907 0.79 0.844 0.86 0.854 0.854 SENTIMENT 41,362 0.814 0.919 0.863 0.907 0.79 0.844 0.86 0.854 0.854 TOPICS 992 0.771 0.943 0.848 0.927 0.719 0.81 0.849 0.831 0.829 NETWORK 25,532 0.897 0.827 0.86 0.839 0.905 0.871 0.868 0.866 0.866 SEMANTICS 8,798 0.994 0.852 0.917 0.87 0.995 0.928 0.932 0.923 0.923 Table 3: Classification performance of the five feature sets with IG feature selection. The values highlighted in grey correspond to the best results obtained for each feature. Results in average P, R and F1 are statistically significant with ⇢ < 0.001. According to the results presented in Table 3, the proposed Semantic features outper- form the 4 baseline feature sets in all average measures by a large margin. In particular, classifiers trained from Semantic features produce 7.8% higher Recall, 7.7% higher precision, and 7.82% higher F1 than all baselines on average. Network features come next, followed by Unigrams features, with approximately 87% and 85% in average F1 respectively. On the other hand, Topic features produce the lowest classification 86.3 86.3 84.8 86 91.7 84.4 84.4 81 87.1 92.8 80 82 84 86 88 90 92 94 Unigrams Sen6ment Topics Network Seman6cs an6-ISIS pro-ISIS Exploration of semantic sub-graphs •  pro-ISIS users tend to discuss about religion, historical events and ethnicity •  anti-ISIS users focus more on politics, geographical locations and interventions against ISIS !
  40. 40. 40! Alberto Mendelzon Workshop (AWM) 23rd May 2018 40 Automatic Detection of pro- ISIS stances When pro-ISIS and “general” users both use radical terminology
  41. 41. 41! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Research Questions •  Problem –  Existing methods to automatically identify radical content online mainly rely on the use of glossaries (i.e., lists of terms and expressions associated with religion, war, offensive language, etc.) –  These methods are not always effective and we continue to observe that many who use radicalisation terminology in their tweets are simply reporting current events, or sharing harmless religious rhetoric •  Research question –  Are there significant variances between the semantic contexts of radicalisation terminology when this terminology is used to convey ’radicalised’ meaning vs. when it is not?
  42. 42. 42! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Contextual Divergence in the use of Radical Terminology 17K Tweets from pro- ISIS users! 97K tweets from “general” users using the same terminology! Radicalisation Lexicon: 556 terms !
  43. 43. 43! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Results Contextual divergence exist The most discrimina6ve contextual dimension among categories, topics, en66es and types is en##es
  44. 44. 44! Alberto Mendelzon Workshop (AWM) 23rd May 2018 44 Understanding the Roots of Radicalisation on Twitter
  45. 45. 45! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Social Science vs. Computer Science What are the factors that drive people to get radicalised? (e.g., failed integration, poverty, discrimination) What are the roots of radicalisation? (micro-level, meso-level, macro-level) How the radicalisation process happens and evolves, i.e., what are its different stages? (e.g., pre- radicalisation, self-identification, indoctrination, Jihadisation)! •  Analysis (how ISIS members use Twitter to radicalise and recruit other users?) •  Detection (can we create methods for the automatic detection of radical content and radicalised users?) •  Prediction (can we predict whether someone will interact with radical content or users? Can we predict whether someone will become radicalised?)
  46. 46. 46! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Roots of Radicalisation Micro or Individual roots! Macro or Global roots! Meso or group roots! RADICALISATION INFLUENCE
  47. 47. 47! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Calculating Radicalisation Influence 112 pro- ISIS Twitter users ! vs.! 112 “general users”!
  48. 48. 48! Alberto Mendelzon Workshop (AWM) 23rd May 2018 48 Challenges
  49. 49. 49! Alberto Mendelzon Workshop (AWM) 23rd May 2018 Challenges of researching online radicalisation •  Lack of gold-standard datasets –  Existing datasets are rarely verified by experts –  Annotating this data requires religious, cultural and political knowledge •  Data Collection –  Once accounts are closed is not possible to access the data –  Data among researchers is not commonly shared (sensitive data) •  Data Analysis –  Need to be *extremely careful* with the false positives –  Dynamics (changes in terminology, procedures, etc.) •  ETHICS!
  50. 50. 50! Alberto Mendelzon Workshop (AWM) 23rd May 2018 50! https://www.youtube.com/watch?v=lG7DGMgfOb8 (2002 movie)
  51. 51. 51! Alberto Mendelzon Workshop (AWM) 23rd May 2018 51! Policing Engagement via Social Media Miriam Fernandez, Tom Dickinson, and Harith Alani. ”And analysis of UK policing engagement via social media." International Conference on Social Informatics. Springer International Publishing, 2017. Miriam Fernandez, A. Elizabeth Cano, and Harith Alani. "Policing engagement via social media." International Conference on Social Informatics. Springer International Publishing, 2014.
  52. 52. 52! Alberto Mendelzon Workshop (AWM) 23rd May 2018 52! Policing Engagement via Social Media •  Policing organisations use social media to spread the word on crime, severe weather, missing people, … •  Many forces have staff dedicated to this purpose and to improve the spreading of key messages to wider social media communities •  Research shows that exchanges between police and citizens are infrequent
  53. 53. 53! Alberto Mendelzon Workshop (AWM) 23rd May 2018 53! Goal •  Understand what attracts citizen’s to social media policing content –  What are the characteristics of the content that generate higher attention levels •  Writing style •  Time of posting •  Topics –  Help police forces to identify actions and recommendations to increase public engagement
  54. 54. 54! Alberto Mendelzon Workshop (AWM) 23rd May 2018 54! Context: UK Policing Corporate! Non-corporate!
  55. 55. 55! Alberto Mendelzon Workshop (AWM) 23rd May 2018 55! Understanding Engagement •  Social media engagement has been studied –  Through multiple lenses (marketing, social sciences, computer science) –  In multiple scenarios (product selling, elections, campaigns, etc.) •  Study the literature of social media engagement –  [Ariely] Very clear message with a very concrete action •  Patrol, missing persons, incidents, emergencies, local authorities? What can/should I do? –  [Vaynerchuk] Need to differentiate each social medium (context) •  What happens in the world? To whom is the message targeted? •  Study the literature of social media police engagement –  Works mainly focus on studying the different social media strategies that police forces use to interact with the public •  [Denef] UK Riots 2011. Instrumental vs. expressive approach
  56. 56. 56! Alberto Mendelzon Workshop (AWM) 23rd May 2018 56! Barriers of Social Media Police Engagement (I) •  Legitimacy The police needs the trust and confidence of the communities they serve !
  57. 57. 57! Alberto Mendelzon Workshop (AWM) 23rd May 2018 57! Barriers of Social Media Police Engagement (II) •  Reputation •  Official communication channels (911) •  Surveillance •  Variety of topics •  Budget
  58. 58. 58! Alberto Mendelzon Workshop (AWM) 23rd May 2018 58! Approach (I) •  Data Collection –  154,679 posts from 48 corporate Twitter accounts –  1,300,070 posts from 2,450 non-corporate Twitter accounts –  January 2017 •  Engagement Indicators –  Retweets •  % of tweets retweeted •  Average number of retweets per tweet –  Favourites (likes) •  % of tweets favourited (liked) •  Average number of likes per tweet –  Replies •  At the time of analysis Twitter API does not allow to collect replies per tweet
  59. 59. 59! Alberto Mendelzon Workshop (AWM) 23rd May 2018 59! Approach (II) •  Feature Extractors –  Describe tweets in terms of their characteristics –  Content Features •  Length / Readability / Informativeness / Complexity / Sentiment •  Media / mentions / hashtags / URLs •  Time in the day –  User Features •  Network: In-degree / out-degree •  Activity: Post count / post rate / age in the system –  Semantic Features •  Use knowledge bases to extracts entities and concepts –  Persons / Organisations / Locations •  Using feature selection / regression to determine the characteristics “patterns” of those tweets receiving higher engagement levels
  60. 60. 60! Alberto Mendelzon Workshop (AWM) 23rd May 2018 60! Results (I) •  Tweets receiving higher engagement are: –  Longer, easier to read, more informative, lower complexity (avoid complex terms), include media items (images, videos). –  In terms of user features they tend to be posted by accounts with a high number of followers (corporate) or with a high post rate and a high in-out degree ratio (non-corporate). neg pos 051015202530 lenght neg pos 020406080100 readability neg pos 020406080100 informativeness neg pos −4−2024 polarity
  61. 61. 61! Alberto Mendelzon Workshop (AWM) 23rd May 2018 61! Results (II) •  Tweets receiving higher engagement talk about –  Weather / roads and infrastructures / events / missing persons –  Raise awareness (domestic abuse, hate crime, modern slavery) –  Tend to mention locations •  Tweets receiving lower engagement talk about –  Crime updates: such as burglary, assault or driving under the influence of alcohol –  Following requests (#ff) –  Advices to stay safe
  62. 62. 62! Alberto Mendelzon Workshop (AWM) 23rd May 2018 62! Results (III) •  Non-corporate accounts generate in average higher engagement –  Offer help, ask for help, advise on local issues, reassure safety, etc. (#wearehereforyou) •  Additional ingredients –  They engage closer with the communities (direct messages and mentions to citizens) –  They are fun!
  63. 63. 63! Alberto Mendelzon Workshop (AWM) 23rd May 2018 63! Engagement Guidelines •  Focus –  Consider the key goal to achieve / the audience to engage (general public, local communities, teenagers) & provide a clear message with a concrete set of actions associated to it •  Be clear –  Complex messages with police jargon are difficult to understand. Messages should be simple, informative and useful. Use images/videos and humour to enhance dissemination •  Interact –  Engage with the communities rather than only broadcast. Identify highly engaging police staff members and community leaders and involve them •  Stay active –  Engagement is a long-term commitment. Accounts active for longer time receive higher engagement. •  Be respectful –  Reputation and legitimacy are extremely important. Post polite, safe and respectful content
  64. 64. 64! Alberto Mendelzon Workshop (AWM) 23rd May 2018 64! Questions?

×