KAIST and Soton Research Project:Malicious Usage of Twitter                                  30th April – 4th May 2012Grou...
Research Objectives    • Examine malicious Twitter usage;    • Review background literature;    • Analyse a sample of Twit...
Overview: Group 2’s Outcomes• Identified malicious use metrics on Twitter;• Used a sample of Twitter data - all the tweets...
What is defined asmalicious use on Twitter?
Twitter Rules I    The „Twitter Rules‟ classify the content boundaries       and use of Twitter within these categories:• ...
The Twitter Rules II        “Technical abuse and user abuse is not     tolerated on Twitter.com, and will result in       ...
Measured MetricsMetric(s) from Twitter Rules   Potential Area of UK Law        Measurable MetricsImpersonation            ...
How can malicious usage be classified on               Twitter?What are the differences between malicious usage          i...
Example of Previous Research   STUDY         PURPOSE OF             DATASET               METHODOLOGY                     ...
Pattern Matching: Regex Examples  UK Phone Numbers  (*(0|[+]*?44)[12789][0-9]{1,3})*[ -]*([0-9]{4}  [0-9]{4}|[0-9]{6,7}) K...
Malicious URL Detection           This user tweeted a malicious URL which was identified by                  Google Safe B...
Pornography Detection• With Computer Vision• Skin detection  ▫ Colour, texture• Classification  ▫ LDA-SVM• Real-time Blur/...
Legal Aspect: Defamatory Liability on               Twitter  Comparative: Korean and UK Defamation Law
The Need for an Interdisciplinary              Approach• A purely technical attempt to address the issue  of malicious use...
Example: Defamation  • Defamation is the communication of an assertion    about a party that may cast them in a negative l...
Example: Defamation (cont’d)• This one example shows that: ▫ The significant differences in what constitutes   defamation ...
Future Work• Examine the comparative legal issues that  impact on malicious use detection;• Consider the sociological reas...
Thanks for Listening. Any Questions?
2012 KAIST-Southampton Workshop Group2
2012 KAIST-Southampton Workshop Group2
Upcoming SlideShare
Loading in …5
×

2012 KAIST-Southampton Workshop Group2

1,471 views

Published on

Presented in KAIST-Southampton Workshop 2012.

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,471
On SlideShare
0
From Embeds
0
Number of Embeds
858
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2012 KAIST-Southampton Workshop Group2

  1. 1. KAIST and Soton Research Project:Malicious Usage of Twitter 30th April – 4th May 2012Group 2 Members:Korea Advanced Institute of Science and Technology:Hamyung Park, Hwon Ihm, Jungin Lee, Kiun Choi and Kyoungwon Lee.University of Southampton:Huw Fryer, Jaymie Caplen and Laura German.
  2. 2. Research Objectives • Examine malicious Twitter usage; • Review background literature; • Analyse a sample of Twitter data collected during one week in November 2011; • Classify these malicious uses and extract examples; • Compare types of malicious usage in South Korea and the UK; • Conduct a comparative legal case study on an area of malicious use – defamation law.
  3. 3. Overview: Group 2’s Outcomes• Identified malicious use metrics on Twitter;• Used a sample of Twitter data - all the tweets produced during one week in November 2011;• Tested metrics using regular expressions;• Outlined the legal framework of malicious use on Twitter;• Conducted a legal case study to show the limitations of malicious use identification;• Highlighted the difference in the legal framework and malicious use in South Korea and the UK;• Recognised the legal, technical and social opportunities for future work.
  4. 4. What is defined asmalicious use on Twitter?
  5. 5. Twitter Rules I The „Twitter Rules‟ classify the content boundaries and use of Twitter within these categories:• 1) Impersonation;•• 2) Trade Mark; 3) Privacy; These five areas• 4) Violence and Threats; have been examined• 5) Copyright;• 6) Unlawful use; this week.• 7) Misuse of Twitter Badges.The Twitter Rules, 2012. http://support.twitter.com/articles/18311-the-twitter-rules#
  6. 6. The Twitter Rules II “Technical abuse and user abuse is not tolerated on Twitter.com, and will result in permanent suspension.”These rules categorise „spam and abuse‟:• 1) Serial Accounts;• 2) Username Squatting;• 3) Invitation spam; These three areas• 4) Selling user names; • 5) Malware/Phishing; have been • 6) Spam; examined this • 7) Pornography. week.
  7. 7. Measured MetricsMetric(s) from Twitter Rules Potential Area of UK Law Measurable MetricsImpersonation Identity Theft Same name as verified account.Trade Mark and Copyright Intellectual Property Law Same brand as verifiedInfringement account.Violence and Threats Criminal Law Use of violent or threatening terms in tweets.Unlawful Use Defamation Law/Other Created a regular expression for privacy and defamation.Spam Communications Act 2003 -Malware and Phishing Fraud Used Google Safe Browsing API. Found a malicious URL.Privacy Breach of Confidence (Art 8 – Created a regular Right to Privacy) expression for privacy and defamation.Pornography Criminal Law Pornographic image detection software.
  8. 8. How can malicious usage be classified on Twitter?What are the differences between malicious usage in South Korea and the UK?
  9. 9. Example of Previous Research STUDY PURPOSE OF DATASET METHODOLOGY FINDINGS STUDY Benevenuto “Conducted a Crawled a Crawled dataset. Manually and others, study about the labelled Twitter uses as This approach can ‘Detecting near-complete* spammers and non-spammers. Spammers characteristics of dataset from Focused on three trending topics* directly identify on Twitter’ tweet content and at the time: user behaviour on Twitter: 1) #michaeljackson and #mj, 2) the majority of Twitter aiming at 54 million #susanboyle and 3) understanding users; 1.9 #musicmonday. spammers – 70%. Obtained all tweets and unique their relative billion links; users that had tweeted these It misclassifies discriminative and almost 1.8 trending topics. power to Use 39 contribute and 23 3.6% of non- billion tweets. behaviour attributes. distinguish spammers and Obtained number of tweets with spammers. (* 8% of „spam words‟ and „behaviour non-spammers.” attributes‟. accounts are Application of a supervised private – these machine learning method to identify spammers. were ignored.) (* To target aggressive spammers who utilise trending topics.) Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida. Detecting Spammers on Twitter, In CEAS, 2010. P. 1. http://ceas.cc/2010/papers/Paper%2021.pdf
  10. 10. Pattern Matching: Regex Examples UK Phone Numbers (*(0|[+]*?44)[12789][0-9]{1,3})*[ -]*([0-9]{4} [0-9]{4}|[0-9]{6,7}) Korean (Non-commercial) phone numbers (?(02|[0-9]{3}))?(-|s*)[0-9]{3,4}(s|-|.)[0- 9]{4} Credit/debit card followed by expiry or CVV ([0-9]{4} *?){4}( [0-9]{2}/[0- 9]{2})*( [0-9]{3})*
  11. 11. Malicious URL Detection This user tweeted a malicious URL which was identified by Google Safe Browsing API. (Group 2 Result.) #PRjobs - ***STOP PRESS*** PR Junior http://t.co/PQAJNUguhttp://www.vox-pop.co.uk/2011/10/18/pr-junior-for-inter-relations-agency-closing-date-11-nov-11/ Grier and others also use three spam blacklists, Google Safe Browsing API, URIBL and jwSpamSpy to detect malicious URLs on Twitter.Chris Grier, Kurt Thomas, Vern Paxson, and Michael Zhang. @spam: The Underground on 140 Characters or Less, In CCS, 2010. P. 29. http://delivery.acm.org/10.1145/1870000/1866311/p27-grier.pdf?ip=143.248.254.84&acc=ACTIVE%20SERVICE&CFID=100207330&CFTOKEN=43504572&__acm__=1335857745_ec015e346b197128136fe302ece8625chttps://developers.google.com/safe-browsing/http://uribl.com/http://www.joewein.net/
  12. 12. Pornography Detection• With Computer Vision• Skin detection ▫ Colour, texture• Classification ▫ LDA-SVM• Real-time Blur/Mosaic
  13. 13. Legal Aspect: Defamatory Liability on Twitter Comparative: Korean and UK Defamation Law
  14. 14. The Need for an Interdisciplinary Approach• A purely technical attempt to address the issue of malicious use of Twitter will be difficult to implement• This is for a number of reasons: ▫ Differing legal positions in countries; ▫ Differing cultural concepts of what is deemed acceptable; ▫ Absence of any „human‟ context risks high numbers of false-positive results.• Future work will need to take account of this.
  15. 15. Example: Defamation • Defamation is the communication of an assertion about a party that may cast them in a negative light.Country Law Type Defences PunishmentUnited Kingdom Civil Law Truth Damages Fair Comment ApologySouth Korea Criminal Law No defence Imprisonment Civil Law Fine Damages Apology
  16. 16. Example: Defamation (cont’d)• This one example shows that: ▫ The significant differences in what constitutes defamation in the UK and South Korea means any technical tool identifying defamatory tweets must be tailored to take account of the local context ▫ The differing penalties the crime attracts in each country may indicate the potential appetite for detection tools in these areas
  17. 17. Future Work• Examine the comparative legal issues that impact on malicious use detection;• Consider the sociological reasons behind malicious use;• Create a „real-time malicious use Twitter detector‟ based on interdisciplinary considerations;• Perform additional analysis on the tweets identified as malicious.
  18. 18. Thanks for Listening. Any Questions?

×