• Save
Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg?
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg?

on

  • 627 views

For more projects visit@www.nanocdac.com

For more projects visit@www.nanocdac.com

Statistics

Views

Total Views
627
Views on SlideShare
627
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? Document Transcript

  • 1. Detecting Automation of Twitter Accounts: Are You a Human, Bot,or Cyborg?AbstractTwitter is a new web application playing dual roles of online social networking andmicroblogging. Users communicate with each other by publishing text-based posts. Thepopularity and open structure of Twitter have attracted a large number of automated programs,known as bots, which appear to be a double-edged sword to Twitter. Legitimate bots generate alarge amount of benign tweets delivering news and updating feeds, while malicious bots spreadspam or malicious contents. More interestingly, in the middle between human and bot, there hasemerged cyborg referred to either bot-assisted human or human-assisted bot. To assist humanusers in identifying who they are interacting with, this paper focuses on the classification ofhuman, bot, and cyborg accounts on Twitter. We first conduct a set of large-scale measurementswith a collection of over 500,000 accounts. We observe the difference among human, bot, andcyborg in terms of tweeting behavior, tweet content, and account properties. Based on themeasurement results, we propose a classification system that includes the following four parts: 1)an entropy-based component, 2) a spam detection component, 3) an account propertiescomponent, and 4) a decision maker. It uses the combination of features extracted from anunknown user to determine the likelihood of being a human, bot, or cyborg. Our experimentalevaluation demonstrates the efficacy of the proposed classification system.Existing SystemSeveral previous works focus on socio-technological aspects of Twitter such as its use inthe workplace or during major disaster events. Twitter has attracted spammers to post spamcontent, due to its popularity and openness. Fighting against spam on Twitter has beeninvestigated in recent works Yardi et al etected spam on Twitter. According to their observations,spammers send more messages than legitimate users, and are more likely to follow otherspammers than legitimate users. Thus, a high follower-to following ratio is a sign of spammingbehavior.www.nanocdac.com www.nsrcnano.com branches: hyderabad nagpur
  • 2. Proposed SystemIn the paper, we first conduct a series of measurements to characterize the differencesamong human, bot, and cyborg in terms of tweeting behavior, tweet content, and accountproperties. By crawling Twitter, we collect over 500,000 users and more than 40 million tweetsposted by them. Then, we perform a detailed data analysis, and find a set of useful features toclassify users into the three classes. Based on the measurement results, we propose an automatedclassification system that consists of four major components:1. The entropy component uses tweeting interval as a measure of behavior complexity, anddetects the periodic and regular timing that is an indicator of automation;2. The spam detection component uses tweet content to check whether text patterns contain spamor not3;3. The account properties component employs useful account properties, such as tweeting devicemakeup, URL ration, to detect deviations from normal; and4. The decision maker is based on Random Forest, and it uses the combination of the featuresgenerated by the above three components to categorize an unknown user as human, bot, orcyborg.Hardware Requirements• Hard disk : 80 GB• RAM : 1 GB• Processor : Pentium IVSoftware Requirements• Coding Language : Java• Database : MySQL• Operating System : Windows XPModules:• Get Access Tokenwww.nanocdac.com www.nsrcnano.com branches: hyderabad nagpur
  • 3. • Get Tweets• Calculate Entropy Measure• Find Source Typewww.nanocdac.com www.nsrcnano.com branches: hyderabad nagpur