• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
[WWW2012] analyzing spammers' social networks for fun and profit
 

[WWW2012] analyzing spammers' social networks for fun and profit

on

  • 461 views

20121219 Lab Paper Presenation.

20121219 Lab Paper Presenation.

Statistics

Views

Total Views
461
Views on SlideShare
460
Embed Views
1

Actions

Likes
0
Downloads
17
Comments
0

1 Embed 1

http://www.slashdocs.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    [WWW2012] analyzing spammers' social networks for fun and profit [WWW2012] analyzing spammers' social networks for fun and profit Presentation Transcript

    • Analyzing Spammers Social Networks for Funand Profit A Case Study of Cyber Criminal Ecosystem on Twitter Chao Yang, Texas A&M University Robert Harkreader, Texas A&M University Jialong Zhang, Texas A&M University WMMKS Lab 郭至軒WWW12
    • Criminal Accountmalicious behavior
    • Twitter RuleA Twitter account can be considered tobe spamming, and thus be suspendedby Twitter.
    • Twitter RuleIf it has a small number of followerscompared to the amount of accountsthat it follows. Follow Follow 10 self 1000
    • Who follow criminal accounts? Let the criminal accounts still exist.
    • Cyber Criminal Ecosystemcriminal accountcriminal supporterlegitimate account victim inner outer
    • Inner Social Relationshipinner
    • Inner Social Relationship 2,060 9,868G = (V,E)V: all criminal accountsE: all follow relationship, directed edge
    • Inner Social RelationshipRelationship Graph Connected Components 8 weakly connected components (at least 3 nodes) 521 isolated nodes
    • Inner Social RelationshipFinding 1:Criminal accounts tend to be socially connected, forminga small-world network.
    • Inner Social RelationshipGraph Density Follow Account Density Relationship Criminal Space in 2,060 9,868 2.33 × 10-3 Sample Entire Twitter 41.7 × 106 1.47 × 109 8.45 × 10-7 Space
    • Inner Social RelationshipGraph Density Follow Account Density Relationship Criminal Almost 3,000 times Space in 2,060 9,868 2.33 × 10-3 Sample Entire Twitter 41.7 × 106 1.47 × 109 8.45 × 10-7 Space
    • Inner Social RelationshipReciprocity Number of Bidirectional Links Reciprocity of 95% criminal accounts higher than 0.2. Reciprocity of 55% normal accounts higher than 0.2. Reciprocity of around 20% criminal accounts are nearly 1.0.
    • Inner Social RelationshipAverage Shortest Path Length Average number of steps along the shortest paths for all possible pairs of graph nodes. ASPL Criminal Accounts 2.60 Legitimate Accounts 4.12
    • Inner Social RelationshipCriminal accounts have strong socialconnections with each other.
    • Inner Social RelationshipWhat are the main factors leadingto that structure?
    • Inner Social RelationshipTend to follow many accounts without consideringthose accounts quality much. Following Quality: average follower number of an accounts all following accounts
    • Inner Social RelationshipTend to follow many accounts without consideringthose accounts quality much. FQ of 85% criminal accounts lower than 20,000. FQ of 45% normal accounts lower than 20,000.
    • Inner Social RelationshipCriminal accounts, belonging to the same criminalorganizations.
    • Inner Social RelationshipCriminal accounts, belonging to the same criminalorganizations. Group criminal accounts intodifferent criminal campaigns by 2,060 malicious URL. 9,868 17 campaigns 8,667 edges 87.8 %
    • Inner Social RelationshipProvide followers to criminal accounts1. Break the Following Limits Policy2. Evade spam detection
    • Inner Social RelationshipFinding 2:Compared with criminal leaves, criminal hubs are moreinclined to follow criminal accounts.
    • Inner Social Relationship HITS algorithm to calculate hub score k-means algorithm to cluster themRelationship Graph criminal hubs: 90 criminal leaf: 1,970
    • Inner Social RelationshipCriminal Following Ratio (CFR):ratio of the number of an account’s criminal-followings to its total following number
    • Inner Social Relationship CRF of 80% criminal hubs higher than 0.1. CRF of 20% criminal leaves higher than 0.1. CRF of 60% criminal leaves lower than 0.05.
    • Inner Social RelationshipWhy?
    • Inner Social RelationshipCriminal hubs tend to obtain followers moreeffectively by following other criminal accounts. Shared Following Ratio (SFR): percentage of an account’s followers, who also follows at least one of this account’s criminal-followings
    • Inner Social RelationshipCriminal hubs tend to obtain followers moreeffectively by following other criminal accounts. SRF of 80% criminal hubs higher than 0.4. CRF of 5% criminal leaves higher than 0.4.
    • Inner Social Relationship criminal hubsfollowing leaves and acquiring theirfollowers’ information criminal leavesrandomly following other accounts toexpect them to follow back
    • Outer Social Relationship
    • Outer Social Relationshipcriminal supportersaccounts outside the criminal community, who have close"follow relationships" with criminal accounts
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) MR score: measuring how closely this account follows criminal accounts MR score
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) 1. the more criminal accounts followed, the higher score 2. the further away from a criminal account, the lower score 3. the closer the support relationship between a Twitter account and a criminal account, the higher score
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) Malicious Relevance Graph, G = (V,E,W) V: all accounts E: all follow relationship, directed edge W: weight for each edge, closeness of relationship
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) MR Score Initialization: Mi = 1, if Vi is criminal account Mi = 0, if Vi is not criminal account
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) MR Score Aggregation: an account’s score should sum up all the scores inherited from the accounts it follows C1 MR(C1) = M1 A C2 MR(A) = M1 + M2 MR(C2) = M2
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) MR Score Dampening: the amount of MR score that an account inherits from other accounts should be multiplied by a dampening factor of α according to their social distances, where 0 < α < 1 C A1 A2 MR(C) = M MR(C) = α × M MR(C) = α2 × M
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) MR Score Splitting: the amount of MR score that an account inherits from the accounts it follows should be multiplied by a A1 relationship-closeness factor MR(A1) = 0.5 × M C MR(C) = M A2 MR(A2) = 0.5 × M
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) n: number of total nodes Iij: { 0, 1 }, if (i,j) ∈ E, Iij = 1; otherwise, Iij = 0
    • Outer Social RelationshipMalicious Relevance Score PropagationAlgorithm (Mr.SPA) I: the column-vector normalized adjacency matrix of nodes
    • Outer Social RelationshipAfter Mr. SPA...use x-means algorithmto cluster accounts based 5,924 criminal supporterson their MR scoresmost accounts have most accounts do notrelatively small scores have very close followand are grouped into one relationships with criminalsingle cluster accounts
    • Outer Social RelationshipSocial Butterflies Those accounts that have extraordinarily large numbers of followers and followings.use 2,000 followingas a threshold3,818 socialbutterflies
    • Outer Social RelationshipSocial Butterflies The reason why social butterflies tend to have close friendships with criminals is mainly because most of them usually follow back the users who follow them without careful examinations.
    • Outer Social RelationshipSocial Promoters Those accounts that have large following-follower ratios, larger following numbers and relatively high URL ratios.whose URL ratios are higher than0.1, and following numbers andfollowing-follower ratios are both atthe top 10-percentile508 social promoters
    • Outer Social RelationshipSocial Promoters The reason why social promoters tend to have close friendships with criminal accounts is probably because most of them usually promote themselves or their business by actively following other accounts without considerations of those accounts’ quality.
    • Outer Social RelationshipDummies Those accounts who post few tweets but have many followers.post fewer than 5 tweets andwhose follower numbers areat the top 10-percentile81 dummies
    • Outer Social RelationshipDummies The reason why dummies intend to have close friendship with criminals is mainly because most of them are controlled or utilized by cyber criminals.
    • Inferring Criminal AccountsThe number of Twitter accounts is HUGE!
    • Inferring Criminal AccountsCriminal account Inference Algorithm(CIA) start from a seed set
    • Inferring Criminal AccountsCriminal account Inference Algorithm(CIA) 1. criminal accounts tend to be socially connected 2. criminal accounts usually share similar topics, thus having strong semantic coordinations among them
    • Inferring Criminal AccountsCriminal account Inference Algorithm(CIA) Malicious Relevance Graph, G = (V,E,W) V: all accounts E: all follow relationship, directed edge W: weight for each edge, WS(i,j) P.S. SS: Semantic Similarity Score
    • Inferring Criminal AccountsCriminal account Inference Algorithm(CIA) n: number of total nodes Iij: { 0, 1 }, if (i,j) ∈ E, Iij = 1; otherwise, Iij = 0
    • Inferring Criminal AccountsEvaluation of CIA Dataset I: around half million accounts from our previous study [35] Dataset II: another new crawled 30,000 accounts by starting from 10 newly identified criminal accounts and using BFS strategy
    • Inferring Criminal AccountsEvaluation of CIA Dataset I CA: Criminal Account MA: Malicious Affected Account Selection Strategies 100 seeds, select 4,000 accounts
    • Inferring Criminal AccountsEvaluation of CIA Dataset I CA: Criminal Account MA: Malicious Affected Account Selection Sizes 100 seeds
    • Inferring Criminal AccountsEvaluation of CIA Dataset I CA: Criminal Account MA: Malicious Affected Account Seed Sizes select 4,000 accounts
    • Inferring Criminal AccountsEvaluation of CIA Dataset I CA: Criminal Account MA: Malicious Affected Account Seed Type 100 seeds, select 4,000 accounts
    • Inferring Criminal AccountsEvaluation of CIA Dataset I CA: Criminal Account MA: Malicious Affected Account Recursive Inference 50 seeds, select 4,000 accounts
    • Inferring Criminal AccountsEvaluation of CIA Dataset II CA: Criminal Account MA: Malicious Affected Account Seed Type 10 seeds, select 4,000 accounts
    • ConclusionS Provide a macro scale to view the criminal accounts. Focus on analysis and use heuristicW method. And the detail of semantic similarity score has omitted. Find out many malicious accounts,O maybe analyze them can improve accuracy of CIA to detect criminal accounts. There is bias for training dataset, andT let CIA improve not much when detect criminal accounts.
    • Thank You for Your Listening!