detecting temporal sybil attacks
       n. lathia, s. hailes & l. capra
      mobisys seminar, sept. 29 2009
the web is based on cooperation...
the web is crowd-sourced...

ratings: recommender, retrieval systems
       captchas: digitising text
     wikis: knowledg...
crowd-sourcing is cooperation...

      my ratings compute your recommendations.
           your reviews inform my decisio...
cooperation is policed by reputation and trust

         ebay: online trade and markets
         #followfriday on twitter ...
...we cooperate without knowing each other

        people are (nearly) anonymous
        why could this be a problem?
for example, recommender systems:

  recommendations → people → rate items →
classification algorithms → recommendations →...
problem with anonymity:

  recommendations → people → rate items →
classification algorithms → recommendations →
         ...
sybil attacks:

   ...when an attacker tries to subvert the system by
    creating a large number of sybils—pseudonymous
i...
sybil attacks: why? how?

random: inject noise, ruin the party for everyone
targetted: promote/demote items. make money?

...
recommender system sybil attack:
    shilling, profile injection, ...




          “honest” ratings


          attacker'...
each sybil rates:
           target, selected, filler items


target: item that attacker wants promoted/demoted
 selected:...
how to defend a recommender system?
a) treat it as a classification problem
         where are the sybils?




           “honest” ratings


           attack...
problems with classification approach:
    when is your system under attack?


         when to run classifier?
problems with classification approach:
when are sybils damaging your recommendations?




        wait until they have all...
proposal:
b) monitor recommender system over time
contributions:
 1. force sybils to draw out their attack
   2. learn normal temporal behaviour
   3. monitor for wide rang...
1. force sybils to draw out their attack
 rather than appear, rate, disappear
       how? distrust newcomers
distrust newcomers




prediction shift




                        → time →
distrust newcomers




prediction shift




                        → time →
distrust newcomers




prediction shift




                        → time →
1. force sybils to draw out their attack
         how? distrust newcomers
sybils are forced to appear more than once
examining temporal attack behaviour
  single sybil – – – – group of sybils

  target,                target,
   filler,   ...
group size and dynamics




how many
  sybils?




                      how many ratings per sybil?
how can they behave?



                 (many, few)     (many, many)
how many
  sybils?
                 (few, few)      ...
how can does this affect the data?
       impact = how much malicious data




how many
  sybils?




                    ...
how to measure attacks?
                 precision, recall, impact




         tp                   tp              #sybi...
how to detect these attacks? monitor!




            item-level system-level
how many
  sybils?


                       ...
how to detect these attacks? monitor!




                   system-level
how many
  sybils?




                   how ma...
overview of the methodology


1. monitor: look at how data changes over time
2. flag: look at how data changes under attack
1. system level
1. system level - attack
(system) avg ratings per user


    1. monitor: exp. weighted moving avg.

        mut = (β mut-|w|) + ((1-β) Rt/Ut)

2. f...
1. system level - evaluation
simulated data: play with data variance, attack
                  amplitude
1. system level - evaluation
simulated data: play with data variance, attack
                  amplitude
1. system level - evaluation
real data: netflix ratings (+ timestamps)
1. system level - evaluation
real data: netflix ratings (+ timestamps)
item-level system-level
how many
  sybils?


                        user-level




                      how many ratings...
(user-level) similar monitor/flag solution

                                                                              ...
(user-level) evaluation: real data




how many
  sybils?




                    how many ratings per sybil?
(user-level) evaluation: real data




how many
  sybils?




                    how many ratings per sybil?
(user-level) evaluation: real data




             item-level system-level
how many
  sybils?


                         ...
(item-level) slightly different context

        1. the item is rated by many users
           define many? using how othe...
3. evaluate: simulated attacks on real data
what next? attackers can defeat these defenses
              the ramp-up attack
but...
conclusions:
 1. force sybils to draw out their attack
   2. learn normal temporal behaviour
     3. monitor system, users...
Upcoming SlideShare
Loading in...5
×

Sybil Attacks - MobiSys Seminar

1,246

Published on

Published in: Education, News & Politics
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,246
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Sybil Attacks - MobiSys Seminar

  1. 1. detecting temporal sybil attacks n. lathia, s. hailes & l. capra mobisys seminar, sept. 29 2009
  2. 2. the web is based on cooperation...
  3. 3. the web is crowd-sourced... ratings: recommender, retrieval systems captchas: digitising text wikis: knowledge repositories
  4. 4. crowd-sourcing is cooperation... my ratings compute your recommendations. your reviews inform my decisions. your links help search engines to respond to my queries.
  5. 5. cooperation is policed by reputation and trust ebay: online trade and markets #followfriday on twitter ? trust ratings, ratings, ratings...
  6. 6. ...we cooperate without knowing each other people are (nearly) anonymous why could this be a problem?
  7. 7. for example, recommender systems: recommendations → people → rate items → classification algorithms → recommendations → people...
  8. 8. problem with anonymity: recommendations → people → rate items → classification algorithms → recommendations → people... can you trust them? are they real people? are they rating honestly?
  9. 9. sybil attacks: ...when an attacker tries to subvert the system by creating a large number of sybils—pseudonymous identities—in order to gain a disproportionate amount of influence...
  10. 10. sybil attacks: why? how? random: inject noise, ruin the party for everyone targetted: promote/demote items. make money? APIs: rate content automatically.
  11. 11. recommender system sybil attack: shilling, profile injection, ... “honest” ratings attacker's ratings
  12. 12. each sybil rates: target, selected, filler items target: item that attacker wants promoted/demoted selected: similar items, to deceive the algorithm filler: other items, to deceive humans
  13. 13. how to defend a recommender system?
  14. 14. a) treat it as a classification problem where are the sybils? “honest” ratings attacker's ratings
  15. 15. problems with classification approach: when is your system under attack? when to run classifier?
  16. 16. problems with classification approach: when are sybils damaging your recommendations? wait until they have all rated?
  17. 17. proposal: b) monitor recommender system over time
  18. 18. contributions: 1. force sybils to draw out their attack 2. learn normal temporal behaviour 3. monitor for wide range of attacks 4. force sybils to attack more intelligently
  19. 19. 1. force sybils to draw out their attack rather than appear, rate, disappear how? distrust newcomers
  20. 20. distrust newcomers prediction shift → time →
  21. 21. distrust newcomers prediction shift → time →
  22. 22. distrust newcomers prediction shift → time →
  23. 23. 1. force sybils to draw out their attack how? distrust newcomers sybils are forced to appear more than once
  24. 24. examining temporal attack behaviour single sybil – – – – group of sybils target, target, filler, filler, selected selected, but also:
  25. 25. group size and dynamics how many sybils? how many ratings per sybil?
  26. 26. how can they behave? (many, few) (many, many) how many sybils? (few, few) (few, many) how many ratings per sybil?
  27. 27. how can does this affect the data? impact = how much malicious data how many sybils? how many ratings per sybil?
  28. 28. how to measure attacks? precision, recall, impact tp tp #sybil ratings pr = re = imp = tp + fp tp + fn #ratings
  29. 29. how to detect these attacks? monitor! item-level system-level how many sybils? user-level how many ratings per sybil?
  30. 30. how to detect these attacks? monitor! system-level how many sybils? how many ratings per sybil?
  31. 31. overview of the methodology 1. monitor: look at how data changes over time 2. flag: look at how data changes under attack
  32. 32. 1. system level
  33. 33. 1. system level - attack
  34. 34. (system) avg ratings per user 1. monitor: exp. weighted moving avg. mut = (β mut-|w|) + ((1-β) Rt/Ut) 2. flag: incoming ratings above moving threshold Rt/Ut > mut + (αg σt) (parameters α, β updated automatically)
  35. 35. 1. system level - evaluation simulated data: play with data variance, attack amplitude
  36. 36. 1. system level - evaluation simulated data: play with data variance, attack amplitude
  37. 37. 1. system level - evaluation real data: netflix ratings (+ timestamps)
  38. 38. 1. system level - evaluation real data: netflix ratings (+ timestamps)
  39. 39. item-level system-level how many sybils? user-level how many ratings per sybil?
  40. 40. (user-level) similar monitor/flag solution 1. monitor: a. how many high-volume raters? b. how much do high-volume raters rate? 2. flag: group size-ratings above threshold file:///C:/Documents%20and%20Settings/User/Desktop/misc/documents/19%20attacks/wsdm_2010/img/highVolume.jpg file:///C:/Documents%20and%20Settings/User/Desktop/misc/documents/19%20attacks/wsdm_2010/img/highRatings.jpg
  41. 41. (user-level) evaluation: real data how many sybils? how many ratings per sybil?
  42. 42. (user-level) evaluation: real data how many sybils? how many ratings per sybil?
  43. 43. (user-level) evaluation: real data item-level system-level how many sybils? user-level how many ratings per sybil?
  44. 44. (item-level) slightly different context 1. the item is rated by many users define many? using how other items were rated 2. the item is rated with extreme ratings define extreme? what is avg item mean? 3. (from a + b) the item mean ratings shifts nuke or promote? flag: if all three conditions broken. Why? 1 → popular item. 2 → few extreme ratings. 3 → cold start item 1 + 2 but not 3 → attack doesn't change anything
  45. 45. 3. evaluate: simulated attacks on real data
  46. 46. what next? attackers can defeat these defenses the ramp-up attack
  47. 47. but...
  48. 48. conclusions: 1. force sybils to draw out their attack 2. learn normal temporal behaviour 3. monitor system, users, items 4. force sybils to attack more intelligently

×