Abuse prevention in the globally distributed economy presentation


Published on

Shyam Mittur (Yahoo)

Published in: Technology, News & Politics
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Abuse prevention in the globally distributed economy presentation

  1. 1. Abuse in the Globally Distributed Economy Shyam Mittur June 26, 2012
  2. 2. Welcome to the Global Economy – howto create new jobs 2 6/23/12
  3. 3. Welcome to the Global Economy –let’s go crack Y! accounts 3 6/23/12
  4. 4. Outline  History – What is abuse and how did we deal with it?  Evolution of abuse  Keeping up with abuse – our strategy and tools  Continuing challenges 4
  5. 5. What is Abuse?
  6. 6. Abuse is – “Something you’re allowed to do, but in a way that isnot allowed”  Service abuse: primarily overuse ›  Mass registration ›  Account and credentials compromise attempts  Content abuse: undesirable user-generated content ›  Spam: “go to stockmarketvideo.com it 5o bucks a month i subscribe there the guy is good ., stop doin wat ur doin” ›  Offensive posts: “****WHY IS YOUR SXXX WXXX CXXX MOTHER CXXXXXX OVER MY HOUSE TONIGHT?****” ›  Solicitations: “!!!!!!`"[Seek¯ing¯R¯ich .C¯0M]],(remove¯),,,,,,,,where to find educated men! where to find women with inner and outer beauty....” ›  Offensive images 6
  7. 7. The view from the inside  High-rate abuse is still present  Content abuse is everywhere ›  Commercial spam: solicitations, stock scams, etc. ›  Off-topic postings: politics, bigotry, baiting, harassment ›  Image abuse: porn sites, webcams, URLs  Account compromise is up ›  Every merchant wants you to register ›  Many have poor back-end infrastructure, user databases are compromised and sold ›  Users use the same id/pw/questions in many locations ›  Baffled family and friends: “I got this e-mail from you … ” ›  Leads to: “Help, my account has been hacked!” 7
  8. 8. Example – registration attempts  5-25% of attempts in one colo were deemed abusive and denied 8
  9. 9. Junk Account Registrations  Over 50% of successful registrations are suspected to be abusive   Black: Total Registrations   Yellow: Suspected abusive registrations   Blue: Likely good registrations 9
  10. 10. Login attempts  20-40% of the attempts in one colo were deemed abusive and denied 10
  11. 11. Service Requests  12-20% of all service requests were denied 11
  12. 12. CAPTCHA Challenges  50% of CAPTCHAs are not attempted  40% of those attempted are successful 12
  13. 13. How we deal with Abuse
  14. 14. Prevention and Mitigation Overuse-detection and service-denial at the edge ›  Common base rules and conservative limits everywhere ›  Additional custom rules and aggressive limits in select locations (high activity and/or high risk) Liberal registration (sign-up) ›  Biased in favor of quick and easy sign-up for new users Widespread use of CAPTCHA Aggressive action on detected abusive activity ›  Wide range of sophistication in detection techniques and strategies ›  Blacklists and regular expressions to machine learning approaches 14
  15. 15. Platform Tools and Solutions Rate limiting and filtering ›  YDoD Challenge/response validation ›  CAPTCHA service Content classification ›  Anti-spam (Mail, Messenger), Standard Moderation Platform (other contexts) ›  URL database and services Account action ›  Warn, Rehab, Suspend, Trap, Delete 15
  16. 16. YDoD – A self-aggregating blacklistmanager and rate limiter 16
  17. 17. YDoD works with “filters”  A filter describes the criteria for identifying abuse ›  Preconditions and descriptions of the information to be used for tracking abuse (what kind of activity am I interested in watching and/or blocking?) ›  Limits and descriptions of the table used to track abuse (how much of that am I willing to take?) ›  Response (what do I do when I’ve had enough?)  Like a set of configuration files in a custom language  Filters are installed on client hosts and central “clusterhosts”  The clusterhost cares about the limits  The client cares about the preconditions and responses ›  On “overlimit” condition, a configurable set of responses (actions) are invoked 17
  18. 18. What a YDoD table looks like 18
  19. 19. CAPTCHA over the years 2001 February 2004 February 2008 April 2008 September 2010 19
  20. 20. Content Abuse  Standard Moderation Platform ›  A framework for classification and moderation of user-generated content  Web service interface, provides a synchronous judgment ›  Uses a configured stack of classifiers •  Blacklists •  Regular expressions •  Obscenity word lists (with variants) •  Image analysis •  Signature/hash matching •  Machine learning algorithm implementations  Abusive or “suspect” content can be forwarded to human moderation (generally asynchronous) 20
  21. 21. The Evolution of Abuse
  22. 22. Data Entry Job? 22
  23. 23. Another “Data Entry Job” recruiter 23 6/23/12
  24. 24. A few “record holders” here 24 6/23/12
  25. 25. When $0.75/day solving CAPTCHAs isthe alternative 25 6/23/12
  26. 26. Need a few Yahoo! accounts?  This one seems to be out of business, there are many such providers 26
  27. 27. Rent-a-botnet  http://www.zdnet.com/blog/security/study-finds-the-average-price-for-renting-a-botnet/6528 27
  28. 28. From hacking/fun/malice to business/profit  There is money to be made ›  Jan 30, 2012: “It is estimated that financial institutions have lost $15 billion in the past five years” – NPR All Things Considered1 ›  Sept 14, 2011: “The FBI is currently investigating over 400 reported cases of corporate account takeovers in which cyber criminals have initiated unauthorized ACH and wire transfers from the bank accounts of U.S. businesses. These cases involve the attempted theft of over $255 million and have resulted in the actual loss of approximately $85 million.”2  Globalization ›  Specialized services that source knowledge and manpower from low-cost locations ›  Examples: Registration, CAPTCHA solving, Spam pushing  Botnets, malware and data breaches ›  Botnets are available for rental by-the-hour or for entire campaigns ›  Malware propagation, key logging, identity theft, account compromise/takeover  “Multi-level marketing” at its best! 1.  Original source unknown 2.  http://www.fbi.gov/news/testimony/cyber-security-threats-to-the-financial-sector 28
  29. 29. A global market and ecosystem 29 6/23/12
  30. 30. Kolotibablo.com: A “full-service” offering  Registration, CAPTCHA-solving, spam campaigns 30
  31. 31. Funny – they use CAPTCHA, too!  Not very good either 31
  32. 32. Xrumer – another full-service solution  ‘The system of “Antispam” – correct spam’ 32
  33. 33. decaptcher.net – a CAPTCHA solvingservice (busted?) Hi. I need to crack captcha. Do you provide a captcha decoders? DeCaptcher CAPTCHA solving is processed by humans. So the accuracy is much better than an automated captcha solver ones Hi guys. Can you make an advert program for me for *****.com? Contact us and well discuss it. Can I solve captchas in many threads? Yes, you can. CAPTCHA solving can be parallelized. Just make sure in every thread you do like follows: login solve as many captchas as you need logout. 33
  34. 34. More on this at …  “The Commercial Malware Industry” by Peter Gutman, University of Auckland  “Krebs on Security” blog by Brian Krebs  Stefan Savage and his team’s work at UC San Diego 34 6/23/12
  35. 35. Evolution of our strategy and tools
  36. 36. Going forward: a two-pronged strategy  General approach: more detection and mitigation at the edge  Classification of every request ›  Good – service, abusive – deny, not sure – service or challenge ›  Algorithmic approaches, beyond just counting  Presentation of graded challenges ›  Simple CAPTCHAs still work well in many situations ›  In-line and out-of-band ›  All kinds of other ideas, too  Special handling of account compromise ›  More notification (mostly opt-in, some not) ›  The account is placed in a trap state ›  Challenge/verify at next opportunity 36
  37. 37. Project Blackbird: a new framework  Why we need this ›  Operating at a much higher scale (of requests, deployments, services) ›  Up against highly capable adversaries ›  Who they are and where they are coming from are not meaningful or relevant ›  What they do is what matters ›  Tight performance budget for synchronous detection ›  Quick reaction time for deployment and customization  Approach ›  Plug-in deployment of blacklists, exemptions, classifiers ›  Encapsulation of detection techniques as classifiers ›  Abstraction of classifiers as algorithm (code) + model (data) ›  Support for automatic data sampling, retraining, model building and updates ›  Central control of the framework (development and deployment) ›  Distributed ownership of classifiers (development, deployment and customization) 37
  38. 38. Blackbird design: front-end 38
  39. 39. Blackbird design: support infrastructure 39
  40. 40. CAPTCHA: not just those squiggly characters  We generalized and abstracted the CAPTCHA framework  Changed integration and delivery to a service model ›  Create challenge (the “test”) ›  Present challenge ›  Validate response  Made the challenge techniques configurable and selectable ›  Several graphical presentations ›  Non-graphical challenges ›  Out-of-band challenges: Voice, SMS, E-mail, Postcard (yes) ›  Difficulty levels 40
  41. 41. New visual variants  Overlap Text  Background Clutter  Floating Screen: Demo 41
  42. 42. New CAPTCHA Challenges  3D-Wave: Demo  OverlapTextWave: Demo  DelayedAnimation: Demo 42
  43. 43. Telephone Voice/SMS Challenge  Generate a phone call or text message ›  With a one-time numeric code  Why this is effective: ›  We check on phone numbers and exclude those available in bulk for abuse ›  We can watch for overuse 43
  44. 44. Continuing challenges  New user acquisition ›  Ease of sign-up vs. challenge/validation friction  Anonymity vs. verifiable personal data ›  Users have “learned” to not provide real information  Use of activity data, building and using reputation ›  “I can’t believe you track this!”  Abuse/compromise mitigation in “free” vs. “at-risk” environments (e.g., banks)  Account/credentials compromise ›  Id/password overloading ›  Mobile devices and apps ›  Reverting to risky behavior 44
  45. 45. Shyam MitturYahoo! Abuse Engineering