Successfully reported this slideshow.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Data Driven Security, from Gartner Security Summit 2012

  1. 1. Copyright © 2012 Splunk, Inc. Data-Driven Security: Managing Risk at Etsy Nick Galbreath @ngalbreath Director of Engineering - Etsy Gartner Security Summit National Harbor, MD June 12, 2012
  2. 2. Agenda Who am I? Who is Etsy? Splunk at Etsy? Web Application Security Account Takeover Payments and PCI Credits, Data, Further Reading 2
  3. 3. @ngalbreath 3
  4. 4. Whois Nick Galbreath Director of Engineering at Etsy covering: – Fraud – Security 2 – Support Engineering -06-1 – (and other stuff outside of this talk) 2012 y is m Software Development background in two year e rsary E-Commerce and Social Media anniv tsy Books, Patents, Oh My… at E 4
  5. 5. $525,000,000 in community sales 875,000 active sellers 41MM unique visitors 15MM registered members 150 countries 5
  6. 6. What Could Possibly Go Wrong? • Marketplace Risk like Big Auction Site • Payment Risk like Payments Company • Social Risk like that Big Social Network With a member base frequently: • New to Etsy • New to Running a Business • New to the Internet Photo Credit: Rod Ramsey 6
  7. 7. To Make It More Interesting: Continuous Deployment On average, there are 50+ production code changes per day. So when we have a problem: Is it an operations problem? Is it a development problem? Learn more Is it a product problem causing complaints to come in? Or is it an attack? 7
  8. 8. Old Workflow: #notwinning Logging into production network (!) Finding the right file Unzipping the right file Grepping • Writing very clever scripts to extract data • Writing more clever scripts to merge data • Making a report – in plain text 34 minutes for • Alerting one day’s log for nothing! 8
  9. 9. Splunk installed at Etsy mid-2010 "Hey. .. let's go try this NEW thing!" (door slamming shut) "Sorry.... we're closed.” Steve Martin. Comedy is not pretty. 1979. Track 8 ~2:45 Serious New Technology Fatigue Why don’t we use a Real Database with SQL? Grep technology works &^#*&@^*#^%^ YAQL – Yet Another Query Language 9
  10. 10. L’Outrage Then a colleague: • Didn’t know Etsy’s stack (new) • Remote and out of office • Didn’t have production access • Didn’t know any of my very clever scripts • Not experienced with Splunk • In about 30 minutes whips up a real-time email alert for a velocity check on a particular URL I only have one thing to say about this….. 10
  11. 11. OH, YEAAHH! 400+GB indexed per day 30+ TB total storage 60+ data sources from “hundreds of servers” (via central syslog aggregation) 11
  12. 12. Data-Driven Security Three examples of how we use data and Splunk to help make Etsy a safer place to conduct business. •Web Application Security •Account Takeover •Payments and PCI That said we are barely scratching the surface of Splunk! Data-Driven By Mat Edelson. John Hopkins Engineering Magazine, Fall 2011 Illustration by Mark McGinnis No association, just a great article & illustration 12
  13. 13. WebApp Security
  14. 14. Make Security Visible Your peers actually are interested in security. But are you letting them? Turn security from a binary event into a continuous event. 14
  15. 15. Detect the Steps A journey of a thousand miles begins with a single step. Lao-tzu, China 600BC A single breach begins with a journey of a thousand steps. Nick Galbreath, USA 2012AD 15
  16. 16. SQLi, XSS, CSRF source=“info.log” log_name_space=“SECURITY” attacktype=“XSS” That was easy 16
  17. 17. SQLi, XSS, CRSF source=“info.log” log_name_space=“SECURITY” attacktype=“SQL”| geoip ip Paints a different picture 17
  18. 18. The Dumbest Check Possible for SQLi We have some snazzy technology for detecting SQLi in Splunk, but you don’t need it to get started: source=access.log (uri="*UNION+ALL*" OR uri="*UNION%20ALL*”) Will wildly undercount but also low false positive rate Will detect scans from various tools Will get you started in making security visible 18
  19. 19. SQLi and Database Errors source="error.log" ( "syntax error" NOT "smarty" NOT "ClientLogger" ) | eval event=_raw | table event' • We use Splunk to alert on any database syntax errors too. • SQLi attacks and probes will likely trigger a bust of syntax errors if code doesn’t properly sanitize data was That e clos Do the same with server 500 errors, core dumps 19
  20. 20. Investigating Rent-A-CPU Traffic source=“access.log” | lookup datacenter-cidrs provider_cidr AS true_client_ip OUTPUTNEW provider_name | where isnotnull(provider_name) | top provider_name Publi c Dat a S ee Appe ndix 20
  21. 21. SANS ISC 10K Sources source=“access.log” | where isnotnull(true_client_ip) | lookup isc-bad-ips src_ip AS true_client_ip | where isnotnull(rank) | table true_client_ip, rank, reports, attacks, last_seen | stats count by true_client_ip,rank | sort rank Public D a See App ta endix 21
  22. 22. Attacker-Driven Testing “I thought I found something but then it stopped working…” Email to from ethical hacker Attacker-driver testing augments Etsy’s proactive security measures Splunk alerts us on potential attacks using a number of parameters What URLs are being targeted? Maybe they found something? Can it be reproduced? (sometimes completely automated validation) Fixes can be pushed out that day, if not within minutes. 22
  23. 23. Security Post-Mortems For any security vulnerability, found either external or internal, exploited or not, we hold “blameless post-mortems” Use to teach about security issues e.g. review OWASP Top 10 Can we make it so this mistake doesn’t happen again or can be automatically detected? A Key to post-mortem is know when something started and when it ended. Logs “at your fingertips” via Splunk helps greatly (and absolutely essential for actual incidents) 23
  24. 24. Account Takeover
  25. 25. Account Takeover • Stolen credentials • Brute forcing of credentials • Using account takeover of email to further takeover other accounts Horrible for victim and really slow to clean up 25
  26. 26. Many Users Failing to Sign-in from One IP 'source=“info.log” log_namespace=“login” reason="wrong password” true_client_ip!=38.117.156.X X X | dedup etsy_username,true_client_ip | transaction true_client_ip | where eventcount > X X X X | table true_client_ip,etsy_username | geoip true_client_ip | table true_client_ip,true_client_ip_countryname,etsy_username' 26
  27. 27. Brute Forcing Passwords? source=”info.log” log_namespace="login” Peop le wil reason="wrong password" try 10 l true_client_ip!=38.117.156.X X X passw 0 ords | transaction etsy_username manu ally | where eventcount > XXXX | table etsy_username,true_client_ip,eventcount | sort -eventcount Frequency Buckets set in Splunk Dashboard 27
  28. 28. I Forgot My Password x1000 source=“/web/access.log” request_uri=/forgot_password.php http_method=POST | transaction true_client_ip | where eventcount > X X X | table true_client_ip,eventcount o from | sort –eventcount Hell bia! Ser Not just fraud… has disclosed problems in email transport and product problems with our reset flow 28
  29. 29. Apply the same analysis to other things that should not change much – Payment cards – Email addresses – Passwords (successful change) – Regular physical addresses 29
  30. 30. CAPTCHA Splunk 2x2 dashboard keeps us in-the-know on how often CAPTCHAs are being shown, to whom, and how often they pass. reCAPTCHA 30
  31. 31. Integrated into Support Tools Splunk is glued into our internal tools used by General Support and MITS (Marketplace Integrity / Trust & Safety) teams. 31
  32. 32. Payments and PCI
  33. 33. Payments @ Etsy Ramping up on our own payments platform Full PCI Environment With separate Splunk installation This space intentionally left blank. 33
  34. 34. Alerting on Unusual Payment Activity All the WebApp security and account take- over rules apply, along with special checks for payment activity Abnormally large payments Part of Payment velocity a larger paymen Very small payments (skimming?) t risk solution The usual IP address checks. 34
  35. 35. Compliance and Reporting Instead of building custom applications with fuzzy requirements “Log it, let Splunk figure it out later” Even the business guys can use it for ad-hoc queries. Unexpected side effect: removing and/or changing data is really hard. This is good. Compare to SQL. (Splunk also has a secure log system) Easy to make reports PCI QSA so far says this meets PCI requirements. 35
  36. 36. Internal Risk Again, instead of build out of new application (with fuzzy requirements) Log It, Splunk it later. Who, is what making what changes Who is looking at potentially sensitive data And alert on it. Used in payments and main support Etsy Support and MITS 2012 applications 100% Good Eggs Team Etsy 2012 36
  37. 37. Credits
  38. 38. Acknowledgements This presentation would not be possible without the hard work by: Marcus Barczak Jerry Soung Zane Lackey Operations Fraud and Risk Security Engineering Engineering Big thanks to everyone at Etsy in Engineering, Payments, Operations, Support and MITS And of course, the fine folks at Splunk! 38
  39. 39. Data and References Datacenter IP List: ISC Top Troublemaker IPs: On Security and Continuous Deployment: Other presentations on Etsy and Security/Fraud/DevOps: 39
  40. 40. Security Engineering and “Just Culture” Treating security mistakes as “accidents” (whether exploited or not) Based originally on health care initiatives Patient Safety and “Just Culture”, David Marx JD – – (presentation) John Allspaw on Blameless Post-Mortems: 40
  41. 41. It’s time for questions! Nick Galbreath @ngalbreath t t p : //s l i d e s h a . r e /K P v H Y u