Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Leveling the playing field

135 views

Published on

A talk about scaling web security

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Leveling the playing field

  1. 1. Leveling the Playing Field Aaron Bedra Chief Security Officer, Eligible @abedra keybase.io/abedra
  2. 2. Right now, your web applications are being attacked
  3. 3. And it will happen again, and again, and again
  4. 4. As you grow so will the target on you
  5. 5. Keeping up with security is difficult
  6. 6. Actually, it’s unfair
  7. 7. Things you have to get right Things the attacker has to get right
  8. 8. Time the attacker has to focus on you Time you have to focus on the attacker
  9. 9. It’s asymmetric warfare
  10. 10. There’s no way to manually keep up
  11. 11. Manual Automated Intelligent
  12. 12. Scaling your defenses means strategic automation
  13. 13. STOP!
  14. 14. Let’s talk about the problem we are solving for a minute
  15. 15. Problems • We don’t know what people are doing • We don’t know how often they are doing it • We don’t know how effective we are • We are don’t have enough resources to keep up
  16. 16. Goals • Reduce noise • Generate better signal • Reduce operational overhead • Build better business cases • Spend energy on the really important stuff
  17. 17. Reducing Noise
  18. 18. It starts with really simple stuff
  19. 19. Tie up the loose ends with static configuration
  20. 20. Static configuration checklist At least a B+ rating on SSL Labs* Reject extensions that you don’t want to accept Reject known bad user agents Reject specific known bad actors Custom error pages that fit your application Basic secure headers
  21. 21. You’ll be surprised how well this works
  22. 22. It has a fringe benefit of creating better awareness
  23. 23. You can feed this back to your intelligence
  24. 24. Reducing Operational Overhead
  25. 25. Dealing with malicious actors has to be easy
  26. 26. It shouldn’t require deploys, reloads, or any potential forward impact
  27. 27. Let’s talk about how to create something that will help
  28. 28. Step 1 Put everything in one place!
  29. 29. Centralization of events is critical
  30. 30. If you can’t see it, it didn’t happen
  31. 31. There are options
  32. 32. Log aggregation and a query engine
  33. 33. The query engine can serve as your discovery agent
  34. 34. A nice first step
  35. 35. But it will eventually fall over
  36. 36. That’s when you reach for a messaging system
  37. 37. Log to topics in a queue
  38. 38. Create processors to understand events
  39. 39. Step 2 Process Events
  40. 40. For every event type you will need to understand how to process it
  41. 41. Structured logging can help, but it doesn’t fit everywhere
  42. 42. The goal is to accept an event and return consumable details
  43. 43. type logEntry struct { Address string Method string Uri string ResponseCode string } func processEntry(entry string) logEntry { parts := strings.Split(entry, " ") event := logEntry{ Address: parts[0], Method: strings.Replace(parts[5], """, "", 1), Uri: parts[6], ResponseCode: parts[8], } return event; }
  44. 44. You will likely have multiple processors
  45. 45. Split topics by event type or application
  46. 46. Once you have the data accessible, figure out what happened
  47. 47. Track everything! • HTTP Method • Time since last request/average requests per sec • Failed responses • Failure of intended action (e.g. login, add credit card, edit, etc) • Anything noteworthy
  48. 48. type Actor struct { Methods map[string]int FailedLogins int FailedResponses map[string]int } func updateEvents(event logEntry, counts *map[string]Actor) { counts[event.Address].Methods[event.Method] += 1 if event.ResponseCode != "200" || event.ResponseCode != "302" { counts[event.Address].FailedResponses[ResponseCode] += 1 } if event.Method == "POST" && event.ResponseCode == "200" { counts[event.Address].FailedLogins += 1 } }
  49. 49. Once you have things in one place, it’s all about counting
  50. 50. Simple counts with thresholds go a long way
  51. 51. Step 3 Thresholds, Patterns, and Deviations
  52. 52. Exceeding a count is a signal that something needs to be done
  53. 53. There are a lot of signals that could be malicious
  54. 54. You can start with simple thresholds • Too many failed logins • Too many bad response codes (4xx, 5xx) • Request volume too high
  55. 55. These provide a lot of signal
  56. 56. But they don’t get you all the way there
  57. 57. There are patterns of behavior that signal malicious intent
  58. 58. Example
  59. 59. 10.20.253.8 - - [23/Apr/2013:14:20:21 +0000] "POST /login HTTP/1.1" 200 267"-" "Mozilla/ 5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/ 20100101 Firefox/8.0" "77.77.165.233"
  60. 60. 10.20.253.8 - - [23/Apr/2013:14:20:22 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"
  61. 61. 10.20.253.8 - - [23/Apr/2013:14:20:23 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2083 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"
  62. 62. 10.20.253.8 - - [23/Apr/2013:14:20:24 +0000] "POST /users/king-roland/credit_cards HTTP/ 1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/ 8.0" "77.77.165.233"
  63. 63. That was a carding attack
  64. 64. As you dig in, you will find many patterns like these
  65. 65. But again it doesn’t cover everything
  66. 66. There will also be interesting deviations
  67. 67. 5% 5% 4% 27% 59% GET POST HEAD PUT DELETE
  68. 68. Deviations in normal flow are interesting but not necessarily malicious
  69. 69. You will have to build more intelligent processing to understand them
  70. 70. Example
  71. 71. A password reset request comes from a new location
  72. 72. Is it a harmless request or an account takeover?
  73. 73. Your processors will have to make complicated choices based on lots of information
  74. 74. Nailing deviation requires the largest amount of effort
  75. 75. Step 4 Act
  76. 76. Once you have enough information to make a decision, you must act
  77. 77. There are multiple ways to act • Blacklist • Whitelist • Mark • Do nothing
  78. 78. Blacklist and whitelist are pretty straight forward
  79. 79. Blacklist when thresholds are exceeded or patterns/deviation fit
  80. 80. Whiltelist things you never want to be blacklisted
  81. 81. Marking is more interesting
  82. 82. Marking allows you to tag actors as potentially malicious
  83. 83. This allows you to dynamically modify your responses
  84. 84. And choose how you react
  85. 85. “Of course machines can't think as people do. A machine is different from a person. Hence, they think differently.” -- Alan Turing, The Imitation Game
  86. 86. You can often render bots useless with small changes
  87. 87. Which exposes them as bots
  88. 88. And gives you the confidence you need to blacklist them
  89. 89. Marking also helps you lower the rate of false positives
  90. 90. Step 5 Visualize
  91. 91. Visualization is incredibly helpful
  92. 92. You need a window into your automation
  93. 93. Spending a few minutes a day looking at what happened is vital
  94. 94. You can pretty easily catch bugs this way
  95. 95. Architecture & Peformance
  96. 96. There are three main ideas • The thing that acts on actors • The shared cache • The event processors
  97. 97. Acting on actors should be fast
  98. 98. Fast in a web request is single digit milliseconds
  99. 99. You can choose to embed this in your applications or your web servers
  100. 100. Data locality is important
  101. 101. It usually involves replicating the global cache to each decision point
  102. 102. The cache should hold everything needed to act on actors
  103. 103. The web server asks the cache what to do
  104. 104. The event processors work out of band
  105. 105. Their sole purpose is to populate the cache
  106. 106. Processors tend to be more custom
  107. 107. But the cache and the acting logic is common
  108. 108. github.com/repsheet
  109. 109. Pitfalls
  110. 110. Things to consider • False positives • Decision latency • Incorrect modeling • Bad data • Monitoring
  111. 111. There’s a good chance you will block incorrectly
  112. 112. Make use of whitelisting
  113. 113. Mobile carriers will be a problem
  114. 114. So will NATed IP addresses
  115. 115. Time to decision should be monitored
  116. 116. Create a solid regression suite
  117. 117. Run all your models through it when you make even a single change
  118. 118. Understand where bad data can impact you
  119. 119. Build tolerance of bad data so you don’t make incorrect decisions
  120. 120. Monitor everything!
  121. 121. This type of automation deserves every monitor and metric you can get
  122. 122. Questions?

×