Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Security precognition chaos engineering in incident response


Published on

Source : RSA Conference

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Security precognition chaos engineering in incident response

  1. 1. SESSION ID: #RSAC Aaron Rinehart Security Precognition: Chaos Engineering in Incident Response ASD-W03 Chief Technology Officer @aaronrinehart Kyle Erickson Director of IoT Security Medtronic
  2. 2. #RSAC @aaronrinehart “Resilience is the story of the outage that never happened.” - John Allspaw
  3. 3. #RSAC Security PreCognition
  4. 4. #RSAC 4 About A.A.Ron ● CTO of Stealthy Startup in Chaos Engineering ● Former Chief Security Architect @UnitedHealth responsible for strategy ● Led the DevOps and Open Source Transformation at UnitedHealth Group ● Extensive enterprise experience (DOD, NASA, DHS, CollegeBoard ) ● Frequent speaker and author on Chaos Engineering & Security ● Pioneer behind Security Chaos Engineering ● Led ChaoSlingr team at UnitedHealth
  5. 5. #RSAC 5 About Kyle E> ● Cybersecurity Director of IOT Security @ Largest Medical Device Company in the World ● Former Director of Security Incident Response @UnitedHealth ● Former Director of Cloud and Application Security Engineering @Optum ● Startup and enterprise experience as an engineer and leader ● Build and break all the things! ● Wannabe financial analyst ● Baseball stadium aficionado
  6. 6. #RSAC In this Session we will cover
  7. 7. #RSAC Takeaways ★Problems with Complex Adaptive Systems
  8. 8. #RSAC Takeaways ★Problems with Complex Adaptive Systems ★Security Chaos Engineering
  9. 9. #RSAC Takeaways ★Problems with Complex Adaptive Systems ★Security Chaos Engineering ★How it works: Security IR Use Case
  10. 10. #RSAC Our systems have evolved beyond human ability to mentally model their behavior. 10
  11. 11. #RSAC A Complex Dynamic Problem
  12. 12. #RSAC Evolution of Modern Architecture Monolith Microservices Functions
  13. 13. #RSAC Circuit Breaker Patterns 13 Continuous Delivery Distributed Systems Blue/Green Deployments Cloud Computing Service Mesh Containers Immutable InfrastructureInfracode Continuous Integration Microservice Architectures API Auto Canaries CI/CD DevOps Automation Pipelines Complex?
  14. 14. #RSAC 14 Mostly Monolithic Requires Domain Knowledge Prevention focused Poorly Aligned Defense in Depth Stateful in nature DevSecOps not widely adopted Security? Expert Systems Adversary Focused
  15. 15. #RSAC Simplicity?
  16. 16. #RSAC Complexity
  17. 17. #RSAC How well do you really understand how your system works?
  18. 18. #RSAC Stella Report
  19. 19. #RSAC System Mental Model
  20. 20. #RSAC So what does all of this have to do with Security?
  21. 21. #RSAC Failure Happens.
  22. 22. #RSAC Security Incidents & System Outages are Costly
  23. 23. #RSAC Security Incidents are Subjective in Nature
  24. 24. #RSAC We really don't know Where? Why? Who? What?How? very much
  25. 25. #RSAC Let’s face it, when outages happen…..
  26. 26. #RSAC Teams spend too much time reacting to outages instead of building more resilient systems.
  27. 27. #RSAC Unexpected application behavior often causes people to intervene and make the situation worse I wonder why it did that? Let’s reboot it. Whoops! Now it’s really hosed
  28. 28. #RSAC “Response” is the problem with Incident Response
  29. 29. #RSAC Ring Ring!
  30. 30. #RSAC Conditioned?
  31. 31. #RSAC War Rooms
  32. 32. #RSAC True Cost
  33. 33. #RSAC IR Success
  34. 34. #RSAC “Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s ability to withstand turbulent conditions”
  35. 35. #RSAC Who is doing Chaos?
  36. 36. #RSAC Security Engineering
  37. 37. #RSAC People Operate Differently when they expect things to fail
  38. 38. #RSAC
  39. 39. #RSAC
  40. 40. #RSAC The Normal Condition of a Human & Systems they Build is to Fail
  41. 41. #RSAC We need failure to learn & Grow 46
  42. 42. #RSAC Let’s Flip the Model Post Mortem = Preparation
  43. 43. #RSAC Create Order through Chaos
  44. 44. #RSAC Use Chaos Engineering to initiate Objective Feedback Loops about Security Effectiveness
  45. 45. #RSAC Proactively Manage & Measure Validate Runbooks Measure Team Skills Determine Control Effectiveness Learn new insights into system behavior Transfer knowledge Build a learning culture
  46. 46. #RSAC Testing vs. Experimentation
  47. 47. #RSAC Security Crayon Differences Noisy distributed system behavior Not geared for Cascading Events Point-in-time even if Automated Performed by Security Teams with Specialized skill sets
  48. 48. #RSAC Security Chaos Differences Distributed Systems Focus Goal: Experimentation Human Factors focused Small Isolated Scope Focus on Cascading Events Performed by Mixed Engineering Teams in Gameday During business hours
  49. 49. #RSAC 2018 Causes of Data Breaches
  50. 50. #RSAC Proactively Manage & Measure
  51. 51. #RSAC Incidents vs. Chaos Uncontrolled Controlled Unpredictable Scheduled Time to Detect: Minutes 0 Time to Detect Time to Resolve: ???? Time to Resolve: seconds Analysis Time: ???? Root Cause Analysis: Intentional
  52. 52. #RSAC Continuous SECURITY Validation
  53. 53. #RSAC Build Confidence in What Actually Works
  54. 54. #RSAC So how does it work?
  55. 55. #RSAC How it Works Plan & Organize GameDay Exercise
  56. 56. #RSAC How it Works Plan & Organize GameDay Exercise Chaos Experiment Develop & Evaluate
  57. 57. #RSAC Infrastructure Switching Application People Tool s Tool sChaos Engineering Team Security
  58. 58. #RSAC What could go wrong?
  59. 59. #RSAC What has gone wrong before?
  60. 60. #RSAC How does My Security Really Work?
  61. 61. #RSAC What evidence do I have to prove it?
  62. 62. #RSAC How it Works Plan & Organize GameDay Exercise Execute Live GameDay Operations Chaos Experiment Develop & Evaluate
  63. 63. #RSAC How it Works Plan & Organize GameDay Exercise Execute Live GameDay Operations Chaos Experiment Develop & Evaluate Conduct Pre- Incident Review
  64. 64. #RSAC How it Works Plan & Organize GameDay Exercise Execute Live GameDay Operations Automate & Evangelize Results & Take Action Chaos Experiment Develop & Evaluate Conduct Pre- Incident Review
  65. 65. #RSAC
  66. 66. Hypothesis: If someone accidentally or maliciously introduced a misconfigured port then we would immediately detect, block, and alert on the event. Alert SOC? Config Mgmt? Misconfigured Port Injection IR Triage Log data? Wait... Firewall?
  67. 67. Result: Hypothesis disproved. Firewall did not detect or block the change on all instances. Standard Port AAA security policy out of sync on the Portal Team instances. Port change did not trigger an alert and log data indicated successful change audit. However we unexpectedly learned the configuration mgmt tool caught change and alerted the SoC. Alert SOC? Config Mgmt? Misconfigured Port Injection IR Triage Log data? Wait... Firewall?
  68. 68. #RSAC More Experiment Examples ● Internet exposed Kubernetes API ● Unauthorized Bad Container Repo ● Unencrypted S3 Bucket ● Disable MFA ● Bad AWS Automated Block Rule ● Software Secret Clear Text Disclosure ● Permission collision in Shared IAM Role Policy ● Disabled Service Event Logging ● Introduce Latency on Security Controls ● API Gateway Shutdown
  69. 69. #RSAC 74 An Open Source Tool
  70. 70. #RSAC • ChatOps Integration • Configuration-as-Code • Example Code & Open Framework ChaoSlingr Product Features • Serverless App in AWS • 100% Native AWS • Configurable Operational Mode & Frequency • Opt-In | Opt-Out Model
  71. 71. #RSAC
  72. 72. #RSAC Apply What You Have Learned Today 77 • Next week you should: – Identify critical database(s) within your organization • In the first three months following this presentation you should: – Understand who is accessing the database(s), from where and why – Define appropriate controls for the database • Within six months you should: – Select a security system which allows proactive policy to be set according to your organization’s needs – Drive an implementation project to protect all critical databases
  73. 73. Q&A @aaronrinehart @ericksonky
  74. 74. Thank you RSA!!!! @aaronrinehart @ericksonky