This document provides an overview of a presentation on chaos engineering and security chaos engineering. The presentation covers United Health Group's journey to rugged DevOps, combating complexity in software, and approaches to chaos engineering and security chaos engineering. Specific topics discussed include automated security configuration and validation using Chef and Inspec, using Gauntlt for automated vulnerability scanning, lessons learned from DevOps transformations, and examples of chaos engineering experiments and game days.
OWASP AppSec Global 2019 Security & Chaos EngineeringAaron Rinehart
Security today is customarily a reactive and chaotic exercise.
In this session, we will introduce a new concept known as Security Chaos Engineering and how it can be applied to create highly secure, performant, and resilient distributed systems.
Blameless Retrospectives in DevSecOps (at Global Healthcare Giants)DJ Schleen
Join us at Agile+DevOps East's DevSecOps Summit on November 18th to check out our new presentation: https://agiledevopseast.techwell.com/program/devsecops-summit-sessions/blameless-retrospectives-devsecops-global-healthcare-giants-agile-devops-virtual-2020
HealthConDX Virtual Summit 2021 - How Security Chaos Engineering is Changing ...Aaron Rinehart
The complex ordeal of delivering secure and reliable software in Healthcare will continue to become exponentially more difficult unless we begin approaching the craft differently.
Enter Chaos Engineering, but now also for security. Instead of a focus on resilience against service disruptions, the focus is to identify the truth behind our current state security and determine what “normal” operations actually look like when it's put to the test.
The speed, scale, and complex operations within modern systems make them tremendously difficult for humans to mentally model their behavior. Security Chaos Engineering is an emerging practice that is helping engineers and security professionals realign the actual state of operational security and build confidence that it works the way it was intended to.
Join Aaron Rinehart to learn how he implemented Security Chaos Engineering as a practice at the world’s largest healthcare company to proactively discover system weakness before they were taken advantage of by malicious adversaries. In this session Aaron will share his experience of applying Security Chaos Engineering to create highly secure, performant, and resilient distributed systems.
RSA Conference APJ 2019 DevSecOps Days Security Chaos EngineeringAaron Rinehart
Distributed systems at scale have unpredictable and complex outcomes that are costly when security incidents occur. The speed, scale, and complex operations within microservice architectures make them tremendously difficult for humans to mentally model their behavior. If the latter is even remotely true how is it possible to adequately secure services that are not even fully comprehended by the engineering teams that built them. How do we realign the actual state of operational security measures to maintain an acceptable level of confidence that our security actually works. Security Chaos Engineering allows teams to proactively, safely discover system weakness before they disrupt business outcomes.
In this session Aaron will uncover the importance of using Chaos Engineering in developing a learning culture in a DevSecOps world. Aaron will walk us through how to get started with Chaos Engineering for security and how it can be practically applied to enhance system performance, resilience and security.
Security focused Chaos Engineering allows engineering teams to derive new information about the state of security within their distributed systems that was previously unknown. This new technique of instrumentation attempts to proactively inject security turbulent conditions or faults into our systems to determine the conditions by which our security will fail so that we can fix it before it causes customer pain.
During this session we will cover some key concepts in Safety & Resilience Engineering and how new techniques such as Chaos Engineering are making a difference in improving our ability to learn from incidents proactively before they become destructive.
ChaoSlingr: Introducing Security based Chaos TestingAaron Rinehart
ChaoSlingr is a Security Chaos Engineering Tool focused primarily on the experimentation on AWS Infrastructure to bring system security weaknesses to the forefront.
The industry has traditionally put emphasis on the importance of preventative security control measures and defense-in-depth where-as our mission is to drive new knowledge and perspective into the attack surface by delivering proactively through detective experimentation. With so much focus on the preventative mechanisms we never attempt beyond one-time or annual pen testing requirements to actually validate whether or not those controls actually are performing as designed.
Our mission is to address security weaknesses proactively, going beyond the reactive processes that currently dominate traditional security models.
OWASP AppSec Global 2019 Security & Chaos EngineeringAaron Rinehart
Security today is customarily a reactive and chaotic exercise.
In this session, we will introduce a new concept known as Security Chaos Engineering and how it can be applied to create highly secure, performant, and resilient distributed systems.
Blameless Retrospectives in DevSecOps (at Global Healthcare Giants)DJ Schleen
Join us at Agile+DevOps East's DevSecOps Summit on November 18th to check out our new presentation: https://agiledevopseast.techwell.com/program/devsecops-summit-sessions/blameless-retrospectives-devsecops-global-healthcare-giants-agile-devops-virtual-2020
HealthConDX Virtual Summit 2021 - How Security Chaos Engineering is Changing ...Aaron Rinehart
The complex ordeal of delivering secure and reliable software in Healthcare will continue to become exponentially more difficult unless we begin approaching the craft differently.
Enter Chaos Engineering, but now also for security. Instead of a focus on resilience against service disruptions, the focus is to identify the truth behind our current state security and determine what “normal” operations actually look like when it's put to the test.
The speed, scale, and complex operations within modern systems make them tremendously difficult for humans to mentally model their behavior. Security Chaos Engineering is an emerging practice that is helping engineers and security professionals realign the actual state of operational security and build confidence that it works the way it was intended to.
Join Aaron Rinehart to learn how he implemented Security Chaos Engineering as a practice at the world’s largest healthcare company to proactively discover system weakness before they were taken advantage of by malicious adversaries. In this session Aaron will share his experience of applying Security Chaos Engineering to create highly secure, performant, and resilient distributed systems.
RSA Conference APJ 2019 DevSecOps Days Security Chaos EngineeringAaron Rinehart
Distributed systems at scale have unpredictable and complex outcomes that are costly when security incidents occur. The speed, scale, and complex operations within microservice architectures make them tremendously difficult for humans to mentally model their behavior. If the latter is even remotely true how is it possible to adequately secure services that are not even fully comprehended by the engineering teams that built them. How do we realign the actual state of operational security measures to maintain an acceptable level of confidence that our security actually works. Security Chaos Engineering allows teams to proactively, safely discover system weakness before they disrupt business outcomes.
In this session Aaron will uncover the importance of using Chaos Engineering in developing a learning culture in a DevSecOps world. Aaron will walk us through how to get started with Chaos Engineering for security and how it can be practically applied to enhance system performance, resilience and security.
Security focused Chaos Engineering allows engineering teams to derive new information about the state of security within their distributed systems that was previously unknown. This new technique of instrumentation attempts to proactively inject security turbulent conditions or faults into our systems to determine the conditions by which our security will fail so that we can fix it before it causes customer pain.
During this session we will cover some key concepts in Safety & Resilience Engineering and how new techniques such as Chaos Engineering are making a difference in improving our ability to learn from incidents proactively before they become destructive.
ChaoSlingr: Introducing Security based Chaos TestingAaron Rinehart
ChaoSlingr is a Security Chaos Engineering Tool focused primarily on the experimentation on AWS Infrastructure to bring system security weaknesses to the forefront.
The industry has traditionally put emphasis on the importance of preventative security control measures and defense-in-depth where-as our mission is to drive new knowledge and perspective into the attack surface by delivering proactively through detective experimentation. With so much focus on the preventative mechanisms we never attempt beyond one-time or annual pen testing requirements to actually validate whether or not those controls actually are performing as designed.
Our mission is to address security weaknesses proactively, going beyond the reactive processes that currently dominate traditional security models.
Modern systems pose a number of thorny challenges and securing the transformation from legacy monolithic systems to distributed systems demands a change in mindset and engineering toolkit. The security engineering toolkit is unfortunately out-of-style and outdated with today's approach to building, security and operating distributed systems.
Distributed systems at scale have unpredictable and complex outcomes that are costly when security incidents occur. The speed, scale, and complex operations within microservice architectures make them tremendously difficult for humans to mentally model their behavior. If the latter is even remotely true how is it possible to adequately secure services that are not even fully comprehended by the engineering teams that built them. How do we realign the actual state of operational security measures to maintain an acceptable level of confidence that our security actually works.
Navigating the Unknowable: Resilience through Security Chaos Engineering
When applied to Cyber Security, Chaos Engineering is advancing our ability to reveal objective information about the effectiveness of operational security measures proactively through empirical experimentation. In this session we will introduce the core concepts behind this new technique and how you can get started in building and applying it.
Chaos Engineering - The Art of Breaking Things in ProductionKeet Sugathadasa
This is an introduction to Chaos Engineering - the Art of Breaking things in Production. This is conducted by two Site Reliability Engineers which explains the concepts, history, principles along with a demonstration of Chaos Engineering
The technical talk is given in this video: https://youtu.be/GMwtQYFlojU
Security incident response is a reactive and chaotic exercise. What if it were possible to flip the scenario on its head? Security focused chaos engineering takes the approach of advancing the security incident response apparatus by reversing the postmortem and preparation phases. Contrary to Purple Team or Red Team game days, Security Chaos Engineering does not use threat actor tactics, techniques and procedures. It develops teams through unique configuration, cyber threat and user error scenarios that challenge responders to react to events outside their playbooks and comfort zones.
Security Chaos Engineering allows incident response and product teams to derive new information about the state of security within their distributed systems that was previously unknown. Within this new paradigm of instrumentation where we proactively conduct “Pre-Incident” vs. “Post-Incident” reviews we are now able to more accurately measure how effective our security incident response teams, tools, skills, and procedures are during the manic of the Incident Response function.
In this session Aaron Rinehart, the mind behind the first Open Source Security Chaos Engineering tool ChaoSlingr, will introduce how Security Chaos Engineering can be applied to create highly secure, performant, and resilient distributed systems.
Chaos engineering for cloud native securityKennedy
Human errors and misconfiguration-based vulnerabilities have become a major cause of data breaches and other forms of security attacks in cloud-native infrastructure (CNI). The dynamic and complex nature of CNI and the underlying distributed systems further complicate these challenges. Hence, novel security mechanisms are imperative to overcome these challenges. Such mechanisms must be customer-centric, continuous, not focused on traditional security paradigms like intrusion detection. We tackle these security challenges via Risk-driven Fault Injection (RDFI), a novel application of cyber security to chaos engineering. Chaos engineering concepts (e.g. Netflix’s Chaos Monkey) have become popular since they increase confidence in distributed systems by injecting non-malicious faults (essentially addressing availability concerns) via experimentation techniques. RDFI goes further by adopting security-focused approaches by injecting security faults that trigger security failures which impact on integrity, confidentiality, and availability. Safety measures are also employed such that impacted environments can be reversed to secure states. Therefore, RDFI improves security and resilience drastically, in a continuous and efficient manner and extends the benefts of chaos engineering to cyber security. We have researched and implemented a proof-of-concept for RDFI that targets multi-cloud enterprise environments deployed on AWS and Google cloud platform.
Maturing DevSecOps: From Easy to High ImpactSBWebinars
Digital Transformation and DevSecOps are the buzzwords du jour. Increasingly, organizations embrace the notion that if you implement DevOps, you must transform security as well. Failing to do so would either leave you insecure or make your security controls negate the speed you aimed to achieve in the first place.
So doing DevSecOps is good... but what does it actually mean? This talk unravels what it looks like with practical, good (and bad) examples of companies who are:
Securing DevOps technologies - by either adapting or building new solutions that address the new security concerns
Securing DevOps methodologies - changing when and how security controls interact with the application and the development process
Adapting to a DevOps philosophy of shared ownership for security
In the end, you'll have the tools you need to plan your interpretation of DevSecOps, choose the practices and tooling you need to support it, and ensure that Security leadership is playing an important role in making it a real thing in your organization.
The reactionary state of the industry means that we quickly identify the ‘root cause’ in terms of ‘human-error’ as an object to attribute and shift blame. Hindsight bias often confuses our personal narrative with truth, which is an objective fact that we as investigators can never fully know. The poor state of self-reflection, human factors knowledge, and the nature of resource constraints further incentivize this vicious pattern. This approach results in unnecessary and unhelpful assignment of blame, isolation of the engineers involved, and ultimately a culture of fear throughout the organization. Mistakes will always happen.
Rather than failing fast and encouraging experimentation, the traditional process often discourages creativity and kills innovation. As an alternative to simply reacting to failures, the security industry has been overlooking valuable chances to further understand and nurture ‘accidents’ or ‘mistakes’ as opportunities to proactively strengthen system resilience. Expose the failures, build resilient systems, and develop an "Applied security" model to minimize the impact of failures. In this session we will cover discuss the role of ‘human-error’, root cause, and resilience engineering in our industry and how we can use new techniques such as Chaos Engineering to make a difference.
Security focused Chaos Engineering proposes that the only way to understand this uncertainty is to confront it objectively by introducing controlled signals. During this session we will cover some key concepts in Safety & Resilience Engineering work based on Sydney Dekker’s 30 years of research into airline accident investigations and how new techniques such as Chaos Engineering are making a difference in improving our ability to learn from incidents proactively before they become destructive
40 DevSecOps Reference Architectures for you. See what tools your peers are using to scale DevSecOps and how enterprises are automating security into their DevOps pipeline. Learn what DevSecOps tools and integrations others are deploying in 2019 and where your choices stack up as you consider shifting security left.
If you thought it was difficult bringing the Ops and Dev teams to the same table, let’s talk about security! Often housed in a separate team, security experts have no incentive to ship software, with a mission solely to minimise risk.
This talk is a detailed case study of bringing security into DevOps. We’ll look at the challenges and tactics, from the suboptimal starting point of a highly regulated system with a history of negative media attention. It follows an Agile-aspiring Government IT team from the time when a deployable product was "finished" to when the application was first deployed many months later.
This talk is about humans and systems - in particular how groups often need to flex beyond the bounds of what either side considers reasonable, in order to get a job done. We’ll talk about structural challenges, human challenges, and ultimately how we managed to break through them.
There are no villains - everybody in this story is a hero, working relentlessly through obstacles of structure, time, law, and history. Come hear what finally made the difference, filling in the missing middle of DevSecOps.
Security in a Site Reliability Engineering (SRE) context with a focus on being pragmatic just makes sense. In this talk, we will look at 4 key areas where SRE and Security tribes can join forces and influence the overall business. This is a lab/discussion session.
Discussion of how security is in crisis but DevSecOps offers a new playbook and gives security a path to influence. Taking a look at the WAF space, we look at how Signal Sciences has created feedback between Dev and Ops and Security to create new value.
Finding Security a Home in a DevOps WorldShannon Lietz
Presented this talk at DevOps Summit in 2015 to a DevOps community. Discovered that security is new to most DevOps teams and this was a very good discussion.
The Emergent Cloud Security Toolchain for CI/CDJames Wickett
Security is in crisis and it needs a new way to move forward. This talk from Nov 2018, Houston ISSA meeting discusses the tooling needed to rise to the demands of devops and devsecops.
This talk by Stefan Streichsbier, Co-Founder of GuardRails.io, provides a brief history of how development, operations and security testing have become highly complex. It continues to outline the key problems with traditional security solutions and why in 2020 companies around the world are still figuring out a good way to manage security as part of rapid development cycles. Specifically, the big challenge of introducing and fixing new security issues versus tackling the existing security dept of existing applications.
To quote Bishop Desmond Tutu, “There comes a point where we need to stop just pulling people out of the river. We need to go upstream and find out why they’re falling in.”
After setting the stage, the remainder of the talk will focus on the paradigm shift that security solutions have to incorporate in order to solve the problem of sustainably secure applications on all layers. This will explore how the elements of Speed, Just in time training, and Data science have to be leveraged to empower development teams around the globe to get ahead for once and finally become able to move fast and be safe at the same time.
The 3 core takeaways for the audience are:
1.) Where security practices have gone wrong so far.
2.) What new technologies will cause a paradigm shift in how security is applied at scale.
3.) How security will look like in 5-10 years.
This is the latest version of the State of the DevSecOps presentation, which was given by Stefan Streichsbier, founder of guardrails.io, as the keynote for the Singapore Computer Society - DevSecOps Seminar in Singapore on the 13th January 2020.
This talk provides a brief history of how DevOps has enabled tech companies to become unicorns. Furthermore, is Security in DevOps important, who is responsible and what can teams do make security a competitive advantage.
Modern systems pose a number of thorny challenges and securing the transformation from legacy monolithic systems to distributed systems demands a change in mindset and engineering toolkit. The security engineering toolkit is unfortunately out-of-style and outdated with today's approach to building, security and operating distributed systems.
Distributed systems at scale have unpredictable and complex outcomes that are costly when security incidents occur. The speed, scale, and complex operations within microservice architectures make them tremendously difficult for humans to mentally model their behavior. If the latter is even remotely true how is it possible to adequately secure services that are not even fully comprehended by the engineering teams that built them. How do we realign the actual state of operational security measures to maintain an acceptable level of confidence that our security actually works.
Navigating the Unknowable: Resilience through Security Chaos Engineering
When applied to Cyber Security, Chaos Engineering is advancing our ability to reveal objective information about the effectiveness of operational security measures proactively through empirical experimentation. In this session we will introduce the core concepts behind this new technique and how you can get started in building and applying it.
Chaos Engineering - The Art of Breaking Things in ProductionKeet Sugathadasa
This is an introduction to Chaos Engineering - the Art of Breaking things in Production. This is conducted by two Site Reliability Engineers which explains the concepts, history, principles along with a demonstration of Chaos Engineering
The technical talk is given in this video: https://youtu.be/GMwtQYFlojU
Security incident response is a reactive and chaotic exercise. What if it were possible to flip the scenario on its head? Security focused chaos engineering takes the approach of advancing the security incident response apparatus by reversing the postmortem and preparation phases. Contrary to Purple Team or Red Team game days, Security Chaos Engineering does not use threat actor tactics, techniques and procedures. It develops teams through unique configuration, cyber threat and user error scenarios that challenge responders to react to events outside their playbooks and comfort zones.
Security Chaos Engineering allows incident response and product teams to derive new information about the state of security within their distributed systems that was previously unknown. Within this new paradigm of instrumentation where we proactively conduct “Pre-Incident” vs. “Post-Incident” reviews we are now able to more accurately measure how effective our security incident response teams, tools, skills, and procedures are during the manic of the Incident Response function.
In this session Aaron Rinehart, the mind behind the first Open Source Security Chaos Engineering tool ChaoSlingr, will introduce how Security Chaos Engineering can be applied to create highly secure, performant, and resilient distributed systems.
Chaos engineering for cloud native securityKennedy
Human errors and misconfiguration-based vulnerabilities have become a major cause of data breaches and other forms of security attacks in cloud-native infrastructure (CNI). The dynamic and complex nature of CNI and the underlying distributed systems further complicate these challenges. Hence, novel security mechanisms are imperative to overcome these challenges. Such mechanisms must be customer-centric, continuous, not focused on traditional security paradigms like intrusion detection. We tackle these security challenges via Risk-driven Fault Injection (RDFI), a novel application of cyber security to chaos engineering. Chaos engineering concepts (e.g. Netflix’s Chaos Monkey) have become popular since they increase confidence in distributed systems by injecting non-malicious faults (essentially addressing availability concerns) via experimentation techniques. RDFI goes further by adopting security-focused approaches by injecting security faults that trigger security failures which impact on integrity, confidentiality, and availability. Safety measures are also employed such that impacted environments can be reversed to secure states. Therefore, RDFI improves security and resilience drastically, in a continuous and efficient manner and extends the benefts of chaos engineering to cyber security. We have researched and implemented a proof-of-concept for RDFI that targets multi-cloud enterprise environments deployed on AWS and Google cloud platform.
Maturing DevSecOps: From Easy to High ImpactSBWebinars
Digital Transformation and DevSecOps are the buzzwords du jour. Increasingly, organizations embrace the notion that if you implement DevOps, you must transform security as well. Failing to do so would either leave you insecure or make your security controls negate the speed you aimed to achieve in the first place.
So doing DevSecOps is good... but what does it actually mean? This talk unravels what it looks like with practical, good (and bad) examples of companies who are:
Securing DevOps technologies - by either adapting or building new solutions that address the new security concerns
Securing DevOps methodologies - changing when and how security controls interact with the application and the development process
Adapting to a DevOps philosophy of shared ownership for security
In the end, you'll have the tools you need to plan your interpretation of DevSecOps, choose the practices and tooling you need to support it, and ensure that Security leadership is playing an important role in making it a real thing in your organization.
The reactionary state of the industry means that we quickly identify the ‘root cause’ in terms of ‘human-error’ as an object to attribute and shift blame. Hindsight bias often confuses our personal narrative with truth, which is an objective fact that we as investigators can never fully know. The poor state of self-reflection, human factors knowledge, and the nature of resource constraints further incentivize this vicious pattern. This approach results in unnecessary and unhelpful assignment of blame, isolation of the engineers involved, and ultimately a culture of fear throughout the organization. Mistakes will always happen.
Rather than failing fast and encouraging experimentation, the traditional process often discourages creativity and kills innovation. As an alternative to simply reacting to failures, the security industry has been overlooking valuable chances to further understand and nurture ‘accidents’ or ‘mistakes’ as opportunities to proactively strengthen system resilience. Expose the failures, build resilient systems, and develop an "Applied security" model to minimize the impact of failures. In this session we will cover discuss the role of ‘human-error’, root cause, and resilience engineering in our industry and how we can use new techniques such as Chaos Engineering to make a difference.
Security focused Chaos Engineering proposes that the only way to understand this uncertainty is to confront it objectively by introducing controlled signals. During this session we will cover some key concepts in Safety & Resilience Engineering work based on Sydney Dekker’s 30 years of research into airline accident investigations and how new techniques such as Chaos Engineering are making a difference in improving our ability to learn from incidents proactively before they become destructive
40 DevSecOps Reference Architectures for you. See what tools your peers are using to scale DevSecOps and how enterprises are automating security into their DevOps pipeline. Learn what DevSecOps tools and integrations others are deploying in 2019 and where your choices stack up as you consider shifting security left.
If you thought it was difficult bringing the Ops and Dev teams to the same table, let’s talk about security! Often housed in a separate team, security experts have no incentive to ship software, with a mission solely to minimise risk.
This talk is a detailed case study of bringing security into DevOps. We’ll look at the challenges and tactics, from the suboptimal starting point of a highly regulated system with a history of negative media attention. It follows an Agile-aspiring Government IT team from the time when a deployable product was "finished" to when the application was first deployed many months later.
This talk is about humans and systems - in particular how groups often need to flex beyond the bounds of what either side considers reasonable, in order to get a job done. We’ll talk about structural challenges, human challenges, and ultimately how we managed to break through them.
There are no villains - everybody in this story is a hero, working relentlessly through obstacles of structure, time, law, and history. Come hear what finally made the difference, filling in the missing middle of DevSecOps.
Security in a Site Reliability Engineering (SRE) context with a focus on being pragmatic just makes sense. In this talk, we will look at 4 key areas where SRE and Security tribes can join forces and influence the overall business. This is a lab/discussion session.
Discussion of how security is in crisis but DevSecOps offers a new playbook and gives security a path to influence. Taking a look at the WAF space, we look at how Signal Sciences has created feedback between Dev and Ops and Security to create new value.
Finding Security a Home in a DevOps WorldShannon Lietz
Presented this talk at DevOps Summit in 2015 to a DevOps community. Discovered that security is new to most DevOps teams and this was a very good discussion.
The Emergent Cloud Security Toolchain for CI/CDJames Wickett
Security is in crisis and it needs a new way to move forward. This talk from Nov 2018, Houston ISSA meeting discusses the tooling needed to rise to the demands of devops and devsecops.
This talk by Stefan Streichsbier, Co-Founder of GuardRails.io, provides a brief history of how development, operations and security testing have become highly complex. It continues to outline the key problems with traditional security solutions and why in 2020 companies around the world are still figuring out a good way to manage security as part of rapid development cycles. Specifically, the big challenge of introducing and fixing new security issues versus tackling the existing security dept of existing applications.
To quote Bishop Desmond Tutu, “There comes a point where we need to stop just pulling people out of the river. We need to go upstream and find out why they’re falling in.”
After setting the stage, the remainder of the talk will focus on the paradigm shift that security solutions have to incorporate in order to solve the problem of sustainably secure applications on all layers. This will explore how the elements of Speed, Just in time training, and Data science have to be leveraged to empower development teams around the globe to get ahead for once and finally become able to move fast and be safe at the same time.
The 3 core takeaways for the audience are:
1.) Where security practices have gone wrong so far.
2.) What new technologies will cause a paradigm shift in how security is applied at scale.
3.) How security will look like in 5-10 years.
This is the latest version of the State of the DevSecOps presentation, which was given by Stefan Streichsbier, founder of guardrails.io, as the keynote for the Singapore Computer Society - DevSecOps Seminar in Singapore on the 13th January 2020.
This talk provides a brief history of how DevOps has enabled tech companies to become unicorns. Furthermore, is Security in DevOps important, who is responsible and what can teams do make security a competitive advantage.
This talk provides a brief history of how DevOps has enabled tech companies to become unicorns. Furthermore, is Security in DevOps important, who is responsible and what can teams do make security a competitive advantage.
"Running enterprise workloads with sensitive data in AWS is hard and requires an in-depth understanding about software-defined security risks. At re:Invent 2014, Intuit and AWS presented ""Enterprise Cloud Security via DevSecOps"" to help the community understand how to embrace AWS features and a software-defined security model. Since then, we've learned quite a bit more about running sensitive workloads in AWS.
We've evaluated new security features, worked with vendors, and generally explored how to develop security-as-code skills. Come join Intuit and AWS to learn about second-year lessons and see how DevSecOps is evolving. We've built skills in security engineering, compliance operations, security science, and security operations to secure AWS-hosted applications. We will share stories and insights about DevSecOps experiments, and show you how to crawl, walk, and then run into the world of DevSecOps."
Some of the most famous information breaches over the past few years have been a result of entry through embedded and IoT system environments. Often these breaches are a result of unexpected system architecture and service connectivity on the network that allows the hacker to enter through an embedded device and make their way to the financial or corporate servers. Experts in embedded security discuss key security issues for embedded systems and how to address them.
This talk provides a brief history of how DevOps has enabled tech companies to become unicorns. Furthermore, is Security in DevOps important, who is responsible and what can teams do make security a competitive advantage.
Secure Your DevOps Pipeline Best Practices Meetup 08022024.pptxlior mazor
Our technology, work processes, and activities all depend on if we trust our software to be developed in a safe and secure manner. Join us virtually for our upcoming "Secure Your DevOps Pipeline: Best Practices" Meetup to learn how to integrate security in the development process, DevSecOps advance methods, manage the implement secure coding analysis and how to manage software security risks.
2016 - Safely Removing the Last Roadblock to Continuous Deliverydevopsdaysaustin
Presentation by Shannon Lietz
Software needs to be awesome, resilient, available and “secure”, but Security has long been a big roadblock to fast deployments and software improvement. What if it wasn’t?
Continuous delivery requires operational functions to shift left and for an iterative approach to be taken. Security has not been easy to shift left and taking an iterative approach requires everyone to take responsibility. With a continuos security approach and everyone in the Software Supply Chain taking on the tasks of including security, its possible to achieve Rugged Software. This talk aims to provide a journey towards this approach and provide the path.
Software needs to be awesome, resilient, available and “secure”, but Security has long been a big roadblock to fast deployments and software improvement. What if it wasn’t?
Continuous delivery requires operational functions to shift left and for an iterative approach to be taken. Security has not been easy to shift left and taking an iterative approach requires everyone to take responsibility. With a continuos security approach and everyone in the Software Supply Chain taking on the tasks of including security, its possible to achieve Rugged Software. This talk aims to provide a journey towards this approach and provide the path.
ADDO - Navigating the DevSecOps App-ocalypse 2020 Aaron Rinehart
The speed and scale of complex system operations within cloud-driven architectures make them extremely difficult for humans to mentally model their behavior. This often results in unpredictable and catastrophic outcomes that become costly when unexpected security incidents occur. There is a need to realign the actual state of operational security measures in order to maintain an acceptable level of confidence that our security actually works when we need it to.
As an alternative to simply reacting to failures, the security industry has been overlooking valuable chances to further understand and nurture ‘accidents’ or ‘mistakes’ as opportunities to proactively strengthen system resilience. Chaos Engineering allows us to proactively expose the failures, build resilient systems, and develop an "Applied Security" model to minimize the impact of failures.
Chaos Engineering allows for security teams to proactively experiment and derive new information about underlying factors that were previously unknown. This is done by developing live fire exercises that can be measured, managed, and automated. Contrary to Red/Purple Team exercises, chaos engineering does not use threat actor or adversarial tactics, techniques and procedures. As far as we know it Chaos Engineering is the only proactive mechanism for detecting availability and security incidents before they happen. We proactively introduce turbulent conditions, faults, and failures into our systems to determine the conditions by which our security will fail before it actually does.
In this session we will introduce a new concept known as Security Chaos Engineering and how it can be applied to create highly secure, performant, and resilient distributed systems.
Security teams are often seen as roadblocks to rapid development or operations implementations, slowing down production code pushes. As a result, security organizations will likely have to change so they can fully support and facilitate cloud operations.
This presentation will explain how DevOps and information security can co-exist through the application of a new approach referred to as DevSecOps.
(SEC312) Taking a DevOps Approach to Security | AWS re:Invent 2014Amazon Web Services
More organizations are embracing DevOps to realize compelling business benefits, such as more frequent feature releases, increased application stability, and more productive resource utilization. However, security and compliance monitoring tools have not kept up. In fact, they often represent the largest single remaining barrier to continuous delivery. Learn how to integrate security controls in your DevOps program from experts at Alert Logic and George Miranda, engineer and evangelist at Chef. Sponsored by Alert Logic.
Programming languages and techniques for today’s embedded andIoT worldRogue Wave Software
This presentation looks at the problem of selecting the best programming language and tools to ensure IoT software is secure, robust, and safe. By taking a look at industry best practices and decades of knowledge from other industries (such as automotive and aerospace), you will learn the criteria necessary to choose the right language, how to overcome gaps in developers’ skills, and techniques to ensure your team delivers bulletproof IoT applications.
Outpost24 webinar: Turning DevOps and security into DevSecOpsOutpost24
DevOps is a revolution starting to deliver. The “shift left” security approach is trying to catch up, but challenges remain. We will go over concrete security approaches and real data that overcome these challenges.
It takes more than adding “hard to find” security talent to your DevOps team to reach DevSecOps benefits. Our discussion focuses on the practical side and lessons-learned from helping organizations gear up for this paradigm shift.
Similar to VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering" (20)
Applied Security: Crafting Secure and Resilient Distributed Systems using Chaos Engineering
CO-TALK BY
AARON RINEHART, CTO @ VERICA
& JAMIE DICKEN, MANAGER OF SECURITY ENGINEERING @ CARDINAL HEALTH
Modern systems pose a number of thorny challenges and securing the transformation from legacy monolithic systems to distributed systems demands a change in mindset and engineering toolkit. The security engineering toolkit is unfortunately out-of-style and outdated with today's approach to building, security and operating distributed systems. The speed, scale, and complex operations within microservice architectures make them tremendously difficult for humans to mentally model their behavior. Security Chaos Engineering helps teams realign the actual state of operational security as well as build confidence that their security actually works the way we think it does.
Join Jamie Dicken and Aaron Rinehart to learn about how they implemented Security Chaos Engineering as a practice at their organizations to proactively discover system weakness before they were taken advantage of by malicious adversaries.In this session Jamie and Aaron will introduce a new concept known as Security Chaos Engineering and share their experiences in applying Security Chaos Engineering to create highly secure, performant, and resilient distributed systems.
Nexus User Conference DevOps "Table Stakes": The minimum required to play the...Aaron Rinehart
In this session we will cover the ‘table stakes’ or the minimum foundational components in what it means to deliver high quality secure software in today’s software driven world. From gaining visibility into the software supply chain to building empathy with engineering teams through DevSecOps practices we will dive through what it takes to play the bare minimum hand and how that contributes to improving value-velocity and faster adoption of more advanced techniques such as Chaos Engineering.
Velocity 2019 - Security Precognition 2019 Slides - San Jose 2019Aaron Rinehart
Large scale distributed systems have unpredictable and complex outcomes that are costly when security incidents occur. Security incident response today is mostly a reactive and chaotic exercise. Chaos engineering allows security incident response teams to proactively experiment on recurring incident patterns to derive new information about underlying factors that were previously unknown.
What if you could flip that scenario on its head? Chaos engineering advances the security incident response framework by reversing the postmortem and preparation phase. This is done by developing live fire exercises that can be measured and managed. Contrary to red team game days, chaos engineering doesn’t use threat actor tactics, techniques, and procedures. Instead it develops teams through unique configuration, cyberthreat, and user error scenarios that challenge responders to react to events outside their playbooks and comfort zones.
Join Aaron Rinehart to explore the hidden costs of security incidents, learn a new technique for uncovering system weaknesses in systems security, and more. You’ll also get a glimpse of ChaoSlingr, an open source security chaos engineering tool built and deployed within a Fortune 5 company. Aaron explains how the tool helped his team discover that many of their security controls didn’t function as intended and how, as a result, they were able to proactively improve them before they caused any real problems.
DevSecOps & Security Chaos Engineering - "Knowing the Unknown" -
"Resilience is the story of the outage that didn’t happen". - John Allspaw
Our systems are becoming more and more distributed, ephemeral, and immutable in how they function in today’s ever-evolving landscape of contemporary engineering practices. Not only are we becoming more complex but the rate of velocity in which our systems are now interacting, and evolving is making the work more challenging for us humans. In this shifted paradigm, it is becoming problematic to comprehend the operational state, health and safety of our systems.
In this session Aaron will uncover what Chaos Engineering is, why we need it, and how it can be used as a tool for building more performant, safe and secure systems. We will uncover the importance of using Chaos Engineering in developing a learning culture through system experimentation. Lastly, we will walk through how to get started using Chaos Engineering as well as dive into how it can be applied to cyber security and other important engineering domains.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
3. @aaronrinehart @verica_io #chaosengineering
● Rugged DevOps Journey at United Health
Group
● Combating Complexity in Software
● Chaos Engineering
● Resilience Engineering & Security
● Security Chaos Engineering
Areas Covered
4. 4
Aaron Rinehart, CTO, Founder
● Former Chief Security Architect
@UnitedHealth responsible for security
engineering strategy
● Led the DevOps and Open Source
Transformation at UnitedHealth Group
● Former (DOD, NASA, DHS, CollegeBoard )
● Frequent speaker and author on Chaos
Engineering & Security
● Pioneer behind Security Chaos Engineering
● Led ChaoSlingr team at UnitedHealth
@aaronrinehart @verica_io #chaosengineering
Verica
6. What is DevOps?
“DevOps, a movement of people who care about developing
and operating reliable, secure, high performance systems at
scale, has always — intentionally — lacked a definition or
manifesto.”
– Jez Humble, author “The DevOps Handbook”
7. The Phoenix Project
A Novel about IT, DevOps, and
Helping Your Business Win
•by Gene Kim, Kevin Bahr and George
Spafford
Our path begins…
The DevOps
Handbook
How to create world-class agility, reliability,
and security in technology organizations
•by Gene Kim, Patrick Debois, John
Willis, Jez Humble and John
Allspaw
9. ●Drive Security as a Function of Quality
●Building a Better Model: Continuous Delivery is
Better Security
○ Focus on Delivering Value
○ Continuous Security Model
○ Enable DevOps Strategy and Automation
A New Paradigm: Bold Steps
10. ●Teams across Silos & Disciplines
○ 60 Developers, Operations Engineers, and Security Leaders from across the entire
company.
●Began with Six Core DevOps Security Problem Sets
○ Security Baseline + Configuration Validation w/ Chef & Inspec
○ Gauntlt Rugged Attack Framework
○ Static Code Analysis (SAST): Automatiing Fortify with Jenkins via API
○ Application Vulnerability Scans(DAST): Automating WebInspect with Jenkins via API
○ DevOps Self-Governance & Operationalization Framework: How does this world look from
an operational support perspective?
○ Clair Container Image Scanning: Building Image Scanning into Jenkins
A Grass Roots Beginning
11. Chef + InSpec:
Automated Security
Configuration &
Validation at Speed
12
Case Study: State
Health Exchange
12. ●Enable Deployment & Compliance at Speed and
Scale
●Allow developers to leverage “Security”- Approved
Chef server compliance cookbooks
●Compliance is built into the initial server standup
process and immediately confirmed prior to
release for use
○ No longer a late “add on”
○ It is “just another cookbook” that can be automatically applied
Shift Left
13. Initial Approach: 15 weeks+
● Stood up 300+ servers from service catalog over 1 weekend
● Waited weeks for extra build services beyond the catalog
● Allowed app and middleware teams to configure in parallel
● After 2+ months were able to apply compliance rules using
Security Blanket
○ Required 2+ weeks just to run, Resulted in compliance tickets, Remediation and rework
Alternatively: “Orders of Magnitude Differential”
○ Run Time: 300 servers⭢6 mins ⭢ 30 hours
○ Setup time: 40-100 hours ***
DevOps & State Health Exchange Migration
●
●
March
April
May
June
July
14. Gauntlt: “Be Mean
to Your Code”
15
Case Study: Driving
Security Testing into the
Pipeline: Automated
Vulnerability Scanning
15. Security as a Function of Quality: Gauntlt
○ An open source application vulnerability scanner engine that enables a self-service
vulnerability resolution solution
○ Automates use of multiple vulnerability security scanning tools
○ Provides packages allowing developers to easily run self-service security checks
against their applications
○ Scans begin immediately and take only minutes to complete
16. Lessons Learned in
DevOps Transformation
17
Takeaways, that will
fundamentally change the
entire strategy.
17. Automation & Tools
are Important but
“Don’t be Distracted
by it”
18
Emphasize ….
Simplification &
Standardization
….over Automation
18. Start Small & Focus
19
Shift Left……One capability at a time…
19. Embrace Failure as
a Friend
20
Plan and expect failure as a positive
outcome. Encourage teams to fail quickly
and learn from them.
20. Seek the Input &
Passion of Others
21
In the end, it has
been the folks
most passionate
about each
problem that
achieved success.
21. Voice of the
Customer
22
Define, understand, and listen
to your customer as part of
your journey. You will be
surprised how eager they are to
help you.
22. DevsecOps over next 5 Years: Written 3 years ago..
23
The Next Generation of Security Professionals will be Chosen from DevOps Teams
1
A Big Data Problem: The challenge becomes more about the data outputs than the toolsets.2
Shared Responsibility becomes more of a reality.3
Security is seen as an integral part of the value stream4
There will be a new breed of security capabilities created by Inner Source efforts. i.e. Netflix Security
Monkey5
23. • Fail small, fail fast
• Its a culture shift, not just about automation
• Drive out complexity: Complex things don’t scale
• Avoid Analysis Paralysis: DevOps is a culture and a
living organism
• DevOps is not a fad, it is the future
• Automation: Focus on where the human adds value.
Automate everything else.
Key Takeaways
24
43. After a few
months….
Hard Coded Passwords
Identity Conflicts
Lead Software
Engineering finds a new
job at Google
New Security Tool
Refactor Pricing
300 Microservices Δ-> 850 Microservices
Cloud Provider API
Outage
WAF Outage -> DisabledScalability Issues
Network is Unreliable
Autoscaling Keeps
Breaking
Large Customer
Outage
Delayed Features
DNS Resolution
ErrorsExpired Certificate
Regulatory
Audit
Rolling Sev1
Outage on Portal
Code Freeze
44. Years?….
Hard Coded Passwords
Identity Conflicts
Lead Software Engineering
finds a new job at Google
New Security Tool
Refactor Pricing
300 Microservices Δ-> 4000 Microservices
Cloud Provider API Outage
Firewall Outage -> Disabled
Scalability Issues
Network is Unreliable
Autoscaling Keeps
Breaking
Large Customer
Outage
Delayed Features
DNS Resolution
Errors
Expired Certificate
Regulatory
Audit
Rolling Sev1 Outages on
Portal
Code Freeze
Hard Coded Passwords
Identity Conflicts
Lead Software Engineering
finds a new job at Google
New Security Tool
Refactor Pricing
300 Microservices Δ-> 850 Microservices
Cloud Provider API Outage
WAF Outage -> DisabledScalability Issues
Network is Unreliable
Autoscaling Keeps
Breaking
Large CustomerDelayed Features
DNS Resolution
ErrorsExpired Certificate
Regulatory
Audit
Rolling Sev1 Outage on
Portal
Merger with
competitor
Misconfigured FW Rule Outage
Database Outage
Portal Retry Storm
Outage
Orphaned Documentation
Corporate Reorg
Budget Freeze
Outsource overseas
development
Exposed Secrets on
GithuCode Freeze
b
Migration to New
CSP
Upgrade to Java
SE 12
69. “Chaos Engineering is the discipline of
experimenting on a distributed system
in order to build confidence in the
system’s ability to withstand turbulent
conditions”
Chaos
Engineering
73. “[Chaos Engineering is] empirical
rather than formal. We don’t use
models to understand what the
system should do. We run
experiments to learn what it does.”
- Michael T. Nygard
76. ●
●
●
●
●
●
Chaos Engineering
Maturity
Despite what has been popularized on online
tech blogs you do not start off performing Chaos
Engineering on live production systems. There is
a maturity ramp to getting there.
● Validate Chaos Tools in
Lower Environment
● Develop Competency &
Confidence in Tooling
● Dry-run experiments
Warning: Still be careful in Non-Prod environments as you will be surprised what
hazards lie in Non-Prod. (Kafka Story)
77. ●
●
●
●
●
●
Chaos Monkey
Story
● During Business Hours
● Born out of Netflix Cloud
Transformation
● Put well defined problems
in front of engineers.
● Terminate VMs on
Random VPC Instances
78. ●
●
●
●
●
●
Chaos Pitfalls: Auto-Remediation
“…an operator will only be able to generate successful new
strategies for unusual situations if he has an adequate
knowledge of the process.”
“ Long term knowledge develops only through use and
feedback about its effectiveness.”
— Lisanne Bainbridge, The Ironies of Automation (1983)
Bring context or chase down
vulnerabilities for the service
owner instead of automating
fixes as this leads to a Fiery
Hell!
Reference: Nora Jones 8 Traps of Chaos Engineering
79. ●
●
●
●
●
●
Chaos Pitfalls:Breaking things on Purpose
“I'm pretty sure
I won’t have a job
very long if I
break things on
purpose all day.”
-Casey Rosenthal
The purpose of Chaos Engineering is NOT
to “Break Things on Purpose”.
If anything we are trying to “Fix them on
Purpose”!
Reference: Nora Jones 8 Traps of Chaos Engineering
80. ●
●
●
●
●
●
GameDay Exercises
● 2-4 hrs in Length
● Diverse Cross Functional Group of
Engineers
● Focused on Increasing Resilience
● Used for Manual Chaos
Engineering
● Great Introduction to Chaos
Engineering
Recommendations
● Use GameDays for New Chaos
Experiments
● Use GameDays for Initial
Experiment Deployment on New
Targets
● Use GameDays for Proving New
Chaos Engineering Tools
● Get Everyone in the Same Location
81. ● Define steady state
● Formulate hypothesis
● Outline methodology
● Identify blast radius
● Observability is key
● Readily abortable
Experiment Lifecycle
1
Perform a GameDay
Exercise
Plan, Schedule, and Run a
GameDay Exercise for
New Experiments
Validate Experiment
Hypothesis
Goal: Validate
experiment ran
successfully and that
the results are credible.
2
Remediate Findings &
Repeat Experiment
If hypothesis failed for
the experiment. Develop
and remediate list of
findings. Once
remediated, repeat
experiment
3
Once Successful:
Automate Experiment
Once the experiment has
been proved to run
successfully validating
your hypothesis you can
now automate the
experiment runs
periodically..
4
82. GameDays: The Basics
Plan &
Organize
GameDay
Exercise
Execute
Live
GameDay
Operations
Automate &
Evangelize
Results & Take
Action
Chaos
Experiment
Develop &
Evaluate
Conduct
Pre-Incident
Review
84. “The discipline of instrumentation, identification,
and remediation of failure within security controls
through proactive experimentation to build
confidence in the system's ability to defend
against malicious conditions in production.”
Security Chaos Engineering is...
100. • ChatOps Integration
• Configuration-as-Code
• Example Code & Open Framework
ChaoSlingr Product Features
• Serverless App in AWS
• 100% Native AWS
• Configurable Operational Mode &
Frequency
• Opt-In | Opt-Out Model
101. Hypothesis: If someone accidentally or
maliciously introduced a misconfigured
port then we would immediately detect,
block, and alert on the event.
Alert
SOC?
Config
Mgmt?
Misconfigured
Port Injection
IR
Triage
Log
data?
Wait...
Firewall?
102. Result: Hypothesis disproved. Firewall did not detect
or block the change on all instances. Standard Port
AAA security policy out of sync on the Portal Team
instances. Port change did not trigger an alert and
log data indicated successful change audit.
However we unexpectedly learned the configuration
mgmt tool caught change and alerted the SoC.
Alert
SOC?
Config
Mgmt?
Misconfigured
Port Injection
IR
Triage
Log
data?
Wait...
Firewall?
103. Stop looking for better
answers and start asking
better questions.
- John Allspaw