AWS FIS experiment templates

© 2021, Amazon Web Services, Inc. or its affiliates.
Let's write your
AWS FIS
experiment templates!!
Masao Kanamori
Solutions Architect, DevAx
Masao Kanamori

About me
Masao Kanamori
 Title/Role :
DevAx(Developer Acceleration) Team
Solutions Architect
 Favorite avengers:
Hawkeye: Clint Barton

Agenda
• Chaos engineering and AWS FIS
• What is experiment templates
• How to write experiment templates with JSON
• Why we need to write experiment template
• Conclusion
3

Chaos engineering and
AWS FIS

Distributed systems are complex
https://aws.amazon.com/builders-library/challenges-with-distributed-systems/
Message
Message
Reply
Reply
Server
Network
Client

Traditional testing is not enough
TESTING = VERIFYING A KNOWN CONDITION
Unit testing
of components
Tested in isolation to ensure
function meets expectations
Functional testing
of integrations
Each execution path tested
to assure expected results

What Chaos Engineering is:
• Experimenting on a system
• Identify failures
• Fix failures before they become outages
Chaos Engineering is meant to do:
• Improve resilience and performance
• Uncover hidden issues
• Expose blind spots (monitoring, observability, and alarms)
Chaos Engineering: Testing the Unknowns
S
O
I
T R E S S
B S E R V E
M P R O V E

Steady
state
Hypothesis
Run
experiment
Verify
Improve
Phases of chaos engineering

Challenges in Chaos Engineering
Difficult
to ensure
safety
Stitch together
different tools and
homemade scripts
1 Agents or
libraries required
to get started
3
2
Difficult to reproduce
“real-world” events
(multiple failures
at once)
4

Safeguards
Real-world
conditions
Easy to
get started
Fully managed chaos engineering service

AWS Fault Injection Simulator
O V E R V I E W
AWS Fault Injection
Simulator
Experiment
template
AWS Command
Line Interface
AWS Management
Console
AWS Identity and
Access Management
FIS safeguards
FIS engine
Compute
Start experiment
Third party
AWS
Amazon
EventBridge
Amazon
CloudWatch
alarms
AWS resources
Databases Networking Storage
Compute
Monitoring
Stop experiment

What is
experiment templates

Experiment
templates
Experiments
Actions Targets
Components

Actions are the fault injection actions executed
during an experiment
aws:<service-name>:<action-type>
Actions include:
• Fault type
• Duration
• Targeted resources
• Timing relative to any other actions
• Fault-specific parameters, such as rollback behavior
or the portion of requests to throttle
Actions

Targets
Targets define one or more AWS resources on
which to carry out an action
Targets include:
• Resource type
• Resource IDs, tags, and filters
• Selection mode (e.g., ALL, RANDOM)

Experiment templates define an experiment and
are used in the start-experiment request
Experiment templates include:
• Actions
• Targets
• Stop condition alarms
• IAM role
• Description
• Tags
Experiment
templates

Experiment template A
Stop conditions
Targets
Actions Action 1 Action 2
Amazon
CloudWatch alarm
i-aaaa i-bbbb i-cccc
Specific EC2 instances
Experiment template B
Stop conditions
Targets
Actions
Action 3
Action 1
Action 2
Amazon
CloudWatch alarms
All EC2 instances with
“chaos-ready” tag

Video
• Chaos Engineering starting guide ( AWS Summit Online Japan 2021 )
https://www.youtube.com/watch?v=9M13W0sYgks
Builders.flush
• Graphic recording: How to start Chaos Engineering without chaos
https://aws.amazon.com/jp/builders-flash/202110/awsgeek-fault-
injection-simulator/
• Hands-on: Let’s start your first experiment with AWS FIS
https://aws.amazon.com/jp/builders-flash/202111/try-chaos-
engineering/
Related resources (Japanese)
18

Now you can create and run experiment from console. But…
But you need automation
Experiment
templates
Experiments
Create Run
❷How to track change?
❶We need to iterate this process.
❸How to mapping
which template version?

How to write
experiment templates
with JSON

Experiment template as JSON
{
"tags": {
"Name": "StopAndRestartRandomeInstance"
},
"description": ”FIS Stop and Restart One Random Instance",
"roleArn": "arn:aws:iam::0123456789:role/MyFISExperimentRole",
"stopConditions": [
{
"source": "aws:cloudwatch:alarm",
"value": "arn:aws:cloudwatch:0123456789:alarm:No_Traffic"
}
],
"targets": {
"myInstance": {
"resourceTags": {
"Purpose": "chaos-ready"
},
"resourceType": "aws:ec2:instance",
"selectionMode": "COUNT(1)”
}
},
"actions": {
"StopInstances": {
"actionId": "aws:ec2:stop-instances",
"description": "stop the instances",
"parameters": {
"startInstancesAfterDuration": ”PT5M"
},
"targets": {
"Instances": "myInstance"
}
}
}
}
Description
IAM role
Stop conditions
Targets
Actions
Name

"tags": {
}
Description
IAM role
Name
"description": ”FIS Stop and Restart One Random Instance"
"roleArn": "arn:aws:iam::0123456789:role/MyFISExperimentRole"

"tags": {
}
Description
IAM role
Name
We use “Name” tag for the name of the experiment template
same as EC2 instances etc.

"tags": {
}
Description
IAM role
Name
Description about this experiment template.(required)

"tags": {
}
Description
IAM role
Name
ARN of the IAM role that grants the AWS FIS service permission
to perform service actions.

Actions
"actions": {
"StopInstances": {
"parameters": {
},
"targets": {
"Instances": "AllTaggedInstances"
}
},
"TerminateInstances": {
"actionId": "aws:ec2:terminate-instances",
"parameters": {},
"targets": {
"Instances": "RandomInstancesInAZ"
},
"startAfter": [
"StopInstances"
]
}
}

Actions
"actions": {
"StopInstances": {
"parameters": {
},
"targets": {
}
},
"parameters": {},
"targets": {
},
"startAfter": [
"StopInstances"
]
}
}
There are two actions
StopInstances
TerminateInstances

Actions
"actions": {
"StopInstances": {
"parameters": {
},
"targets": {
}
},
"parameters": {},
"targets": {
},
"startAfter": [
"StopInstances"
]
}
}
Specify action identifier.
Each AWS FIS action has an identifier
with the following format:
aws:<service-name>:<action-type>
See the document for details.
https://docs.aws.amazon.com/fis/latest/userguide/fis-
actions-reference.html

Actions
"actions": {
"StopInstances": {
"parameters": {
},
"targets": {
}
},
"parameters": {},
"targets": {
},
"startAfter": [
"StopInstances"
]
}
}
Some of actions have parameters.
You can check it in the document.
https://docs.aws.amazon.com/fis/latest/userguide/fis-
actions-reference.html

Actions
"actions": {
"StopInstances": {
"parameters": {
},
"targets": {
}
},
"parameters": {},
"targets": {
},
"startAfter": [
"StopInstances"
]
}
}
You need to specify targets.
What is a target will be described later.

Actions
"actions": {
"StopInstances": {
"parameters": {
},
"targets": {
}
},
"parameters": {},
"targets": {
},
"startAfter": [
"StopInstances"
]
}
}
You can specify the order of actions
with this attribute.

"targets": {
"AllTaggedInstances": {
"resourceTags": {
},
"selectionMode": "ALL"
},
"RandomInstancesInAZ": {
"resourceTags": {
},
filters: [
{
path: 'Placement.AvailabilityZone’,
values: [‘us.east.1a’]
},
{
path: 'State.Name’,
values: ['running’]
}
]
"selectionMode": ”PERCENT(50)"
}
Targets

"targets": {
"resourceTags": {
},
},
"resourceTags": {
},
filters: [
{
},
{
}
]
"selectionMode": "PERCENT(50)"
}
Targets
There are two targets
AllTarggedInstances
RandomInstancesInAZ

"targets": {
"resourceTags": {
},
},
"resourceTags": {
},
filters: [
{
},
{
}
]
}
Targets
You must specify exactly one resource type.
And when you specify a target for an action,
the target must be the resource type supported by the action
Resource types supported by AWS FIS
• aws:ec2:instance
• aws:ec2:spot-instance
• aws:ecs:cluster
• aws:eks:nodegroup
• aws:iam:role
• aws:rds:cluster
• aws:rds:db

"targets": {
"resourceTags": {
},
},
"resourceTags": {
},
filters: [
{
},
{
}
]
}
Targets You can use tags to specify AWS resources for target.
Of course you can use ARN using resourceArns
attribute instead tag.

"targets": {
"resourceTags": {
},
},
"resourceTags": {
},
filters: [
{
},
{
}
]
}
Targets
You can use resource filter to specify resource with specific attributes.
You can describe the path to reach an attribute in the output of the
Describe action for a resource.
(ex: for aws:ec2:instance , DescribeInstances API action is used)
More details , see following document:
https://docs.aws.amazon.com/fis/latest/userguide/targets.html#target-filters

"targets": {
"resourceTags": {
},
},
"resourceTags": {
},
filters: [
{
},
{
}
]
}
Targets
You can scope identified resources using selectionMode.
Default is "ALL”(all identified resources will be target).
You can use two other methods to scope.
• COUNT(n)
• PERCENT(n)

"stopConditions": [
{
}
],
Stop conditions
You can specify CloudWatch alarm
to stop your experiment if it reach the threshold. “none” or “aws:cloudwatch:alarm ”
ARN of the CloudWatch alarm.
(It’s required if the source is a CloudWatch alarm.)

Why we need to write
experiment template

Now you can create and run experiment from console. But…
But you need automation (repeat)
Experiment
templates
Experiments
Create Run
❷How to track change?
❶We need to iterate this process.
❸How to mapping
which template version?

Using VCS to track change experiment template
{
"tags": { "Name": "StopAndRestartRandomeInstance"},
"stopConditions": [{
"source": ”none",
}],
...
}
{
"tags": { "Name": "StopAndRestartRandomeInstance” },
"stopConditions": [{
}],
...
}
Version 1:
Version 2:
Add stop condition
Github
Bitbucket
Git repository
AWS CodeCommit
etc…

Automate update and run experiment template
VPC
Auto Scaling group
Instance Instance
Target environment
AWS CodeCommit AWS CodePipeline
AWS CloudFormation
AWS CodeBuild
AWS CodeBuild
Alarm
User
Experiment
templates
Experiments
AWS Command Line
Interface (AWS CLI)
Template Update Stage
or
AWS Command Line
Interface (AWS CLI)
Experiment Stage
push trigger pipeline
update template
run experiment
create/update
use
stop
condition
run experiment

Conclusion

• You can define your experiments as JSON/YAML.
• It’s good as start point automating your experiments.
• You shouldn’t forget to define a steady state and hypothesis.
You can try this idea in Chaos Engineering on AWS workshop:
https://chaos-engineering.workshop.aws
Let’s automate your experiments!
47

You can see good example in AWS Resilience Hub

Thank you!

AWS FIS experiment templates

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AWS FIS experiment templates

Similar to AWS FIS experiment templates (20)

More from 政雄金森

More from 政雄金森 (8)

Recently uploaded

Recently uploaded (20)