chaos-engineering-Knolx

Presented By: Dipayan Pramanik
Chaos Engineering on Kubernetes

Lack of etiquette and manners is a huge turn oﬀ.
KnolX Etiquettes
Punctuality
Respect Knolx session timings, you
are requested not to join sessions
after a 5 minutes threshold post
the session start time.
Feedback
Make sure to submit a constructive
feedback for all sessions as it is
very helpful for the presenter.
Silent Mode
Keep your mobile devices in silent
mode, feel free to move out of
session in case you need to attend
an urgent call.
Avoid Disturbance
Avoid unwanted chit chat during
the session.

Agenda
01 A real life scenario
02 What is Chaos Engineering
03 Why Chaos Engineering is needed
04 Chaos Mesh - a Chaos Engineering tool for Kubernetes
05 Chaos Mesh architecture and features
06 Demo

Real Life Scenario
● Kubernetes is the main mode for application deployment in the
present time
● In the current time, containers are the main mode for
application deployment and Kubernetes is the container
orchestrator which serves the purpose.
● Though Kubernetes solves the problem of container recreation
and High availability and load balancing, there can be lot of
unfortunate problems which are unforeseen.

Chaos Engineering
● As the name suggests, chaos engineering is all about creating
havoc in the current environment.
● This chaos simulation is a way in which engineers can replicate
many such events through chaos experiments and tests. And
then they can check the result and find out what the application
lacks and solve the issue.
● In simpler words, chaos engineering is the practice of
implementing chaos and havoc in the production or staging
environment, so that the engineers can build a fault tolerant
application.

Why Chaos Engineering
● Let us list some of the issues that might happen with the
application deployed. Network failure, Network corruption,
Unresponsive pods, extra traffic etc.
● Any of the above scenario is enough to induce a downtime in
the application. But how do we avoid the downtime. How can
we stay prepared for the problems that have not occurred but
might occur?
● The answer to the question is Chaos Engineering. In Chaos
engineering we recreate many of the chaos scenarios that can
affect the application, and then build an application which can
tolerate the fault induced and be completely functional.

Chaos Mesh
● Chaos Mesh is chaos engineering platform for Kubernetes.
● There are no external dependency. It uses Kubernetes Custom resource Definitions(CRDs)
to define the chaos experiments.
● Chaos Mesh provides us a control over blast radius of the experiments by allowing us to
whitelist and black list namespaces.
● Chaos Mesh provides a wide variety of experiments which can be used to replicate real life
scenarios.
● We can run experiments in schedule or run them in serial or parallel as a workflow.
● Experiments can be configured through yaml or the dashboard it provides.

Architecture
● Chaos Dashboard: The visualization component of Chaos Mesh. Chaos Dashboard
offers a set of user-friendly web interfaces through which users can manipulate
and observe Chaos experiments. At the same time, Chaos Dashboard also
provides an RBAC permission management mechanism.
● Chaos Controller Manager: The core logical component of Chaos Mesh. Chaos
Controller Manager is primarily responsible for the scheduling and management of
Chaos experiments. This component contains several CRD Controllers, such as
Workflow Controller, Scheduler Controller, and Controllers of various fault types.
● Chaos Daemon: The main executive component. Chaos Daemon runs in the
DaemonSet mode and has the Privileged permission by default (which can be
disabled). This component mainly interferes with specific network devices, file
systems, kernels by hacking into the target Pod Namespace.

Features
● Provides different types of simulated faults like, container failure, network
corruption, network delay, kernel error.
● Also provides cloud platform like AWS, GCP speciﬁc faults.
● The Chaos daemon can be run on remote physical hosts and can be used to inject
simulated faults in those nodes.
● We can run single experiments or we can combine those experiments to form
workﬂow chain.
● We can also run the experiments in recurring schedule.

chaos-engineering-Knolx

More Related Content

What's hot

Similar to chaos-engineering-Knolx

More from Knoldus Inc.

Recently uploaded

chaos-engineering-Knolx