Chaos Mesh is an open source cloud-native Chaos Engineering platform that allows you to simulate various faults and orchestrate fault scenarios in your kubernetes cluster. This client is written in Python and provides a single point of entry to create and manage experiments in Chaos Mesh.
To start using Chaos Mesh, please follow the installation steps in the documentation.
To create a Chaos Mesh client, you can use the following code:
from chaosmesh.client import Client, Experiment
from chaosmesh.k8s.selector import Selector
# creating the ChaosMesh client
client = Client(version="v1alpha1")
# target pods selector; by labelSector or by pods in specified namespaces
selector = Selector(labelSelectors={"app": "filebeat"}, pods=None, namespaces=None)
chaos-mesh.org/v1alpha1
Chaos Mesh supports various types of experiments, including Pod faults, stress tests, JVM faults, and Host faults.
- Pod failure
- Pod kill
- Container kill
- CPU
- Memory
- GC
- Exception
- CPU
- Memory
- Read payload
- Write payload
- Fill
- Partition
- Bandwidth
Here are some examples of how you can create experiments in Chaos Mesh:
# name of the experiment
exp_name = str(uuid.uuid4())
# starting up the pod failure experiment
client.start_experiment(Experiment.POD_FAILURE, namespace="default", name=exp_name, selector=selector)
exp_name = str(uuid.uuid4())
# starting up the pod kill experiment
client.start_experiment(Experiment.POD_KILL, namespace="default", name=exp_name, selector=selector)
exp_name = str(uuid.uuid4())
# starting up the pod kill experiment
client.start_experiment(Experiment.CONTAINER_KILL, namespace="default", name=exp_name, selector=selector, container_names=['main'])
exp_name = str(uuid.uuid4())
# starting up the pod kill experiment
client.start_experiment(Experiment.POD_STRESS_CPU, namespace="default", name=exp_name, selector=selector, container_names=['main'])
exp_name = str(uuid.uuid4())
# starting up the pod kill experiment
client.start_experiment(Experiment.POD_STRESS_MEMORY, namespace="default", name=exp_name, selector=selector, container_names=['main'])
# name of the experiment
exp_name = str(uuid.uuid4())
client.start_experiment(Experiment.GC, namespace="default", name=exp_name, selector=selector, port=8080)
exp_name = str(uuid.uuid4())
client.start_experiment(Experiment.RAISE_EXCEPTION, namespace="default",
name=exp_name, selector=select
exp_name = str(uuid.uuid4())
# starting up the host cpu stress experiment
client.start_experiment(Experiment.HOST_STRESS_CPU, namespace="default", name=exp_name,
address=["10.225.66.224", "10.225.67.213", "10.225.66.231", "10.225.66.138", "10.225.66.192", "10.225.67.52", "10.225.67.103"],
load=1000)
exp_name = str(uuid.uuid4())
# starting up the host memory stress experiment
client.start_experiment(Experiment.HOST_STRESS_MEMORY, namespace="default", name=exp_name,
address=["10.225.66.224", "10.225.67.213", "10.225.66.231", "10.225.66.138", "10.225.66.192", "10.225.67.52", "10.225.67.103"],
size="30GB")
exp_name = "disk-fault-read-payload-" + random.randint(0, 1000000).__str__()
# starting up the read payload experiment
client.start_experiment(Experiment.HOST_READ_PAYLOAD, namespace="default", name=exp_name, selector=selector, address=["address"], size="1024K", path="/", payload_process_num=1)
exp_name = "disk-fault-write-payload-" + random.randint(0, 1000000).__str__()
# starting up the write payload experiment
client.start_experiment(Experiment.HOST_WRITE_PAYLOAD, namespace="default", name=exp_name, selector=selector, address=["address"], size="1024K", path="/",
payload_process_num=1)
exp_name = "disk-fault-fill-" + random.randint(0, 1000000).__str__()
# starting up the disk fill experiment
client.start_experiment(Experiment.HOST_DISK_FILL, namespace="default", name=exp_name, selector=selector, address=["address"], size="1024K", path="/", fill_by_fallocate=True)
exp_name = "network-partition-" + random.randint(0, 1000000).__str__()
# starting up the network partition experiment
client.start_experiment(Experiment.NETWORK_PARTITION, namespace="default", name=exp_name, selector=selector, external_targets=["target"], direction="both")
exp_name = "network-bandwidth-" + random.randint(0, 1000000).__str__()
# starting up the network bandwidth experiment
client.start_experiment(Experiment.NETWORK_BANDWIDTH, namespace="default", name=exp_name, selector=selector, rate="1bps", buffer=1, limit=1, direction="to",
external_targets=["target"])
In order to pause an experiment you can use the following command
# pausing the experiment
client.pause_experiment(Experiment.POD_STRESS_MEMORY, namespace="default", name=exp_name)
The experiment can be removed from the k8s cluster using the following command
client.delete_experiment(Experiment.POD_STRESS_MEMORY, namespace="default", name=exp_name)
Schedule an experiment using the following command
client.schedule_experiment(Experiment.POD_STRESS_CPU, namespace="default", name=exp_name, cron_schedule="*/2 * * * *", selector=selector, container_names=['main'])
Initializing the ChaosMesh logger
import logging, sys
logging.getLogger("chaosmesh")
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)