Extending BenchPilot #
This section describes all of the steps you need to take in order to extend BenchPilot for supporting more workloads!
Dockerize Workload #
The first step to extend BenchPiot, is to create the necessary docker images. In general BenchPilot utilizes the idea of having a controller node (which BenchPilot’s client and other core services will reside on), and the workers, which will be the system under test. Having this scheme in mind, you need to dockerize your workload, and to divide it into images that will reside on your controller node, and another image which will be deployed on the workers.
Adding New Services #
After creating the latter images, you should add under the /BenchPilotSDK/services the new service. That class should derive its properties from the BenchPilot’s abstract service object. Keep in mind that for every docker image you created for your workload, you should declare it as a different service.
For each service it is important to declare the following:
- docker image, either an already existing one, or you have to create it on your own
- hostname, we use the same one as the service name usually
- image tag, in cases of having different images for arm infrastructures you can define it using the “image_arm_tag” attribute.
- ports, needed ports
- environment, needed environment variables / configurations
- service log, the log that the service prints when is up and running
- Depends On, here you should add the service name that it’s important to start before the one you just created
- Command, in case if it needs to execute a specific command when the service starts
- Proxy, a simple “True” / “False” definition, whether it will reside on a device that passes through proxy
- needs_placement, again, “True” if it should reside on a worker, “False” if it’s a core service and will reside on the manager node
To configure the environment, ports, volumes, and images of the service, you should call the appropriate methods rather than directly assigning them to parameters. This approach simplifies the process by eliminating the need to understand the exact initialization details of these parameters.
Below you can see a service example:
from BenchPilotSDK.services.service import Service
class Redis(Service):
"""
This class represents the redis docker service
"""
def __init__(self):
Service.__init__(self)
self.hostname = "redis"
self.assign_image(image_name="bitnami/redis", image_tag="6.0.10")
self.add_environment("ALLOW_EMPTY_PASSWORD", "yes")
self.service_started_log = "Ready to accept connections"
** Before adding new services, check first if it already exists.
Adding New Workload #
After adding all of your workload’s services, you should create a new workload class as well, under the /BenchPilotSDK/workloads. This particular class will inherit its behavior from the “workload” class. In that class you should add in the “services list” all the services you need.
In the following block you can find a Workload example:
from abc import ABC
import BenchPilotSDK.utils.benchpilotProcessor as bp
from BenchPilotSDK.utils.exceptions import BenchExperimentInvalidException
from BenchPilotSDK.workloads.workload import Workload
from BenchPilotSDK.services.materializedServices.stress import Stress
class Simple(Workload, ABC):
"""
This class represents the Simple Workload, it just creates a specific simple workload.
"""
def __init__(self, **workload_definition):
super().__init__(**workload_definition)
bp.check_required_parameters('workload > parameters', ["service"], workload_definition["parameters"])
service = self.parameters["service"]
options = {} if not "options" in self.parameters else self.parameters["options"]
service = Stress(options)
...
Adding a new SDPE Workload #
In case of adding a new Streaming Distributed - based workload you don’t need to add the engines, you only need to inherit from the SDPEWorkload class, and add the rest of the services, like the example below:
import inspect
from abc import ABC
from dataclasses import dataclass, asdict
from BenchPilotSDK.workloads.materializedWorkloads.sdpeWorkload import SDPEWorkload
from BenchPilotSDK.services.materializedServices.kafka import Kafka
from BenchPilotSDK.services.materializedServices.redis import Redis
from BenchPilotSDK.services.materializedServices.zookeeper import Zookeeper
class MarketingCampaign(SDPEWorkload, ABC):
"""
This class represents Yahoo Streaming Benchmark, it holds all the extra needed services.
- by extra we mean the services that are not DSPEs
"""
@dataclass
class Parameters:
num_of_campaigns: int = 1000
...
@classmethod
def from_dict(cls, env):
return cls(**{
k: v for k, v in env.items()
if k in inspect.signature(cls).parameters
})
def __post_init__(self):
..
def __init__(self, **workload_definition):
super().__init__(**workload_definition)
self.parameters.update(asdict(self.Parameters.from_dict(workload_definition)))
self.add_service(Zookeeper())
self.add_service(Kafka(len(self.cluster), self.manager_ip))
self.add_service(Redis())