Workloads

Workloads #

As for now BenchPilot only supports the following containerized workloads:

Name Description Specific Configuration Parameters
marketing-campaign A streaming distributed workload that features an application as a data processing pipeline with multiple and diverse steps that emulate insight extraction from marketing campaigns. The workload utilizes technologies such as Kafka and Redis.
  • campaigns, which is the number of campaigns, the default number is 1000,
  • tuples_per_second, the number of emitted tuples per second, the default is 10000
  • kafka_event_count, the number of generated and published events on kafka, the default is 1000000
  • maximize_data, this attribute is used to automatically maximize the data that are critically affecting the workload’s performance, the input that the user can put is in the format of x10, x100, etc.

Distributed Processing Engine Parameters #

In the case of streaming distributed workloads, the user needs to define specific engine parameters in their experiments.

For each Streaming Distributed Processing Engine, the following attributes can be specified:

Engine Storm Flink Spark
Parameters
  1. partitions
  2. ackers
  1. partitions
  2. buffer_timeout
  3. checkpoint_interval
  1. partitions
  2. batchtime
  3. executor_cores
  4. executor_memory

It’s important to note that BenchPilot can be easily extended to add new workloads.

For extending BenchPilot check this section out.