These benchmarks are intended to provide stable and reproducible measurements of the performance characteristics of the ddtrace library. A scenario is defined using a simple Python framework. Docker is used to build images for the execution of scenarios against different versions of ddtrace.


A scenario requires:

  • implements a class for running a benchmark

  • config.yaml: specifies one or more sets of configuration variables for the benchmark

  • requirements_scenario.txt: any additional dependencies

The scenario class inherits from bm.Scenario and includes the configurable variables using bm.var. The execution of the benchmark uses the run() generator function to yield a function that will handle the execution of a specified number of loops.


import bm

class MyScenario(bm.Scenario):
    size = bm.var(type=int)

    def run(self):
        size = self.size

        def bm(loops):
            for _ in range(loops):
                2 ** size

        yield bm


  size: 10
  size: 1000
  size: 1000000

Run scenario#

The scenario can be run using the built image to compare two versions of the library and save the results in a local artifacts folder:

scripts/perf-run-scenario <scenario> <version> <version> <artifacts>

The version specifiers can reference published versions on PyPI or git repositories.


scripts/perf-run-scenario span ddtrace==0.50.0 ddtrace==0.51.0 ./artifacts/
scripts/perf-run-scenario span Datadog/dd-trace-py@1.x Datadog/dd-trace-py@my-feature ./artifacts/



This benchmark test is used to simulate the creation, encoding, and flushing of traces in threaded environments.

It uses a concurrent.futures.ThreadPool to manage the total number of workers.

The only modification to the tracing workflow that has been made is using a NoopWriter which does not start a background thread and drops traces on writer.write. This means we skip encoding, queuing, and flushing payloads to the agent, but we will still use the span processors.