Case Study — Python Runtime

Name: poly-bench
Author: poly-bench

Step-by-step walkthrough of how the Python runtime was implemented

Docs in progress. Some content or features may be missing or not covered in detail.

This case study walks through the Python runtime implementation, section by section. Python is a good reference because it has no compilation step — the executor writes a script and runs it directly. The same patterns apply to other runtimes: you add the language to the grammar, wire up manifest and build, implement the Runtime trait, and optionally add LSP support for editor diagnostics and hover.

Use this alongside the Adding a Runtime guide. The case study shows concrete code; the guide provides the full checklist.

1. Lang Enum

The first step is to add the language to the shared Lang enum so the DSL parser, manifest, and runtime registry all recognize it. This lives in poly-bench-dsl/src/ast.rs:

1pub enum Lang {
2  // ...
3  Python,
4}
5
6impl Lang {
7  pub const ALL: [Lang; 7] = [..., Lang::Python];
8
9  pub fn from_str(s: &str) -> Option<Self> {
10      match s.to_lowercase().as_str() {
11          // ...
12          "python" | "py" => Some(Lang::Python),
13          _ => None,
14      }
15  }
16
17  pub fn as_str(&self) -> &'static str {
18      match self {
19          Lang::Python => "python",
20          // ...
21      }
22  }
23
24  pub fn aliases(&self) -> &'static [&'static str] {
25      match self {
26          Lang::Python => &["python", "py"],
27          // ...
28      }
29  }
30}

Every place that matches on Lang must handle the new variant. Search for Lang:: to find touchpoints.

2. Manifest Config

The manifest (polybench.toml) needs a config block for the new runtime. In poly-bench-project/src/manifest.rs, add a config struct and wire it into Manifest so has_runtime(Lang::Python) returns true when Python is enabled:

1#[derive(Debug, Clone, Serialize, Deserialize)]
2pub struct PythonConfig {
3  #[serde(default)]
4  pub version: Option<String>,
5  #[serde(default, skip_serializing_if = "HashMap::is_empty")]
6  pub dependencies: HashMap<String, String>,
7}
8
9pub struct Manifest {
10  // ...
11  pub python: Option<PythonConfig>,
12}
13
14impl Manifest {
15  pub fn has_runtime(&self, lang: Lang) -> bool {
16      match lang {
17          Lang::Python => self.python.is_some(),
18          // ...
19      }
20  }
21}

Manifest::new() creates python: Some(PythonConfig { ... }) when "python" is in the enabled languages list. The dependencies map is used to populate requirements.txt during build.

3. Init

When a user runs poly-bench init or poly-bench add-runtime python, the init logic creates the runtime environment directory. In poly-bench-project/src/init.rs, add a branch in init_runtime_env_for_lang:

1Lang::Python => {
2  let python_env = runtime_env(project_dir, Lang::Python);
3  std::fs::create_dir_all(&python_env)?;
4  let requirements_content = templates::requirements_txt_for_runtime_env(&[]);
5  std::fs::write(python_env.join("requirements.txt"), requirements_content)?;
6}

Python does not create the venv at init. The venv is created during poly-bench build when pip install runs.

4. Build

The build step installs dependencies and prepares the environment. In poly-bench-project/src/build.rs, add build_python_env and call it from build_runtime_env_for_lang:

1fn build_python_env(project_root: &Path, python_config: &PythonConfig, options: &BuildOptions) -> Result<()> {
2  let python_env = runtime_env(project_root, Lang::Python);
3  std::fs::create_dir_all(&python_env)?;
4
5  // Write requirements.txt (includes pyright for LSP)
6  let deps = python_config.dependencies.iter().map(|(k, v)| (k.clone(), v.clone())).collect();
7  let requirements_content = templates::requirements_txt_for_runtime_env(&deps);
8  std::fs::write(python_env.join("requirements.txt"), requirements_content)?;
9
10  if !options.skip_install {
11      // Create .venv if missing
12      let venv_python = python_env.join(".venv").join("bin").join("python");
13      if !venv_python.exists() {
14          Command::new("python3").args(["-m", "venv", ".venv"]).current_dir(&python_env).output()?;
15      }
16      // pip install -r requirements.txt
17      let venv_pip = python_env.join(".venv").join("bin").join("pip");
18      Command::new(&venv_pip).args(["install", "-r", "requirements.txt"]).current_dir(&python_env).output()?;
19  }
20  Ok(())
21}

5. Runtime + Factory

The core runtime logic lives in runtimes-python/src/executor.rs. The Runtime trait defines the contract for running benchmarks; the factory creates instances with the correct project root:

1pub struct PythonRuntime {
2  python_binary: PathBuf,
3  project_root: Option<PathBuf>,
4  cached_script: Option<(PathBuf, PathBuf, u64)>,
5}
6
7impl RuntimeFactory for PythonRuntimeFactory {
8  fn lang(&self) -> Lang { Lang::Python }
9  fn name(&self) -> &'static str { "Python Runtime" }
10  fn create(&self, config: &RuntimeConfig) -> Result<Box<dyn Runtime>> {
11      let mut rt = PythonRuntime::new()?;
12      rt.set_project_root(config.get_root(Lang::Python));
13      Ok(Box::new(rt))
14  }
15}
16
17#[async_trait]
18impl Runtime for PythonRuntime {
19  fn lang(&self) -> Lang { Lang::Python }
20  async fn initialize(&mut self, _suite: &SuiteIR) -> Result<()> {
21      which::which("python3").or_else(|_| which::which("python"))
22          .map_err(|_| miette!("Python not found in PATH"))?;
23      Ok(())
24  }
25  fn generate_check_source(&self, spec: &BenchmarkSpec, suite: &SuiteIR) -> Result<String> {
26      // Generate Python script from IR
27  }
28  async fn compile_check(&self, spec: &BenchmarkSpec, suite: &SuiteIR) -> Result<()> {
29      // Run pyright or similar
30  }
31  async fn run_benchmark(&mut self, spec: &BenchmarkSpec, suite: &SuiteIR) -> Result<Measurement> {
32      // Write bench.py, spawn python bench.py, parse JSON from stdout
33  }
34  async fn shutdown(&mut self) -> Result<()> { Ok(()) }
35}

Python prefers .venv/bin/python when project_root contains runtime-env — so benchmarks use the same interpreter that has the installed deps.

6. ErrorMapper

Python uses a passthrough mapper because there is no separate compilation step — the script runs directly, so errors are reported in terms of the generated Python file. In runtimes-python/src/error_mapping.rs:

1impl ErrorMapper for PythonErrorMapper {
2  fn lang(&self) -> Lang { Lang::Python }
3  fn build_mappings(&self, _suite: &SuiteIR, _generated_code: &str) -> LineMappings {
4      LineMappings::default()
5  }
6  fn remap_error(&self, error: &str, _mappings: &LineMappings) -> String {
7      error.to_string()
8  }
9}

Compiled languages (Go, Rust, C) need real line mapping so compiler errors point to .bench lines. Interpreted languages can use passthrough if the runtime reports errors with useful context.

7. RuntimePlugin

The plugin bundles all runtime components and registers them via the #[distributed_slice] macro. In runtimes-python/src/plugin.rs:

1pub struct PythonPlugin;
2
3impl RuntimePlugin for PythonPlugin {
4  fn lang(&self) -> Lang { Lang::Python }
5  fn runtime_factory(&self) -> &'static dyn RuntimeFactory { &PYTHON_RUNTIME_FACTORY }
6  fn error_mapper(&self) -> &'static dyn ErrorMapper { &PYTHON_ERROR_MAPPER }
7  fn lang_display(&self) -> LangDisplayInfo { python_lang_display() }
8  // Optional: project_root_detector, virtual_file_builder, embedded_diagnostic_provider, etc.
9}
10
11#[distributed_slice(poly_bench_traits::PLUGINS)]
12static _PYTHON: &dyn RuntimePlugin = &PYTHON_PLUGIN;

8. Execution Flow

When run_benchmark is called, the runtime generates a standalone Python script, writes it to disk, runs it, and parses the JSON output. Here is the high-level flow:

Generate — Codegen produces a Python script with fixture decoding, init, helpers, and a benchmark loop that prints JSON.
Write — The script is written to bench.py in .polybench/runtime-env/python/ (or a temp dir).
Run — poly-bench spawns python bench.py (or .venv/bin/python bench.py).
Parse — The last non-empty line of stdout is JSON with iterations, total_nanos, nanos_per_op, and ops_per_sec.
Return — poly-bench builds a Measurement via Measurement::from_aggregate(iterations, total_nanos).

The benchmark script must print valid JSON on the last line. Use time.perf_counter_ns() for high-resolution timing.

CodeGen: How the Python Script Is Built

The Python runtime uses two codegen functions in runtimes-python/src/executor.rs:

generate_python_check_source — Used when poly-bench check runs. Produces a minimal script with imports, declarations, helpers, and the benchmark implementation — no timing harness or benchmark loop.
generate_standalone_script — Used for run_benchmark and precompile. Produces the full script that runs the benchmark and prints JSON.

Both use the same assembly order. The script is built by concatenating the following sections in order:

Header — # Code generated by poly-bench. DO NOT EDIT. and core imports (time, json, gc, tracemalloc if memory tracking).
User imports — From suite.imports.get(&Lang::Python) (e.g. import hashlib).
Stdlib imports — From suite.stdlib_imports via stdlib::get_stdlib_code, which injects standard library helpers used by the DSL.
Declarations — From suite.declarations.get(&Lang::Python), e.g. module-level constants or helper stubs.
Init code — From suite.init_code.get(&Lang::Python), run once before the benchmark loop.
Helpers — From suite.helpers.get(&Lang::Python), e.g. def keccak256_py(data: bytes) -> bytes.
Fixture decoding — For each fixture_refs in the spec, either fixture_name = <implementation> (Python literal) or fixture_name = bytes([0x68, 0x65, ...]) for raw hex data.
Benchmark function — The implementation is wrapped in def __polybench_bench(): with return on the last line when use_sink is true (to prevent dead-code elimination).
Benchmark loop — def __polybench_run(): warmup (time-based or iteration-based), before_hook, then either:
- Auto mode — Batched execution until target_ns is reached; batch size grows as needed.
- Fixed mode — Fixed number of iterations.
Each iteration times the benchmark call. each_hook runs before/after each call. after_hook runs at the end. If use_memory is set, tracemalloc tracks peak memory per batch.
JSON output — The script prints the final JSON on the last line: \{"iterations": N, "total_nanos": T, "nanos_per_op": X, "ops_per_sec": Y}.

The normalize_python_indent helper strips common leading whitespace from embedded blocks so the generated script has consistent indentation.

9. LSP (Optional)

Python provides full LSP support for editor diagnostics and hover. You can omit LSP initially and add it later; the runtime will work for poly-bench run without it.

LSP Overview

The LSP integration for embedded Python works as follows:

Building a virtual file — The .bench file contains embedded Python blocks (imports, setup, helpers, benchmark implementations). The VirtualFileBuilder assembles these into a single .py file that the LSP can analyze.
Running diagnostics — The EmbeddedDiagnosticProvider runs pyright (or pylsp) on that virtual file and maps diagnostics back to positions in the .bench file.
Hover and completions — The EmbeddedLspClient wraps pyright-langserver so the editor can request hover, completions, and other LSP features for embedded Python.

VirtualFileBuilder

The VirtualFileBuilder lives in runtimes-python/src/virtual_file.rs. It implements VirtualFileBuilder::build and receives VirtualFileParams containing:

bench_uri — URI of the .bench file
blocks — Parsed embedded blocks (imports, declarations, helpers, init, benchmark/validate/skip blocks)
fixture_names — Names of fixtures referenced by the benchmark

The builder uses VirtualFileBuilderCore to:

Categorize blocks — categorize_blocks groups blocks by type (imports, declares, helpers, inits, other).
Assemble content — Imports, declarations, helpers, and init are written in order. For benchmark/validate/skip blocks, it emits def _bench_N(): (or similar) with placeholder fixture assignments and the block content indented.
Compute paths — The virtual file path is \{module_root}/.lsp_virtual/_lsp_virtual_\{hash}.py where a hash is derived from the bench path. This keeps virtual files unique per .bench file.
Build section mappings — Each block maps a range in the virtual file to a range in the .bench file. These are stored in section_mappings so diagnostics can be translated from virtual line/character back to bench line/character.

EmbeddedDiagnosticProvider and EmbeddedDiagnosticSetup

The EmbeddedDiagnosticProvider in runtimes-python/src/embedded_diagnostics.rs implements check_blocks(virtual_file, ctx). It:

Tries pyright first — Runs pyright --outputjson on the virtual file path (after writing the content to disk). Parses the JSON output for generalDiagnostics and converts each to EmbeddedDiagnostic with virtual_line, virtual_character, length, message, severity, and optional code.
Falls back to py_compile — If pyright is not available, runs python -m py_compile and parses stderr for File "...", line N and ErrorType: message. Produces EmbeddedDiagnostic with virtual_line set from the error line.

The LSP layer then uses virtual_file.section_mappings() to map each diagnostic's virtual_line and virtual_character back to the corresponding position in the .bench file.

The EmbeddedDiagnosticSetup implements prepare(module_root, ctx). It calls ctx.ensure_ready(Lang::Python, module_root), which initializes the LSP client for Python if needed (for example, starting pyright-langserver).

Where to Find pyright-langserver

The Python LSP uses PyrightConfig in runtimes-python/src/pyright_client.rs:

find_executable_in_workspace — Looks for .venv/bin/pyright-langserver or .venv/bin/pyright under the workspace root. This ensures the venv’s pyright is used when available.
find_executable — Falls back to which for pyright (with --langserver) or pylsp in PATH.
server_args_for_path — For pyright-langserver, passes --stdio so it communicates over stdin/stdout.

The plugin registers embedded_lsp_client_init and embedded_lsp_client_get so the LSP server can start and reuse the pyright client for diagnostics.

Summary of LSP Components

The table below summarizes the LSP components and their roles:

Component	Purpose
VirtualFileBuilder	Builds a virtual .py file from embedded blocks; maintains section mappings for bench to virtual file
EmbeddedDiagnosticProvider	Runs pyright or py_compile on the virtual file; returns EmbeddedDiagnostic with virtual positions
EmbeddedDiagnosticSetup	Calls prepare so the LSP client is ready before diagnostics
EmbeddedLspClient (pyright_client)	Wraps pyright-langserver; find_executable_in_workspace checks .venv/bin/pyright-langserver
ProjectRootDetector	Detects Python root via requirements.txt or pyproject.toml

Summary

Section	What it does
Lang	Adds `python`/`py` to the grammar and config
Manifest	`PythonConfig` with deps; `has_runtime`
Init	Creates `requirements.txt`
Build	Creates venv, runs `pip install`
Runtime	`initialize`, `generate_check_source`, `compile_check`, `run_benchmark`, `shutdown`
ErrorMapper	Passthrough (or real mapping for compiled langs)
Plugin	Bundles all; `#[distributed_slice(PLUGINS)]`

For the full checklist and LSP details, see Adding a Runtime.