Case Study — Python Runtime
Step-by-step walkthrough of how the Python runtime was implemented
Step-by-step walkthrough of how the Python runtime was implemented
This case study walks through the Python runtime implementation, section by section. Python is a good reference because it has no compilation step — the executor writes a script and runs it directly. The same patterns apply to other runtimes: you add the language to the grammar, wire up manifest and build, implement the Runtime trait, and optionally add LSP support for editor diagnostics and hover.
The first step is to add the language to the shared Lang enum so the DSL parser, manifest, and runtime registry all recognize it. This lives in poly-bench-dsl/src/ast.rs:
1pub enum Lang {2 // ...3 Python,4}5
6impl Lang {7 pub const ALL: [Lang; 7] = [..., Lang::Python];8
9 pub fn from_str(s: &str) -> Option<Self> {10 match s.to_lowercase().as_str() {11 // ...12 "python" | "py" => Some(Lang::Python),13 _ => None,14 }15 }16
17 pub fn as_str(&self) -> &'static str {18 match self {19 Lang::Python => "python",20 // ...21 }22 }23
24 pub fn aliases(&self) -> &'static [&'static str] {25 match self {26 Lang::Python => &["python", "py"],27 // ...28 }29 }30}Lang must handle the new variant. Search for Lang:: to find touchpoints.The manifest (polybench.toml) needs a config block for the new runtime. In poly-bench-project/src/manifest.rs, add a config struct and wire it into Manifest so has_runtime(Lang::Python) returns true when Python is enabled:
1#[derive(Debug, Clone, Serialize, Deserialize)]2pub struct PythonConfig {3 #[serde(default)]4 pub version: Option<String>,5 #[serde(default, skip_serializing_if = "HashMap::is_empty")]6 pub dependencies: HashMap<String, String>,7}8
9pub struct Manifest {10 // ...11 pub python: Option<PythonConfig>,12}13
14impl Manifest {15 pub fn has_runtime(&self, lang: Lang) -> bool {16 match lang {17 Lang::Python => self.python.is_some(),18 // ...19 }20 }21}Manifest::new() creates python: Some(PythonConfig { ... }) when "python" is in the enabled languages list. The dependencies map is used to populate requirements.txt during build.
When a user runs poly-bench init or poly-bench add-runtime python, the init logic creates the runtime environment directory. In poly-bench-project/src/init.rs, add a branch in init_runtime_env_for_lang:
1Lang::Python => {2 let python_env = runtime_env(project_dir, Lang::Python);3 std::fs::create_dir_all(&python_env)?;4 let requirements_content = templates::requirements_txt_for_runtime_env(&[]);5 std::fs::write(python_env.join("requirements.txt"), requirements_content)?;6}The build step installs dependencies and prepares the environment. In poly-bench-project/src/build.rs, add build_python_env and call it from build_runtime_env_for_lang:
1fn build_python_env(project_root: &Path, python_config: &PythonConfig, options: &BuildOptions) -> Result<()> {2 let python_env = runtime_env(project_root, Lang::Python);3 std::fs::create_dir_all(&python_env)?;4
5 // Write requirements.txt (includes pyright for LSP)6 let deps = python_config.dependencies.iter().map(|(k, v)| (k.clone(), v.clone())).collect();7 let requirements_content = templates::requirements_txt_for_runtime_env(&deps);8 std::fs::write(python_env.join("requirements.txt"), requirements_content)?;9
10 if !options.skip_install {11 // Create .venv if missing12 let venv_python = python_env.join(".venv").join("bin").join("python");13 if !venv_python.exists() {14 Command::new("python3").args(["-m", "venv", ".venv"]).current_dir(&python_env).output()?;15 }16 // pip install -r requirements.txt17 let venv_pip = python_env.join(".venv").join("bin").join("pip");18 Command::new(&venv_pip).args(["install", "-r", "requirements.txt"]).current_dir(&python_env).output()?;19 }20 Ok(())21}The core runtime logic lives in runtimes-python/src/executor.rs. The Runtime trait defines the contract for running benchmarks; the factory creates instances with the correct project root:
1pub struct PythonRuntime {2 python_binary: PathBuf,3 project_root: Option<PathBuf>,4 cached_script: Option<(PathBuf, PathBuf, u64)>,5}6
7impl RuntimeFactory for PythonRuntimeFactory {8 fn lang(&self) -> Lang { Lang::Python }9 fn name(&self) -> &'static str { "Python Runtime" }10 fn create(&self, config: &RuntimeConfig) -> Result<Box<dyn Runtime>> {11 let mut rt = PythonRuntime::new()?;12 rt.set_project_root(config.get_root(Lang::Python));13 Ok(Box::new(rt))14 }15}16
17#[async_trait]18impl Runtime for PythonRuntime {19 fn lang(&self) -> Lang { Lang::Python }20 async fn initialize(&mut self, _suite: &SuiteIR) -> Result<()> {21 which::which("python3").or_else(|_| which::which("python"))22 .map_err(|_| miette!("Python not found in PATH"))?;23 Ok(())24 }25 fn generate_check_source(&self, spec: &BenchmarkSpec, suite: &SuiteIR) -> Result<String> {26 // Generate Python script from IR27 }28 async fn compile_check(&self, spec: &BenchmarkSpec, suite: &SuiteIR) -> Result<()> {29 // Run pyright or similar30 }31 async fn run_benchmark(&mut self, spec: &BenchmarkSpec, suite: &SuiteIR) -> Result<Measurement> {32 // Write bench.py, spawn python bench.py, parse JSON from stdout33 }34 async fn shutdown(&mut self) -> Result<()> { Ok(()) }35}Python uses a passthrough mapper because there is no separate compilation step — the script runs directly, so errors are reported in terms of the generated Python file. In runtimes-python/src/error_mapping.rs:
1impl ErrorMapper for PythonErrorMapper {2 fn lang(&self) -> Lang { Lang::Python }3 fn build_mappings(&self, _suite: &SuiteIR, _generated_code: &str) -> LineMappings {4 LineMappings::default()5 }6 fn remap_error(&self, error: &str, _mappings: &LineMappings) -> String {7 error.to_string()8 }9}The plugin bundles all runtime components and registers them via the #[distributed_slice] macro. In runtimes-python/src/plugin.rs:
1pub struct PythonPlugin;2
3impl RuntimePlugin for PythonPlugin {4 fn lang(&self) -> Lang { Lang::Python }5 fn runtime_factory(&self) -> &'static dyn RuntimeFactory { &PYTHON_RUNTIME_FACTORY }6 fn error_mapper(&self) -> &'static dyn ErrorMapper { &PYTHON_ERROR_MAPPER }7 fn lang_display(&self) -> LangDisplayInfo { python_lang_display() }8 // Optional: project_root_detector, virtual_file_builder, embedded_diagnostic_provider, etc.9}10
11#[distributed_slice(poly_bench_traits::PLUGINS)]12static _PYTHON: &dyn RuntimePlugin = &PYTHON_PLUGIN;When run_benchmark is called, the runtime generates a standalone Python script, writes it to disk, runs it, and parses the JSON output. Here is the high-level flow:
time.perf_counter_ns() for high-resolution timing.The Python runtime uses two codegen functions in runtimes-python/src/executor.rs:
Both use the same assembly order. The script is built by concatenating the following sections in order:
Header — # Code generated by poly-bench. DO NOT EDIT. and core imports (time, json, gc, tracemalloc if memory tracking).
User imports — From suite.imports.get(&Lang::Python) (e.g. import hashlib).
Stdlib imports — From suite.stdlib_imports via stdlib::get_stdlib_code, which injects standard library helpers used by the DSL.
Declarations — From suite.declarations.get(&Lang::Python), e.g. module-level constants or helper stubs.
Init code — From suite.init_code.get(&Lang::Python), run once before the benchmark loop.
Helpers — From suite.helpers.get(&Lang::Python), e.g. def keccak256_py(data: bytes) -> bytes.
Fixture decoding — For each fixture_refs in the spec, either fixture_name = <implementation> (Python literal) or fixture_name = bytes([0x68, 0x65, ...]) for raw hex data.
Benchmark function — The implementation is wrapped in def __polybench_bench(): with return on the last line when use_sink is true (to prevent dead-code elimination).
Benchmark loop — def __polybench_run(): warmup (time-based or iteration-based), before_hook, then either:
target_ns is reached; batch size grows as needed.Each iteration times the benchmark call. each_hook runs before/after each call. after_hook runs at the end. If use_memory is set, tracemalloc tracks peak memory per batch.
JSON output — The script prints the final JSON on the last line: \{"iterations": N, "total_nanos": T, "nanos_per_op": X, "ops_per_sec": Y}.
The normalize_python_indent helper strips common leading whitespace from embedded blocks so the generated script has consistent indentation.
Python provides full LSP support for editor diagnostics and hover. You can omit LSP initially and add it later; the runtime will work for poly-bench run without it.
The LSP integration for embedded Python works as follows:
Building a virtual file — The .bench file contains embedded Python blocks (imports, setup, helpers, benchmark implementations). The VirtualFileBuilder assembles these into a single .py file that the LSP can analyze.
Running diagnostics — The EmbeddedDiagnosticProvider runs pyright (or pylsp) on that virtual file and maps diagnostics back to positions in the .bench file.
Hover and completions — The EmbeddedLspClient wraps pyright-langserver so the editor can request hover, completions, and other LSP features for embedded Python.
The VirtualFileBuilder lives in runtimes-python/src/virtual_file.rs. It implements VirtualFileBuilder::build and receives VirtualFileParams containing:
The builder uses VirtualFileBuilderCore to:
Categorize blocks — categorize_blocks groups blocks by type (imports, declares, helpers, inits, other).
Assemble content — Imports, declarations, helpers, and init are written in order. For benchmark/validate/skip blocks, it emits def _bench_N(): (or similar) with placeholder fixture assignments and the block content indented.
Compute paths — The virtual file path is \{module_root}/.lsp_virtual/_lsp_virtual_\{hash}.py where a hash is derived from the bench path. This keeps virtual files unique per .bench file.
Build section mappings — Each block maps a range in the virtual file to a range in the .bench file. These are stored in section_mappings so diagnostics can be translated from virtual line/character back to bench line/character.
The EmbeddedDiagnosticProvider in runtimes-python/src/embedded_diagnostics.rs implements check_blocks(virtual_file, ctx). It:
Tries pyright first — Runs pyright --outputjson on the virtual file path (after writing the content to disk). Parses the JSON output for generalDiagnostics and converts each to EmbeddedDiagnostic with virtual_line, virtual_character, length, message, severity, and optional code.
Falls back to py_compile — If pyright is not available, runs python -m py_compile and parses stderr for File "...", line N and ErrorType: message. Produces EmbeddedDiagnostic with virtual_line set from the error line.
The LSP layer then uses virtual_file.section_mappings() to map each diagnostic's virtual_line and virtual_character back to the corresponding position in the .bench file.
The EmbeddedDiagnosticSetup implements prepare(module_root, ctx). It calls ctx.ensure_ready(Lang::Python, module_root), which initializes the LSP client for Python if needed (for example, starting pyright-langserver).
The Python LSP uses PyrightConfig in runtimes-python/src/pyright_client.rs:
find_executable_in_workspace — Looks for .venv/bin/pyright-langserver or .venv/bin/pyright under the workspace root. This ensures the venv’s pyright is used when available.
find_executable — Falls back to which for pyright (with --langserver) or pylsp in PATH.
server_args_for_path — For pyright-langserver, passes --stdio so it communicates over stdin/stdout.
The plugin registers embedded_lsp_client_init and embedded_lsp_client_get so the LSP server can start and reuse the pyright client for diagnostics.
The table below summarizes the LSP components and their roles:
| Component | Purpose |
|---|---|
| VirtualFileBuilder | Builds a virtual .py file from embedded blocks; maintains section mappings for bench to virtual file |
| EmbeddedDiagnosticProvider | Runs pyright or py_compile on the virtual file; returns EmbeddedDiagnostic with virtual positions |
| EmbeddedDiagnosticSetup | Calls prepare so the LSP client is ready before diagnostics |
| EmbeddedLspClient (pyright_client) | Wraps pyright-langserver; find_executable_in_workspace checks .venv/bin/pyright-langserver |
| ProjectRootDetector | Detects Python root via requirements.txt or pyproject.toml |
| Section | What it does |
|---|---|
| Lang | Adds python/py to the grammar and config |
| Manifest | PythonConfig with deps; has_runtime |
| Init | Creates requirements.txt |
| Build | Creates venv, runs pip install |
| Runtime | initialize, generate_check_source, compile_check, run_benchmark, shutdown |
| ErrorMapper | Passthrough (or real mapping for compiled langs) |
| Plugin | Bundles all; #[distributed_slice(PLUGINS)] |
For the full checklist and LSP details, see Adding a Runtime.