Real-Time Code Generation¶
Status: shipped — Python codegen pipeline + C99 target + PIL parity bench. ARM Cortex-M7 / Zynq target profiles + cycle-budget enforcement are the natural follow-ups.
pulsim.codegen takes a Pulsim Circuit plus a fixed step dt and
emits a self-contained C99 module that integrates the same dynamics on
any target with a float and a basic libm. The generated model.c
has no malloc, no globals, no platform dependencies — drop it into
your control loop as-is.
This is the headline feature for HIL (Hardware-in-the-Loop) / RCP (Rapid Control Prototyping) workflows: simulate on the desktop, code-gen, run the same model on a Cortex-M7 / Zynq / microcontroller / FPGA-soft-core / RTOS without re-tuning anything.
TL;DR¶
import pulsim
from pulsim.codegen import generate
ckt = ... # your existing Circuit
summary = generate(ckt, dt=1e-6, out_dir="gen/buck")
print(f"State: {summary.state_size}D, "
f"stability radius {summary.stability_radius:.4f}")
print(f"ROM ≈ {summary.rom_estimate_bytes} B, "
f"RAM ≈ {summary.ram_estimate_bytes} B")
That writes three files:
| File | Purpose |
|---|---|
model.h |
Header with PulsimModel struct + step API |
model.c |
Discrete-time matrices as static const float[][] + step body |
model_test.c |
Tiny PIL harness — reads n_steps, u_const from argv, prints the y trace |
The step API is target-agnostic:
typedef struct PulsimModel {
float x[PULSIM_STATE_SIZE];
} PulsimModel;
void pulsim_model_init(PulsimModel* m);
void pulsim_model_step(PulsimModel* m, const float* u, float* y);
Pipeline¶
┌─ DC OP ┌─ matrix exponential
▼ ▼
Circuit → linearize → reduce → discretize → emit C99
(E,A,B,C,D) (drop alg. rows) (A_d, B_d) (model.c/h)
↓
stability check
rho(A_d) < 1
- Linearize at the DC operating point via the C++
Simulator::linearize_around(shipped inadd-frequency-domain-analysis). Returns continuous-time(E, A, B, C, D)in descriptor form. - Reduce the descriptor form to a regular state-space by
eliminating the algebraic rows of
E(V-source branch equations, etc). Outputs a regular(A_red, B_red)plus the state-projection matrix. -
Discretize via the matrix exponential trick:
┌A B⎤ ┌A_d B_d⎤ ⎢ ⎥ ·dt → exp ⎢ ⎥ ⎣0 0⎦ ⎣ 0 I ⎦
Single expm call gives both A_d and B_d even when A is
singular (integrator-only systems).
4. Stability check: every eigenvalue of A_d must sit inside the
unit circle. Codegen fails loud with RuntimeError when
rho(A_d) ≥ 1 and recommends shrinking dt.
5. Emit C99 from a Python template. Matrices are baked in as
static const float[N][N]; the step body is a plain triply-nested
loop (no BLAS dependency).
Target profiles¶
| Target | Status |
|---|---|
c99 |
shipped — portable baseline; tested via gcc on macOS / Linux |
arm-cortex-m7 |
deferred — wants CMSIS-DSP optional matrix-vector + cycle-budget enforcement |
zynq-baremetal |
deferred — wants HW-IP-block stub + Xilinx tool integration |
The generated c99 code is naturally portable: any C99 compiler
including arm-none-eabi-gcc, xc32-gcc, riscv-gcc, etc. will
compile it. The deferred targets are about target-specific
optimization (CMSIS-DSP, custom NEON / SSE intrinsics) and
toolchain integration (Makefiles, IDE projects), not about the C99
itself.
PIL parity (gate G.1)¶
test_codegen.py::test_codegen_pil_parity_against_native is the
contract:
- Generate
model.c / model.h / model_test.cfrom a Pulsim Circuit. - Compile the harness with the system gcc (or
cc). - Run the binary with constant input for N steps; capture the y trace.
- Compute the same trace independently in Python via direct matrix
multiplication using the
summary.A_d / B_d / C / D. - Assert per-step output agreement within ±0.1 % relative tolerance (with a small absolute floor near zero) — gate G.1 of the change.
When no C compiler is on PATH the test gracefully pytest.skips so
it survives bare CI environments.
Stability check¶
The codegen pipeline rejects any (circuit, dt) pair whose discrete
state matrix has spectral radius ≥ 1. The error message names the
spectral radius and recommends shrinking dt:
RuntimeError: codegen: discrete state matrix is unstable
(spectral radius 1.234567 ≥ 1.0). Choose a smaller dt or check the
circuit for sign-flipped passive elements.
The standard rule of thumb is |λ_max(A) · dt| ≤ ln(2) ≈ 0.693, which
gives rho(A_d) ≤ 2. We enforce the tighter < 1 because anything
≥ 1 doesn't decay; the error margin is the user's call to make
explicitly.
ROM / RAM estimate¶
CodegenSummary.rom_estimate_bytes is 4·(|A_d| + |B_d| + |C| + |D|)
— the total number of floats baked in. RAM is 4·2·state_size (the
state vector plus one scratch). These are coarse but accurate to ±20 %
on the gcc -O2 baseline; tighter estimates require actually compiling
for the target which is the deferred work.
| Circuit | State | ROM | RAM |
|---|---|---|---|
| RC low-pass | 1 | 56 B | 8 B |
| RLC | 2 | 192 B | 16 B |
| Buck output stage (LC + R) | 2 | 192 B | 16 B |
| 16-topology PFC | 16 | ~ 4 KB | 128 B |
The 8-KB ROM / 512-B RAM gate from the proposal is comfortable for
single-topology converters; multi-topology PWL state machines
(switching between several A_d matrices per topology bitmask) push
toward the budget on Cortex-M0 / M3, which is why the cycle-budget
enforcement (Phase 6) is tracked as a follow-up.
Limitations / follow-ups¶
- Multi-topology switching: today the codegen captures a single
topology (the linearization at the DC OP). A production HIL model
for a switching converter wants
Ndiscrete matrix sets keyed by the switch state bitmask, with aswitch (topology)body inpulsim_model_step. Tracked alongside therefactor-pwl-switching-enginefollow-up that surfaces the topology bitmask as a runtime input. - Cycle-budget enforcement (
--max-cyclesflag): requires a target-specific cycle model (Cortex-M7 in-order, Zynq A9, etc). Deferred. - YAML
codegen:section: declarative config fordt,target,out_dir,reachable_topologies. Lands once the Circuit-variant integration parser dispatch is final. - CLI
pulsim codegen <netlist.yaml>: the Python API atpulsim.codegen.generate(...)is final; the CLI wrapper is a thin argparse layer that can ship later. - PIL on Cortex-M7 via qemu-arm: works in principle (the C99 is portable), needs the qemu integration in CI to run.
See also¶
ac-analysis.md— thelinearize_aroundAPI the codegen pipeline calls under the hood.linear-solver-cache.md— the per-key cache the multi-topology codegen will reuse via the topology bitmask.converter-templates.md— the auto- designed converter circuits that are typical codegen inputs.- Runnable example:
examples/python/05_codegen_pil.py(Pulsim → C99 → gcc → ctypes round-trip with rel. error ≤ 1e-6).