Live Benchmark Monitor

Real-time observability while a benchmark run executes: system telemetry, task progression, per-task streaming details, and live pass/fail accounting.

Open Live Monitor ↗

What It Is

The live monitor is a standalone dashboard that connects to the benchmark runner's SSE telemetry endpoint (/api/benchmark/telemetry/stream) and updates in real time. It is useful when a run_benchmark.py session is active — outside of that context it waits silently for events.

It complements the static benchmark reports (Guided, Claude) by showing the execution side: is the server keeping up? Is the GPU thermally stable? Which tasks just passed or failed?

Access

Realtime Telemetry

Six auto-refreshing ECharts time-series, each toggleable via the buttons above the grid. Each chart scrolls a 60-sample window.

Task Lists

Three pill grids updated live as tasks complete. Each grid can be hidden via the toggle buttons at the top of the section.

A pulsing amber halo on a running task dot indicates the active LLM call. It disappears as soon as the result is evaluated.

Dashboard Panel

Three KPI cards that summarize the current moment without needing to read the charts.

Streaming Details

The bottom card shows the full drill-down for the task currently executing (or the last one that ran, labelled "last task").

Workflow Progress pills

Five sequential stages, each showing elapsed time once passed:

Text panels

When No Run Is Active

All panels show dashes or empty grids. A yellow waiting-indicator banner appears: "Waiting for benchmark events…". The SSE connection retries automatically. As soon as run_benchmark.py starts publishing events the monitor fills in live without any page reload.

Troubleshooting