Roadmap
What stepper has shipped, what is being hardened for real-world use, and the limits deliberately deferred.
This page tracks stepper's delivery status at a glance: shipped capabilities, work in progress toward day-to-day use, limits deliberately deferred (with rationale), and maintenance follow-ups. The other pages cover how each piece works; this one is the high-level state.
Shipped
Implemented, test-covered, and available in the current release.
- **Layered pipeline & multi-provider** — an orchestrator delegates through an ordered pipeline of sub-agent layers, each with its own provider, model, and fresh context window; sequential or parallel fan-out (
assign_tasks+ a live worker panel). - **Permission system** —
auto/plan/accept-editsmodes,allow/ask/denyrules with persisted approvals, compound-bash escalation, and fail-closed headless runs. - **Auth** — provider keys via env vars or the OS keyring (
stepper auth set-key/delete-key), plus Codex (ChatGPT) OAuth. - **Sessions & control** — session resume, checkpoint +
/rewind, model-driven compaction, hooks, skills (progressive disclosure), slash commands, and MCP (stdio/HTTP) servers. - **Opt-in OS sandbox** — a macOS Seatbelt profile confines the
bashtool's writes to the project (defense-in-depth under the permission engine). - **Test hardening** — isolation-invariant CI, core integration tests (orchestrator, compaction, session/rewind, cost, parallel layer, dispatch, cancellation), the permission matrix, TUI render snapshots, hermetic MCP echo, and provider fixtures — 837 network-less tests.
- **Live end-to-end** — the two-layer pipeline (ollama-cloud → oMLX), streaming,
/rewind, and resume are validated against real providers (kept#[ignore]+STEPPER_E2E-gated so the defaultcargo testskips them).
In progress
Implemented but still being hardened against live, day-to-day use.
- The interactive tty TUI driven by a live streaming model (the headless
-ppath and the orchestrator are already validated live). - Codex (ChatGPT) backend live auth and streaming.
/initscaffolding refinement and the/rewind/ resume user experience.
Deferred (accepted limits)
Known limits intentionally not addressed yet, with the reasoning.
- **WriteFile TOCTOU symlink swap** — out of scope for a single-user dev CLI.
- **gix-backed checkpoints** — the copy-based snapshotter works; a git backend is a later optimization.
- **Hermetic keyring test** — the OS keychain is unavailable in CI, so the keyring stays integration-tested only.
- **Live MCP HTTP auth** and the
McpManager::connectsuccess path — both need a live server. - **Background
!cmd &sandbox parity** — the foregroundbashtool is confined; the background path (proc.rs) needs the writable roots threaded from the TUI before it can be sandboxed too.
Maintenance
- De-duplicate the transitive
reqwest0.12 / 0.13 versions. - Move the GitHub Actions release/deploy workflows off the deprecated Node.js 20 actions before they are removed.