Layers

Configure per-layer models, permissions, tools, and skills; run layers sequentially or in parallel fan-out with task assignment.

Layers are the individual step agents in your pipeline, each running in its own sub-process with its own provider, model, and fresh context window. Layers execute in the order specified by the step array in setting.json. Only the free-text summary from each layer is handed to the next—not the full conversation history or tool outputs.

Layer file structure

Each name in step may have a layer file at .stepper/layer/<name>/index.md. This file contains a YAML frontmatter block (configuration) followed by the layer's system prompt body.

markdown

---
description: implementation layer        # required
model: omlx/deepseek-coder               # or provider: + the default model
temperature: 0.2                         # sampling overrides → the request
top_p: 0.9
tools:
  allow: [read_file, write_file, edit_file, bash]
  deny:  [web_fetch]
permission:                              # per-layer overrides (tighten-only)
  Bash(rm *): deny
  Write(**): ask
mcp:
  allow: [context7]                      # which MCP servers this layer sees
skills: [rust-style]                     # skill bodies injected into the system prompt
steps: 40                                # step cap (ReAct iterations)
on-failure: skip                         # stop (default) | skip the layer and continue
retries: 1                               # extra attempts before applying on-failure
color: green
---
You are the implementation layer. Carry out the plan using the tools.

Frontmatter fields

description (required) — A brief description of the layer's role.
model or provider — Override the default model for this layer. Use provider/model-id format (e.g., omlx/deepseek-coder).
temperature and top_p — Sampling parameters passed to the API request.
tools.allow and tools.deny — Control which tools (e.g., read_file, write_file, bash) this layer can call.
permission — Per-layer permission rules that merge with global rules; can only tighten restrictions (see section below).
mcp.allow — List MCP servers this layer can access (e.g., [context7]).
skills — Array of skill names available to this layer; bodies are lazy-loaded on demand.
steps — Maximum number of ReAct iterations before the layer stops.
on-failure — Either stop (default; halt the pipeline) or skip (continue to the next layer).
retries — Number of extra attempts before on-failure is applied.
color — Optional TUI color label for the layer.

Permission tightening

Per-layer permission rules are merged onto the global rules and respect the resolution order deny > ask > allow. This means a layer can only **tighten** restrictions—it cannot relax a base deny rule. For example, if the global config denies Bash(rm *), a layer cannot allow it; but a layer can ask for confirmation on a tool that was globally allowed.

Parallel layers (fan-out)

A layer can run as a fan-out: one concurrent worker per subtask, each with its own fresh context window and the same layer configuration, all joined before the next layer processes the result. To enable parallel execution, mark the layer parallel: true.

Task assignment

The **preceding** layer (the one before the parallel layer) is offered the assign_tasks tool. This layer calls assign_tasks({ tasks: [{label, prompt}, …] }) to split work into subtasks. Each subtask becomes one worker in the parallel layer. If the preceding layer does not assign any tasks, the parallel layer runs once with no task context.

Concurrency and worker panel

Use parallel-max to cap the number of workers running at once (e.g., parallel-max: 4). This limits concurrency but does not drop tasks—they queue and run as workers finish. The TUI displays a live **worker panel** showing one row per worker: current status, the last tool called, and token count. After all workers finish, their summaries converge into a single handoff for the next layer.

markdown

---
description: implementation layer
parallel: true            # fan out — one worker per assigned subtask
parallel-max: 4           # max workers running AT ONCE (concurrency cap; no task is dropped)
model: omlx/deepseek-coder
tools:
  allow: [read_file, write_file, edit_file, bash]
---
You are one implementation worker. Complete only your assigned subtask, then
end with a concise summary of what you changed.

Workflow example

With step: ["plan", "implement", "test"] and implement marked parallel: The plan layer outputs its summary and calls assign_tasks to split the work into implementation subtasks. Each subtask spins up one implement worker with the full layer config. Once all workers complete, test receives the merged summary of all worker changes. The model-callable dispatch tool (when dispatch.enabled: true) uses the same worker panel for its sub-agents.