Runtime
Runtime
Section titled “Runtime”The Perstack runtime combines probabilistic LLM reasoning with deterministic state management — making agent execution predictable, reproducible, and auditable.
Execution model
Section titled “Execution model”The runtime organizes execution into a three-level hierarchy:
Job (jobId) ├── Run 1 (runId, Coordinator Expert) │ └── Checkpoints... ├── Run 2 (runId, Delegated Expert A) │ └── Checkpoints... └── Run 3 (runId, Delegated Expert B) └── Checkpoints...| Concept | Description |
|---|---|
| Job | Top-level execution unit. Created per perstack run invocation. Contains all Runs. |
| Run | Single Expert execution. Each delegation creates a new Run within the same Job. |
| Checkpoint | Snapshot at the end of each step within a Run. |
Coordinator vs. Delegated Expert
Section titled “Coordinator vs. Delegated Expert”| Role | Description |
|---|---|
| Coordinator Expert | The initial Expert that starts a Job. Has full capabilities. |
| Delegated Expert | Expert started via delegation. Restricted capabilities. |
Key differences:
| Capability | Coordinator | Delegated |
|---|---|---|
| Interactive tool calls | ✅ Available | ❌ Not available |
--continue / --resume-from | ✅ Supported | ❌ Not supported |
| Context from parent | N/A | Only the query (no shared history) |
Delegated Experts cannot use interactive tools. See Why no interactive tools for delegates?
Agent loop
Section titled “Agent loop”Each Run executes through an agent loop:
┌─────────────────────────────────────────────────────┐│ 1. Reason → LLM decides next action ││ 2. Act → Runtime executes tool ││ 3. Record → Checkpoint saved ││ 4. Repeat → Until completion or limit │└─────────────────────────────────────────────────────┘The loop ends when:
- LLM calls
attemptCompletionwith all todos complete (or no todos) - Job reaches
maxStepslimit - External signal (SIGTERM/SIGINT)
When attemptCompletion is called, the runtime checks the todo list. If incomplete todos remain, they are returned to the LLM to continue work. This prevents premature completion and ensures all planned tasks are addressed.
Step counting
Section titled “Step counting”Step numbers are continuous across all Runs within a Job. When delegation occurs, the delegated Run continues from the parent’s step number:
Job (totalSteps = 8) ├── Run 1 (Coordinator): step 1 → 2 → delegates at step 3 │ ↓ ├── Run 2 (Delegate A): step 3 → 4 → completes │ ↓ └── Run 1 continues: step 5 → 6 → 7 → 8The maxSteps limit applies to the Job’s total steps across all Runs.
Stopping and resuming
Section titled “Stopping and resuming”npx perstack run my-expert "query" --max-steps 50| Stop condition | Behavior | Resume from |
|---|---|---|
attemptCompletion (no remaining todos) | Task complete | N/A |
attemptCompletion (remaining todos) | Continue loop | N/A (loop continues) |
maxSteps reached | Graceful stop | Coordinator’s last checkpoint |
| SIGTERM/SIGINT | Immediate stop | Coordinator’s previous checkpoint |
--continueand--resume-fromonly work with the Coordinator Expert’s checkpoints. You cannot resume from a Delegated Expert’s checkpoint.
Deterministic state
Section titled “Deterministic state”LLMs are probabilistic — same input can produce different outputs. Perstack draws a clear boundary:
| Probabilistic (LLM) | Deterministic (Runtime) |
|---|---|
| Which tool to call | Tool execution |
| Todo management decisions | State recording |
| Reasoning | Checkpoint creation |
The “thinking” is probabilistic; the “doing” and “recording” are deterministic. This boundary enables:
- Reproducibility: Replay from any checkpoint with identical state
- Testability: Mock the LLM, test the runtime deterministically
Event, Step, Checkpoint
Section titled “Event, Step, Checkpoint”Runtime state is built on three concepts:
| Concept | What it represents |
|---|---|
| Event | A single state transition (tool call, result, etc.) |
| Step | One cycle of the agent loop |
| Checkpoint | Complete snapshot at step end — everything needed to resume |
This combines Event Sourcing (complete history) with Checkpoint/Restore (efficient resume).
The perstack/ directory
Section titled “The perstack/ directory”The runtime stores execution history in perstack/jobs/ within the workspace:
/workspace└── perstack/ └── jobs/ └── {jobId}/ ├── job.json # Job metadata └── runs/ └── {runId}/ ├── run-setting.json # Run configuration ├── checkpoint-{timestamp}-{step}-{id}.json └── event-{timestamp}-{step}-{type}.jsonThis directory is managed automatically — don’t modify it manually.
For cloud deployments, see Storage Backends to configure S3 or R2 storage.
Event notification
Section titled “Event notification”The runtime emits events for every state change. Two options:
stdout (default)
Section titled “stdout (default)”Events are written to stdout as JSON. This is the safest option for sandboxed environments — no network access required.
npx perstack run my-expert "query"Your infrastructure reads stdout and decides what to do with events. See Sandbox Integration for the rationale.
Custom event listener
Section titled “Custom event listener”When embedding the runtime programmatically, use a callback:
import { run } from "@perstack/runtime"
await run(params, { eventListener: (event) => { // Send to your monitoring system, database, etc. }})Skills (MCP)
Section titled “Skills (MCP)”Experts use tools through MCP (Model Context Protocol). The runtime handles:
- Lifecycle: Start MCP servers with Expert, clean up on exit
- Environment isolation: Only
requiredEnvvariables are passed - Error recovery: MCP failures are fed back to LLM, not thrown as runtime errors
For skill configuration, see Skills.
Base skill optimization
Section titled “Base skill optimization”The @perstack/base skill provides essential tools (file operations, exec, etc.) required by every Expert. To minimize startup latency, the runtime bundles this skill and uses in-process communication:
| Configuration | Transport | Latency |
|---|---|---|
| Default (no version specified) | InMemoryTransport | <50ms |
Explicit version (@perstack/base@1.0.0) | StdioTransport via npx | ~500ms |
Custom command (perstackBaseSkillCommand) | StdioTransport | Varies |
How it works:
- InMemoryTransport: The bundled base skill runs in the same process as the runtime, using MCP SDK’s
InMemoryTransportfor zero-IPC communication - Version pinning fallback: When you specify an explicit version (e.g.,
@perstack/base@0.0.34), the runtime falls back to spawning vianpxto ensure reproducibility
When to pin versions:
Pin a specific base skill version when:
- You need reproducible builds across environments
- You’re debugging version-specific behavior
- Your deployment requires exact version control
# Default: uses bundled base (fastest)[experts.my-expert.skills."@perstack/base"]type = "mcpStdioSkill"command = "npx"packageName = "@perstack/base"
# Pinned version: uses npx (slower but reproducible)[experts.my-expert.skills."@perstack/base"]type = "mcpStdioSkill"command = "npx"packageName = "@perstack/base@0.0.34"Native reasoning
Section titled “Native reasoning”Perstack supports native LLM reasoning (extended thinking / test-time scaling) for providers that support it. Configure via reasoningBudget:
reasoningBudget = "medium" # minimal, low, medium, high, or token countNative reasoning is applied during the final result generation (after tool execution), not during tool selection. This ensures reasoning depth is used where it matters most — synthesizing results.
| Provider | Support | Implementation |
|---|---|---|
| Anthropic | Yes | Extended thinking (budgetTokens) |
| OpenAI | Yes | Reasoning effort (reasoningEffort) |
| Planned | Flash Thinking mode | |
| DeepSeek | Partial | Use deepseek-reasoner model |
Streaming events
Section titled “Streaming events”The runtime supports real-time streaming of LLM output through fire-and-forget events. Events are emitted as deltas arrive, without waiting for acknowledgment.
Event sequence
Section titled “Event sequence”| Phase | Events | Description |
|---|---|---|
| Reasoning | startReasoning → streamReasoning... → completeReasoning | Extended thinking output |
| Result | startRunResult → streamRunResult... → completeRun | Final completion text |
Streaming vs state machine events
Section titled “Streaming vs state machine events”| Event type | State transition? | Purpose |
|---|---|---|
start* | No | Marks stream beginning (display hint) |
stream* | No | Incremental delta (fire-and-forget) |
complete* | completeRun only | Final result with full text |
When streaming is used
Section titled “When streaming is used”- GeneratingToolCall: Streams reasoning; text streamed only if no tool calls
- GeneratingRunResult: Full streaming (reasoning + result)
Tool calls are NOT streamed
Section titled “Tool calls are NOT streamed”Tool calls are structured data requiring complete parsing. They are processed after the stream completes, not incrementally.
Providers and models
Section titled “Providers and models”Perstack uses standard LLM features available from most providers:
- Chat completion (including PDF/image in messages)
- Tool calling
- Native reasoning (where supported)
For supported providers and models, see Providers and Models.