Agent Sandbox
Your agents get their own computer
Agents can do anything a person at a computer could do. Analyze files, generate reports, write and run code, coordinate with other agents. Each agent gets an isolated computing environment with full access to your workspace.
Generated weekly executive summary from data-analyzer outputs. 3 files written to outputs/.
Reviewing auth middleware changes across 14 files. Reading previous session history for context.
Powered by Runwork AI Access from anywhere
Web Dashboard
Watch execution streams in real time from the Agents Dashboard. See tool calls, costs, token usage, and output files. Retry failed runs with one click.
Open in dashboardCLI
Capabilities
Delegate Any Task
Agents hand off complex work to Claude CLI running in an isolated sandbox. Data analysis, report generation, code review, file processing, multi-step research. If it can be done at a computer, your agent can do it.
Full Computing Environment
Not a constrained tool call. A real computer. File system, web access, MCP tools, workspace integrations. Your agents work with the same power a developer has in their terminal.
Cross-Agent Coordination
Agents read each other's outputs and continue previous sessions. No complex protocols needed. Agents coordinate through the filesystem, the simplest interface there is.
Session Persistence
Resume work where you left off. Pass a session ID and the agent picks up with full context of previous work. Long-running analysis, iterative refinement, multi-day projects.
Cost Controls
Three-layer budget system keeps spending predictable. Platform credits, agent-level limits, and per-task overrides. Budget warnings at 90% let agents wrap up gracefully. Hard stops at 100%.
Programmatic or Built-in
Every agent automatically gets the delegate_task tool. Or call delegateTask() in code for full control over inputs, outputs, and execution limits. Both paths, same power.
Execution Stream
Watch sandbox agents work in real time from the dashboard. Every tool call, text output, and error rendered with timestamps. Expand any execution to see the full event log.
Token and Cost Tracking
Every execution tracks input tokens, output tokens, and USD cost. See cost per run in the execution history. Budget warnings at 90% let agents wrap up gracefully before the hard stop.
Use Cases
Why It Matters
- No infrastructure to manage. Sandboxes are provisioned automatically.
- Workspace credentials flow to the sandbox. No API key juggling.
- The same Claude CLI developers use locally powers the sandbox.
- Agents coordinate through the filesystem. Simple, no protocol overhead.
- You don't need a separate app to run an agent. The workspace computer is already an app.
How It Works
The Agent Sandbox gives your AI agents their own computing environment. When an agent needs to do complex work, it delegates the task to Claude CLI running inside an isolated sandbox with full access to your workspace.
The framework agent decides what needs doing and calls delegate_task with a prompt, input files, and output patterns. The platform provisions a sandbox session in the workspace computer. Claude CLI runs with full workspace context: your skills, MCP servers, integrations, and data. Results flow back to the orchestrating agent.
Agents share work through the filesystem. Every agent session lives in /workspace/agents/ where other agents can read previous outputs and continue where someone left off. No inter-agent messaging protocol needed. Just files.
You don't need to build a separate app just to run an agent. The workspace computer is itself an app. Write agents directly against your workspace for single-purpose tasks: a new schedule, a one-off analysis, a recurring report. The CLI works the same way locally, so the experience is consistent everywhere.
Cost controls work in three layers. Your platform credit balance sets the ceiling. Agent definitions set per-task budgets, turn limits, and timeouts. Individual delegateTask() calls can tighten these further. At 90% of budget, the agent gets a warning to wrap up. At 100%, it stops gracefully and returns whatever results it has.
The Execution Stream shows exactly what your sandbox agents are doing. From the Agents Dashboard, expand any execution to see the full event log: every tool call with its input parameters, text responses, errors, and status changes, all with timestamps. The stream auto-updates while agents are running. Each execution also tracks token usage (input and output) and cost in USD, so you see exactly what every run costs. Failed or partial executions can be retried with one click. From the terminal, runwork agents execution <sessionId> shows the same stream.