Runwork
All Features

Agent Sandbox

Your agents get their own computer

Agents can do anything a person at a computer could do. Analyze files, generate reports, write and run code, coordinate with other agents. Each agent gets an isolated computing environment with full access to your workspace.

Agent Sandbox
3 active
data-analyzer session-a8f3e2 · CRM App
Running
$ claude -p "Analyze Q2 sales data..."
Reading inputs/q2-sales.csv (2,847 rows)
Computing regional breakdown...
Writing outputs/q2-analysis.md
Turn 12/50 · $1.24 of $10.00 3m 42s elapsed
report-writer session-c4d1b7 · CRM App
Complete

Generated weekly executive summary from data-analyzer outputs. 3 files written to outputs/.

8 turns · $0.67 1m 15s
code-reviewer session-f9a2d1 · Workspace
Running

Reviewing auth middleware changes across 14 files. Reading previous session history for context.

Turn 5/30 · $0.89 of $5.00 2m 08s elapsed
Runwork AI Powered by Runwork AI
/workspace/agents/ · 3 sessions

Access from anywhere

Claude ChatGPT Cursor

AI Agents

Ask your AI agent:

Connect your AI agent

Web Dashboard

Watch execution streams in real time from the Agents Dashboard. See tool calls, costs, token usage, and output files. Retry failed runs with one click.

Open in dashboard

CLI

Capabilities

Delegate Any Task

Agents hand off complex work to Claude CLI running in an isolated sandbox. Data analysis, report generation, code review, file processing, multi-step research. If it can be done at a computer, your agent can do it.

Full Computing Environment

Not a constrained tool call. A real computer. File system, web access, MCP tools, workspace integrations. Your agents work with the same power a developer has in their terminal.

Cross-Agent Coordination

Agents read each other's outputs and continue previous sessions. No complex protocols needed. Agents coordinate through the filesystem, the simplest interface there is.

Session Persistence

Resume work where you left off. Pass a session ID and the agent picks up with full context of previous work. Long-running analysis, iterative refinement, multi-day projects.

Cost Controls

Three-layer budget system keeps spending predictable. Platform credits, agent-level limits, and per-task overrides. Budget warnings at 90% let agents wrap up gracefully. Hard stops at 100%.

Programmatic or Built-in

Every agent automatically gets the delegate_task tool. Or call delegateTask() in code for full control over inputs, outputs, and execution limits. Both paths, same power.

Execution Stream

Watch sandbox agents work in real time from the dashboard. Every tool call, text output, and error rendered with timestamps. Expand any execution to see the full event log.

Token and Cost Tracking

Every execution tracks input tokens, output tokens, and USD cost. See cost per run in the execution history. Budget warnings at 90% let agents wrap up gracefully before the hard stop.

Use Cases

Analyze spreadsheets and generate reports Review and refactor code across repositories Process and transform files in bulk Multi-step research with web access Data pipeline orchestration Automated testing and validation

Why It Matters

How It Works

The Agent Sandbox gives your AI agents their own computing environment. When an agent needs to do complex work, it delegates the task to Claude CLI running inside an isolated sandbox with full access to your workspace.

The framework agent decides what needs doing and calls delegate_task with a prompt, input files, and output patterns. The platform provisions a sandbox session in the workspace computer. Claude CLI runs with full workspace context: your skills, MCP servers, integrations, and data. Results flow back to the orchestrating agent.

Agents share work through the filesystem. Every agent session lives in /workspace/agents/ where other agents can read previous outputs and continue where someone left off. No inter-agent messaging protocol needed. Just files.

You don't need to build a separate app just to run an agent. The workspace computer is itself an app. Write agents directly against your workspace for single-purpose tasks: a new schedule, a one-off analysis, a recurring report. The CLI works the same way locally, so the experience is consistent everywhere.

Cost controls work in three layers. Your platform credit balance sets the ceiling. Agent definitions set per-task budgets, turn limits, and timeouts. Individual delegateTask() calls can tighten these further. At 90% of budget, the agent gets a warning to wrap up. At 100%, it stops gracefully and returns whatever results it has.

The Execution Stream shows exactly what your sandbox agents are doing. From the Agents Dashboard, expand any execution to see the full event log: every tool call with its input parameters, text responses, errors, and status changes, all with timestamps. The stream auto-updates while agents are running. Each execution also tracks token usage (input and output) and cost in USD, so you see exactly what every run costs. Failed or partial executions can be retried with one click. From the terminal, runwork agents execution <sessionId> shows the same stream.

Frequently Asked Questions

What can agents do in the sandbox?
Anything Claude CLI can do: read and write files, run commands, use MCP tools, access workspace integrations, browse the web. The sandbox is a full computing environment, not a restricted API call. If you can do it in a terminal, your agent can do it in the sandbox.
How do agents share data with each other?
Through the filesystem. All agents in a workspace can access /workspace/agents/ and read each other's session outputs. One agent can analyze data, write results to its output directory, and another agent can pick up those results and continue the work. No special protocols or message passing required.
Do I need to create an app to use agent sandboxes?
No. The workspace computer is itself an app. You can write agents directly against your workspace for single-purpose tasks like a new schedule, a one-off data analysis, or a recurring report. You only need a separate app when you want a dedicated frontend or complex multi-agent system.
How are costs controlled?
Three layers. Your platform credit balance is the ceiling. Agent definitions can set per-task budgets (e.g. $10 max), turn limits, and timeouts. Individual delegateTask() calls can tighten these further but never loosen them. At 90% of budget, the agent gets a warning to wrap up gracefully. At 100%, execution stops and partial results are returned.

Related Features

See How Teams Use Agent Sandbox

Ready to try Agent Sandbox?

Try the work cloud for AI agents.