Detailed guide • Docker Sandboxes • AI agent safety

Docker Sandboxes for AI Agents

Run capable coding agents in “YOLO mode” without handing them your whole computer. Docker Sandboxes use isolated microVMs, a private Docker daemon, workspace mounts, secret injection, and network policies so an agent can work quickly while you keep meaningful guardrails.

MicroVM isolationSeparate Linux kernel, filesystem, Docker Engine, and network
Agent-readyClaude Code, Codex, Copilot, Cursor, Gemini, OpenCode, and more
Policy controlOpen, Balanced, or Locked Down outbound network behavior

Source video: Docker Sandboxes Hands-On Guide – A Safe Space for AI Agents! by Bijan Bowen (25:08). Official docs cross-checked 2026-05-23.

1. What Docker Sandboxes solve

Modern coding agents are most useful when they can run commands, install packages, create files, run tests, build containers, and fix errors without asking for approval every few seconds. The downside is obvious: an agent running directly on your host can see local files, reach the network, and execute scripts in ways you may not intend.

Core idea: give the agent broad freedom inside a controlled sandbox, not on your full host system.

Good use cases

  • Letting Claude Code or Codex run with fewer approval prompts.
  • Testing untrusted build scripts or package installs.
  • Running multiple agents on separate branches.
  • Giving a local model a safer execution environment.
  • Auditing what hosts an agent tries to contact.

Not a magic replacement for review

  • The agent can still change mounted workspace files.
  • You still need to review diffs, generated code, and PRs.
  • Broad network allow rules can reintroduce risk.
  • Secrets must be configured thoughtfully.

2. Safety model in plain English

Docker’s docs describe four isolation layers: hypervisor isolation, network isolation, Docker Engine isolation, and credential proxying.

MicroVM

Each sandbox runs in a lightweight VM with its own Linux kernel. Unlike a plain container, it does not share the host kernel as its primary boundary.

Private Docker daemon

The agent can build and run containers, but it does not receive the host Docker socket. That avoids giving it control over your host Docker environment.

Network proxy

Outbound traffic is mediated by a host-side proxy and policy rules. HTTP/HTTPS rules can allow or deny destinations; UDP/ICMP are blocked at the network layer.

Important nuance: inside the sandbox, the agent has powerful capabilities, including sudo access and package installation. The safety boundary is the VM and policy layer, not “the agent is restricted inside Linux.”

3. Prerequisites

PlatformRequirements from Docker docsNotes
macOSmacOS Sonoma 14+ and Apple siliconThe video demonstrates a Mac setup. Install Homebrew first if brew is missing.
WindowsWindows 11, x86_64, Windows Hypervisor Platform enabledRun the HypervisorPlatform command in an elevated PowerShell before installing if needed.
Ubuntu LinuxUbuntu 24.04+, x86_64, KVM enabled, user in kvm groupVerify with lsmod | grep kvm; log out/in after adding your user to the group.

You also need a Docker account for sbx login and an authentication method for the agent you want to run: e.g., OpenAI for Codex, Anthropic or Claude subscription for Claude Code, or another provider/local endpoint.

4. Install and sign in

macOS

brew install docker/tap/sbx
sbx login

Windows

winget install -h Docker.sbx
sbx login

Ubuntu Linux

curl -fsSL https://get.docker.com | sudo REPO_ONLY=1 sh
sudo apt-get install docker-sbx
sbx login

sbx login opens Docker OAuth in a browser and displays a one-time confirmation code. On first login, choose a default network policy. The video uses Balanced as the practical default because it permits common development services while still blocking everything else.

Docker Desktop is not required to use sbx, according to Docker’s get-started docs.

5. Secrets and agent authentication

Most coding agents need a model-provider key or OAuth session. Docker Sandboxes keep secrets on the host and use a proxy to inject credentials into outbound requests. The sandbox sees a sentinel value, not the raw credential.

Agent/providerHost-side setupWhy it matters
Codex / OpenAIsbx secret set -g openai or sbx secret set -g openai --oauthLets Codex call OpenAI without placing the raw API key in the sandbox.
Claude Code / Anthropicsbx secret set -g anthropic, or use Claude OAuth inside the sandbox when supportedUseful for Claude Code runs that would otherwise need an Anthropic key.
GitHubsbx secret set -g github -t "$(gh auth token)"Lets agents create PRs or interact with GitHub through controlled credential injection.
Do not paste secrets into prompts. Use sbx secret set, OAuth, environment variables only when appropriate, or CI-safe --password-stdin patterns.

6. First agent run: Codex or Claude Code

Create or enter a project folder

mkdir -p ~/sandbox-test
cd ~/sandbox-test

The agent only sees the workspace you give it, plus sandbox VM state. Start with a disposable folder for your first test.

Run Codex in the sandbox

sbx secret set -g openai
sbx run codex

Docker’s Codex docs also support passing a workspace path: sbx run codex ~/my-project.

Or run Claude Code

sbx secret set -g anthropic
sbx run claude ~/my-project

Claude Code can also use a Claude subscription/OAuth flow. The sandbox templates may launch agents with reduced approval prompts by default, so treat the sandbox as the safety boundary.

Check status

sbx ls
sbx

sbx without arguments opens the interactive dashboard where you can create, start/stop, attach, open a shell, remove sandboxes, and inspect network governance.

7. Network policies: the most important control

The video’s strongest demo is network policy: a script that can ping a host from the native terminal is blocked/logged when run through the sandbox. Docker’s policy system is designed so the sandbox can only reach destinations you allow.

PolicyMeaningRecommendation
OpenAll outbound network traffic allowed.Only for disposable tests where network risk is acceptable.
BalancedDeny-by-default plus common dev services such as AI APIs, package managers, code hosts, registries, and cloud services.Best first choice for most agent coding workflows.
Locked DownEverything blocked unless explicitly allowed, including model-provider APIs.Best for high-security work, but expect to add allow rules.

Core commands

sbx policy ls
sbx policy log
sbx policy allow network -g api.anthropic.com
sbx policy allow network -g "api.openai.com,*.npmjs.org,*.pypi.org,files.pythonhosted.org,github.com"
sbx policy deny network -g ads.example.com
sbx policy reset
Wildcard caution: *.example.com does not match example.com. Specify both when you need both. Broad domains like github.com allow access to user-generated content on that domain, so allow only what your workflow needs.

For non-interactive environments, set a policy up front:

sbx policy set-default balanced

8. Filesystem and workspace boundaries

The video compares an agent listing files on the host versus inside a sandbox. On the host, it can see the host directory contents. Inside the sandbox, it sees only the mounted workspace and sandbox-specific files. Docker’s isolation docs also state that symlinks pointing outside the workspace scope are not followed.

What is protected

  • Host files outside the selected workspace.
  • Host localhost and private host network surfaces.
  • Host Docker daemon/socket.
  • Other sandboxes.

What is not protected

  • Files inside the mounted workspace.
  • Uncommitted work in direct mode.
  • Secrets you manually paste into files/prompts.
  • External services you broadly allow by policy.
# Ask an agent to prove what it can see:
ls -la
pwd
find . -maxdepth 2 -type f | sort

9. Git workflow and branch mode

By default, Docker Sandboxes use direct mode: the agent modifies your working tree. For serious work, especially with multiple agents, use branch mode so the agent gets a Git worktree under .sbx/.

cd ~/my-project
sbx run codex --branch my-feature

# Review later:
git worktree list
cd .sbx/<sandbox-name>-worktrees/my-feature
git log
git diff main
git push -u origin my-feature
gh pr create
Best practice: add .sbx/ to your repo or global gitignore so sandbox worktrees do not clutter status.
echo '.sbx/' >> .gitignore

10. Local model / LM Studio pattern

The video also demonstrates running a Codex-style sandbox against a local model served by LM Studio. The exact command depends on your agent image and local OpenAI-compatible endpoint, but the pattern is:

  1. Start LM Studio’s local server on the host, commonly 127.0.0.1:1234.
  2. Allow the sandbox to reach that local endpoint.
  3. Create a named sandbox.
  4. Run the agent with an OpenAI-compatible base URL/model identifier pointing at LM Studio.
# Allow the sandbox to reach local LM Studio.
# Use the exact host/port your local server exposes.
sbx policy allow network -g localhost:1234

mkdir -p ~/codex-lms && cd ~/codex-lms
sbx create --name codex-lms codex .

# Then run Codex with your local endpoint/model settings.
# Consult the Codex CLI and Docker agent docs for the exact current flags/env vars.
Why local models still need a sandbox: local/open-source models can be more willing to run unsafe commands or follow malicious instructions. A local model reduces cloud/API exposure, but it does not remove execution risk.

11. Daily operation commands

TaskCommand
Start/attach agentsbx run codex or sbx run claude
List running sandboxessbx ls
Open interactive dashboardsbx
Shell into sandboxsbx exec -it <sandbox-name> bash
Stop without deletingsbx stop <sandbox-name>
Delete sandbox and VM statesbx rm <sandbox-name>
Inspect network attemptssbx policy log
List policiessbx policy ls

Sandboxes persist after the agent exits. Removing with sbx rm deletes the VM and its contents; workspace files on the host remain.

12. Troubleshooting

SymptomLikely causeFix
brew: command not foundHomebrew missing on macOSInstall from brew.sh, then rerun brew install docker/tap/sbx.
sbx will not start on LinuxKVM unavailable or user not in kvm groupRun lsmod | grep kvm, install diagnostics such as kvm-ok, add user to kvm, then log out/in.
Agent cannot call model APISecret missing or network policy blocks providerSet secret with sbx secret set -g openai/anthropic; check sbx policy log; allow needed host.
Package install failsPackage registry blocked by policyUse Balanced or allow required domains such as npm, PyPI, GitHub, registry hosts.
Agent can’t see expected filesWrong workspace path or branch-mode worktreeRun pwd, ls -la, and confirm the path passed to sbx run.
Changes appear in wrong placeDirect mode vs branch mode confusionUse git worktree list; review under .sbx/... for branch mode.
Local LM Studio endpoint unreachableHost localhost/private networking blockedConfirm LM Studio server port, then add a policy rule for the exact host/port. Avoid broad wildcards.

13. First-run checklist

  • ☐ Confirm platform prerequisites: macOS Apple silicon/Sonoma+, Windows Hypervisor Platform, or Ubuntu KVM.
  • ☐ Install sbx and run sbx login.
  • ☐ Choose Balanced unless you intentionally need Open or Locked Down.
  • ☐ Store model provider secrets with sbx secret set, not in prompts or files.
  • ☐ Start with a disposable workspace.
  • ☐ Run sbx run codex or sbx run claude.
  • ☐ Use sbx policy log after tests to see what the agent tried to reach.
  • ☐ Use --branch for real repositories or parallel agents.
  • ☐ Review diffs and generated files before merging.
  • ☐ Remove old sandboxes with sbx rm when done.

14. Sources and related links