Writing

Sandboxed Code Execution for AI Agents: Security, Architecture, and Production Patterns

The boundary is enforced by sandboxing. Isolate the code execution in an environment with minimal capabilities and controlled resource limits.

18 min read
AI Agentssandboxed code execution AI agentsE2B sandbox

Code execution is the most powerful tool an AI agent can have. It is also the most dangerous. An agent that can run arbitrary code can read files it shouldn't, make network calls to exfiltrate data, consume unbounded compute, install persistent backdoors, or crash the host system.

The temptation with early agent prototypes is to run code directly on the host machine with subprocess or exec. This works fine until the agent generates import os; os.system("rm -rf /") or a dependency chain that triggers a network fetch to an attacker-controlled server. In demos, the code you feed the agent is carefully crafted and benign. In production, users provide the input. The attack surface is as wide as everything the model has been trained on.

Production-grade code execution for AI agents requires a security boundary between the agent's execution environment and everything else: the host filesystem, the network, the other users of the same system. The boundary is enforced by sandboxing. Isolate the code execution in an environment with minimal capabilities and controlled resource limits.

The full technical stack for production sandboxed code execution includes the threat model for AI code execution, the sandbox technologies available (containers, gVisor, Firecracker microVMs, E2B), resource limits and output capture, secure file transfer patterns, and the architecture for integrating sandboxed execution into an agent tool call pipeline.

The Threat Model: What Can Go Wrong With AI Code Execution

The threat model for AI code execution differs from traditional code execution because the code is generated by a model that users can prompt. The attack surface includes:

Direct attacks via code generation:

  • File system access: read /etc/passwd, write to arbitrary paths, delete files
  • Process spawning: launch background processes, install cron jobs
  • Network access: exfiltrate data, download malware, callback to C2 servers
  • Resource exhaustion: fork bombs, infinite loops, large memory allocations

Prompt injection attacks:

  • User provides input that causes the agent to generate malicious code
  • Data in the execution environment contains injected instructions
  • LLM output itself is adversarial (jailbroken or misconfigured model)

Indirect attacks:

  • Package imports that have side effects: import malicious_package runs setup.py which executes shell commands
  • Data exfiltration through output: code writes sensitive data to stdout and the agent forwards it to an attacker
  • SSRF (Server Side Request Forgery): code makes HTTP requests to internal network endpoints

The minimum security requirements for any production code execution environment:

  1. Filesystem isolation: The sandbox can only access a specific, controlled filesystem. No access to host filesystem or other users' data.

  2. Network isolation: Code cannot make outbound network connections by default. If network access is needed, it is explicitly allowlisted.

  3. Process isolation: Code cannot spawn processes that outlive the sandbox execution.

  4. Resource limits: CPU, memory, file size, and execution time are all bounded.

  5. No privilege escalation: Code cannot become root, modify kernel parameters, or escape the sandbox.

Sandbox Isolation Levels: Containers vs gVisor vs MicroVMs

Three isolation technologies provide progressively stronger security boundaries:

Container isolation (Docker, Podman):

  • Isolation: Linux namespaces (pid, network, mount, user)
  • Security level: Moderate (container processes share the host kernel)
  • Attack surface: Kernel exploits can escape container isolation
  • Performance: Near-native (~1% overhead for CPU-bound tasks)
  • Startup time: 100-500ms

gVisor (Google's container sandbox):

  • Isolation: User-space kernel (Sentry) intercepts all system calls
  • Security level: High (syscalls are validated before reaching the host kernel)
  • Attack surface: Limited to gVisor's syscall implementation (about 300 syscalls vs 400+ in Linux)
  • Performance: 10-30% overhead for syscall-heavy workloads, near-native for CPU-bound
  • Startup time: 200-800ms

Firecracker microVMs:

  • Isolation: Hardware virtualization (KVM) with separate VM kernel
  • Security level: Highest (true VM isolation with minimal attack surface: 5 device types)
  • Attack surface: Limited to hypervisor interface (intentionally minimal)
  • Performance: Near-native for CPU computation, about 5% memory overhead
  • Startup time: 100-300ms (Firecracker's fast VM boot is a core feature)

Choosing isolation level by use case:

Use case Recommended isolation Rationale
Internal tooling, trusted users Container with resource limits Speed, simplicity
SaaS product, user-provided inputs gVisor or microVM Kernel exploit risk from user inputs
Multi-tenant, adversarial inputs Firecracker microVM Maximum isolation for hostile code
Data science notebooks gVisor Balance security and performance
from dataclasses import dataclass
from enum import Enum
 
class IsolationLevel(Enum):
    CONTAINER = "container"      # Docker namespaces
    GVISOR = "gvisor"            # gVisor user-space kernel
    MICROVM = "microvm"          # Firecracker/Cloud Hypervisor
 
@dataclass
class SandboxConfig:
    isolation: IsolationLevel = IsolationLevel.GVISOR
    cpu_cores: float = 0.5          # Fractional CPU allocation
    memory_mb: int = 512            # Memory limit
    disk_mb: int = 1024             # Filesystem size limit
    timeout_seconds: int = 30       # Hard execution timeout
    network_enabled: bool = False   # Network access (off by default)
    allowed_domains: list = None    # If network enabled, allowlist
    max_file_size_mb: int = 50      # Maximum output file size
    max_processes: int = 10         # Maximum concurrent processes

E2B: Managed Sandboxes for AI Agents

E2B (formerly e2b.dev) is a cloud service that provides on-demand Linux sandboxes designed for AI agent code execution. It abstracts the VM/container management and exposes a clean Python SDK.

from e2b_code_interpreter import CodeInterpreter
import asyncio
 
async def run_code_with_e2b(
    code: str,
    language: str = "python",
    timeout_seconds: int = 30,
) -> dict:
    """
    Execute code in an E2B sandbox.
    Each sandbox is an isolated microVM (Firecracker-based).
    """
    async with CodeInterpreter() as sandbox:
        # Execute code
        execution = await sandbox.notebook.exec_cell(
            code,
            timeout=timeout_seconds,
        )
 
        return {
            "stdout": execution.text,
            "stderr": "\n".join(str(e) for e in execution.error) if execution.error else "",
            "outputs": [
                {"type": output.type, "data": output.data}
                for output in execution.results
            ],
            "error": bool(execution.error),
        }
 
async def run_multi_step_analysis(
    steps: list[str],
    shared_files: dict[str, bytes] = None,
) -> list[dict]:
    """
    Run multiple code steps in the same sandbox (state persists between steps).
    Key E2B capability: stateful execution across multiple agent steps.
    """
    results = []
 
    async with CodeInterpreter() as sandbox:
        # Upload any shared files to the sandbox
        if shared_files:
            for filename, content in shared_files.items():
                await sandbox.files.write(f"/home/user/{filename}", content)
 
        # Execute each step, maintaining state
        for step_code in steps:
            execution = await sandbox.notebook.exec_cell(step_code, timeout=60)
            results.append({
                "stdout": execution.text,
                "error": bool(execution.error),
                "outputs": [{"type": o.type, "data": str(o.data)[:1000]}
                             for o in execution.results],
            })
 
            # Stop if there's an error
            if execution.error:
                break
 
    return results
 
# Real-world agent tool implementation
async def agent_code_execution_tool(
    code: str,
    context_files: dict[str, str] = None,
) -> dict:
    """
    Tool function for AI agent code execution via E2B.
    context_files: {filename: content} dict of files to pre-load
    """
    try:
        file_bytes = {k: v.encode() for k, v in (context_files or {}).items()}
        result = await run_code_with_e2b(
            code,
            timeout_seconds=30,
        )
 
        if result["error"]:
            return {
                "status": "error",
                "message": f"Code execution failed: {result['stderr']}",
                "stdout": result["stdout"],
            }
 
        return {
            "status": "success",
            "stdout": result["stdout"][:5000],  # Cap output size
            "outputs": result["outputs"][:10],   # Cap number of outputs
        }
 
    except TimeoutError:
        return {"status": "timeout", "message": "Code execution exceeded 30 second limit"}
    except Exception as e:
        return {"status": "error", "message": f"Sandbox error: {str(e)}"}

E2B architecture: Each sandbox is a Firecracker microVM started from a pre-built snapshot. The snapshot approach reduces cold start time to 100-300ms. The VM is not booted from scratch but resumed from a frozen state. This makes E2B practical for interactive agent loops where code execution tools are called frequently.

E2B security properties:

  • Each sandbox is an isolated Firecracker microVM (no shared kernel with host or other sandboxes)
  • Network access disabled by default (explicitly enabled with domain allowlisting)
  • Filesystem is ephemeral (destroyed when the sandbox closes)
  • CPU and memory limits enforced at hypervisor level
  • Maximum sandbox lifetime: configurable, default 30 minutes

DIY Sandboxing: Implementing Your Own Secure Executor

For teams that cannot use managed services, a DIY sandbox can be built with Docker and seccomp profiles. This provides container-level isolation (not microVM-level) but is significantly more secure than direct subprocess execution.

import docker
import tempfile
import os
import json
from pathlib import Path
import asyncio
from typing import Optional
 
class DockerSandboxExecutor:
    """
    Execute code in an isolated Docker container.
    Uses seccomp profile and read-only mounts for hardening.
    """
 
    # Seccomp profile: restrict dangerous syscalls
    SECCOMP_PROFILE = {
        "defaultAction": "SCMP_ACT_ALLOW",
        "syscalls": [
            {
                "names": [
                    "mount", "umount2", "ptrace", "strace",
                    "swapon", "swapoff", "reboot", "kexec_load",
                    "init_module", "delete_module", "create_module",
                    "clone",  # Prevent fork bombs (limit with ulimit instead)
                ],
                "action": "SCMP_ACT_ERRNO",
            }
        ]
    }
 
    def __init__(self, base_image: str = "python:3.11-slim"):
        self.client = docker.from_env()
        self.base_image = base_image
 
    async def execute(self,
                       code: str,
                       timeout_seconds: int = 30,
                       memory_mb: int = 512,
                       cpu_quota: int = 50000,   # 50% of one CPU
                       network_enabled: bool = False,
                       environment: dict = None) -> dict:
        """Execute code in an isolated container."""
 
        # Write code to a temp file for injection
        with tempfile.NamedTemporaryFile(mode='w', suffix='.py',
                                          delete=False) as f:
            f.write(code)
            code_file = f.name
 
        try:
            # Run container with restrictions
            container = await asyncio.to_thread(
                self.client.containers.run,
                self.base_image,
                command=["python", "/sandbox/code.py"],
                volumes={code_file: {"bind": "/sandbox/code.py", "mode": "ro"}},
                mem_limit=f"{memory_mb}m",
                memswap_limit=f"{memory_mb}m",  # No swap
                cpu_quota=cpu_quota,
                cpu_period=100000,
                network_mode="none" if not network_enabled else "bridge",
                read_only=True,
                tmpfs={"/tmp": "size=64m,noexec"},  # Small writable /tmp, noexec
                security_opt=[
                    "no-new-privileges",
                    f"seccomp={json.dumps(self.SECCOMP_PROFILE)}",
                ],
                user="nobody",   # Run as non-root
                environment=environment or {},
                remove=False,    # Keep for log retrieval
                detach=True,
            )
 
            # Wait for completion or timeout
            try:
                result = await asyncio.wait_for(
                    asyncio.to_thread(container.wait),
                    timeout=timeout_seconds,
                )
                exit_code = result["StatusCode"]
                stdout = container.logs(stdout=True, stderr=False).decode()
                stderr = container.logs(stdout=False, stderr=True).decode()
                timed_out = False
            except asyncio.TimeoutError:
                container.kill()
                stdout = container.logs(stdout=True, stderr=False).decode()
                stderr = "Execution timed out"
                exit_code = -1
                timed_out = True
            finally:
                container.remove(force=True)
 
            return {
                "stdout": stdout[:10000],   # Cap at 10KB
                "stderr": stderr[:2000],
                "exit_code": exit_code,
                "timed_out": timed_out,
                "success": exit_code == 0 and not timed_out,
            }
 
        finally:
            os.unlink(code_file)
 
# Stricter: block all network access at the OS level using netns
def create_network_isolated_sandbox():
    """
    Create a sandbox with network namespace isolation.
    The container gets a new network namespace with only loopback.
    """
    import subprocess
 
    def run_in_isolated_network(code: str) -> dict:
        script = f"""
import sys
import signal
 
signal.alarm(30)  # 30 second timeout via SIGALRM
 
try:
    exec(compile({repr(code)}, '<agent_code>', 'exec'))
except SystemExit:
    pass
"""
        result = subprocess.run(
            ["unshare", "--net", "--user",   # New network and user namespace
             "--map-root-user",               # Map current user to root in namespace
             "python3", "-c", script],
            capture_output=True,
            text=True,
            timeout=35,
            env={},  # Empty environment
        )
 
        return {
            "stdout": result.stdout[:10000],
            "stderr": result.stderr[:2000],
            "exit_code": result.returncode,
            "success": result.returncode == 0,
        }
 
    return run_in_isolated_network

Resource Limits: CPU, Memory, Network, and Time

Every production sandbox needs explicit, enforced limits on all resource dimensions:

@dataclass
class ResourceLimits:
    """Resource limits for sandbox execution."""
    # Compute
    cpu_cores: float = 1.0
    cpu_time_seconds: int = 30    # Wall clock timeout
    cpu_quota_percent: int = 50   # Max % of CPU time
 
    # Memory
    ram_mb: int = 512
    swap_mb: int = 0             # No swap by default
 
    # I/O
    disk_read_mb_per_s: int = 50
    disk_write_mb_per_s: int = 20
    max_output_bytes: int = 1_000_000   # 1MB output cap
 
    # Process
    max_processes: int = 10
    max_file_descriptors: int = 50
 
    # Network
    network_enabled: bool = False
    max_outbound_connections: int = 0   # 0 = none allowed
    allowed_domains: list[str] = None
 
def enforce_python_resource_limits():
    """
    Apply resource limits using Python's resource module.
    Call this at the start of sandboxed Python execution.
    """
    import resource
 
    # CPU time limit (raises SIGXCPU after limit)
    resource.setrlimit(resource.RLIMIT_CPU, (30, 30))
 
    # Memory limit
    memory_bytes = 512 * 1024 * 1024  # 512 MB
    resource.setrlimit(resource.RLIMIT_AS, (memory_bytes, memory_bytes))
 
    # File size limit
    max_file_bytes = 50 * 1024 * 1024  # 50 MB
    resource.setrlimit(resource.RLIMIT_FSIZE, (max_file_bytes, max_file_bytes))
 
    # Process limit (fork bomb prevention)
    resource.setrlimit(resource.RLIMIT_NPROC, (10, 10))
 
    # File descriptor limit
    resource.setrlimit(resource.RLIMIT_NOFILE, (50, 50))
 
def validate_code_before_execution(code: str) -> list[str]:
    """
    Static analysis to flag potentially dangerous code.
    Not a security substitute for sandboxing (a warning layer only).
    """
    import ast
 
    warnings = []
 
    dangerous_patterns = [
        "__import__", "importlib", "subprocess", "os.system",
        "eval(", "exec(", "compile(",
        "open(", "file(",
        "socket", "urllib", "requests", "httpx",
        "ctypes", "cffi",
        "sys.exit", "os.kill", "signal.raise_signal",
    ]
 
    for pattern in dangerous_patterns:
        if pattern in code:
            warnings.append(f"Potentially dangerous: '{pattern}' found in code")
 
    # Try AST parse: catch syntax errors early
    try:
        ast.parse(code)
    except SyntaxError as e:
        warnings.append(f"Syntax error: {e}")
 
    return warnings

Output size limits: Uncontrolled output can cause memory exhaustion in the agent's context window. A loop that prints 100MB of data causes multiple downstream failures. Cap stdout/stderr at a reasonable limit (1-10MB) and truncate with a message if exceeded.

Output Capture: Stdout, Stderr, Files, and Return Values

import io
import sys
import traceback
from contextlib import redirect_stdout, redirect_stderr
import ast
 
def execute_python_safely(code: str,
                            max_output_bytes: int = 100_000,
                            global_vars: dict = None,
                            local_vars: dict = None) -> dict:
    """
    Execute Python code with full output capture.
    NOT a security sandbox (use within a sandboxed environment).
    Captures stdout, stderr, return value, and created variables.
    """
    stdout_buffer = io.StringIO()
    stderr_buffer = io.StringIO()
 
    execution_globals = dict(global_vars or {})
    execution_locals = dict(local_vars or {})
 
    error = None
    return_value = None
 
    try:
        with redirect_stdout(stdout_buffer), redirect_stderr(stderr_buffer):
            # Try to get return value from last expression
            tree = ast.parse(code, mode='exec')
 
            if tree.body and isinstance(tree.body[-1], ast.Expr):
                # Split: compile all but last expression, then eval last
                last_expr = ast.Expression(body=tree.body[-1].value)
                preceding = ast.Module(body=tree.body[:-1], type_ignores=[])
                ast.fix_missing_locations(last_expr)
                ast.fix_missing_locations(preceding)
 
                exec(compile(preceding, '<sandbox>', 'exec'),
                     execution_globals, execution_locals)
                return_value = eval(compile(last_expr, '<sandbox>', 'eval'),
                                    execution_globals, execution_locals)
            else:
                exec(compile(tree, '<sandbox>', 'exec'),
                     execution_globals, execution_locals)
 
    except Exception as e:
        error = {
            "type": type(e).__name__,
            "message": str(e),
            "traceback": traceback.format_exc(),
        }
 
    stdout = stdout_buffer.getvalue()
    stderr = stderr_buffer.getvalue()
 
    # Cap output sizes
    if len(stdout) > max_output_bytes:
        stdout = stdout[:max_output_bytes] + f"\n... [truncated at {max_output_bytes} bytes]"
 
    # Collect any new variables defined in the execution
    new_vars = {
        k: v for k, v in execution_locals.items()
        if k not in (local_vars or {})
        and not k.startswith("_")
        and not callable(v)
    }
 
    return {
        "stdout": stdout,
        "stderr": stderr,
        "return_value": return_value,
        "error": error,
        "success": error is None,
        "new_variables": {k: repr(v)[:200] for k, v in new_vars.items()},
    }

MultiTurn Code Execution: Maintaining State Across Steps

The most valuable aspect of sandboxed execution for agents is stateful multi-turn execution: variables defined in step 1 are available in step 2. This enables agents to incrementally build analysis, load data once and manipulate it across multiple steps, and debug by inspecting intermediate state.

class StatefulSandbox:
    """
    Manages a persistent execution environment across multiple code cells.
    Works with E2B or any persistent sandbox implementation.
    """
 
    def __init__(self, sandbox_factory):
        self.sandbox = sandbox_factory()
        self.execution_history: list[dict] = []
        self.defined_variables: dict[str, str] = {}  # name → repr
 
    async def execute_cell(self, code: str,
                            timeout_seconds: int = 30) -> dict:
        """Execute a code cell in the persistent environment."""
        result = await self.sandbox.execute(code, timeout_seconds)
 
        self.execution_history.append({
            "code": code,
            "result": result,
            "step": len(self.execution_history) + 1,
        })
 
        # Track new variables from this execution
        if result.get("new_variables"):
            self.defined_variables.update(result["new_variables"])
 
        return result
 
    def get_context_summary(self) -> str:
        """
        Generate a summary of the current execution context for the agent.
        Shows the agent what variables exist and what has been done.
        """
        lines = []
 
        if self.defined_variables:
            lines.append("Variables in scope:")
            for name, repr_val in list(self.defined_variables.items())[:20]:
                lines.append(f"  {name} = {repr_val[:100]}")
 
        if self.execution_history:
            lines.append(f"\nExecution history: {len(self.execution_history)} cells executed")
            last_result = self.execution_history[-1]["result"]
            if last_result.get("error"):
                lines.append(f"Last execution: ERROR - {last_result['error']['message']}")
            else:
                lines.append(f"Last execution: SUCCESS")
 
        return "\n".join(lines) if lines else "No code executed yet."
 
    async def reset(self):
        """Start a fresh execution environment."""
        await self.sandbox.close()
        self.sandbox = self.sandbox.__class__()
        self.execution_history = []
        self.defined_variables = {}

Security Checklist: What to Audit Before Deploying

SANDBOX SECURITY AUDIT CHECKLIST
 
Isolation:
☐ Code runs in a separate process/container/VM from the host
☐ Filesystem access is restricted to a controlled directory
☐ Container/VM cannot access host filesystem via bind mounts
☐ Network access is disabled by default
☐ Process cannot spawn privileged children
 
Resource limits:
☐ Hard timeout enforced (not just soft signal)
☐ Memory limit enforced at OS/hypervisor level
☐ CPU limit enforced (prevents 100% CPU consumption)
☐ Disk write limit enforced (prevents disk exhaustion)
☐ Output size limit enforced (prevents context window flooding)
 
Input validation:
☐ Code size limit enforced (before execution)
☐ Static analysis warnings logged (not blocking, but informational)
☐ Code encoding validated (prevent injection via encoding tricks)
 
Output handling:
☐ stdout/stderr captured and size-limited before return
☐ File outputs validated before transfer to agent
☐ No secrets or host credentials accessible to sandbox
 
Multi-tenancy (if applicable):
☐ Each user/session gets a separate sandbox instance
☐ No shared filesystem state between user sandboxes
☐ Sandbox cleanup verified after session ends
 
Monitoring:
☐ Execution time logged for anomaly detection
☐ Memory usage logged
☐ Error patterns monitored for attack signatures
☐ Resource limit violations alerted on

Key Takeaways

  • The minimum security requirements for AI code execution are: filesystem isolation (sandbox can only access a controlled directory), network isolation (outbound connections blocked by default), process isolation (no persistent processes after execution), resource limits (CPU, memory, disk, time), and no privilege escalation. Running AI-generated code without these controls is not a prototype shortcut. It is an active security incident waiting to happen.

  • Three isolation technologies provide progressively stronger security: containers (Linux namespaces, shared kernel, fastest), gVisor (user-space kernel intercepts syscalls, blocks kernel exploits), and Firecracker microVMs (hardware virtualization, full VM isolation, maximum security). For multi-tenant or user-input-driven code execution, use gVisor or Firecracker.

  • E2B provides managed Firecracker microVMs with cold start times of 100-300ms (via snapshot resumption), making it practical for interactive agent loops. The SDK handles the VM lifecycle, file transfer, output capture, and cleanup automatically. E2B is the fastest path to production-safe agent code execution.

  • Stateful multi-turn execution is the key capability that makes sandboxed code useful for agents: variables from step 1 are available in step 2. Use persistent sandboxes (or E2B's persistent CodeInterpreter context) for multi-step agent tasks. Track defined variables and pass context summaries to the agent at each step.

  • Static code analysis (pattern matching, AST analysis) is a useful warning layer but is not a security control. Determined attackers can bypass all pattern-based filters. The security guarantee comes from the sandbox isolation, not from rejecting dangerous-looking code patterns. Run static analysis for logging and alerting, not for access control.

  • Output size limits are critical and frequently overlooked. A print loop that generates 1GB of output causes context window flooding, agent confusion, and potential memory exhaustion in the orchestration layer. Cap stdout/stderr at 100KB-1MB, truncate with a clear message, and add per-character limits on file outputs transferred back to the agent.

FAQ

How do you safely run AIgenerated code in production?

Safely running AI-generated code in production requires isolating the execution in a sandbox with enforced security boundaries. The minimum requirements are: filesystem isolation (the code can only access a designated directory, not the host filesystem), network isolation (outbound connections blocked by default), process isolation (no persistent processes after execution), and resource limits (CPU time, memory, disk writes, execution timeout). For multi-tenant or user-facing deployments, use gVisor or Firecracker microVM isolation rather than standard Docker containers. Docker's shared kernel creates a risk of kernel-level escape exploits. Managed services like E2B (Firecracker-based) provide all these guarantees with a simple SDK and 100-300ms cold start times.

What is E2B and how is it used for AI agents?

E2B is a cloud service that provides on-demand isolated Linux sandboxes (Firecracker microVMs) designed for AI agent code execution. The Python SDK provides a CodeInterpreter context that handles VM lifecycle, code execution, output capture, and file transfer. Each sandbox is an isolated VM that cannot access the host system or other sandboxes. E2B sandboxes start quickly (100-300ms via snapshot resumption) making them practical for interactive agent loops. Key features: stateful execution (variables persist across multiple code cells within a session), file upload/download between the agent and sandbox, support for Python and Node.js, and configurable environment (custom packages, system dependencies). E2B is the fastest path to production-safe agent code execution for teams that don't want to build sandbox infrastructure themselves.

What is the difference between gVisor and standard Docker for AI code execution?

Standard Docker uses Linux namespaces for isolation but allows processes inside the container to make direct syscalls to the host kernel. If AI-generated code exploits a kernel vulnerability (or a container escape technique), it can access the host system. gVisor adds a user-space kernel layer (called Sentry) that intercepts all syscalls from the container before they reach the host kernel. The syscalls are implemented in Go/Rust in user space, validated, and then translated to a minimal set of host syscalls. This eliminates most kernel exploit vectors because the container's code never directly touches the host kernel. The tradeoff is 10-30% performance overhead for syscall-heavy workloads (network I/O, file I/O), generally acceptable for the security guarantee provided. For AI code execution with user-provided inputs, gVisor is strongly preferred over standard Docker.

AI agents with code execution capabilities are more powerful than agents without them. An agent that can run code can process arbitrary data, test its own hypotheses, automate tasks that would otherwise require brittle string manipulation, and verify its outputs against hard facts. The capability is real and significant.

The security debt from treating code execution as another tool call is also real. The history of security incidents in adjacent domains (server-side template injection, eval() misuse, arbitrary file read in web apps) follows the same pattern: someone found a way to get user input into an execution context that didn't have appropriate boundaries. AI agents make this attack surface larger because the path from user input to code execution is shorter.

The practical answer is the same as in every security context: assume the code will try to do things you don't want, and make it technically impossible rather than unlikely. The sandbox technologies are mature, the managed services are fast and cheap, and the implementation patterns are established. There is no reason to deploy AI code execution without appropriate isolation.

Build the sandbox first. Add the capability second. The order matters.

Written & published by Chaitanya Prabuddha