The End of Ephemeral Sandboxes: How Fly.io Solved AI Agent Hosting

The infrastructure shift that will change how you run Claude Code and OpenCode

Running an AI coding agent on your laptop is like giving a stranger the keys to your house and hoping they don't rearrange the furniture. Claude Code with --dangerously-skip-permissions can execute any command, modify any file, and install whatever it wants. It's powerful. It's also terrifying.

For the past year, the industry consensus was clear: run AI agents in ephemeral containers. Spin up a Docker container, let Claude do its thing, then burn it down. Clean. Isolated. Safe.

But there's a problem nobody talks about: ephemeral containers are terrible for AI agents.

The Secure Cloud Computer

The Rebuild Tax

Every time you invoke Claude in a fresh container, it starts from zero. No installed packages. No cached dependencies. No memory of the project it spent three hours understanding yesterday.

I watched Claude reinstall the same npm packages twelve times in one afternoon. Each invocation cost 2-3 minutes of setup before it could do actual work. That's not just wasteful—it's actively hostile to how AI agents operate.

Agents work iteratively. They build mental models of your codebase. They accumulate context. Forcing a clean slate on every invocation is like giving someone amnesia between every sentence of a conversation.

Fly.io's Answer: Computers, Not Sandboxes

Earlier this month, Fly.io launched Sprites—and in doing so, articulated what we've been dancing around: AI agents don't want sandboxes. They want computers.

Sprites are persistent Linux VMs built on Firecracker microVMs. They boot in 1-2 seconds, persist your installed packages and configurations between sessions, and—here's the magic—automatically idle when inactive to stop billing.

"Ephemeral sandboxes are obsolete. Stop killing your sandboxes every time you use them." — Kurt Mackey, Fly.io CEO

The numbers tell the story:

Metric	Value
Boot time	1-2 seconds (first), <1 second (warm)
Checkpoint creation	~300ms
Checkpoint restore	<1 second
Storage	100GB persistent
4-hour session cost	~$0.44

But the headline feature isn't speed or price—it's the checkpoint/restore capability.

Ephemeral vs Persistent

Checkpoint as Version Control

Imagine you're about to let Claude refactor your authentication system. It's a big change. Risky.

In the old world, you'd either pray or spend time setting up a reproducible environment. With Sprites:

sprite checkpoint create my-agent --name before-auth-refactor
# Let Claude do its thing
# Something goes wrong
sprite checkpoint restore my-agent --name before-auth-refactor

Entire environment state—files, installed packages, running processes—captured in 300ms and restored in under a second.

It's git for your development environment.

The Security Foundation: Firecracker

All of this would be meaningless without real isolation. Sprites run on Firecracker, the same microVM technology behind AWS Lambda.

Each Sprite gets:

Dedicated CPU cores (no contention)
Isolated memory space
Separate kernel
Its own network namespace

This is hardware-level isolation, not the shared-kernel model of Docker containers. If AI-generated code tries to escape, it hits a hardware boundary, not a software abstraction.

Given that AI-generated code was introducing 10,000+ new security vulnerabilities per month at large organizations by mid-2025, this isn't paranoia—it's table stakes.

Running Claude Code in the Cloud

Let's get practical. Here's how to run Claude Code on Sprites:

# Install Sprites CLI
curl https://sprites.dev/install.sh | bash

# Login
sprite login

# Create your Sprite
sprite create my-agent

# Claude comes pre-installed
sprite exec -s my-agent "claude -p 'What is this codebase?'"

That's it. Claude Code is pre-installed with --dangerously-skip-permissions enabled by default—because the environment is designed to handle it.

For more control, use Claude's headless mode flags:

sprite exec -s my-agent "claude -p 'Run tests and fix failures' \
  --allowedTools 'Bash,Read,Edit' \
  --output-format json"

Multi-Step Workflows

Claude Code supports session continuity:

# Start a task
sprite exec -s my-agent "claude -p 'Analyze the auth module'"

# Continue the conversation
sprite exec -s my-agent "claude -p 'Now refactor for better security' --continue"

# Keep building
sprite exec -s my-agent "claude -p 'Add comprehensive tests' --continue"

Context persists across invocations. Claude remembers what it learned.

The Alternative: OpenCode Self-Hosted

Not everyone wants cloud execution. OpenCode offers a self-hosted alternative with multi-provider support (OpenAI, Anthropic, Gemini, Bedrock, Groq, and local models).

For Docker-based deployment:

FROM ghcr.io/anomalyco/opencode:latest

# Add your dependencies
RUN apt-get update && apt-get install -y nodejs npm python3

# Configure provider
ENV ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}

The trade-off is clear: Sprites give you zero-maintenance persistent VMs with enterprise isolation. Self-hosted OpenCode gives you full control and local model support at the cost of infrastructure management.

Git for Environments

Architecture for Teams

For teams building on this infrastructure, Fly.io provides the building blocks:

Pre-warmed Machine Pools: Create machines before you need them for instant assignment
Subdomain Routing: Route user1.yourapp.com to User 1's machine automatically
Persistent Volumes: Attach durable storage per user
External Backup: Sync to S3/Tigris for disaster recovery

The pattern:

Router App (*.yourapp.com)
    ↓ fly-replay header
Machine Pool
├── Machine 1 (alice's env)
├── Machine 2 (bob's env)
└── Machine N (pending assignment)

The Economics Make Sense Now

Here's the math that matters:

Usage Pattern	Always-On VM	Auto-Idle Sprites
4 hours/day active	$67/month	$8.40/month
8 hours/day active	$67/month	$16.80/month
20 hours/day active	$67/month	$42/month

AI agents are bursty. You interact intensively for a few hours, then context switch to something else. Paying for 24/7 compute when you need 4 hours is wasteful. Auto-idle billing aligns cost with actual usage.

What This Means for You

If you're running AI coding agents today, here's the action plan:

Individual developers: Stop running agents locally. Use Sprites. It's $15/month for moderate usage and infinitely safer than your laptop.

Teams building AI products: Architect on Fly Machines with pre-warmed pools. The Machines API gives you the primitives; the autostop/autostart gives you the economics.

Enterprises: The combination of hardware isolation, checkpoint audit trails, and multi-region deployment addresses most compliance requirements out of the box.

The Prediction

Within 12 months, every major AI agent platform will offer persistent VMs. The ephemeral container model made sense for stateless functions—it fails for iterative, stateful agent workloads.

Fly.io saw this first. They built for it. And with Sprites, they've given us the blueprint for what AI agent infrastructure should look like.

The age of ephemeral sandboxes is ending. The age of persistent AI workspaces has begun.

Start with Sprites at sprites.dev or read the Fly.io AI documentation. For Claude Code headless mode, see the Agent SDK docs.

Global Builders

The End of Ephemeral Sandboxes: How Fly.io Solved AI Agent Hosting

The End of Ephemeral Sandboxes: How Fly.io Solved AI Agent Hosting

The Rebuild Tax

Fly.io's Answer: Computers, Not Sandboxes

Checkpoint as Version Control

The Security Foundation: Firecracker

Running Claude Code in the Cloud

Multi-Step Workflows

The Alternative: OpenCode Self-Hosted

Architecture for Teams

The Economics Make Sense Now

What This Means for You

The Prediction

Related Posts

The Complete Guide to Orchestrating Multiple Claude Code Agents

Claude Code Subagents: The Complete Guide to AI Agent Orchestration

I Tested OpenCode vs Claude Code for a Week. Here's What Actually Matters.

Enjoyed this article?