AVAILABLE IN PREVIEW

CoreWeave Sandboxes

CoreWeave Sandboxes give your teams an execution layer for RL, agent tool use, and model evaluation in isolated, scalable environments

Learn more Watch the demo

from cwsandbox import Sandbox

# Context manager for automatic cleanup
with Sandbox.run(container_image="python:3.11") as sb:
result = sb.exec(["python", "-c", "print(2 + 2)"]).result()
print(result.stdout) # 4
Copied

from wandb.sandbox import Sandbox

with Sandbox.run(container_image="python:3.11",
    max_lifetime_seconds=180) as sb:
result = sb.exec(["python", "-c", "print(2 + 2)"]).result()
print(result.stdout)
Copied

Stop moving workloads, start moving faster

CoreWeave Sandboxes brings training, experimentation, and execution into a single environment—no data movement, no fragmented toolchains, no context switching between clouds. Less overhead. More control.

CKS and SERVERLESS

Run Sandboxes your way

CoreWeave Sandboxes supports two deployment models — dedicated infrastructure through CKS or fully managed serverless runtime — so teams can match their environment to how they actually work.

CKS

Sandboxes on CoreWeave Kubernetes Service (CKS)

Run isolated sandbox environments on your own compute across one or more CKS clusters.

Runs on existing clusters
Governance built into execution
Scale across idle capacity

Get started with CoreWeave Sandboxes

Serverless

Sandboxes on serverless runtime

Run agents in on-demand, isolated environments managed by Weights & Biases. Experience built-in security with no infrastructure to manage.

Serverless, no infrastructure to manage
Start with your Weights & Biases API key
Kata VM for strong isolation

Get started with serverless sandboxes

CKS Features

More value from the compute you already have

Sandboxes on CKS schedule across the clusters you already operate, picking up idle capacity—including CPU on GPU nodes—when run alongside SUNK.

Isolated environments, built in

Run code in sandboxes with namespace isolation, configurable network policies, and controlled resource limits. Supports agent tool use, reward verification, and evaluation workloads with execution boundaries that fit your existing environment.

Closer to your models and data

Run sandbox workloads on the same CoreWeave infrastructure used for training. Keep execution close to models, data, and training workflows without introducing a separate stack or additional vendor surface.

Scale without the overhead

Run hundreds of concurrent sandboxes with support for batch and distributed workloads. Scale evaluation pipelines, RL rollouts, and agent workflows on infrastructure you already control.

Governance that extends, not duplicates

Keep workloads inside your existing CoreWeave environment under the same placement controls, network policies, and governance model used for training. No new operational surface to manage.

Run your first sandbox in minutes with the CoreWeave Sandboxes Python Client

Launch, manage, and parallelize sandbox workloads for ML and agent workflows from a single Python client

uv pip install cwsandbox

Copied

View the Sandboxes setup guide

Run and manage sandbox workloads

Run code and scale execution across parallel workloads for batch jobs.

from cwsandbox import Sandbox

# Context manager for automatic cleanup
with Sandbox.run(container_image="python:3.11") as sb:
result = sb.exec(["python", "-c", "print(2 + 2)"]).result()
print(result.stdout) # 4
Copied

Run Python functions directly in sandboxes

Execute code in isolated environments without scripts or command strings—just write Python.

from cwsandbox import Sandbox, SandboxDefaults
with Sandbox.session(SandboxDefaults()) as session:
    @session.function()
    def add(x: int, y: int) -> int:
        return x + y
    result = add.remote(2, 3).result()  # 5
Copied

Stream output in real time

Monitor execution as it happens—capture logs line by line and respond instantly to long-running processes.

# Returns Process immediately
process = sandbox.exec(["python", "long_script.py"])

# Stream stdout line by line
for line in process.stdout:
    print(f"[stdout] {line}", end="")

# Get final result
result = process.result()
print(f"Exit code: {result.returncode}")
Copied

CKS Built-in observability

Monitor sandbox operations in real time

‍

Pre-installed Grafana dashboards give platform teams full visibility into sandbox usage, performance, and reliability. Ready out of the box.

Global visibility — Track sandbox activity, states, and usage over time
Performance insights — Monitor start rates, cold starts, and execution patterns
System health — View sandbox distribution and infrastructure signals
‍Troubleshooting — Drill into sandbox metrics mapped to nodes and pods

SERVERLESS SANDBOXES

Run AI agents and model-generated code safely at scale

Give agents isolated environments to run themselves and model-generated code safely and scalably, with built-in observability so teams can debug failures faster and monitor agent workflows more reliably.

Serverless, on-demand at scale

Go from your Weights & Biases API key to running thousands of agents concurrently on CoreWeave cloud in three lines of Python. No cluster, no YAML, no kubectl.

All telemetry in one workspace

Log sandbox status and metrics to your active W&B run. When something fails, investigate the sandbox, code, and LLM calls, all in your workspace, and debug agent failure modes faster.

Hardware-level isolation by default

Every serverless sandbox runs inside a Kata VM with its own kernel, filesystem, and network. Strong isolation without additional configuration.

Consistent state, every run

Start every run from a known state so results are easier to trust, compare, and reproduce. Sandboxes spin up clean and tear down when done — consistent across rollouts, evaluations, and post-training runs.

SERVERLESS SANDBOXES

Correlate sandbox telemetry with run metrics

‍

Every sandbox is captured as part of the training run, with lifecycle events, traces, code, inputs, and outputs tied back to the rollouts they belong to.

Run timeline: Analyze sandbox events alongside metrics in the same W&B run
Trace correlation: W&B Weave traces connect model calls, tool calls, and return values to the sandbox that produced them
Sandbox records: Code, inputs, and outputs are stored per sandbox and can be queried later

Security and governance

Security and governance for execution at scale

Run sandbox workloads on infrastructure designed for isolated, governed execution.

Isolated, policy-controlled execution
Cluster-level governance and access controls
Backed by CoreWeave infrastructure with SOC 2 and ISO certifications

Copied

Visit our Trust Portal

We were managing separate CPU clusters just to run agent workloads. CoreWeave Sandbox eliminated the need for that, saving us time and improving resource utilization. We run hundreds of concurrent sandboxes on our CPU nodes and alongside Slurm training jobs on GPU nodes, and our researchers were able to get started with the Python SDK immediately.

Roman Soletskyi

AI Scientist, Mistral

CoreWeave Sandboxes solves a real gap in our AI research stack: secure, isolated code execution at scale directly in our existing compute. Our reinforcement learning workflows spin up thousands of sandboxes in parallel per training step, each with its own container image and resource boundaries. Adoption is frictionless, our researchers run sandboxes within minutes of doing a pip install cwsandbox, no infrastructure knowledge required. And because sandboxes integrate with SUNK, they schedule on any available CPU or GPU capacity alongside Slurm jobs. That hybrid scheduling means we're getting more out of the infrastructure we already have. It just works.

Brian Belgodere

Senior Technical Staff Member: AI/ML Systems, IBM Research

Trusted by AI pioneers shaping the future of AI

Core execution workloads for AI teams

Support agent tool use, reward verification, and model evaluation on CoreWeave.

Agent tool use

Run tool-using agents and code interpreters in isolated environments with control over execution boundaries.

Model evaluation

Execute benchmarks and test harnesses at scale with parallel sandbox execution in reproducible environments.

RL reward verification

Run reward verification for multi-step RL workflows in isolated environments on CoreWeave.

Frequently asked questions

What are CoreWeave sandboxes?

CoreWeave sandboxes are an execution layer for RL, agent tool use, and model evaluation that let AI teams run code in isolated environments on CoreWeave.

How do CoreWeave sandboxes work?

CoreWeave sandboxes use a Python SDK to create and manage isolated execution environments. Teams can execute code, run functions, and scale workloads across parallel sandboxes using simple API calls.

Are CoreWeave sandboxes available to all customers?

CoreWeave serverful sandboxes are currently available in preview for existing customers. Contact CoreWeave to request access. Serverless sandboxes are available through Weights & Biases. Sign in to access on demand.

‍

Where do sandbox workloads run?

Sandbox workloads run inside your existing CoreWeave environment, on the same infrastructure and contracted capacity already used for training. With the serverless option, we fully handle provisioning and managing the infrastructure.

Do sandbox workloads run under existing governance controls?

Yes. CoreWeave sandboxes supports governed execution with controls aligned to your existing CoreWeave environment, including placement controls, networking policies, and execution isolation.

What types of workloads can I run?

You can run AI workloads including agent tool use, evaluation benchmarks, reinforcement learning reward verification, and model experimentation workflows.

Do I need to provision new infrastructure?

No. CoreWeave sandboxes runs on your existing contracted CoreWeave infrastructure, so there is no need to add a separate execution vendor. For the serverless option, you don’t even need to configure the infrastructure, we take care of it.

Do CoreWeave sandboxes support GPUs?

Yes. You can schedule sandboxes on specific GPU types or by memory requirements, enabling execution on the same infrastructure used for training.

How do CoreWeave sandboxes ensure security?

Each sandbox runs in an isolated environment with configurable network policies, namespace isolation, and controlled resource access.

Can I run workloads in parallel?

Yes. CoreWeave sandboxes supports parallel execution with distributed workloads, allowing teams to run hundreds of concurrent sandboxes.

How do I get started?

Install the Python SDK, create an API token, and start running sandboxes in minutes using the CoreWeave documentation.