AVAILABLE IN PREVIEW

CoreWeave Sandboxes

CoreWeave Sandboxes give your teams an execution layer for RL, agent tool use, and model evaluation in isolated, scalable environments

Copied
from cwsandbox import Sandbox
 
# Context manager for automatic cleanup
with Sandbox.run(container_image="python:3.11") as sb:
result = sb.exec(["python", "-c", "print(2 + 2)"]).result()
print(result.stdout) # 4
Copied
from wandb.sandbox import Sandbox

with Sandbox.run(container_image="python:3.11",
	max_lifetime_seconds=180) as sb:
result = sb.exec(["python", "-c", "print(2 + 2)"]).result()
print(result.stdout)

Stop moving workloads, start moving faster

CoreWeave Sandboxes brings training, experimentation, and execution into a single environment—no data movement, no fragmented toolchains, no context switching between clouds. Less overhead. More control.

CKS and SERVERLESS

Run Sandboxes your way

CoreWeave Sandboxes supports two deployment models — dedicated infrastructure through CKS or fully managed serverless runtime — so teams can match their environment to how they actually work.

CKS

Sandboxes on CoreWeave Kubernetes Service (CKS)

Run isolated sandbox environments on your own compute across one or more CKS clusters.

  1. Runs on existing clusters
  2. Governance built into execution
  3. Scale across idle capacity
Serverless

Sandboxes on serverless runtime

Run agents in on-demand, isolated environments managed by Weights & Biases. Experience built-in security with no infrastructure to manage.

  1. Serverless, no infrastructure to manage
  2. Start with your Weights & Biases API key
  3. Kata VM for strong isolation
Left
Right
CKS FEATURES

More value from the compute you already have

Sandboxes on CKS schedule across the clusters you already operate, picking up idle capacity—including CPU on GPU nodes—when run alongside SUNK.

Isolated environments, built in

Run code in sandboxes with namespace isolation, configurable network policies, and controlled resource limits. Supports agent tool use, reward verification, and evaluation workloads with execution boundaries that fit your existing environment.

Closer to your models and data

Run sandbox workloads on the same CoreWeave infrastructure used for training. Keep execution close to models, data, and training workflows without introducing a separate stack or additional vendor surface.

Scale without the overhead

Run hundreds of concurrent sandboxes with support for batch and distributed workloads. Scale evaluation pipelines, RL rollouts, and agent workflows on infrastructure you already control.

Governance that extends, not duplicates

Keep workloads inside your existing CoreWeave environment under the same placement controls, network policies, and governance model used for training. No new operational surface to manage.

Left
Right

Run your first sandbox in minutes with the CoreWeave Sandboxes Python Client

Launch, manage, and parallelize sandbox workloads for ML and agent workflows from a single Python client

uv pip install cwsandbox
Copied

Run and manage sandbox workloads

Run code and scale execution across parallel workloads for batch jobs.

Copied

Run and manage sandbox workloads

Run code and scale execution across parallel workloads for batch jobs.

from cwsandbox import Sandbox
 
# Context manager for automatic cleanup
with Sandbox.run(container_image="python:3.11") as sb:
result = sb.exec(["python", "-c", "print(2 + 2)"]).result()
print(result.stdout) # 4

Run Python functions directly in sandboxes

Execute code in isolated environments without scripts or command strings—just write Python.

Copied
from cwsandbox import Sandbox, SandboxDefaults
with Sandbox.session(SandboxDefaults()) as session:
    @session.function()
    def add(x: int, y: int) -> int:
        return x + y
    result = add.remote(2, 3).result()  # 5

Stream output in real time

Monitor execution as it happens—capture logs line by line and respond instantly to long-running processes.

Copied
# Returns Process immediately
process = sandbox.exec(["python", "long_script.py"])

# Stream stdout line by line
for line in process.stdout:
    print(f"[stdout] {line}", end="")

# Get final result
result = process.result()
print(f"Exit code: {result.returncode}")
CKS Built-in observability

Monitor sandbox operations in real time

Pre-installed Grafana dashboards give platform teams full visibility into sandbox usage, performance, and reliability. Ready out of the box.

  • Global visibility — Track sandbox activity, states, and usage over time
  • Performance insights — Monitor start rates, cold starts, and execution patterns
  • System health — View sandbox distribution and infrastructure signals
  • Troubleshooting — Drill into sandbox metrics mapped to nodes and pods
SERVERLESS SANDBOXES

Run AI agents and model-generated code safely at scale

Give agents isolated environments to run themselves and model-generated code safely and scalably, with built-in observability so teams can debug failures faster and monitor agent workflows more reliably.

Serverless, on-demand at scale

Go from your Weights & Biases API key to running thousands of agents concurrently on CoreWeave cloud in three lines of Python. No cluster, no YAML, no kubectl.

All telemetry in one workspace

Log sandbox status and metrics to your active W&B run. When something fails, investigate the sandbox, code, and LLM calls, all in your workspace, and debug agent failure modes faster.

Hardware-level isolation by default

Every serverless sandbox runs inside a Kata VM with its own kernel, filesystem, and network. Strong isolation without additional configuration.

Consistent state, every run

Start every run from a known state so results are easier to trust, compare, and reproduce. Sandboxes spin up clean and tear down when done — consistent across rollouts, evaluations, and post-training runs.

Left
Right
SERVERLESS SANDBOXES

Correlate sandbox telemetry with run metrics

Every sandbox is captured as part of the training run, with lifecycle events, traces, code, inputs, and outputs tied back to the rollouts they belong to.

  • Run timeline: Analyze sandbox events alongside metrics in the same W&B run
  • Trace correlation: W&B Weave traces connect model calls, tool calls, and return values to the sandbox that produced them
  • Sandbox records: Code, inputs, and outputs are stored per sandbox and can be queried later
Security and governance

Security and governance for execution at scale

Run sandbox workloads on infrastructure designed for isolated, governed execution.

  • Isolated, policy-controlled execution
  • Cluster-level governance and access controls
  • Backed by CoreWeave infrastructure with SOC 2 and ISO certifications
Copied
Trusted by AI pioneers shaping the future of AI
ZohoZoho
Rev.comRev.com
AltumAltum
AletheaAlethea
DatabricksDatabricks
OpenAIOpenAI
GoogleGoogle
MistralAIMistralAI
CohereCohere
Jane StreetJane Street
DecartDecart
CloudflareCloudflare
AbridgeAbridge
Stability AIStability AI
RunDiffusionRunDiffusion
MozillaMozilla
InflectionInflection
Fireworks AIFireworks AI
DebuildDebuild
AugmentAugment
ConjectureConjecture
ChaiChai
NovelAINovelAI
RunwayRunway
General IntuitionGeneral Intuition

Core execution workloads for AI teams

Support agent tool use, reward verification, and model evaluation on CoreWeave.

Agent tool use

Run tool-using agents and code interpreters in isolated environments with control over execution boundaries.

Model evaluation

Execute benchmarks and test harnesses at scale with parallel sandbox execution in reproducible environments.

RL reward verification

Run reward verification for multi-step RL workflows in isolated environments on CoreWeave.

Left
Right

Frequently asked questions

What are CoreWeave sandboxes?

CoreWeave sandboxes are an execution layer for RL, agent tool use, and model evaluation that let AI teams run code in isolated environments on CoreWeave.

How do CoreWeave sandboxes work?

CoreWeave sandboxes use a Python SDK to create and manage isolated execution environments. Teams can execute code, run functions, and scale workloads across parallel sandboxes using simple API calls.

Are CoreWeave sandboxes available to all customers?

CoreWeave serverful sandboxes are currently available in preview for existing customers. Contact CoreWeave to request access. Serverless sandboxes are available through Weights & Biases. Sign in to access on demand.

Where do sandbox workloads run?

Sandbox workloads run inside your existing CoreWeave environment, on the same infrastructure and contracted capacity already used for training. With the serverless option, we fully handle provisioning and managing the infrastructure.

Do sandbox workloads run under existing governance controls?

Yes. CoreWeave sandboxes supports governed execution with controls aligned to your existing CoreWeave environment, including placement controls, networking policies, and execution isolation.

What types of workloads can I run?

You can run AI workloads including agent tool use, evaluation benchmarks, reinforcement learning reward verification, and model experimentation workflows.

Do I need to provision new infrastructure?

No. CoreWeave sandboxes runs on your existing contracted CoreWeave infrastructure, so there is no need to add a separate execution vendor. For the serverless option, you don’t even need to configure the infrastructure, we take care of it.

Do CoreWeave sandboxes support GPUs?

Yes. You can schedule sandboxes on specific GPU types or by memory requirements, enabling execution on the same infrastructure used for training.

How do CoreWeave sandboxes ensure security?

Each sandbox runs in an isolated environment with configurable network policies, namespace isolation, and controlled resource access.

Can I run workloads in parallel?

Yes. CoreWeave sandboxes supports parallel execution with distributed workloads, allowing teams to run hundreds of concurrent sandboxes.

How do I get started?

Install the Python SDK, create an API token, and start running sandboxes in minutes using the CoreWeave documentation.

Left
Right