Published on

November 2, 2022

min read

Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly

Max Hjelm

The smartest companies are evolving toward more flexible, on-demand cloud infrastructure using a technique called burst compute, which provides enterprises with accessible, efficient, and cost-effective computing.

What is Burst Compute?

Burst compute is a use case that requires GPUs to be spun up to run workloads as needed, and spun down when they finish. Examples include batch simulations that can be run in parallel across thousands of GPUs, online (or batch) inference that scales GPUs up and down in response to end-user demand, and VFX rendering to deliver projects on a short timeline.

Unlike the traditional definition of cloud bursting, which directs overflow traffic onto the public cloud to avoid interruptions in service, bursting on modern, specialized cloud infrastructure - like CoreWeave - allows companies who need high-performance NVIDIA GPUs to scale up and down across hundreds or thousands of GPUs instantly - saving up to 80% at a critical time when every IT department needs to batten down the hatches.

Accessing On-Demand GPUs at Scale on Legacy Cloud Infrastructure Has Been Virtually Impossible

Whether you’re consistently deploying workloads across thousands of GPUs or just need a few instances, there’s an increasing challenge in the industry: it is extremely difficult to access the compute you need, when you need it, on legacy cloud infrastructure. When you are able to access compute, legacy providers often charge exorbitant fees for ingress/egress, which can be debilitating for many clients.

Businesses that rely on on-demand cloud infrastructure, like AI start-ups, VFX and animation studios, biotech companies, and Metaverse platforms, often need to scale up and down across hundreds or thousands of GPUs for short periods of time, but too often find themselves stuck without this option.

The result? Paying for idle compute cycles you don’t need, to make sure you can access it when you do.

CoreWeave Cloud is designed to address availability constraints, making it dead simple to scale up when your workloads require it, and scale down when they don’t. We care deeply about making sure our clients have practical access to scale, and built our Kubernetes-native infrastructure to make sure you can consume it efficiently.

The solution? Scaling seamlessly across the industry's broadest range of NVIDIA GPUs on CoreWeave Cloud, only paying for the compute you need, when you need it. And zero charges for ingress or egress.

Thanks to integrations with open-source Kubernetes projects - like Knative, Keda, and ArgoWorkflows - and industry-standard software - like Determined.AI, Scalable Pixel Streaming, Zeet and Deadline - our clients regularly:

Auto-scale inference requests seamlessly to accommodate real-time fluctuations in end-user demand
Parallelize batch processing workloads across thousands of NVIDIA GPUs
Scale across virtually unlimited render capacity to hit any client deadline
Deliver immersive experiences in the Metaverse to thousands of users on-demand

Modern Infrastructure for the Most Intensive, Scalable Workloads

CoreWeave’s Kubernetes native environment is purpose-built for large-scale NVIDIA GPU-accelerated workloads. Each component of our infrastructure has been carefully designed to help clients access both the volume and the variety of compute they need in real-time, with responsive auto-scaling across thousands of GPUs.

For clients, this means dramatically faster spin-up times, no delays when running parallel jobs across different geographies, teams and models, and zero spend on idle time.

Thanks to container image caching and specialized schedulers, workloads on CoreWeave can be up and running in as little as 5 seconds. Lightning-fast spin-up times mean you can scale elastically and access massive amounts of resources in the same cluster, instantly.

Examples of Compute-Intensive Workloads We Support

‍
Machine Learning

CoreWeave is optimized for natural language processing and speech AI, utilizing containerized workloads with streaming responses and context aware load-balancing. On CoreWeave, you can deploy inference with a single YAML.

“CoreWeave’s deployment architecture enables us to scale up extremely fast when there is more demand. We are able to serve requests 3x faster after migrating to CoreWeave, leading to a much better user experience while saving 75% in cloud costs. For the users, this means the generation speeds will never slow down, even when there is peak load.”

— Eren Doğan, CEO NovelAI

VFX, Animation & Rendering

Accelerate artist workflows by eliminating the render queue, leveraging container auto-scaling across virtually unlimited render capacity.

“There weren’t any other companies able to take on this type of task, which would have meant adding months to the timeline. To be honest, without CoreWeave, this part of the project would not have been completed.”

— Riley and Aston, Procedural Space

“CoreWeave provides us with virtual workstations that have high-end NVIDIA GPUs. Not just for the individual artists, but also for rendering on the queue as well. If we need to provision hundreds of GPUs for a long sequence, we are able to do that quickly and easily – and that’s been awesome.”

– Rajesh Sharma, VP of Engineering at Spire Animation Studios

Drug Discovery

Run thousands of NVIDIA GPUs for parallel simulations, leveraging our Kubernetes orchestration tools such as Argo Workflows to run and manage the lifecycle of parallel processing pipelines.

“CoreWeave is critical to our infrastructure. They tailored a perfect burst compute solution for our large-scale molecular dynamics simulation, dramatically impacting the pace of our actionable scientific discovery. The pricing differential alone is instrumental in enabling us to execute proteome wide simulation.”

— Haotian Li, CTO Redesign Science

Metaverse & Pixel Streaming

Whether your Unreal Engine experience runs in VMs or containers, lightning-fast spin-up times and responsive auto-scaling means you can serve users in real-time, rather than spinning up and paying for idle compute.

“CoreWeave helped us deliver a true on-demand solution for our clients, providing maximum flexibility and unparalleled access to scale.”

– Chris Jarabek, VP Product Development, PureWeb

No Charges for Ingress / Egress

Another prohibiting factor to running burst compute workloads on other cloud providers is data transfer, which carries alarmingly high rates for ingress and egress. Anytime you transfer data into the cloud, move data between regions, access your data remotely or send something you're storing to a client, you are charged what is effectively a tax per GB of data you move. These costs are prohibitively expensive and can lock clients into unfavorable contracts.

At CoreWeave, we don’t charge for ingress or egress. The cost of bursting on CoreWeave Cloud is limited to the compute you use and the storage volumes you allocate. That's it.

Solve Tomorrow’s Problems Today

At CoreWeave, you won’t be forced into a box. We meet clients where they are, and provide economics that empower them to scale. Our modern infrastructure helps clients reach maximum efficiency, saving between 50-80% compared with legacy clouds. We’d love to help you too! Get started by speaking with one of our engineers.

Published on

November 2, 2022

Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly

Max Hjelm

Copied

CoreWeave’s burst compute capabilities allow organizations to instantly scale AI and HPC workloads across thousands of GPUs, accelerating performance without sacrificing flexibility or cost control.

Copied

Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly

What is Burst Compute?

Accessing On-Demand GPUs at Scale on Legacy Cloud Infrastructure Has Been Virtually Impossible

Modern Infrastructure for the Most Intensive, Scalable Workloads

Examples of Compute-Intensive Workloads We Support

‍
Machine Learning

VFX, Animation & Rendering

Drug Discovery

Metaverse & Pixel Streaming

No Charges for Ingress / Egress

Solve Tomorrow’s Problems Today

Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly

Related Blogs

CoreWeave ARENA: A Practical Approach to Workload Evaluation

CoreWeave Becomes One of the First Cloud Providers to Achieve NVIDIA Exemplar Cloud Validation for Inference on NVIDIA GB200 NVL72

The Year AI Gets to Work

A View from the Field: Davos 2026

Accelerating Quant Research with CoreWeave and Weights & Biases

Powering Production Agentic AI with RAG: Vector Databases on CoreWeave as Your Knowledge Retrieval Layer

A CFO’s Guide to Cloud Investment and the True Cost of AI Innovation

We Said We Would. Then We Did.

CoreWeave Sets New Standard as First NVIDIA GB200 Exemplar Cloud, Improving Upon NVIDIA’s Own Training Performance Targets

Building for What’s Next: Why the ClusterMAX™ 2.0 Platinum Rating Validates Our Long-Term Systems Thinking

Products

Solutions

AI Infrastructure

Why CoreWeave

Resources

About

What is Burst Compute?

Accessing On-Demand GPUs at Scale on Legacy Cloud Infrastructure Has Been Virtually Impossible

Modern Infrastructure for the Most Intensive, Scalable Workloads

Examples of Compute-Intensive Workloads We Support

‍Machine Learning

VFX, Animation & Rendering

Drug Discovery

Metaverse & Pixel Streaming

No Charges for Ingress / Egress

Solve Tomorrow’s Problems Today

Related Blogs

CoreWeave ARENA: A Practical Approach to Workload Evaluation

CoreWeave Becomes One of the First Cloud Providers to Achieve NVIDIA Exemplar Cloud Validation for Inference on NVIDIA GB200 NVL72

The Year AI Gets to Work

A View from the Field: Davos 2026

Accelerating Quant Research with CoreWeave and Weights & Biases

Powering Production Agentic AI with RAG: Vector Databases on CoreWeave as Your Knowledge Retrieval Layer

A CFO’s Guide to Cloud Investment and the True Cost of AI Innovation

We Said We Would. Then We Did.

CoreWeave Sets New Standard as First NVIDIA GB200 Exemplar Cloud, Improving Upon NVIDIA’s Own Training Performance Targets

Building for What’s Next: Why the ClusterMAX™ 2.0 Platinum Rating Validates Our Long-Term Systems Thinking

Products

Solutions

AI Infrastructure

Why CoreWeave

Resources

About

‍
Machine Learning