Real Cloud Infrastructure for Real AI Workloads: Training and Inference at Production ScaleReal Cloud Infrastructure for Real AI Workloads: Training and Inference at Production ScaleReal Cloud Infrastructure for Real AI Workloads: Training and Inference at Production Scale
CoreWeave

Real Cloud Infrastructure for Real AI Workloads: Training and Inference at Production Scale

Event details

Location
Chen Goldberg
EVP, Product & Engineering
,
CoreWeave
Location
Corey Sanders
SVP of Product
,
CoreWeave
Location
Schedule

 — 

Location
30 minutes

Infrastructure Built for Production-Scale AI

Today’s frontier and mixture-of-experts models weren’t small. They spanned multi-trillion parameters and required precise coordination across thousand-GPU clusters.

Traditional cloud environments simply weren’t built for this scale. To move from experimentation to real-world deployment, teams needed infrastructure purpose-built for sustained, large-scale workloads.

In this session, CoreWeave detailed how we optimized every layer of the AI stack—from infrastructure to orchestration to observability—to efficiently run large-scale training and inference workloads. We also examined the architectural breakthroughs that enabled rack-scale systems to operate with ultra-low latency and high reliability.

These were the essential cloud components that powered the next generation of agentic AI. The question was: How did your infrastructure stack up?

What you’ll learn in this on-demand session

  • How infrastructure requirements change when scaling to trillion-parameter and mixture-of-experts models
  • How full-stack optimization across infrastructure, orchestration, and observability improves performance and efficiency
  • Architectural innovations enabling ultra-low latency, rack-scale AI systems
  • Best practices for running production-grade AI workloads, including agentic AI systems

Speakers

Chen Goldberg
Chen Goldberg
CoreWeave
EVP, Product & Engineering
Corey Sanders
Corey Sanders
CoreWeave
SVP of Product

CoreWeave Cloud,
Mission Control,
CKS,
Observability,
SUNK,
Home v3,
Home v2,
Product - GPU Compute,
Product - Virtual Servers,
Solution - Pixel Streaming,
Solution - Machine Learning,
Product - VFX,
Product - Kubernetes,
Product - Concierge Render,
Home,