On-demand Webinar: How to measure and optimize AI infrastructure for large-scale training

Event details

On-Demand Webinar: How to measure and optimize AI infrastructure for large-scale training

Wes Brown

Distinguished Engineer

CoreWeave

Deok Filho

Product Manager

CoreWeave

—

Is purpose-built for AI training at scale really better? If so, how much better?

Our engineering team set out to answer this question, which led to months of research, testing, and even our own trained AI model, all captured in our latest performance benchmarking whitepaper.

Join Distinguished Engineer Wes Brown and Product Manager Deok Filho as they pull back the curtain on the methodology, the surprises, and the hard-won optimizations that delivered up to 20% more throughput, 10x longer uptime, and 97–98% utilization.

In this session, you’ll learn:

The hard data, charts, and benchmarks that prove an AI-first cloud outperforms industry training benchmarks
How we measured MFU, MTTF, and ETTR at massive scale—and why those metrics matter
What optimizations move the needle, from high-throughput tokenization to async checkpointing and automated recovery‍
Actionable next steps for applying these measurement and optimization techniques to your own AI training workflows

Speakers

Wes Brown

CoreWeave

Distinguished Engineer

Deok Filho

CoreWeave

Product Manager

Upcoming events

Related webinars

No events found.

On-demand Webinar: How to measure and optimize AI infrastructure for large-scale training

Event details