MLPerf v6.0 Results

The only AI cloud leading MLPerf 6.0 in training & inference

Once again, CoreWeave delivers benchmark-leading performance across MLPerf 6.0 training and inference.

What is MLPerf?

MLPerf Inference is an industry-standard suite that measures machine learning performance across realistic deployment scenarios. The speed at which systems process inputs and generate outputs from a trained model directly influences performance and user experience, making the MLPerf Inference benchmark a critical performance metric for both CoreWeave and our customers.

CoreWeave delivers unmatched performance benchmarks

CoreWeave consistently sets new records in MLPerf benchmarking, leading the industry in both AI training and inference performance.

2-minute frontier training

Our MLPerf® Training v6.0 submission trained DeepSeek-V3 671B in 2.02 minutes on 8,192 NVIDIA GB300 GPUs — the fastest time recorded in the round across all available-cloud submissions on this benchmark.

16x larger cluster

Our MLPerf® Training v6.0 DeepSeek-V3 671B submission ran on the largest GB300 NVL72 cluster ever benchmarked on this workload — 8,192 GPUs, 16x the next-largest GB300 submission in the round.

2.8x faster

Our Llama 3.1 405B Training v6.0 submission hit the same MLPerf® quality target in 9.77 minutes — a 2.8x wall-clock speedup over our v5.0 result (27.33 minutes) on the same benchmark.

Left
Right
MLPerf Training v5.0

Achieve faster training performance

CoreWeave, NVIDIA, and IBM partnered to deliver groundbreaking MLPerf Training v5.0 results, showcasing an NVIDIA GB200 cluster 34x larger than the next largest submission. Our results demonstrate exceptional scalability and efficiency, dramatically shortening training times and accelerating your ability to innovate with unprecedented efficiency.

MLPerf Inference v5.0

Industry-Leading MLPerf Inference Results for Unmatched Production Speed

CoreWeave is the first and only cloud provider to submit MLPerf Inference v5.0 results for NVIDIA GB200 Grace Blackwell instances, delivering over 800 tokens per second on the Llama 3.1 405B model—a 2.86X per-chip performance boost over NVIDIA H200 GPUs. Our NVIDIA H200 GPU instances also reached 33,000 tokens per second on the Llama 2 70B model, improving throughput by 40% compared to NVIDIA H100 GPUs. This unmatched inference performance ensures maximum GPU utilization and faster innovation cycles for our customers.

Trusted by leading AI labs, enterprises, and startups
Rev.comRev.com
DecartDecart
CloudflareCloudflare
AbridgeAbridge
OpenAIOpenAI
Jane StreetJane Street
CohereCohere
GoogleGoogle
WaveForms AIWaveForms AI
InflectionInflection
Fireworks AIFireworks AI
AugmentAugment
AltumAltum
ConjectureConjecture
ChaiChai
MistralAIMistralAI
NovelAINovelAI

Frequently asked questions

 What is MLPerf?

MLPerf is an industry-standard benchmark suite developed by MLCommons to measure and compare machine learning training and inference performance across hardware and platforms.

Why are MLPerf benchmarks important?

MLPerf benchmarks provide a fair, transparent method for evaluating AI hardware and cloud platforms, helping businesses choose solutions that offer optimal performance, scalability, and cost efficiency.

What does MLPerf Training v6.0 measure?

MLPerf Training v6.0 measures how quickly a computing system can train complex machine learning models—such as Meta's Llama 3.1 405B and the new DeepSeek-V3 671B mixture-of-experts benchmark—from initialization to a specified quality target, enabling fair and transparent performance comparisons across hardware platforms and cloud providers.

 What does MLPerf Inference v5.0 measure?

MLPerf Inference v5.0 measures how quickly computing systems process inputs and generate outputs using fully trained machine learning models, focusing specifically on throughput (tokens per second) and latency across realistic deployment scenarios to evaluate and compare the inference performance of hardware and cloud infrastructure providers.

How does CoreWeave's performance compare in MLPerf benchmarks?

In MLPerf Training v6.0, CoreWeave set new records—training DeepSeek-V3 671B in just 2.02 minutes on 8,192 NVIDIA GB300 GPUs, the fastest time in the round, and improving its Llama 3.1 405B time-to-train 2.8x year over year. These results were achieved on the same production GB300 NVL72 infrastructure customers run on every day.

 What makes CoreWeave GPUs unique for MLPerf results?

CoreWeave's results come from the full platform, not hardware alone. By combining the latest NVIDIA GB300 NVL72 systems with NVIDIA Spectrum-X networking, the topology-aware SUNK scheduler, and CoreWeave Mission Control, CoreWeave sustains industry-leading performance and scaling efficiency from 64 to 8,192 GPUs—on the same infrastructure available to customers.

Left
Right

Ready to accelerate your roadmap?

Gain a competitive edge with the most performant AI cloud on the market.

Disclaimer

Result verified by MLCommons Association. The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.