2-minute frontier training
Our MLPerf® Training v6.0 submission trained DeepSeek-V3 671B in 2.02 minutes on 8,192 NVIDIA GB300 GPUs — the fastest time recorded in the round across all available-cloud submissions on this benchmark.
16x larger cluster
Our MLPerf® Training v6.0 DeepSeek-V3 671B submission ran on the largest GB300 NVL72 cluster ever benchmarked on this workload — 8,192 GPUs, 16x the next-largest GB300 submission in the round.
2.8x faster
Our Llama 3.1 405B Training v6.0 submission hit the same MLPerf® quality target in 9.77 minutes — a 2.8x wall-clock speedup over our v5.0 result (27.33 minutes) on the same benchmark.




















