AI Insights
Video

Why Inference Engineering Is the Next AI Frontier

Play video

Part of the Who’s Ready for Anything series, this episode features Baseten’s Philip Kiely at NVIDIA GTC 2026. Learn why inference—not training—is the real challenge in AI, and what it takes to run models in production at scale.

In this video:

  • Why inference engineering is essential for production AI systems
  • The hidden complexity of running models with low latency and high reliability
  • How to optimize performance, cost, and scalability for real-world workloads
  • Why an incremental, real-world approach leads to successful AI deployment