Skip to main content

One post tagged with "kai-scheduler"

View All Tags

· 6 min read

If you run GPU workloads on Kubernetes at any meaningful scale, you've probably hit a point where the default scheduler isn't enough. Fractional GPU requests, quota enforcement, gang scheduling, preemption — none of that comes out of the box. That's the gap KAI Scheduler fills.

KAI Scheduler is NVIDIA's open-source Kubernetes-native GPU scheduler, originally built inside the Run:ai platform and released under the Apache 2.0 license in April 2025. It's now a CNCF Sandbox project with over 1,200 GitHub stars, and it's quickly becoming the go-to scheduler for teams running AI workloads on Kubernetes — whether on-prem, colo, or cloud.

At BaaZ, we work with KAI Scheduler in production GPU clusters. We recently contributed a queue validation webhook (PR #857) that prevents a class of misconfiguration bugs in hierarchical queue setups. This post explains the problem, the fix, and why it matters for anyone operating multi-tenant GPU infrastructure.