04/21/2026 | Press release | Distributed by Public on 04/21/2026 07:15
Cast AI, the leading automation platform, today released its 2026 State of Kubernetes Optimization Report, a comprehensive analysis of GPU, CPU, and memory utilization across non-optimized Kubernetes clusters. Drawing on data from tens of thousands of clusters, the report delivers a clear and urgent message: GPUs are poorly utilized at 5% despite their cost. The efficiency gains that Kubernetes was designed to unlock are not emerging naturally with scale, and the gap between what organizations are paying for and what they are actually using is widening.
As Kubernetes adoption accelerates across organizations of every size and industry, resource utilization is moving in the opposite direction. Average CPU utilization across clusters stood at just 8% in 2025, while memory utilization was 20%.
A newer and rapidly escalating pressure is amplifying the problem: the expansion of GPU-equipped nodes as Kubernetes becomes the default platform for AI and ML workloads. Yet the data tells the same story as CPU and memory. GPU utilization averaged just 5% across the clusters analyzed, representing an enormous and largely invisible cost for organizations investing heavily in AI infrastructure.
As enterprises race to build AI capabilities on Kubernetes, the report warns that without the right optimization infrastructure in place, GPU waste will emerge as one of the most expensive inefficiencies in the modern cloud stack.
Cast AI's report identifies a critical misconception holding back Kubernetes efficiency: the belief that configuration is a deployment-time task. Rightsizing that runs once at deployment is not rightsizing. Workloads change, traffic patterns shift, and the configuration that was accurate six months ago is unlikely to remain accurate today. The same applies to Spot Instance selection, autoscaler configuration, commitment utilization, and node lifecycle management: each has a time dimension that manual processes simply cannot keep pace with at scale.
"A GPU sitting idle costs dollars per hour. A CPU sitting idle costs cents. And 95% of GPU capacity is doing nothing," said Laurent Gil, co-founder and president, Cast AI. "Cloud vendors just raised H200 prices 15%, breaking a 20-year trend of falling compute costs. That's not a configuration problem as much as it is a business emergency. Autonomous optimization is the only rational response to infrastructure economics that are moving against you."
The 2026 State of Kubernetes Optimization Report is based on Cast AI's analysis of real-world utilization data across tens of thousands of Kubernetes workloads. It is designed to give engineering and infrastructure leaders the insights they need to understand where inefficiencies originate, and what it takes to correct them. Cast AI has unique visibility into this data by virtue of its position as the go-to platform for organizations running Kubernetes at scale.