Capacity allocation and the next generation of AI-era KPIs

AI workloads are reshaping data center infrastructure at an unprecedented scale. Rack densities exceeding 50 kW are becoming increasingly common. AI training clusters often operate near their peak power envelope for extended periods, placing sustained demand on facility infrastructure. At the same time, AI inference workloads exhibit different characteristics, often behaving more like traditional compute environments in terms of utilization patterns.

In this environment, the industry continues to rely heavily on PUE as its primary infrastructure efficiency benchmark. PUE remains essential; it provides a simple, widely understood measure of facility overhead relative to IT load and allows operators to benchmark mechanical and electrical performance across sites. However, AI-scale deployments are introducing an additional structural consideration that PUE was not designed to address. PUE evaluates how efficiently a facility operates once energized; it does not assess how a site’s provisioned electrical capacity is allocated under its declared redundancy basis, nor how much of that capacity is structurally available for IT.

The shift is not about replacing PUE. It is about recognizing that robust operation and full utilization of the allocated/available power capacity must become a critical operational metric.

PUE and the limits of operational efficiency

For nearly two decades, measuring and monitoring PUE has driven significant improvements in mechanical design, electrical distribution, airflow management and UPS efficiency. It remains a valuable baseline indicator of infrastructure discipline.

However, PUE focuses on operational overhead relative to IT load. It does not address a growing structural question in AI deployments: how much of a site’s provisioned electrical capacity is available to IT in the first place?

In regions where utility interconnection capacity is constrained, this structural availability is becoming more consequential than incremental overhead optimization.

The permitted power envelope constraint

As AI deployments scale into the hundreds of megawatts, the limiting factor is often not chiller efficiency or airflow containment. Instead, it is the amount of electrical capacity permitted and provisioned under a site’s declared facility infrastructure redundancy configuration.

The redundancy architecture of the facility directly affects how that power envelope is allocated. Higher redundancy designs, such as N+1 or 2N configurations, require additional electrical and mechanical infrastructure to maintain reliability, which in turn consumes a larger share of the provisioned capacity.

Operators increasingly need to evaluate:

Total provisioned site capacity from grid and on-site sources.
Capacity structurally reserved for cooling systems at peak design load.
Electrical conversion and distribution losses.
Auxiliary building loads.
The capacity available to IT compute.

These are allocation questions and reflect facility system design decisions made before workloads are energized.

This allocation lens is formalized in the concept of power and compute effectiveness (PCE).

PCE as a capacity allocation metric

PCE is a metric developed by cooling system provider Airsys to provide greater transparency into power allocation within a data center’s provisioned electrical envelope. As defined by Airsys, PCE is the ratio of provisioned IT compute power allocation to total provisioned site electrical capacity.

PCE does not measure real-time utilization, evaluate workload efficiency or compete with PUE. Instead, it answers a structural question:

Given a permitted power envelope and designed redundancy, how much of that envelope is sustainably allocatable to IT?

In markets where utility interconnection capacity is constrained, this ratio can help operators understand how infrastructure design decisions affect the amount of compute that can ultimately be deployed within a fixed electrical envelope. This visibility can inform site planning, infrastructure design choices and long-term expansion strategies.

A layered KPI landscape

AI-scale infrastructure requires complementary metrics that operate at different layers of the system. Each key performance indicator (KPI) answers a distinct question about performance, allocation, or impact. These distinctions are summarized in Table 1. No single metric replaces another, as each operates at a different layer.

Table 1 KPI comparison overview

A facility may exhibit strong PUE and PCE yet deliver poor compute productivity due to low IT infrastructure utilization. Conversely, a site may achieve high compute productivity but face structural limits on expansion because cooling provisioning constrains allocatable IT capacity.

Understanding AI infrastructure performance requires viewing these metrics together rather than in isolation.

Cooling architecture and allocation trade-offs

As rack densities increase in high-performance computing environments, liquid cooling technologies — such as direct-to-chip cooling, rear-door heat exchangers and immersion systems — are moving from pilot deployments to production-scale use. These approaches can reduce fan energy and improve heat removal efficiency compared with traditional air-based cooling. However, they also change how cooling loads appear within a facility’s electrical envelope. Pumping systems, heat rejection infrastructure and associated redundancy layers all draw from the same provisioned capacity that supports the rest of the facility.

As a result, cooling architecture decisions directly influence how much of the provisioned power envelope is allocated to facility systems and how much remains available for IT compute.

In many facilities, the provisioned electrical envelope is not fully utilized. Operators may secure interconnection capacity and install infrastructure sized for future growth, yet only a portion of that capacity actively supports IT load. Metrics such as PCE provide visibility into this allocation by showing how much of the provisioned capacity is available to IT compute.

The Uptime Intelligence View

PCE does not seek to redefine redundancy tiers or alter declared design conditions; rather, it introduces a structural power capacity-allocation lens that can sit alongside existing reliability frameworks. A key consideration is whether maximizing PCE within a given redundancy architecture helps operators make better use of the available power envelope while maintaining the facility’s intended reliability posture.

The industry is understandably cautious when metrics intersect with reliability design. At the same time, as grid constraints tighten, the allocation of available power between facility systems and IT within a fixed envelope is becoming increasingly important to fully utilizing the capabilities of installed infrastructure. The value and applicability of PCE will depend on whether operators begin to track and manage the metric and on whether broader industry data emerges demonstrating that it provides meaningful insight into data center performance.