• Link to X
  • Link to LinkedIn
  • Link to Mail
  • ABOUT UI
    • Business Partners
    • Careers
    • Contact Us
    • News & Press
    • Our Team
    • Press Releases
    • Branding Guidelines
  • CONTACT
Uptime Institute Blog
  • Journal
    • Journal Home
    • Executive
    • Operations
    • Design
  • AI Services
    • AI Infrastructure Advisory
  • Tier Certification
    • Overview
    • Design
    • Construction
    • Operations
    • Tier Gap Analysis
    • Prefabricated/Modular
    • Tier Certifications List
  • Professional Services
    • Overview
    • Infrastructure Services
    • Management and Operations Services
    • Energy and Sustainability Services
    • Consulting Services
  • Education
    • Education Framework
    • Course Calendar
    • Competency & Confidence Assessments
    • Private Education
    • Graduate Roster
  • Events
    • Industry Events
    • Leadership Events
    • Network Events
  • Network
    • Overview
    • Network Calendar
    • Network Roster
    • Request Corporate Access
    • Request Guest Access
    • Uptime Network Portal
  • Intelligence
  • Clients
    • Client Stories
  • Resources
    • Data Center Industry Surveys
    • Ebooks
    • Journal Blog
    • Product Datasheets
    • Research & Reports
    • Tier Specification Documents
    • Tools
    • Webinars
  • Click to open the search input field Click to open the search input field Search
  • Menu Menu
Blog - Latest News
Performance expectations of liquid cooling need a reality check

Performance expectations of liquid cooling need a reality check

March 20, 2024/in Design, Executive, Operations/by Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.com

The idea of using liquids to cool IT hardware, exemplified by technologies such as cold plates and immersion cooling, is frequently hailed as the ultimate solution to the data center’s energy efficiency and sustainability challenges. If a data center replaces air cooling with direct liquid cooling (DLC), chilled water systems can operate at higher supply and return water temperatures, which are favorable for both year-round free cooling and waste heat recovery.

Indeed, there are some larger DLC system installations that use only dry coolers for heat rejection, and a few installations are integrated into heat reuse schemes. As supply chains remain strained and regulatory environments tighten, the attraction of leaner and more efficient data center infrastructure will only grow.

However, thermal trends in server silicon will challenge engineering assumptions, chiefly DLC coolant design temperature points that ultimately underpin operators’ technical, economic and sustainability expectations of DLC. Some data center operators say the mix of technical and regulatory changes on the horizon are difficult to understand when planning for future capacity expansions — and the evolution of data center silicon will only add to the complications.

The only way is up: silicon power keeps escalating

Uptime Institute Intelligence has repeatedly noted the gradual but inescapable trend towards higher server power — barring a fundamental change in chip manufacturing technology (see Silicon heatwave: the looming change in data center climates). Not long ago, a typical enterprise server used less than 200 watts (W) on average, and stayed well below 400 W even when fully loaded. More recent highly performant dual-socket servers can reach 700 W to800 W thermal power, even when lightly configured with memory, storage and networking. In a few years, mainstream data center servers with high-performance configurations will require as much as 1 kilowatt (kW) in cooling, even without the addition of power-hungry accelerators.

The underlying driver for this trend is semiconductor physics combined with server economics for two key reasons. First, even though semiconductor circuits’ switching energy is dropping, the energy gains are being outpaced by an increase in the scale of integration. As semiconductor technology advances, the same area of silicon will gradually consume (and dissipate) ever more power as a result. Chips are also increasing in size, compounding this effect.

Second, many large server buyers prefer highly performant chips that can process greater software payloads faster because these chips drive infrastructure efficiency and business value. For some, such as financial traders and cloud services providers, higher performance can translate into more direct revenue. In return for these benefits, IT customers are ready to pay hefty price premiums and accept that high-end chips are more power-hungry.

DLC to wash cooling problems away

The escalation of silicon power is now supercharged by the high demand for artificial intelligence (AI) training and other supercomputing workloads, which will make the use of air cooling more costly. Fan power in high-performance servers can often account for 10% to 20% of total system power, in addition to silicon static power losses, due to operating near the upper temperature limit. There is also a loss of server density, resulting from the need to accommodate larger heat sinks and fans, and to allow more space between the electronics.

In addition, air cooling may soon see restrictions in operating temperatures after nearly two decades of gradual relaxation of set points. In its 2021 Equipment thermal guidelines for data processing environments, US industry body ASHRAE created a new environmental class for high-density servers with a recommended supply temperature maximum of 22°C (71.6°F) — a whole 5°C (9°F) lower than the general guidelines (Class A1 to A4), with a corresponding dip in data center energy efficiency (see New ASHRAE guidelines challenge efficiency drive).

Adopting DLC offers relief from the pressure of these trends. The superior thermal performance of liquids, whether water or engineered fluids, makes the job of removing several hundred watts of thermal energy from compact IT electronics more straightforward. Current top-of-the-line processors (up to 350 W thermal design power) and accelerators (up to 700 W on standard parts such as NVIDIA data center GPUs) can be effectively cooled even at high liquid coolant temperatures, allowing the facility water supply for the DLC system to be running as high as 40°C (104°F), and even up to 45°C (113°F).

High facility water temperatures could enable the use of dry coolers in most climates; or alternatively, the facility can offer valuable waste heat to a potential offtaker. The promise is attractive: much reduced IT and facility fan power, elimination of compressors that also lower capital and maintenance needs, and little to no water use for cooling. Today, several high-performance computing facilities with DLC systems take advantage of the heat-rejection or heat-reuse benefits of high temperatures.

Temperature expectations need to cool down

Achieving these benefits is not necessarily straightforward. Details of DLC system implementation, further increases in component thermal power, and temperature restrictions on some components all complicate the process further.

  • Temperatures depend on the type of DLC implementation. Many water-cooled IT systems, the most common type in use today, often serialize multiple cold plates within a server to simplify tubing, which means downstream components will receive a higher temperature coolant than the original supply. This is particularly true for densified compute systems with very compact chassis, and restricts coolant supply temperatures well below what would be theoretically permissible with a parallel supply to every single cold plate.
  • Thermal design power has not peaked. The forces underlying the rise in silicon power (discussed above) remain in play, and the data center industry widely expects even more power-hungry components in the coming years. Yet, these expectations remain in the realm of anecdotes, rumors and leaks in the trade press, rather than by way of publicly available information. Server chip vendors refuse to publicize the details of their roadmaps — only select customers under nondisclosure agreements have improved visibility. From our discussions with suppliers, Uptime Intelligence can surmise that more powerful processors are likely to surpass the 500 W mark by 2025. Some suppliers are running proof of concepts simulating 800 W silicon heat loads, and higher.
  • Temperature restrictions of processors. It is not necessarily the heat load that will cap facility water temperatures, but the changing silicon temperature requirements. As thermal power goes up, the maximum temperature permitted on the processor case (known as Tcase) is coming down —to create a larger temperature difference to the silicon and boost heat flux. Intel has also introduced processor models specified for liquid cooling, with Tcase as low as 57°C (134.6°F), which is more than a 20°C (36°F) drop from comparable air-cooled parts. These low-Tcase models are intended to take advantage of the lower operating temperature made possible by liquid cooling to maximize peak performance levels when running computationally intense code, which is typical in technical and scientific computing.
  • Memory module cooling.In all the speculation around high-power processors and accelerators, a potentially overlooked issue is the cooling of server memory modules, whose heat output was once treated as negligible. As module density, operating speeds and overall capacity increase with successive generations, maintaining healthy operating temperature ranges is becoming more challenging. Unlike logic chips, such as processors that can withstand higher operating temperatures, dynamic memory (DRAM) cells show performance degradation above 85°C (185°F), including elevated power use, higher latency, and — if thermal escalation is unchecked — bit errors and overwhelmed error correction schemes. Because some of the memory modules will be typically downstream of processors in a cold-plate system, they receive higher temperature coolant. In many cases it won’t be the processor’s Tcase that will restrict coolant supply temperatures, but the limits of memory chips.

The net effect of all these factors is clear: widespread deployment of DLC to promote virtually free heat rejection and heat reuse will remain aspirational in all but a few select cases where the facility infrastructure is designed around a specific liquid-cooled IT deployment.

There are too many moving parts to accurately assess the precise requirements of mainstream DLC systems in the next five years. What is clear, however, is that the very same forces that are pushing the data center industry towards liquid cooling will also challenge some of the engineering assumptions around its expected benefits.

Operators that are considering dedicated heat rejection for DLC installations will want to make sure they prepare the infrastructure for a gradual decrease in facility supply temperatures. They can achieve this by planning increased space for additional or larger heat rejection units — or by setting the water temperature conservatively from the outset.

Temperature set points are not dictated solely by IT requirements, but also by flow rate considerations — which has consequences for pipe and pump sizing. Operating close to temperature limits means loss of cooling capacity for the coolant distribution units (CDU), requiring either larger CDUs or more of them. Slim margins also mean any degradation or loss of cooling may have a near immediate effect at full load: a cooling failure in water or single-phase dielectric cold-plate systems may have less than 10 seconds of ride-through time.

Today, temperatures seem to be converging around 32°C (89.6°F) for facility water — a good balance between facility efficiency, cooling capacity and support for a wide range of DLC systems. Site manuals for many water-cooled IT systems also have the same limit. Although this is far higher than any elevated water temperature for air-cooling systems, it still requires additional heat rejection infrastructure either in the form of water evaporation or mechanical cooling. Whether lower temperatures will be needed as server processors approach 500 W — with large memory arrays and even higher power accelerators — will depend on a number of factors, but it is fair to assume the likely answer will be “yes”, despite the high cost of larger mechanical plants.

These considerations and limitations are mostly defined by water cold-plate systems. Single-phase immersion with forced convection and two-phase coolants, probably in the form of cold-plate evaporators rather than immersion, offer alternative approaches to DLC that should help ease supply temperature restrictions. For the time being, water cold plates remain the most widely available and are commonly deployed, and mainstream data center operators will need to ensure they meet the IT system requirements that use them.

In many cases, Uptime Intelligence expects operators to opt for lower facility supply water temperatures for their DLC systems, which brings benefits in lower pumping energy and fewer CDUs for the same cooling capacity, and is also more future proof. Many operators have already opted for conservative water temperatures as they upgrade their facilities for a blend of air and liquid-cooled IT. Others will install DLC systems that are not connected to a water supply but are air-cooled using fans and large radiators.


The Uptime Intelligence View

The switch to liquid to cool IT electronics offers a host of energy and compute performance benefits. However, future expectations based on the past performance of DLC installations are unlikely to be met. The challenges of silicon thermal management will only become more difficult as new generations of high-power server and memory chips develop. This is due to stricter component temperature limits, with future maximum facility water temperatures to be set at more conservative levels. For now, the vision of a lean data center cooling plant without either compressors or evaporative water consumption remains elusive.

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on X (Opens in new window) X
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on Reddit (Opens in new window) Reddit
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Email a link to a friend (Opens in new window) Email
Tags: cooling, Data Center, digital Infrastructure, Energy Efficiency, thermal management
https://journal.uptimeinstitute.com/wp-content/uploads/2024/03/Performance-expectations-of-liquid-cooling-need-a-reality-check-featured.jpg 539 1030 Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.com https://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.png Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.com2024-03-20 15:00:002025-10-28 12:47:56Performance expectations of liquid cooling need a reality check
You might also like
What does embedded carbon of IT really represent? What does embedded carbon of IT really represent?
Industry consensus on sustainability looks fragile Industry consensus on sustainability looks fragile
Cloud generations drive down prices Cloud generations drive down prices
Tier Certification for Modular and Phased Construction
When net-zero goals meet harsh realities When net-zero goals meet harsh realities
Cleaning a data center: Contractors vs. DIY
Fastly outage underscores slow creep of digital services risk
Accounting for digital infrastructure GHG emissions Accounting for digital infrastructure GHG emissions

Content Categories

  • Journal Home
  • Executive
  • Operations
  • Design

Subscribe to Journal via Email

Enter your email address to subscribe to Uptime Institute Journal and receive notifications of new articles by email.

  • Recent

Tags

Accredited Tier Designer (9) AI (22) artificial intelligence (17) ATD (10) Carbon Emissions (7) Climate Change (13) Cloud (23) Cloud Computing (17) Cloud Costs (15) Cloud Infrastructure (29) Cloud Migration (8) Colocation (7) cooling (9) Data Center (253) Data Center Availability (40) Data Center Cooling (13) Data Center Design (45) Data Center Disaster Recovery (7) Data Center Energy Efficiency (34) Data Center Facilities Management (43) Data Center Operations (66) data center power (8) Data Center Staffing (18) DCIM (9) digital Infrastructure (118) energy (8) Energy Efficiency (38) Environmental Sustainability (18) IT (7) IT Efficiency (16) IT Outages (10) M&O (6) outages (11) Public Cloud (7) PUE (10) Regulations (24) Resiliency (9) security (7) Sustainability (34) Sustainability Reporting (7) Tier Certification (26) Tier Certification Constructed Facility (16) Uptime Institute FORCSS (6) Uptime Institute Network (13) Uptime Institute Symposium (6)
© 2014-2025 Uptime Institute, LLC All rights reserved.
  • Link to X
  • Link to LinkedIn
  • Link to Mail
Link to: FinOps gives hope to those struggling with cloud costs Link to: FinOps gives hope to those struggling with cloud costs FinOps gives hope to those struggling with cloud costsFinOps gives hope to those struggling with cloud costs Link to: US mandates crypto energy reporting: will data centers be next? Link to: US mandates crypto energy reporting: will data centers be next? US mandates crypto energy reporting: will data centers be next?US mandates crypto energy reporting: will data centers be next?
Scroll to top Scroll to top Scroll to top