Blog Single Author Small - Uptime Institute Blog

Direct liquid cooling bubbles to the surface

March 31, 2022/in Design, Executive, Operations/by Lenny Simon, Senior Research Associate, Uptime Institute

Conditions will soon be ripe for widespread use of direct liquid cooling (DLC) — a collection of techniques that uses fluid to remove heat from IT electronics instead of air — and it may even become essential.

Currently, air cooling is still dominant with DLC remaining a niche option — around 85% of enterprise IT operations do not use it at all, according to Uptime Institute Intelligence’s recent Direct Liquid Cooling Survey of nearly 200 enterprises. About 7% of respondents to our survey report some application of DLC, such as in mainframes or a few racks, and this leaves only 9% that use DLC in more substantial ways.

Despite the current low uptake, Uptime Intelligence expects this view to shift markedly in favor of DLC in the coming years. In Uptime Intelligence’s survey, nearly two-thirds of enterprise IT users would consider it as a future option (see Figure 1).

diagram: DLC adoption and attitudes among enterprise operators — **Figure 1 DLC adoption and attitudes among enterprise operators**

Why would operators consider such a major change beyond select use cases? A likely key factor is silicon power density: the heat flux and temperature limits of some next-generation server processors and accelerators coming to market in 2022 / 2023 will push air cooling to its limits. Processors with thermal ratings of 300 watts and above — some with steeply lowered temperature limits — are on the short-term roadmap. To keep up with future advances in silicon performance, while also meeting cost and sustainability goals, air cooling alone could become inadequate surprisingly soon.

Industry body ASHRAE has already issued a warning in the form of adding a new H1 class of high-density IT equipment in its most recent update to its data center thermal guidelines. (See New ASHRAE guidelines challenge efficiency drive.) Class H1 equipment, classified as such by the server maker, require inlet temperatures of under 22°C (71.6°F), well under the recommended upper limit of 27°C (80.6°F) for general IT equipment. Many modern data center designs operate at temperatures above 22°C (71.6°F) to lower cooling energy and water consumption by minimizing the use of compressors and evaporation. Lowering temperatures (which may not be possible without a major overhaul of cooling systems) to accommodate for class H1 equipment would frustrate these objectives.

Collectively, the data center industry considers the use of DLC to be not a question of if, but when and how much. Most enterprises in our survey expect at least 10% of their cabinets to have DLC within five years. A previous Uptime Institute survey supports this data, showing that for data centers with an IT load of 1 megawatt or greater, the industry consensus for mass adoption of DLC is around seven years. Only a few respondents think that air cooling will remain dominant beyond 10 years. (See Does the spread of direct liquid cooling make PUE less relevant?)

Today, there is no single DLC approach that ticks all boxes without compromises. Fortunately for data center operators, there has been rapid development in DLC products since the second half of the 2010s, and vendors now offer a wide range of choice with very different trade-offs. Our survey confirms that cold plates using water chemistries (for example, water that is purified, deionized or a glycol mix) is the most prevalent category. However, there are still significant concerns around leakage among survey participants, which can curb an operator’s appetite for this type of DLC when it comes to large-scale installations.

Other options include:

Cold plates that circulate a dielectric fluid (either single-phase or two-phase) to reduce the risks associated with a leak.
Chassis-based immersion systems that are self-contained (single-phase).
Pool immersion tanks (also either single-phase or two-phase).

Nonetheless, major hurdles remain. There are some standardization efforts in the industry, notably from the Open Compute Project Foundation, but these are yet to bear fruit in the form of applicable products. This can make it difficult and frustrating for operators to implement DLC in their data centers, let alone in a colocation facility — for example, different DLC systems use different, often incompatible, coolant distribution units. Another challenge is the traditional division between IT and facilities teams. In a diverse and large-scale environment, DLC requires close collaboration between the two teams (including IT and facility equipment vendors) to resolve mechanical and material compatibility issues, which are both still common.

Despite the challenges, Uptime Institute maintains a positive view of DLC. As the demands on cooling systems grow with every generation of silicon, operators will find the advantages of DLC harder to ignore. We will explore various DLC options and their trade-offs in a subsequent Uptime Intelligence report (due to be published later in 2022).

Data center operators ponder the nuclear option

March 15, 2022/in Executive, Operations/by Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.com

As major businesses feel a growing sense of urgency to dramatically cut carbon emissions, opinions are starting to shift in favor of nuclear power, which is not classed as clean, but is a near-zero carbon energy source. The digital infrastructure industry, a major global consumer of energy, has a role to play in rehabilitating nuclear, and in marrying it to intermittent renewable energy to provide firm zero-emission power.

There is considerable reluctance to use, or endorse, the use of nuclear power, largely stemming from a fear of meltdowns and concerns about nuclear waste. These worries are likely to be overcome by the need for dispatchable, low-carbon power generation that does not depend on local weather. From 2022, we expect some major data center operators, and industry influencers and leaders, to support nuclear power more actively and openly — even pressuring governments and utilities to invest in this option.

We see three areas where the data center industry may have an active role in supporting a significant shift toward nuclear power. First, some data center operators are likely to publicly endorse nuclear as a carbon neutral power source. The EU’s Carbon Neutral Data Centre Pact, an industry initiative, has publicly declared nuclear power acceptable in a zero-emission energy mix. US hyperscalers Amazon Web Services, Google and Microsoft all mention on their websites the role of nuclear power in providing low-carbon electricity.

Second, we expect more data center operators to buy nuclear power, where available, as part of the mix in their power purchase agreements (PPAs), as a carbon-friendly alternative to 100% renewable energy. Some operators, for example, have already started exploring a combination of renewable and nuclear electricity purchases to guarantee a facility is always supplied with low-carbon content. Historically, most corporate PPAs and green financing mechanisms have been limited to renewable power (and excluded nuclear) to satisfy environmental concerns.

The financial case for including nuclear power in a PPA contract is that there will be greater long-term price stability – compared with all-renewable PPAs (as well as energy certificates and carbon offsets), which can become increasingly oversubscribed, pushing up prices. But without investments in more new plants, nuclear power will gradually diminish in the US, Europe and Japan — even if other countries, such as China and India, continue to invest in the construction of new nuclear plants. Utilities will dispatch coal and gas power to fill the shortfall when there is insufficient renewable energy.

This leads on to the third point: next-generation reactors. Some significant private backers, and several governments, including the US, UK and France, are funding the development of new, modularized designs (known as SMRs, or small modular reactors). In a plan revised in 2021, the US Department of Energy (DOE) has committed $1.4B to the construction of a power plant using a cluster of SMRs from NuScale, a US startup. In the UK, the government awarded a total of £210M ($280M) to Rolls-Royce, matched by a private equity investment of £195M ($260M), to move the company’s SMR design toward completion. South Korea is close to fully certifying an SMR from local startup the Korea Atomic Energy Research Institute (KAERI). These are only some examples of active development projects in SMR development.

There are three key innovations behind SMRs:

Simplified reactor and plant design.
Extensive prefabrication.
Passive safety systems, which don’t rely on active controls for cooling but use convection, pressure or other forces. This eliminates the requirement for emergency backup power, as already agreed by nuclear safety regulators in the case of NuScale.

Not only are SMRs proposed for the grid, but some of the hyperscalers have reportedly expressed interest in funding new sites that can supply power directly. At 10–100 megawatts electrical output, some of the next-generation modular reactors will be small enough for large data center operators to fund through PPAs. The projected cost for SMR plants, once in volume production, is a few hundred million dollars, and each would take only three to five years to build (compared with the current timeline of seven to 10 years for conventional nuclear plants).

If they deliver, SMRs could potentially become attractive for large data center campuses (potentially connecting directly), and for municipalities with a large concentration of data centers and other large energy users. SMRs also tend to have fuel preloaded for over 10 years at full capacity, eliminating the need for costly, complex and disruptive refueling. There are, of course, many issues and objections, including the need for power redundancy, cost, security, safety and the difficulty of matching the SMR lifespans to the data center.

Opponents argue there is neither the time nor need for next-generation nuclear. But a renewable energy-only strategy is a bet on grid-scale energy storage technologies that do not exist, and on high-capacity power transmission links for which there are no plans. Worse still, a renewable energy-only strategy adds risk to power security due to the unpredictability of extreme weather events because of climate change.

Owing to its more dynamic build activity, and large energy requirement compared with other industries, the data center sector could play an outsized role in rehabilitating nuclear and steering the future of energy.

The full report Five data center predictions for 2022 is available here.

Flexibility drives cloud lock-in risk

March 1, 2022/in Executive, Operations/by Dr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.com

Vendor lock-in is regularly levied as a criticism of cloud services. But the reality of IT is, there has always been lock-in. Even before cloud, enterprises were locked into hardware, operating systems, database platforms, data centers and network providers. Practically, the lock-in challenge with cloud isn’t that moving from one location or platform is not possible — it’s that the cost of moving can negate the benefit of doing so.

Cloud’s highly flexible model is a driver of lock-in. At the core of the cloud model, there are still data centers, servers and operating systems. Cloud has been revolutionary because of its pricing model, not necessarily because of the core technology. Rather than the buyer being concerned with capacity planning, for example, on-demand consumption allows users to consume the infrastructure they need with no capex, no advanced notification and no prepayment – buyers pay only for what they use, in arrears. The cloud provider is responsible for capacity planning and absorbs the risk of under- or overprovisioning.

Lock-in is more of a threat when using cloud compared with non-cloud infrastructure because of this inherent flexibility. The buyer is free to consume what it needs when it needs it, but the cloud provider is free to make changes as well. The provider could increase prices spontaneously with only a few weeks’ or even days’ notice (although this doesn’t happen in practice today). This would not breach any contract, as the consumer would be made aware of the new price and would not be obliged to consume or pay if they do not think it is fair value.

Realistically, few organizations can just “switch off” their mission-critical workloads because prices have gone up. They are locked into a venue as they always have been, but with cloud they are more beholden to price increases. Because price changes can happen quickly, if organizations decide to migrate to a different provider, they typically have only a few days’ notice to act before the increases take effect — and there are egress fees when exiting an existing cloud service (which can be significant), as well as ingress fees when joining a new service.

The cloud model does expose the buyer to unexpected price increases. However, if cloud providers spontaneously reduce their prices, buyers receive instant cost savings. So far, cloud pricing overall has trended downward — but there are no guarantees this will always be the case. In 2018, Google increased the prices of its Maps API (a cloud service that allows its maps to be integrated into applications) by 1,400%. Developers didn’t have to use the Maps API if they didn’t agree with the price change — but for many, the API had already been integrated into their applications. When the change happened, they had two choices: switch off the app or pay the higher price. In this case, moving to a different maps provider would not be trivial. Developers would need to change the application code to communicate with a new mapping service.

With traditional enterprise-owned (on-premises) and colocation infrastructure, prices may go up — but these do not have such an immediate effect as an on-demand service. Server prices might increase, but this will only be seen at the next server refresh. Existing servers can be “sweated” for longer to delay the impact of price changes (by delaying a server refresh). Colocation costs might increase, but this would take effect at the end of the contract term (not considering fluctuations with consumption-based models). The colo buyer is given more time to decide if migrating out is worthwhile before feeling the effect of an increase, whereas a cloud buyer is effectively forced into an increase; any decisions can happen only after the fact.

While cloud pricing has and continues to trend downward, there are no guarantees for the future. A server in an on-premises or colo facility that has already been paid for is inherently more predictable cost-wise than the future purchase of cloud capacity. A fixed-term, fixed-price colo contract is explicitly more stable than on-demand cloud pricing. Cloud buyers should plan now for their options to move venues if sharp cost increases are anticipated, to avoid the excess costs an unplanned migration typically entails.

Data center operators give themselves a “Fail” for sustainability

February 22, 2022/in Executive, Operations/by Lenny Simon, Senior Research Associate, Uptime Institute

The global data center industry describes its own sustainability performance as average, compared with other industries — and most think efforts are not substantially reducing environmental impacts.

A recent Uptime Institute global survey of over 400 data center owners and operators asked whether the industry’s environmental commitments are effective in curbing data center energy use, water use and greenhouse gas emissions. Only 38% say yes, while just under half (45%) of respondents say the industry’s commitments deliver few or no meaningful reductions in environmental impacts (Figure 1).

Diagram: Industry rates itself average for sustainability but is critical of its environmental impact — **Figure 1: Industry rates itself average for sustainability but is critical of its environmental impact**

Data center operators in Europe are the least likely to take a positive view of progress. Only about a third (35%) say meaningful reductions in data center environmental impacts are occurring. This critical stance by Europeans is reflected elsewhere in Uptime Institute’s research and may be attributed to a greater concern for sustainability within the European Union compared with North America. In a previous survey, Europe-based respondents were the most likely to say that government regulations should play a greater role (i.e., more regulation is needed) in improving the overall sustainability of the data center sector. (See Bring on regulations for data center sustainability, say Europe and APAC.)

But awareness of sustainability issues, and the limits of their own efforts, is growing among data center owners and operators globally. Most survey respondents in the US and Canada are also critical of industry commitments to reduce environmental impacts, although less so than their counterparts in Europe. Excluding the don’t know responses, 55% of North American respondents say commitments do not deliver meaningful reductions, with slightly less than half (45%) taking an optimistic view.

One possible explanation for the difference may be that more North American respondents reject the notion of the question itself. In Uptime’s 2020 climate change survey, nearly a third of North American data center managers stated that they believe climate change is not caused by human activity or is not happening at all. (See Climate change skepticism in the data center.) Almost a quarter of our 2021 North American survey respondents do not know whether the industry’s sustainability commitments are having an impact.

Our 2021 climate change survey also explored how the data center industry views its own efforts in terms of sustainability when compared with other areas of business activity (for example, in-store commerce, manufacturing, logistics). Most respondents (48% globally) say the industry is average for sustainability performance. This view is consistent across Europe (48%), the US and Canada (45%), and the Asia Pacific region (not including China; 43%).

This view may surprise some, given the heavily publicized environmental programs of some hyperscale operators (Amazon, Google and Microsoft, for example). Data also shows the industry buys more renewable energy (relative to consumption) than other sectors. However, given the lack of a universal measurement for sustainability across different industrial sectors, it isn’t easy to accurately assess sustainability performance.

Overall, the survey data suggests that Europeans believe the industry should do more to reduce environmental impacts. The opposite seems true in the Asia Pacific region, where data center owners and operators mostly view the industry’s commitments as effective in reducing environmental impacts. Views in North America are mixed and probably changing — possibly reflecting the relatively early stages of regulatory scrutiny of the industry’s sustainability performance to date.

The full Uptime Institute Climate Change Survey 2021 report is available to Uptime Institute members here.

Concerns over cloud concentration risk grow

February 10, 2022/in Executive, Operations/by Rhonda Ascierto, Vice President, Research, Uptime Institute

Control over critical digital infrastructure is increasingly in the hands of a small number of major providers. While a public cloud provides a flexible, stable and distributed IT environment, there are growing concerns around its operational resiliency.

Following some recent high-profile cloud failures, and with regulators asking more questions, there is increasing anxiety that using a big cloud provider can be a single point of failure, not just technically but also from a business-risk perspective.

Many organizations and regulators take issue with the lack of transparency of cloud providers, and the lack of control (see Figure 1) that important clients have — some of which are part of the national critical infrastructure. Concentration risk, where key services are dependent on one or a few key suppliers, is a particular concern.

Diagram: More mission-critical workloads in public clouds but visibility issues persist — Figure 1. More mission-critical workloads in public clouds but visibility issues persist

However, because the range, scope of services, management tools and developer environments vary among major cloud providers, organizations are often forced to choose a single provider (at least, for each business function). Even in highly regulated and critical sectors, such as financial services, a multicloud strategy is often neither feasible, nor is it easy to change suppliers — whatever the reason.

In 2021, for example, two major US financial firms Bank of America and Morgan Stanley announced they would standardize on a primary public cloud provider (IBM and Microsoft Azure, respectively). Spreading workloads across multiple clouds that use different technologies, and retraining developers or hiring a range of specialists, had proved too complex and costly.

Big cloud providers say that running workloads just in their environment does not lead to an over-reliance. For example, diversifying within a single cloud can mitigate risk, such as deploying workloads using platform as a service (PaaS) and using an infrastructure as a service (IaaS) configuration for disaster recovery. Providers also point to the distributed nature of cloud computing, which, combined with good monitoring and automated recovery, makes it highly reliable.

Reliability and resiliency, however, are two different things. High reliability suggests there will be few outages and limited downtime, while high resilience means that a system is not only less likely to fail but it, and other systems that depend on it, can quickly recover when there is a failure. While in enterprise and colocation data centers, and in corporate IT, the designs can be scrutinized, single points of failure eliminated, and the processes for system failure rehearsed, in cloud services it is mostly (or partly) a black box. These processes are conducted by the cloud provider, behind the scenes and for the benefit of all their clients, and not to ensure the best outcomes for just a few.

Our research shows that cloud providers have high levels of reliability, but they are not immune to failure. Complex backup regimes and availability zones, supported by load and traffic management, improve the resiliency and responsiveness of cloud providers, but they also come with their own problems. When issues do occur, many customers are often affected immediately, and recovery can be complex. In 2020, Uptime Institute recorded 21 cloud / internet giant outages that had significant financial or other negative consequences (see Annual Outage Analysis 2021).

Mindful of these risks, US financial giant JPMorgan, for example, is among few in its sector taking a multicloud approach. JPMorgan managers have cited concerns over a lack of control with a single provider and, in the case of a major outage, the complexity and the time needed to migrate to another provider and back again.

Regulators are also concerned — especially in the financial services industry where new rules are forcing banks to conduct due diligence on cloud providers. In the UK, the Bank of England is introducing new rules to ensure better management oversight over large banks’ reliance on cloud. And the European Banking Authority mandates that a cloud (or other third-party) operator allows site inspections of data centers.

A newer proposed EU law has wider implications: the Digital Operational Resiliency Act (DORA) puts cloud providers under financial regulators’ purview for the first time. Expected to pass in 2022, cloud providers — among other suppliers — could face large fines if the loss of their services causes disruption in the financial services industry. European governments have also expressed political concerns over growing reliance on non-European providers.

In 2022, we expect these “concentration risk” concerns to rise up more managers’ agendas. In anticipation, some service providers plan to focus more on enabling multicloud configurations.

However, the concentration risk goes beyond cloud computing: problems at one or more big suppliers have been shown to cause technical issues for completely unrelated services. In 2021, for example, a technical problem at the content distribution network (CDN) provider Fastly led to global internet disruption; while an outage at the CDN provider Akamai took down access to cloud services from AWS and IBM (as well as online services for many banks and other companies). Each incident points to a broader issue: the concentration of control over core internet infrastructure services in relatively few major providers.

How will these concerns play out? Some large customers are demanding a better view of cloud suppliers’ infrastructure and a better understanding of potential vulnerabilities. As our research shows, more IT and data center managers would consider moving more of their mission-critical workloads into public clouds if visibility of the operational resiliency of the service improves.

While public cloud data centers may have adequate risk profiles for most mission-critical enterprise workloads already, details about the infrastructure and its risks will increasingly be inadequate for regulators or auditors. And legislation, such as the proposed DORA, with penalties for outages that go far beyond service level agreements, are likely to spur greater regulatory attention in more regions and across more mission-critical sectors.

The full Five Data Center Predictions for 2022 report is available here.

Bring on regulations for data center sustainability, say Europe and APAC

January 12, 2022/in Design, Executive/by Jacqueline Davis, Research Analyst, Uptime Institute, jdavis@uptimeinstitute.com

As the data center sector increases its focus on becoming more environmentally sustainable, regulators still have a part to play — the question is to what extent? In a recent Uptime Institute survey of nearly 400 data center operators and suppliers worldwide, a strong majority would favor regulators playing a greater role in improving the overall sustainability of data centers — except for respondents in North America.

Globally, more than three in five respondents favor greater reliance on statutory regulation. The strongest support (75% of respondents) is in Europe and APAC (Asia-Pacific, including China). However, in the US and Canada, fewer than half (41%) want more government involvement, with the majority of respondents saying the government plays an adequate role, or should play a lesser role, in sustainability regulation (See Figure 1).

Diagram: Appetites for sustainability laws differ by region — Figure 1. Appetites for sustainability laws differ by region

Our survey did not delve into attitudes toward governments’ role in this area, but there are a few possible explanations for North America being an outlier. Globally, there is often a technical knowledge gap between industry professionals and government policymakers. As North America is the largest mature data center market, this gap may be more pronounced, fueling a general distrust by the sector toward legislators’ ability to create effective, meaningful laws. Indeed, North American participants have a lower opinion of their regulators’ understanding of data center matters compared with the rest of the world: four of 10 respondents rate their regulators as “not at all informed or knowledgeable.”

There are, however, cases of non-US legislation lacking technical merit, such as Amsterdam’s annual power usage effectiveness (PUE) limit of 1.2 for new data center builds. Although low PUEs are important, this legislation lacks nuance and does not factor in capacity changes — PUE tends to escalate at low utilization levels (for example, below 20% of the facility’s rated capacity). The requirement for a low PUE could incentivize behavior that is counterproductive to the regulation’s intent, such as enterprises and service providers moving (and leased operators commercially attracting) power-hungry applications to achieve a certain PUE number to avoid penalties. Also, these rules do not consider the energy efficiency of the IT.

Even if we accept the PUE’s limitations, the metric will likely have low utility as a regulatory dial in the future. Once a feature of state-of-the-art data centers, strong PUEs are now straightforward to achieve. Also, major technical shifts, such as the use of direct liquid cooling, may render PUE inconsequential. (See Does the spread of direct liquid cooling make PUE less relevant?)

The issue is not simply one of over-regulation: there are instances of legislators setting the bar too low. The industry-led Climate Neutral Data Centre Pact is a case in point. Formed in the EU, this self-regulatory agreement has signatory data center operators working toward reaching net-zero emissions by 2030 — 20 years earlier than the goal set by the EU government (as part of its European Green Deal).

Why, then, are most operators (outside of North America) receptive to more legislation? Perhaps it is because regulation, in some cases, benefitted the industry’s sustainability profile and received global attention as a reliable framework. Although Amsterdam’s one-year ban on new data center construction in 2019 was largely met with disapproval from the sector, it resulted in policies (including the PUE mandate) offering a clearer path toward sustainable development.

The new regulations for Amsterdam include designated campuses for new facility construction within the municipal zones, along with standards for improving the efficient use of land and raw materials. There are also regulations relating to heat re-use and multistory designs, where possible — all of which force the sector to explore efficient, sustainable siting, design and operational choices.

Amsterdam’s temporary ban on new facilities provided a global case study for the impacts of extreme regulatory measures on the industry’s environmental footprint. Similar growth-control measures are planned in Frankfurt, Germany and Singapore. If they realize benefits similar to those experienced in Amsterdam, support for regulation may increase in these regions.

In the grand scheme of sustainability and local impact, regulatory upgrades may have minimal effect. A clear policy, however, builds business confidence by removing uncertainty — which is a boon for data center developments with an investment horizon beyond 10 years. As for North America’s overall resistance, it could simply be that the US is more averse to government regulation, in general, than elsewhere in the world.

By: Jacqueline Davis, Research Analyst, Uptime Institute and Douglas Donnellan, Research Associate, Uptime Institute

Direct liquid cooling bubbles to the surface

Data center operators ponder the nuclear option

Data center operators give themselves a “Fail” for sustainability

Bring on regulations for data center sustainability, say Europe and APAC

Explaining the Uptime Institute’s Tier Classification System (April 2021 Update)

The Making of a Good Method of Procedure

A Look at Data Center Cooling Technologies

Data Center Cooling: CRAC/CRAH redundancy, capacity, and selection metrics

Implementing Data Center Cooling Best Practices