Cloud generations drive down prices

Cloud generations drive down prices

Cloud providers need to deliver the newest capability to stay relevant. Few enterprises will accept working with outdated technology just because it’s consumable as a cloud service. However, existing cloud instances don’t migrate automatically. Similarly to on-premises server infrastructure, users need to refresh their cloud services regularly.

Typically, cloud operators prefer product continuity between generations, often creating nearly identical instances. A virtual instance has a “family”, which dictates the physical server’s profile, such as more computing power or faster memory. A “size” dictates the amount of memory, virtual processors, disks and other attributes assigned to the virtual instance. The launch of a new generation usually consists of a range of virtual instances with similar definitions of family and size as the previous generation. The major difference is the underlying server hardware’s technology.

A new generation doesn’t replace an older version. The older generation is still available to purchase. The user can migrate their workloads to the newer generation if they wish, but it is their responsibility to do so. By supporting older generations, the cloud provider is seen to be allowing the user to upgrade at their own pace. The provider doesn’t want to appear to be forcing the user into migrating applications that might not be compatible with the newer server platforms.

More generations create more complexity for users through greater choice and different virtual instance generations to manage. More recently, cloud operators have started to offer different processor architectures in the same generation. Users can now pick between Intel, Advanced Micro Devices (AMD) or, in Amazon Web Service’s (AWS’s) case, servers using ARM-based processors. The variety of cloud processor architectures is likely to expand over the coming years.

Cloud operators provide price incentives so that users gravitate towards newer generations (and between server architectures). Figure 1 shows lines of best fit for the average cost per virtual central processing unit (vCPU, essentially a physical processor thread as most processor cores run two threads simultaneously) of a range of AWS virtual instances over time. Data is obtained from AWS’s Price List API. For clarity, we only show pricing for AWS’ US-East-1 region, but the observations are similar across all regions. The analysis only considers x86 processors from AMD and Intel.

The trend for most virtual instances is downward, with the average cost of the m family general-purpose virtual instances dropping 50% from its first generation to the present time. Each family has different configurations of memory, network and other attributes that aren’t accounted for in the price of an individual vCPU, which explains the price differences between families.

Average cost per AWS vCPU generation over time diagram
Figure 1 Average cost per AWS vCPU generation over time

One hidden factor is that compute power per vCPU also increases over generations — often incrementally. This is because more advanced manufacturing technologies tend to help with both clock speeds (frequency) and the “smartness” of processor cores in executing codes faster. Users can expect greater processing speed with newer generations compared with older versions while paying less. The cost efficiency gap is more substantial than simple pricing suggests.

AWS (and other cloud operators) are reaping the economic benefits of Moore’s law in a steep downward trajectory for cost of performance and passing some of this saving onto customers. Giving customers lower prices works in AWS’s favor by incentivizing customers to move to newer server platforms that are often more energy efficient and can carry more customer workloads — generating greater revenue and gross margin. However, how much of the cost savings AWS is passing on to its customers versus adding to its gross margin remains hidden from view. In terms of demand, cloud customers prioritize cost over performance for most of their applications and, partly because of this price pressure, cloud virtual instances are coming down in price.

The trend of lower costs and higher clock speed fails for one type of instance: graphics processing units (GPUs). GPU instances of families g and p have higher prices per vCPU over time, while g instances also have a lower CPU clock speed. This is not comparable with the non-GPU instances because GPUs are typically not broken down into standard units of capacity, such as a vCPU. Instead, customers tend to have (and want) access to the full resources of a GPU instance for their accelerated applications. Here, the rapid growth in total performance and the high value of the customer applications (for example, training of deep neural networks or massively parallel large computational problems) that use them allowed cloud operators (and their chip suppliers, chiefly NVIDIA) to raise prices. In other words, customers are willing to pay more for newer GPU instances if they deliver value in being able to solve complex problems quicker.

On average, virtual instances (at AWS at least) are coming down in price with every new generation, while clock speed is increasing. However, users need to migrate their workloads from older generations to newer ones to take advantage of lower costs and better performance. Cloud users must keep track of new virtual instances and plan how and when to migrate. The migration of workloads from older to newer generations is a business risk that requires a balanced approach. There may be unexpected issues of interoperability or downtime while the migration takes place — maintaining an ability to revert to the original configuration is key. Just as users plan server refreshes, they need to make virtual instance refreshes part of their ongoing maintenance.

Cloud providers will continue to automate, negotiate and innovate to drive costs lower across their entire operations, of which processors constitute a small but vital part. They will continue to offer new generations, families and sizes so buyers have access to the latest technology at a competitive price. The likelihood is that new generations will continue the trend of being cheaper than the last — by just enough to attract increasing numbers of applications to the cloud, while maintaining (or even improving) the operator’s future gross margins.

Industry consensus on sustainability looks fragile

Industry consensus on sustainability looks fragile

Pressed by a sense of urgency among scientists and the wider public, and by governments and investors who must fulfil promises made at COP (Conference of the Parties) summits, major businesses are facing ever more stringent sustainability reporting requirements. Big energy users, such as data centers, are in the firing line.

Many of the reporting requirements, and proposed methods of reducing carbon emissions, are proving to be complicated and may appear contradictory and counterproductive. Many managers will be bewildered and frustrated.

To date, most of the commitments on climate change made by the digital infrastructure sector have been voluntary. This has allowed a certain laxity in the definitions, targets and terminology used — and in the level of scrutiny applied. But these are all set to be tested: reporting requirements will increasingly become mandatory, either by law or because of commercial pressures. Failure to publish data or meet targets will carry penalties or have other negative consequences.

The European Union (EU) is the flag bearer in what is likely to be a wave of legislation spreading around the world. Its much-strengthened Energy Efficiency Directive, part of its “fit for 55” initiative (a legislative package to help meet the target of a 55% reduction in carbon emissions by 2030), is but one example. This legislation will require much more granular and open reporting, with even smaller-sized data centers (around 300–400 kilowatt total load) likely to face public audits for energy efficiency.

For operators in each part of the critical digital infrastructure sector, there may be some difficult decisions and trade-offs to make. Cloud companies, enterprises and colocation companies all want to halt climate change, but each has its own perspective and interests to protect.

Cloud suppliers and some of the bigger colocation providers, for example, are lobbying against some of the EU’s proposed reporting rules. Most of these organizations are already highly energy efficient and, by using matching and offsets, claim a very high degree of renewable use. Almost all also publish power usage effectiveness (PUE) data and some produce high-level carbon calculators for clients. Significant, step-change improvements would be complex and costly. Additionally, they argue, a bigger part of the sector’s energy waste takes place in smaller data centers, which may not have to fully report their energy use or carbon emissions — and may not be audited.

Colocation companies have a particular conundrum. Their energy consumption is high profile and huge — and clients now expect their colocation companies to use electricity from low-carbon or renewable sources. But this requires the purchase of ever more expensive RECs (renewable energy certificates), also known as Guarantees of Origin, and / or expensive, risky PPAs (power purchase agreements).

Purchasing carbon offsets or sourcing renewable power alone, however, is not likely to be enough in the years ahead. Regulators and investors will want to see annual improvements in energy efficiency or in reductions in energy use and carbon emissions.

For a colocation provider, achieving significant energy efficiency gains every year may not be possible. More than 70% of their energy use is tied to (and controlled by) their IT customers — many of whom are also pushing for more resiliency, which usually uses more energy. This can also apply to bare metal cloud customers.

In most data centers, the IT systems consume the most power and are operated wastefully. To encourage more energy efficiency at colocation sites, it makes sense for enterprises to take direct, Scope 2 responsibility for the carbon associated with the purchased electricity powering their systems. At present, most enterprises in a colocation site categorize the carbon associated with their IT as embedded Scope 3, which has weaker oversight and is not usually covered by expensive carbon offsets.

While many (including Uptime Institute) advocate that IT owners and operators take Scope 2 responsibility, it is clearly problematic. The owners and operators of the IT would have to be accountable for the carbon emissions resulting from the energy purchases made by their colocation or cloud companies — something many will not yet be ready to do. And, if they are responsible for the carbon emissions, they may have to also take on more responsibility for the expensive RECs and PPAs. This may be onerous – although the change might, at least, encourage IT owners to take on the considerable task of improving IT efficiency.

IT energy waste is a challenge for most in the digital critical infrastructure sector. After a decade of trying, the industry has yet to settle on metrics for measuring IT efficiency, although there are good measurements available for utilization and server efficiency (see Figure 1). In 2022, this challenge will rise up the agenda as stakeholders once again seek to define and apply the elusive metric of “useful work per watt” of IT. There won’t be any early resolution, though: these metrics are specific to each application, limiting their usefulness to regulators or overseers — and executives may fear the results will be alarmingly revealing.

Power consumption and PUE top sustainability metrics
Fig 1. Power consumption and PUE top sustainability metrics

The full report Five data center predictions for 2022 is available to Uptime Institute members here.

Cloud Complexity

Why cloud is a kludge of complexity

The cloud model was designed to be simple and nimble. Simple and nimble doesn’t necessarily mean fit for purpose. Over the past decade, new layers of capability have been added to cloud to address its shortcomings. While this has created more options and greater functionality, it has also meant greater complexity in its management.

Today, it is possible to create a virtual server on a public cloud and deploy an IT application within minutes. This simplicity is a significant value driver of cloud uptake. But building applications that are resilient, performant and compliant requires far greater consideration by cloud users.

Public cloud providers make few guarantees regarding the performance and resiliency of their services. They state that users should design their applications across availability zones, which are networked groups of data centers within a region, so they are resilient to an outage in a single zone. The onus is on the cloud user to build an IT application that works across multiple availability zones. This can be a complex task, especially for existing applications that were not designed for multi-availability zone cloud architecture. In other words, availability zones were introduced to make cloud more resilient, but they are just an enabler — the user must architect their use.

One of the original core tenets of cloud was the centralization of computing for convenience and outsourced management. The reality is many cloud buyers aren’t comfortable giving full control of all their workloads to a third party. Many are also bound by regulations requiring them to keep data in certain jurisdictions or under their own management. Private clouds were created to provide governance and control where public cloud failed, albeit with less scalability and flexibility than a public cloud.

Hybrid cloud makes public cloud more scalable by making it more distributed, which also means it is more flexible in terms of compliance and control. But this means cloud buyers must wrestle with designing and managing IT applications to work across different venues, where each venue has different capabilities and characteristics.

Public cloud providers now offer appliances or software that provide the same services found on public cloud but located in an on-premises environment. These appliances and software are designed to work “out of the box” with the public cloud, thereby allowing hybrid cloud to be implemented quicker than through a bespoke design. The hardware manufacturers, seeing the cloud providers enter their traditional territory of the on-premises data center, have responded with pay-as-you-go cloud servers that are billed according to usage.

Cloud management platforms provide a common interface to manage hybrid cloud, another consideration for cloud buyers. To manage applications effectively across venues, new application architectures are required. Software containers (an abstraction of code from operating systems) provide the basis of microservices, where applications are broken down into small, independent pieces of code that can scale independently — across venues if needed.

Applications that can scale effectively on the cloud are referred to as “cloud native.” Containers, microservices and cloud-native architectures were all introduced to make cloud scale effectively, but they all introduce new complexity. The Cloud Native Computing Foundation (CNCF) tracks over 1,200 projects, products and companies associated with cloud-native practices. The CNCF aims to reduce technical complexity in cloud-native practices, but these practices are all nascent and there is no clear standard approach to implementing cloud-native concepts.

To the uninitiated, cloud might appear a simple and nimble means to access capacity and cloud-enabling technologies (such as cloud software tools, libraries of application programming interfaces for integrations, etc.). This can still be the case for simple use cases, such as non-mission critical websites. However, users face complex and often onerous requirements for many of their workloads to run in a public cloud according to their business needs (such as resiliency and cost). The original cloud promised much, but the additional capabilities that have made cloud arguably more scalable and resilient have come at the cost of simplicity.

Today, there is no standard architecture for a particular application, no “best” approach or “right” combination of tools, venues, providers or services. Cloud users face a wall of options to consider. Amazon Web Services, the largest cloud provider, has over 200 products alone, with over five million variations. Most cloud deployments today are kludged — improvised or put together from an ill-assorted collection of parts. Different venues, different management interfaces and different frameworks, working together as best as they can. Functional, but not integrated.

The big threat of complexity is that more things can go wrong. When they do, the cause can be challenging to trace. The cloud sector has exploded with new capabilities to address mission-critical requirements — but choosing and assembling these capabilities to satisfactorily support a mission-critical application is a work in progress.

Direct liquid cooling bubbles to the surface

Direct liquid cooling bubbles to the surface

Conditions will soon be ripe for widespread use of direct liquid cooling (DLC) — a collection of techniques that uses fluid to remove heat from IT electronics instead of air — and it may even become essential.

Currently, air cooling is still dominant with DLC remaining a niche option — around 85% of enterprise IT operations do not use it at all, according to Uptime Institute Intelligence’s recent Direct Liquid Cooling Survey of nearly 200 enterprises. About 7% of respondents to our survey report some application of DLC, such as in mainframes or a few racks, and this leaves only 9% that use DLC in more substantial ways.

Despite the current low uptake, Uptime Intelligence expects this view to shift markedly in favor of DLC in the coming years. In Uptime Intelligence’s survey, nearly two-thirds of enterprise IT users would consider it as a future option (see Figure 1).

diagram: DLC adoption and attitudes among enterprise operators
Figure 1 DLC adoption and attitudes among enterprise operators

Why would operators consider such a major change beyond select use cases? A likely key factor is silicon power density: the heat flux and temperature limits of some next-generation server processors and accelerators coming to market in 2022 / 2023 will push air cooling to its limits. Processors with thermal ratings of 300 watts and above — some with steeply lowered temperature limits — are on the short-term roadmap. To keep up with future advances in silicon performance, while also meeting cost and sustainability goals, air cooling alone could become inadequate surprisingly soon.

Industry body ASHRAE has already issued a warning in the form of adding a new H1 class of high-density IT equipment in its most recent update to its data center thermal guidelines. (See New ASHRAE guidelines challenge efficiency drive.) Class H1 equipment, classified as such by the server maker, require inlet temperatures of under 22°C (71.6°F), well under the recommended upper limit of 27°C (80.6°F) for general IT equipment. Many modern data center designs operate at temperatures above 22°C (71.6°F) to lower cooling energy and water consumption by minimizing the use of compressors and evaporation. Lowering temperatures (which may not be possible without a major overhaul of cooling systems) to accommodate for class H1 equipment would frustrate these objectives.


Collectively, the data center industry considers the use of DLC to be not a question of if, but when and how much. Most enterprises in our survey expect at least 10% of their cabinets to have DLC within five years. A previous Uptime Institute survey supports this data, showing that for data centers with an IT load of 1 megawatt or greater, the industry consensus for mass adoption of DLC is around seven years. Only a few respondents think that air cooling will remain dominant beyond 10 years. (See Does the spread of direct liquid cooling make PUE less relevant?)

Today, there is no single DLC approach that ticks all boxes without compromises. Fortunately for data center operators, there has been rapid development in DLC products since the second half of the 2010s, and vendors now offer a wide range of choice with very different trade-offs. Our survey confirms that cold plates using water chemistries (for example, water that is purified, deionized or a glycol mix) is the most prevalent category. However, there are still significant concerns around leakage among survey participants, which can curb an operator’s appetite for this type of DLC when it comes to large-scale installations.


Other options include:

  • Cold plates that circulate a dielectric fluid (either single-phase or two-phase) to reduce the risks associated with a leak.
  • Chassis-based immersion systems that are self-contained (single-phase).
  • Pool immersion tanks (also either single-phase or two-phase).

Nonetheless, major hurdles remain. There are some standardization efforts in the industry, notably from the Open Compute Project Foundation, but these are yet to bear fruit in the form of applicable products. This can make it difficult and frustrating for operators to implement DLC in their data centers, let alone in a colocation facility — for example, different DLC systems use different, often incompatible, coolant distribution units. Another challenge is the traditional division between IT and facilities teams. In a diverse and large-scale environment, DLC requires close collaboration between the two teams (including IT and facility equipment vendors) to resolve mechanical and material compatibility issues, which are both still common.

Despite the challenges, Uptime Institute maintains a positive view of DLC. As the demands on cooling systems grow with every generation of silicon, operators will find the advantages of DLC harder to ignore. We will explore various DLC options and their trade-offs in a subsequent Uptime Intelligence report (due to be published later in 2022).

Data center operators ponder the nuclear option

Data center operators ponder the nuclear option

As major businesses feel a growing sense of urgency to dramatically cut carbon emissions, opinions are starting to shift in favor of nuclear power, which is not classed as clean, but is a near-zero carbon energy source. The digital infrastructure industry, a major global consumer of energy, has a role to play in rehabilitating nuclear, and in marrying it to intermittent renewable energy to provide firm zero-emission power.

There is considerable reluctance to use, or endorse, the use of nuclear power, largely stemming from a fear of meltdowns and concerns about nuclear waste. These worries are likely to be overcome by the need for dispatchable, low-carbon power generation that does not depend on local weather. From 2022, we expect some major data center operators, and industry influencers and leaders, to support nuclear power more actively and openly — even pressuring governments and utilities to invest in this option.

We see three areas where the data center industry may have an active role in supporting a significant shift toward nuclear power. First, some data center operators are likely to publicly endorse nuclear as a carbon neutral power source. The EU’s Carbon Neutral Data Centre Pact, an industry initiative, has publicly declared nuclear power acceptable in a zero-emission energy mix. US hyperscalers Amazon Web Services, Google and Microsoft all mention on their websites the role of nuclear power in providing low-carbon electricity.

Second, we expect more data center operators to buy nuclear power, where available, as part of the mix in their power purchase agreements (PPAs), as a carbon-friendly alternative to 100% renewable energy. Some operators, for example, have already started exploring a combination of renewable and nuclear electricity purchases to guarantee a facility is always supplied with low-carbon content. Historically, most corporate PPAs and green financing mechanisms have been limited to renewable power (and excluded nuclear) to satisfy environmental concerns. 

The financial case for including nuclear power in a PPA contract is that there will be greater long-term price stability – compared with all-renewable PPAs (as well as energy certificates and carbon offsets), which can become increasingly oversubscribed, pushing up prices. But without investments in more new plants, nuclear power will gradually diminish in the US, Europe and Japan — even if other countries, such as China and India, continue to invest in the construction of new nuclear plants. Utilities will dispatch coal and gas power to fill the shortfall when there is insufficient renewable energy.

This leads on to the third point: next-generation reactors. Some significant private backers, and several governments, including the US, UK and France, are funding the development of new, modularized designs (known as SMRs, or small modular reactors). In a plan revised in 2021, the US Department of Energy (DOE) has committed $1.4B to the construction of a power plant using a cluster of SMRs from NuScale, a US startup. In the UK, the government awarded a total of £210M ($280M) to Rolls-Royce, matched by a private equity investment of £195M ($260M), to move the company’s SMR design toward completion. South Korea is close to fully certifying an SMR from local startup the Korea Atomic Energy Research Institute (KAERI). These are only some examples of active development projects in SMR development.

There are three key innovations behind SMRs:

  • Simplified reactor and plant design.
  • Extensive prefabrication.
  • Passive safety systems, which don’t rely on active controls for cooling but use convection, pressure or other forces. This eliminates the requirement for emergency backup power, as already agreed by nuclear safety regulators in the case of NuScale.

Not only are SMRs proposed for the grid, but some of the hyperscalers have reportedly expressed interest in funding new sites that can supply power directly. At 10–100 megawatts electrical output, some of the next-generation modular reactors will be small enough for large data center operators to fund through PPAs. The projected cost for SMR plants, once in volume production, is a few hundred million dollars, and each would take only three to five years to build (compared with the current timeline of seven to 10 years for conventional nuclear plants).

If they deliver, SMRs could potentially become attractive for large data center campuses (potentially connecting directly), and for municipalities with a large concentration of data centers and other large energy users. SMRs also tend to have fuel preloaded for over 10 years at full capacity, eliminating the need for costly, complex and disruptive refueling. There are, of course, many issues and objections, including the need for power redundancy, cost, security, safety and the difficulty of matching the SMR lifespans to the data center.

Opponents argue there is neither the time nor need for next-generation nuclear. But a renewable energy-only strategy is a bet on grid-scale energy storage technologies that do not exist, and on high-capacity power transmission links for which there are no plans. Worse still, a renewable energy-only strategy adds risk to power security due to the unpredictability of extreme weather events because of climate change.

Owing to its more dynamic build activity, and large energy requirement compared with other industries, the data center sector could play an outsized role in rehabilitating nuclear and steering the future of energy. 

The full report Five data center predictions for 2022 is available here.

Flexibility drives cloud lock-in risk

Flexibility drives cloud lock-in risk

Vendor lock-in is regularly levied as a criticism of cloud services. But the reality of IT is, there has always been lock-in. Even before cloud, enterprises were locked into hardware, operating systems, database platforms, data centers and network providers. Practically, the lock-in challenge with cloud isn’t that moving from one location or platform is not possible — it’s that the cost of moving can negate the benefit of doing so.

Cloud’s highly flexible model is a driver of lock-in. At the core of the cloud model, there are still data centers, servers and operating systems. Cloud has been revolutionary because of its pricing model, not necessarily because of the core technology. Rather than the buyer being concerned with capacity planning, for example, on-demand consumption allows users to consume the infrastructure they need with no capex, no advanced notification and no prepayment – buyers pay only for what they use, in arrears. The cloud provider is responsible for capacity planning and absorbs the risk of under- or overprovisioning.

Lock-in is more of a threat when using cloud compared with non-cloud infrastructure because of this inherent flexibility. The buyer is free to consume what it needs when it needs it, but the cloud provider is free to make changes as well. The provider could increase prices spontaneously with only a few weeks’ or even days’ notice (although this doesn’t happen in practice today). This would not breach any contract, as the consumer would be made aware of the new price and would not be obliged to consume or pay if they do not think it is fair value.

Realistically, few organizations can just “switch off” their mission-critical workloads because prices have gone up. They are locked into a venue as they always have been, but with cloud they are more beholden to price increases. Because price changes can happen quickly, if organizations decide to migrate to a different provider, they typically have only a few days’ notice to act before the increases take effect — and there are egress fees when exiting an existing cloud service (which can be significant), as well as ingress fees when joining a new service.

The cloud model does expose the buyer to unexpected price increases. However, if cloud providers spontaneously reduce their prices, buyers receive instant cost savings. So far, cloud pricing overall has trended downward — but there are no guarantees this will always be the case. In 2018, Google increased the prices of its Maps API (a cloud service that allows its maps to be integrated into applications) by 1,400%. Developers didn’t have to use the Maps API if they didn’t agree with the price change — but for many, the API had already been integrated into their applications. When the change happened, they had two choices: switch off the app or pay the higher price. In this case, moving to a different maps provider would not be trivial. Developers would need to change the application code to communicate with a new mapping service.

With traditional enterprise-owned (on-premises) and colocation infrastructure, prices may go up — but these do not have such an immediate effect as an on-demand service. Server prices might increase, but this will only be seen at the next server refresh. Existing servers can be “sweated” for longer to delay the impact of price changes (by delaying a server refresh). Colocation costs might increase, but this would take effect at the end of the contract term (not considering fluctuations with consumption-based models). The colo buyer is given more time to decide if migrating out is worthwhile before feeling the effect of an increase, whereas a cloud buyer is effectively forced into an increase; any decisions can happen only after the fact.

While cloud pricing has and continues to trend downward, there are no guarantees for the future. A server in an on-premises or colo facility that has already been paid for is inherently more predictable cost-wise than the future purchase of cloud capacity. A fixed-term, fixed-price colo contract is explicitly more stable than on-demand cloud pricing. Cloud buyers should plan now for their options to move venues if sharp cost increases are anticipated, to avoid the excess costs an unplanned migration typically entails.