Hyperscale cloud providers have opened numerous operating regions in all corners of the world over the past decade. The three most prominent — Amazon Web Services (AWS), Google Cloud and Microsoft Azure — now have 105 distinct regions (excluding government and edge locations) for customers to choose from to locate their applications and data. Over the next year, this will grow to 130 regions. Other large cloud providers such as IBM, Oracle and Alibaba are also expanding globally, and this trend is likely to continue.
Each region requires enormous investments in data centers, IT, software, people, and networks. The opening of a region may both develop and disrupt the digital infrastructure of the countries involved. This Update, part of Uptime Intelligence’s series of publications explaining and examining the development of the cloud, shows how investment can be tracked — and, to a degree, predicted — by looking at the size of the markets involved.
Providers use the term “region” to describe a geographical area containing a collection of independent availability zones (AZs), which are logical representations of data center facilities. A country may have many regions, with each region typically having two or three AZs. The three leading hyperscalers’ estates include more than 300 hyperscale AZs and many more data centers (including both hyperscale-owned and hyperscale-leased facilities) in operation today. Developers use AZs to build resilient applications in a single region.
The primary reason providers offer a range of regions is latency. In general, no matter how good the network infrastructure is, the further the end user is from the cloud application, the greater the delay and the poorer the end-user experience (especially on latency-sensitive applications, such as interactive gaming). Another important driver is that some cloud buyers are required to keep applications and user data in data centers in a specific jurisdiction for compliance, regulatory or governance reasons.
Figure 1 shows how many of the three largest cloud providers have regions in each country.
Figure 1. Count of the three largest cloud providers (AWS, Google, Microsoft) operating a cloud region in a country: current and planned
The economics of a hyperscale public cloud depends on scale. Implementing a cloud region of multiple AZs (and, therefore, data centers) requires substantial investment, even if it relies on colocation sites. Cloud providers need to expect enough return to justify such an investment.
To achieve this return on investment, a geographical region must have the telecommunications infrastructure to support the entire cloud region. Practically too, the location must be able to support the data center itself, and be able to provide reliable power, telecommunications, security and skills.
Considering these requirements, cloud providers focus their expansion plans on economies with the largest gross domestic product (GDP). GDP measures economic activity but, more generally, is an indicator of the health of an economy. Typically, countries with a high GDP have broad and capable telecommunications infrastructure, high technology skills, robust legal and contractual frameworks, and the supporting infrastructure and supply chains required for data center implementation and operation. Furthermore, organizations in countries with higher GDPs have greater spending power and access to borrowing. In other words, they have the cash to spend on cloud applications to give the provider a high enough return on investment.
The 17 countries where all three hyperscalers currently operate cloud regions, or plan to, account for 56% of global GDP. The GDP of countries where at least one hyperscaler intends to operate is 87% of global GDP across just 40 countries (for comparison, the United Nations comprises 195 countries).
Figure 2 shows GDP against hyperscalers present in a country. (US and China’s GDPs are not shown because they are significant outliers.) The figure shows a trend: a greater GDP increases the likelihood of a hyperscaler presence in the region. Three countries buck this trend: Mexico, Turkey and Russia.
Figure 2. GDP against hyperscaler presence (China and US removed because of outlying GDP)
Observations
The US is due to grow to 24 hyperscaler cloud regions across 13 states (excluding the US government), which is substantially more than any other country. This widespread presence is because Google, Microsoft and AWS are US companies with significant experience of operating in the country. The US is the single most influential and competitive market for digital services, with a shared official language, an abundance of available land, a business-friendly environment, and relatively few differences in regulatory requirements between local authorities.
Despite China’s vast GDP, only two of the big three US hyperscalers operate there today: AWS and Microsoft Azure. However, unlike all other cloud regions, AWS and Microsoft regions are outsourced to Chinese companies to comply with local data protection requirements. AWS outsources its cloud regions to Sinnet and Ningxia Western Cloud Data Technology (NWCD); Azure outsources its cloud to 21Vianet. Notably, China’s cloud regions are totally isolated from all non-China cloud regions regarding connectivity, billing and governance. Google considered opening a China region in 2018 but abandoned the idea in 2020; one reason for this being a reluctance to operate through a partner, reportedly. China has its own hyperscaler clouds: Alibaba Cloud, Huawei Cloud, Tencent Cloud and Baidu AI Cloud. These hyperscalers have implemented regions beyond China and into greater Asia-Pacific, Europe, the US and the Middle East, primarily so that these China-based organizations can reach other markets.
Mexico has a high GDP but only one cloud region, which Microsoft Azure is currently developing. Mexico’s proximity to the US and a good international telecommunications infrastructure means applications targeting Mexican users do not necessarily suffer significant latency. The success of the Mexico region will depend on the eventual price of cloud resources there. If Mexico does not offer substantially lower costs and higher revenues than nearby US regions (for example San Antonio in Texas, where Microsoft Azure operates), and if customers are not legally required to keep data local, Mexican users could be served from the US, despite minor latency effects and added network bandwidth costs. Uptime thinks other hyperscale cloud providers are unlikely to create new regions in Mexico in the next few years for this reason.
Today, no multinational hyperscaler cloud provider offers a Russia region. This is unlikely to change soon because of sanctions imposed by a raft of countries since Russia invaded Ukraine. Cloud providers have historically steered clear of Russia because of geopolitical tensions with the US and Europe. Even before the Ukraine invasion, AWS had a policy of not working with the Russian government. Other hyperscalers, such as IBM, Oracle Cloud and China’s Alibaba, are also absent from Russia. The Russian Commonwealth of Independent States has no hyperscaler presence. Yandex is Russia’s most-used search engine and the country’s key cloud provider.
A total of 16 European countries have either a current or planned hyperscaler presence and represent 70% of the continent’s GDP. Although latency is a driver, data protection is a more significant factor. European countries tend to have greater data protection requirements than the rest of the world, which drives the need to keep data within a jurisdiction.
Turkey has a high GDP but no hyperscaler presence today. This is perhaps because the country can be served, with low latency, by nearby EU regions. Data governance concerns may also be a barrier to investment. However, Turkey may be a target for future cloud provider investment.
Today, the three hyperscalers are only present in one African country, South Africa — even though Egypt and Nigeria have larger GDPs. Many applications aimed at a North African audience may be suitably located in Southern Europe with minimal latency. However, Nigeria could be a potential target for a future cloud. It has high GDP, good connectivity through several submarine cables, and would appeal to the central and western African markets.
South American cloud regions were previously restricted to Brazil, but Google now has a Chilean region, and Azure has one in the works. Argentina and Chile have high relative GDP. It would not be surprising if AWS followed suit.
Conclusions
As discussed in Cloud scalability and resiliency from first principles, building applications across different cloud providers is challenging and costly. As a result, customers will seek cloud providers that operate in all the regions they want to reach. To meet this need, providers are following the money. Higher GDP generally equates to more resilient, stable economies, where companies are likely to invest and infrastructure is readily available. The current exception is Russia. High GDP countries yet to have a cloud presence include Turkey and Nigeria.
In practice, most organizations will be able to meet most of their international needs using hyperscaler cloud infrastructure. However, they need to carefully consider where they may want to host applications in the future. Their current provider may not support a target location, but migrating to a new provider that does is often not feasible. (A future Uptime Intelligence update will further explore specific gaps in cloud provider coverage.)
There is an alternative to building data centers or using colocation providers in regions without hyperscalers: organizations seeking new markets could consider where hyperscaler cloud providers may expand next. Rather than directly tracking market demand, software vendors may launch new services when a suitable region is brought online. The cost of duplicating an existing cloud application into a new region is small (especially compared with a new data center build or multi-cloud development). Sales and technical support can often be provided remotely without an expensive in-country presence.
Similarly, colocation providers can follow the money and consider the cloud providers’ expansion plans. A location such as Nigeria, with high GDP and no hyperscaler presence (but good telecommunications infrastructure) may be ideal for data center buildouts for future hyperscaler requirements.
Colocation providers also have opportunities outside the GDP leaders. Many organizations still need local data centers for compliance or regulatory reasons, or for peace of mind, even if a hyperscaler data center is relatively close in terms of latency. In the Uptime Institute Data Center Capacity Trends Survey 2022, 44% of 65 respondents said they would use their own data center if their preferred public cloud provider was unavailable in a country, and 29% said they would use a colocation provider.
Cloud providers increasingly offer private cloud appliances that can be installed in a customer’s data center and connected to the public cloud for a hybrid deployment (e.g., AWS Outposts, VMware, Microsoft Azure Stack). Colocation providers should consider if partnerships with hyperscaler cloud providers can support hybrid cloud implementations outside the locations where hyperscalers operate.
Cloud providers have no limits in terms of country or market. If they see an opportunity to make money, they will take it. But they need to see a return on their investment. Such returns are more likely where demand is high (often where GDP is high) and infrastructure is sufficient.
https://journal.uptimeinstitute.com/wp-content/uploads/2023/05/Data-shows-the-cloud-goes-where-the-money-is-featured.jpg5391030Dr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngDr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.com2023-05-31 14:00:002023-05-31 09:03:56Data shows the cloud goes where the money is
Data center operators and IT tenants have traditionally adopted a binary view of cooling performance: it either meets service level commitments, or it does not. The relationship is also coldly transactional: as long as sufficient volumes of air of the right temperature and quality (in accordance with service-level agreements that typically follow ASHRAE’s guidance) reach the IT rack, the data center facility’s mission has been accomplished. What happens after that point with IT cooling, and how it affects IT hardware, is not facilities’ business.
This practice was born in an era when the power density of IT hardware was much lower, and when server processors still had a fixed performance envelope. Processors were running at a given nominal frequency, under any load, that was defined at the time of manufacturing. This frequency was always guaranteed if there was sufficient cooling available, whatever the workload.
Chipmakers guide IT system builders and customers to select the right components (heat sinks, fans) via processor thermal specifications. Every processor is assigned a power rating for the amount of heat its cooling system must be able to handle at the corresponding temperature limit. This is not theoretical maximum power but rather the maximum that can realistically be sustained (seconds or more) running real-world software. This maximum is called thermal design power (TDP).
The majority of software applications don’t stress the processor enough to get close to the TDP, even if they use 100% of the processor’s time — typically only high-performance computing code makes processors work that hard. With frequencies fixed, this results in power consumption (and thermal power) that is considerably below the TDP rating. Since the early 2000s, nominal processor speeds have tended to be limited by power rather than the maximum speed of circuitry, so for most applications there is untapped performance potential within the TDP envelope.
This gap is wider still in multicore processors when the software cannot benefit from all the cores present. This results in an even larger portion of the power budget not being used to increase application performance. The higher the core count, the bigger this gap can be unless the workload is highly multithreaded.
Processors looking for opportunities
Most server processors and accelerators that came to market in the past decade have mechanisms to address this (otherwise ever-growing) imbalance. Although implementation details differ between chipmakers (Intel, AMD, NVIDIA, IBM), they all dynamically deploy available power budget to maximize performance when and where it is needed most.
This balancing happens in two major ways: frequency scaling and management of power allocation to cores. When a modern server processor enters a phase of high utilization but remains under its thermal specification, it starts to increase supply voltage and then matches frequency in incremental steps. It continues to scale the steps until it reaches any one of the preset limits: frequency, current, power or temperature — whichever comes first.
If the workload is not evenly distributed across cores, or leaves some cores unused, the processor allocates unused power to highly utilized cores (if power was the limiting factor for their performance) to enable them to scale their frequencies even higher. The major beneficiary of independent core scaling is the vast repository of single- or lightly threaded software, but multithreaded applications also benefit where they struggle with Amdahl’s law (when the application is hindered by parts of the code that are not parallelized, so that overall performance depends largely on how fast a core can work through those segments).
This opportunistic behavior of modern processors means the quality of cooling, considering both supply of cold air and its distribution within the server, is not binary anymore. Considerably better cooling increases the performance envelope of the processor, a phenomenon that supercomputing vendors and users have been exploring for years. It also tends to improve overall efficiency because more work is done for the energy used.
Performance is best served cold
Better cooling unlocks performance and efficiency in two major ways:
The processor operates at lower temperatures (everything else being equal).
It can operate at higher thermal power levels.
The lowering of operational temperature through improved cooling brings many performance benefits such as enabling individual processor cores to run at elevated speeds for longer without hitting their temperature limit.
Another, likely sizeable, benefit lies in reducing static power in the silicon. Static power is power lost to leakage currents that perform no useful work, yet keep flowing through transistor gates even when they are in the ”off” state. Static power was not an issue 25 years ago, but has become more difficult to suppress as transistor structures have become smaller, and their insulation properties correspondingly worse. High-performance logic designs, such as those in server processors, are particularly burdened by static power because they integrate a large number of fast-switching transistors.
Semiconductor technology engineers and chip designers have adopted new materials and sophisticated power-saving techniques to reduce leakage currents. However, the issue persists. Although chipmakers do not reveal the static power consumption of their products, it is likely to take a considerable component of the power budget of the processor, probably a low double-digit percentage share.
Various academic research papers have shown that static leakage currents depend on the temperature of silicon, but the exact profile of that correlation varies greatly across chip manufacturing technologies — such details remain hidden from the public eye.
Upgraded air coolers can measurably improve application performance when the processor is thermally limited during periods of high load, though such a speed-up tends to be in the low single digits. This can be achieved by lowering inlet air temperatures or, more commonly, by upgrading the processors’ cooling to lower thermal resistance. Examples of this are: adding larger, CFD-optimized heat sinks built from thermally better conducting alloy (e.g., copper-based alloys); using better thermal interface materials; and introducing more powerful fans to increase airflow. If combined with better facility air delivery and lower inlet temperatures, the speed-up is higher still.
No silver bullets, just liquid cooling
But the markedly lower thermal resistance and consequent lowered silicon temperature that direct liquid cooling (DLC) brings makes a more pronounced difference. Compared with air coolers at the same temperature, DLC (cold plate and immersion) can free up more power by reducing the temperature-dependent component of static leakage currents.
There is an even bigger performance potential in the better thermal properties of liquid cooling: prolonging the time that server processors can spend in controlled power excursions above their TDP level, without hitting critical temperature limits. This behavior, now common in server processors, is designed to offer bursts of extra performance, and can result in a short-term (tens of seconds) heat load that is substantially higher than the rated cooling requirement.
Typically, excursions reach 15% to 25% above the TDP, which did not previously pose a major challenge. However, in the latest generation of products from AMD and Intel, this results in up to 400 watts (W) and 420 W, respectively, of sustained thermal power per processor — up from less than 250 W about five years ago.
Such high-power levels are not exclusive to processor models aimed at high-performance computing applications: a growing number of mainstream processor models intended for cloud, hosting and enterprise workload consolidation can have these demanding thermal requirements. The favorable economics of higher performance servers (including their energy efficiency across an array of applications) generates demand for powerful processors.
Although these TDPs and power excursion levels are still manageable with air when using high-performance heat sinks (at the cost of rack density because of very large heat sinks, and lots of fan power), peak performance levels will start to slip out of reach for standard air cooling in the coming years. Server processor development roadmaps call for even more powerful processor models in the coming years, probably reaching 600 W in thermal excursion power by the mid-2020s.
As processor power escalates and temperature limits grow more restrictive, even DLC temperature choices will be a growing trade-off dilemma as data center and IT infrastructure operators try to balance capital costs, cooling performance, energy efficiency and sustainability credentials. Inevitably, the relationship between data center cooling, server performance and overall IT efficiency will demand more attention.
https://journal.uptimeinstitute.com/wp-content/uploads/2023/05/Cooling-to-play-a-more-active-role-in-IT-performance-and-efficiency-featured.jpg5391030Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngDaniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.com2023-05-24 15:00:002023-05-22 14:23:53Cooling to play a more active role in IT performance and efficiency
European countries narrowly avoided an energy crisis in the past winter months, as a shortfall in fossil fuel supplies from Russia threatened to destabilize power grids across the region. This elevated level of risk to the normally robust European grid has not been seen for decades.
A combination of unseasonably mild weather, energy saving initiatives and alternative gas supplies averted a full-blown energy crisis, at least for now, although business and home consumers are paying a heavy price through high energy bills. The potential risk to the grid forced European data center operators to re-evaluate both their power arrangements and their relationship with the grid. Even without an energy security crisis, power systems elsewhere are becoming less reliable, including some of the major grid regions in the US.
Most mission-critical data centers are designed not to depend on the availability of an electrical utility, but to benefit from its lower power costs. On-site power generation — usually provided by diesel engine generators — is the most common option to backup electricity supplies, because it is under the facility operator’s direct control.
A mission-critical design objective of power autonomy, however, does not shield data center operators from problems that affect utility power systems. The reliability of the grid affects:
The cost of powering the data center.
How much diesel to buy and store.
Maintenance schedules and costs.
Cascading risks to facility operations.
South Africa provides a case study in how grid instability affects data center operations. The country has emerged as a regional data center hub over the past decade (largely due to its economic and infrastructure head-start over other major African countries), despite experiencing its own energy crisis over the past 16 years.
A total of 11 major subsea network cables land in South Africa, and its telecommunications infrastructure is the most developed on the continent. Although it cannot match the capacity of other global data center hubs, South Africa’s data center market is highly active — and is expanding (including recent investments by global colocation providers Digital Realty and Equinix). Cloud vendors already present in South Africa include Amazon Web Services (AWS), Microsoft Azure, Huawei and Oracle, with Google Cloud joining soon. These organizations must contend with a notoriously unreliable grid.
Factors contributing to grid instability
Most of South Africa’s power grid is operated by state-owned Eskom, the largest producer of electricity in Africa. Years of under-investment in generation and transmission infrastructure have forced Eskom to impose periods of load-shedding — planned rolling blackouts based on a rotating schedule — since 2007.
Recent years have seen substation breakdowns, cost overruns, widespread theft of coal and diesel, industrial sabotage, multiple corruption scandals and a $5 billion government bail-out. Meanwhile, energy prices nearly tripled in real terms between 2007 and 2020.
In 2022, the crisis deepened, with more power outages than in any of the previous 15 years — nearly 300 load-shedding events, which is three times the previous record of 2020 (Figure 1). Customers are usually notified about upcoming disruption through the EskomSePush (ESP) app. Eskom’s load-shedding measures do not distinguish between commercial and residential properties.
Figure 1. Number of load-shedding instances initiated by Eskom from 2018 to 2022
Blackouts normally last for several hours, and there can be several a day. Eskom’s app recorded at least 3,212 hours of load-shedding across South Africa’s grid in 2022. For more than 83 hours, South Africa’s grid remained in “Stage 6”, which means the grid was in a power shortfall of at least 6,000 megawatts. A new record was set in late February 2023, when the grid entered “Stage 8” load-shedding for the first time. Eskom has, in the past, estimated that in “Stage 8”, an average South African could expect to be supplied with power for only 12 hours a day.
Reliance on diesel
In this environment, many businesses depend on diesel generators as a source of power — including data centers, hospitals, factories, water treatment facilities, shopping centers and bank branches. This increased demand for generator sets, spare parts and fuel has led to supply shortages.
Load-shedding in South Africa often affects road signs and traffic lights, which means fuel deliveries are usually late. In addition, trucks often have to queue for hours to load fuel from refineries. As a result, most local data center operators have two or three fuel supply contracts, and some are expanding their on-site storage tanks to provide fuel for several days (as opposed to the 12-24 hours typical in Europe and the US).
There is also the cost of fuel. The general rule in South Africa is that generating power on-site costs about seven to eight times more than buying utility power. With increased runtime hours on generators, this quickly becomes a substantial expense compared with utility energy bills.
Running generators for prolonged periods accelerates wear and heightens the risk of mechanical failure, with the result that the generator units need to be serviced more often. As data center operations staff spend more time monitoring and servicing generators and fuel systems, other maintenance tasks are often deferred, which creates additional risks elsewhere in the facility.
To minimize the risk of downtime, some operators are adapting their facilities to accommodate temporary external generator set connections. This enables them to provision additional power capacity in four to five hours. One city, Johannesburg, has access to gas turbines as an alternative to diesel generators, but these are not available in other cities.
Even if the data center remains operational through frequent power cuts, its connectivity providers, which also rely on generators, may not. Mobile network towers, equipped with UPS systems and batteries, are frequently offline because they do not get enough time to recharge between load-shedding periods if there are several occurrences a day. MTN, one of the country’s largest network operators, had to deploy 2,000 generators to keep its towers online and is thought to be using more than 400,000 liters of fuel a month.
Frequent outages on the grid create another problem: cable theft. In one instance, a data center operator’s utility power did not return following a scheduled load-shedding. The copper cables leading to the facility were stolen by thieves, who used the load-shedding announcements to work out when it was safe to steal the cables.
Lessons for operators in Europe and beyond
Frequent grid failures increase the cost of digital services and alter the terms of service level agreements.
Grid issues may take years to emerge. The data center industry needs to be vigilant and respond to early warning signs.
Data center operators must work with utilities, regulators and industry associations to shape the development of grids that power their facilities.
Uptime’s view is that data center operators will find a way to meet demand — even in hostile environments.
Issues with the supply of Russian gas to Europe might be temporary, but there are other concerns for power grids around the world. In the US, an aging electricity transmission infrastructure — much of it built in the 1970s and 1980s — requires urgent modernization, which will cost billions of dollars. It is not clear who will foot this bill. Meanwhile, power outages across the US over the past six years have more than doubled compared with the previous six years, according to federal data.
While the scale of grid disruption seen in South Africa is extreme, it offers lessons on what happens when the grid destabilizes and how to mitigate those problems. An unstable grid will cause similar problems for data center operators around the world, and range from ballooning power costs to a higher risk of equipment failure. This risk will creep into other parts of the facility infrastructure if operations staff do not have time to perform the necessary maintenance tasks. Generators may be the primary source of data center power, but they are best used as an insurance policy.
https://journal.uptimeinstitute.com/wp-content/uploads/2023/05/Failing-Power-Grid-South-Africa-featured.jpg5391030Max Smolaks, Research Analyst, msmolaks@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngMax Smolaks, Research Analyst, msmolaks@uptimeinstitute.com2023-05-17 15:00:002023-05-16 09:41:26The effects of a failing power grid in South Africa
In the struggle to reduce carbon emissions and increase renewable energy, the US Inflation Reduction Act (IRA), passed in August 2022, is a landmark development. The misleadingly named Act, which is lauded by environmental experts and castigated by foreign leaders, is intended to rapidly accelerate the decarbonization of the world’s largest economy by introducing nearly $400 billion in federal funding over the next 10 years.
Reducing the carbon intensity of electricity production is a major focus of the act and the US clean energy industry will greatly benefit from the tax credits encouraging renewable energy development. But it also includes provisions intended to “re-shore” and create jobs in the US and to ensure that US companies have greater control over the energy supply chain. Abroad, foreign leaders have raised objections over these protectionist provisions creating (or aggravating) a political rift between the US and its allies and trading partners. In response to the IRA, the EU has redirected funds to buoy its low-carbon industries, threatened retaliatory measures and is considering the adoption of similar legislation.
While the politicians argue, stakeholders in the US have been scouring the IRA’s 274 pages for opportunities to capitalize on these lucrative incentives. Organizations that use a lot of electricity are also likely to benefit — including large data centers and their suppliers. Lawyers, accountants and investors working for organizations planning large-scale digital infrastructure investments will see opportunities too. Some of these will be substantial.
Summary of opportunities
For digital infrastructure companies, it may be possible to secure support in the following areas:
Renewable energy prices / power purchase agreements. Demand for renewable energy and the associated renewable energy credits is likely to be very high in the next decade, so prices are likely to rise. The tax incentives in the IRA will help to bring these prices down for renewable energy generators. By working with electricity providers and possibly co-investing, some data center operators will be able to secure lower energy prices.
Energy efficiency. Commercial building operators will find it easier to earn tax credits for reducing energy use. However, data centers that have already improved energy efficiency will struggle to reach the 25% reduction required to qualify. Operators may want to reduce energy use on the IT side, but this would not meet the eligibility requirements for this credit.
Equipment discounts / tax benefits. The act provides incentives for energy storage equipment (batteries or other technologies) which are necessary to operate a carbon-free grid. There are tax concessions for low-carbon energy generation, storage and microgrid equipment. Vendors may also qualify for tax benefits that can be sold.
Renewable energy generation. Most data centers generate little or no onsite renewable energy. In most cases, the power generated on site can support only a tiny fraction of the IT load. Even so, the many new incentives for equipment and batteries may make this more cost effective; and large operators may find it worthwhile to invest in generation at scale.
Detailed provisions
Any US operator considering significant investments in renewable generation and/or energy storage, including, for example, a UPS — is advised to study the act closely.
Uptime Institute Intelligence’s view is that the following IRA tax credits will apply to operators:
Investment tax credit (ITC), section 48. Of the available tax credits, the most significant for operators is the ITC. The ITC encourages renewable, low-carbon energy use by reducing capital costs by up to 30% through 2032. It applies to capital spending on assets, including solar, wind, geothermal equipment, electrochemical fuel cells, energy storage and microgrid controllers. The ITC will make investing in solar, wind, energy storage and fuel cells more attractive. The ITC is likely to catalyze investment in, and the deployment of, low carbon energy technologies.
Energy efficiency commercial buildings deduction, section 179D. This tax credit will encourage investment in energy efficiency and retrofits in commercial buildings. The incentive applies to projects that deliver at least a 25% energy efficiency improvement (reduced from the existing 50% threshold) for a building, compared with ASHRAE’s 90.1 standard reference building. The energy efficiency tax credit applies to projects in the following categories: interior lighting, heating, cooling, ventilation or the building envelope. To meet the 25% threshold, operators can retrofit several building systems. Qualified projects earn a tax credit of between 50 cents and $5 a square foot, depending on efficiency gains and labor requirements.
Production tax credit (PTC), section 45. This incentive does not directly apply to data center operators but will affect their business if they buy renewable energy. This tax credit rewards low-carbon energy producers by increasing their profit margin. The PTC only applies to energy producers that sell to a third party, rather than consume it directly. Qualifying projects include wind, solar and hydropower facilities. The PTC scales with inflation and lasts for 10 years. In 2022, the maximum value of the PTC was 2.6 cents per kilowatt-hour (kWh) — for reference, the average US industrial energy price in September 2022 was 10 cents per kWh. If the credit is fully passed on to consumers, energy costs will be reduced by about 25%. (Note: eligible projects must choose between the PTC and the ITC.)
For the tax credits mentioned above, organizations must meet the prevailing wage and apprenticeship requirements (initial guidance by the US Treasury Department and the Internal Revenue Service can be found here) to receive the maximum credit unless the nameplate generation capacity of the project is less than 1 megawatt for the ITC and PTC.
The incentives listed above will be available until 2032 creating certainty for operators considering an investment in renewables, efficiency retrofits or the renewable energy industry. Additionally, these tax credits are transferable: they can be sold — for cash — to another company with tax liability, such as a bank.
Hyperscalers and large colocation providers are best positioned to capitalize on these tax credits: they are building new capacity quickly, possess the expertise and staffing capacity to navigate the legal requirements, and have ambitious net-zero targets.
However, data center operators of all sizes will pursue these incentives where there is a compelling business case. Owners / operators from Uptime Institute’s 2022 global data center survey said more renewable energy purchasing options would deliver the most significant gains in sustainability performance in the next three to five years.
The IRA may also lower the cost barriers for innovative data center designs and typologies. For example, IRA incentives will strengthen the business case for pairing a facility on a microgrid with renewable and long-duration energy storage (LDES). Emerging battery chemistries in development (including iron-air, liquid metal and nickel-zinc) offer discharge durations of 10 hours to 10 days and would benefit from large deployments to prove their viability.
LDES is essential for a reliable low-carbon grid. As the IRA speeds up the deployment of renewables, organizations will need multi-day energy storage to smooth out the high variability of intermittent generators such as solar and wind. Data center facilities may be ideal sites for LDES, even if they are not dedicated for data center use.
Additionally, low-carbon baseload generators such as nuclear, hydrogen and geothermal — all eligible for IRA tax credits — will be needed to replace reliable fossil fuel generators, such as gas turbines and coal power plants.
The incentives in the IRA, welcomed with enthusiasm by climate campaigners the world over, will strengthen the business case in the US for reducing energy consumption, deploying low-carbon energy and energy storage, and/or investing in the clean energy economy.
There is, however, a more problematic side: the rare earth materials and critical components the US will need to meet the objectives of the IRA may be hard to source in sufficient quantities and allegations of protectionism may cause political rifts with other countries.
Data centers have become victims of their own success. Ever-larger data centers have mushroomed across the globe in line with an apparently insatiable demand for computing and storage capacity. The associated energy use is not only expensive (and generating massive carbon emissions) but is also putting pressure on the grid. Most data center developments tend to be concentrated in and around metropolitan areas — making their presence even more palpable and attracting scrutiny.
Despite major achievements in energy performance throughout the 2010s — as witnessed by Uptime data on industry-average PUE — this has created challenges for data center builders and operators. Delivering bulletproof and energy-efficient infrastructure at a competitive cost is already a difficult balancing act, even without having to engage with local government, regulators and the public at large on energy use, environmental impact and carbon footprint.
IT is conspicuously absent from this dialogue. Server and storage infrastructure account for the largest proportion of a data centers’ power consumption and physical footprint. As such, they also offer the greatest potential for energy-efficiency gains and footprint compression. Often the issue is not wasted but unused power: poor capacity-planning practices create demand for additional data center developments even where unused (but provisioned) capacity is available.
Nonetheless, despite growing costs and sustainability pressures, enterprise IT operators — as well as IT vendors — continue to show little interest in the topic.
This will be increasingly untenable in the years ahead. In the face of limited power availability in key data center markets, together with high power prices and mounting pressure to meet sustainability legislation, enterprise IT’s energy footprint will have to be addressed more seriously. This will involve efficiency-improvement measures aimed at using dramatically fewer server and storage systems for the same workload.
Uptime has identified four key areas where pressure on IT will continue to build — all of them pointing in the same direction:
Municipal (local) resistance to new large data centers.
The limited availability of grid power to support increasing data center capacity.
Increasing regulation governing sustainability and carbon reduction, and more stringent reporting requirements.
High energy costs.
Municipalities — and utility providers — need the pace to drop
Concerns over power and land availability have, since 2019, led to greater restrictions on the construction of new data centers (Table 1). This is likely to intensify. Interventions on the part of local government and utility providers typically involve more rigorous application processes, more stringent energy-efficiency requirements and, in some cases, the outright denial of new grid connections for major developments. These restrictions have resulted in costly project delays (and, in some cases, cancellations) for major cloud and colocation providers.
Frankfurt, a key financial hub and home to one of the world’s largest internet exchange ecosystems, set an example. Under a new citywide masterplan (announced in 2022), the city stipulates densified, multistory and energy-optimized data center developments — chiefly out of concerns for sprawling land use and changes to the city’s skyline.
The Dublin area (Ireland) and Loudoun County (Northern Virginia, US) are two stand-out examples (among others) of the grid being under strain and power utilities having temporarily paused or capped new connections because of current shortfalls in generation or transmission capacity. Resolving these limitations is likely to take several years. A number of data center developers in both Dublin and Loudoun County have responded to these challenges by seeking locations further afield.
Table 1 Restrictions on new data centers since 2019 — selected examples
New sustainability regulations
Following years of discussion with key stakeholders, authorities have begun introducing regulation governing performance improvements and sustainability reporting for data centers — a key example being the EC’s Energy Efficiency Directive recast (EED), which will subject data centers directly to regulation aimed at reducing both energy consumption and carbon emissions (see Critical regulation: the EU Energy Efficiency Directive recast).
This regulation creates new, detailed reporting requirements for data centers in the EU and will force operators to improve their energy efficiency and to make their energy performance metrics publicly available — meaning investors and customers will be better equipped to weigh business decisions on the basis of the organizations’ performance. The EED is expected to enter into force in early 2023. At the time of writing (December 2022), the EED could still be amended to include higher targets for efficiency gains (increasing from 9% to 14.5%) by 2030. The EC has already passed legislation mandating regulated organizations to report on climate-related risks, their potential financial impacts and environmental footprint data every year from 2025, and will affect swathes of data centers.
Similar initiatives are now appearing in the US, with the White House Office of Technology and Science Policy’s (OTSP’s) Climate and Energy Implications of Crypto-assets in the US report, published in September 2022. Complementary legislation is being drafted that addresses both crypto and conventional data centers and sets the stage for the introduction of similar regulation to the EED over the next three to five years (see First signs of federal data center reporting mandates appear in US).
Current and draft regulation is predominantly focused on the performance of data center facility infrastructure (power and cooling systems) in curbing the greenhouse gas emissions (GHGs) associated with utility power consumption (Scope 2). While definitions and metrics remain vague (and are subject to ongoing development) it is clear that EC regulators intend to ultimately extend the scope of such regulation to also include IT efficiency.
Expensive energy is here to stay
The current energy crises in the UK, Europe and elsewhere are masking some fundamental energy trends. Energy prices and, consequently, power prices were on an upward trajectory before Russia’s invasion of Ukraine. Wholesale forward prices for electricity were already shooting up — in both the European and US markets — in 2021.
Certain long-term trends also underpin the trajectory towards costlier power and create an environment conducive to volatility. Structural elements to long-term power-price inflation include:
The global economy’s continued dependence on (and continued increasing consumption of) oil and gas.
Underinvestment in fossil-fuel supply capacities while alternative low-carbon generation and energy storage capacities remain in development.
Gargantuan build-out of intermittent power generation capacity (overwhelmingly wind and solar) as opposed to firm low-carbon generation.
Steady growth in power demand arising from economic growth and electrification in transport and industry.
More specifically, baseload power is becoming more expensive because of the economic displacement effect of intermittent renewable energy. Regardless of how much wind and solar (or even hydro) is connected to the grid, reliability and availability considerations mean the grid has to be fully supported by dispatchable generation such as nuclear, coal and, increasingly, gas.
However, customer preference for renewable energy (and its low operational costs) means fleets of dispatchable power plants operate at reduced capacity, with an increasing number on standby. Grid operators — and, ultimately, power consumers — still need to pay for the capital costs and upkeep of this redundant capacity, to guarantee grid security.
IT power consumption will need to be curbed
High energy prices, carbon reporting, grid capacity shortfalls and efficiency issues have been, almost exclusively, a matter of concern for facility operators. But facility operators have now passed the point of diminishing returns, with greater intervention delivering fewer and fewer benefits. In contrast, every watt saved by IT reduces pressures elsewhere. Reporting requirements will, sooner or later, shed light on the vast potential for greater energy efficiency (or, to take a harsher view, expose the full extent of wasted energy) currently hidden in IT infrastructure.
For these reasons, other stakeholders in the data center industry are likely to call upon IT infrastructure buyers and vendors to engage more deeply in these conversations, and to commit to major initiatives. These demands will be completely justified: currently, IT has considerable scope for delivering improved power management and energy efficiency, where required.
Architecting IT infrastructure to deliver improved energy efficiency through better hardware configuration choices, dynamic workload consolidation practices and the use of power-management features (including energy-saving states and power throttling / capping features) — will deliver major energy-efficiency gains. Server utilization, and the inherent efficiency of server hardware, are two key dials that could bring manifold improvements in energy performance compared with typical enterprise IT.
These efficiency gains are not just theoretical: web technology and cloud services operators exploit them wherever they can. There is no reason why other organizations cannot adopt some of these practices and move closer to the performance metrics achievable. In an era of ever-more expensive (and scarce) power resources, together with mounting regulatory pressure, it will be increasingly difficult for IT C-level managers to deny calls to engage in the battle for better energy efficiency.
The full report Five data center predictions for 2023is available to here.
See our Five Data Center Predictions for 2023 webinar here.
Daniel Bizo
Douglas Donnellan
https://journal.uptimeinstitute.com/wp-content/uploads/2023/05/Energy-efficiency-focus-to-shift-to-IT-at-last-featured.jpg5391030Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngDaniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.com2023-05-03 15:00:002023-05-02 11:33:58Energy-efficiency focus to shift to IT — at last
The past decade has seen numerous reports of so-called cloud “repatriations” — the migration of applications back to on-premises venues following negative experiences with, or unsuccessful migrations to, the public cloud.
A recent Uptime Update (High costs drive cloud repatriation, but impact is overstated) examined why these migrations might occur. The Update revealed that unexpected costs were the primary reason for cloud repatriation, with the cost of data storage being a significant factor in driving expenditure.
Software vendor 37signals recently made headlines after moving its project management platform Basecamp and email service HEY from Amazon Web Services (AWS) and Google Cloud to a colocation facility.
The company has published data on its monthly AWS bills for HEY (Figure 1). The blue line in Figure 1 shows the company’s monthly AWS expenditure. This Update examines this data to understand what lessons can be learned from 37signal’s experience.
Figure 1 37signals’ monthly AWS spend to support HEY in 2022, with capacity profiles
37signals’ AWS spend — observations
Based on the spend charts included in 37signals’ blog (simplified in Figure 1), some observations stand out:
The applications that are part of HEY scale proportionally. When database costs increase, for example, the cost of other services increases similarly. This proportionality suggests that applications (and the total resources used across various services) have been architected to scale upwards and downwards, as necessary. As HEY’s costs scale proportionally, it is reasonable to assume that costs are proportional to resources consumed.
Costs (and therefore resource requirements) are relatively constant over the year — there are no dramatic increases or decreases from month to month.
Database and search are substantial components of 37signals’ bills. The company’s database is not expanding, however, suggesting that the company is effective in preventing sprawl. 37signals’ data does not appear to have “gravity” — “gravity” here meaning the greater the amount of data stored in a system the more data (and, very often, software applications) it will attract over time.
While 37signals’ applications are architected to scale upwards and downwards as necessary, these applications rarely need to scale rapidly to address unexpected demand. This consistency allows 37signals to purchase servers that are likely to be utilized effectively over their life cycle without performance being impacted due to low capacity.
This high utilization level supports the company’s premise that — at least for its own specific use cases — on-premises infrastructure may be cheaper than public cloud.
Return on server investment
As with any capital investment, a server is expected to provide a return — either through increased revenue, or higher productivity. If a server has been purchased but is sitting unused on a data center floor, no value is being obtained, and CAPEX is not being recovered while that asset depreciates.
At the same time, there is a downside to using every server at its maximum capacity. If asset utilization is too high, there is nowhere for applications to scale up if needed. The lack of a capacity buffer could result in application downtime, frequent performance issues, and even lost revenue or productivity.
Suppose 37signals decided to buy all server hardware one year in advance, predicted its peak usage over the year precisely, and purchased enough IT to deliver that peak (shown in orange on Figure 1). Under this ideal scenario, the company would achieve a 98% utilization of its assets over that period (in a financial, not computing or data-storage sense) — that is, 98% of its investment would be used over the year for a value-adding activity.
The chance of the company being able to make such a perfect prediction is unlikely. Overestimating capacity requirements would result in lower utilization and, accordingly, more waste. Underestimating capacity requirements would result in performance issues. A more sensible approach would be to purchase servers as soon as required (shown in green on Figure 1). This strategy would achieve 92% utilization. In practice, however, the company would have more servers idle for immediate capacity, decreasing utilization further.
Cloud providers can never achieve such a high level of utilization (although non-guaranteed “spot” purchases can help). Their entire proposition relies on being able to deliver capacity when needed. As a result, cloud services must have servers available when required — and lots of them.
Why utilization matters
Table 1 makes simple assumptions that demonstrate the challenge a cloud provider faces in provisioning excess capacity.
Table 1 Demonstration of how utilization affects server economics
These calculations show that this on-premises implementation costs $10,000 in total, with the cloud provider’s total costs being $16,000. Cloud buyers rent units of resources, however, with the price paid covering both operating costs (such as power), the resources being used, and the depreciating value (and costs) of servers held in reserve. A cloud buyer would pay a minimum of $1,777 per unit, compared with a unit cost of $1,111 in an on-premises venue. The exact figures are not directly relevant: what is relevant is the fact that the input cost using public cloud is 60% more per unit —purely because of utilization.
Of course, this calculation is a highly simplified explanation of a complex situation. But, in summary, the cloud provider is responsible for making sure capacity is readily available (whether this be servers, network equipment, data centers, or storage arrays) while ensuring sufficient utilization such that costs remain low. In an on-premises data center this balancing act is in the hands of the organization. If enterprise capacity requirements are stable or slow-growing, it can be easier to balance performance against cost.
Sustaining utilization
It is likely that 37signals has done its calculations and is confident that migration is the right move. Success in migration relies on several assumptions. Organizations considering migrating from the public cloud back to on-premises infrastructure are best placed to make a cost-saving when:
There are unlikely to be sudden drops in resource requirements, such that on-premises servers are sitting idle and depreciating without adding value.
Unexpected spikes in resource requirements (that would mean the company could not otherwise meet demand, and the user experience and performance would be impacted) are unlikely. An exception here would be if a decline in user experience and performance did not impact business value — for example, if capacity issues meant employees were unable to access their CEO’s blog simultaneously.
Supply chains can deliver servers (and data center space) quickly in line with demand without the overheads involved in holding many additional servers (i.e., depreciating assets) in stock.
Skills are available to manage those aspects of the infrastructure for which the cloud provider was previously responsible (e.g., MySQL, capacity planning). These factors have not been considered in this update.
The risk is that 37signals (or any other company moving back to the public cloud) might not be confident of these criteria being met in the longer term. Were the situation to change unexpectedly, the cost profile of on-premises versus public cloud can be substantially altered.
https://journal.uptimeinstitute.com/wp-content/uploads/2023/04/Asset-utilization-drives-cloud-repatriation-economics-featured.jpg5391030Dr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngDr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.com2023-04-26 15:00:002023-04-25 10:08:53Asset utilization drives cloud repatriation economics
Data shows the cloud goes where the money is
/in Executive, Operations/by Dr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.comHyperscale cloud providers have opened numerous operating regions in all corners of the world over the past decade. The three most prominent — Amazon Web Services (AWS), Google Cloud and Microsoft Azure — now have 105 distinct regions (excluding government and edge locations) for customers to choose from to locate their applications and data. Over the next year, this will grow to 130 regions. Other large cloud providers such as IBM, Oracle and Alibaba are also expanding globally, and this trend is likely to continue.
Each region requires enormous investments in data centers, IT, software, people, and networks. The opening of a region may both develop and disrupt the digital infrastructure of the countries involved. This Update, part of Uptime Intelligence’s series of publications explaining and examining the development of the cloud, shows how investment can be tracked — and, to a degree, predicted — by looking at the size of the markets involved.
Providers use the term “region” to describe a geographical area containing a collection of independent availability zones (AZs), which are logical representations of data center facilities. A country may have many regions, with each region typically having two or three AZs. The three leading hyperscalers’ estates include more than 300 hyperscale AZs and many more data centers (including both hyperscale-owned and hyperscale-leased facilities) in operation today. Developers use AZs to build resilient applications in a single region.
The primary reason providers offer a range of regions is latency. In general, no matter how good the network infrastructure is, the further the end user is from the cloud application, the greater the delay and the poorer the end-user experience (especially on latency-sensitive applications, such as interactive gaming). Another important driver is that some cloud buyers are required to keep applications and user data in data centers in a specific jurisdiction for compliance, regulatory or governance reasons.
Figure 1 shows how many of the three largest cloud providers have regions in each country.
The economics of a hyperscale public cloud depends on scale. Implementing a cloud region of multiple AZs (and, therefore, data centers) requires substantial investment, even if it relies on colocation sites. Cloud providers need to expect enough return to justify such an investment.
To achieve this return on investment, a geographical region must have the telecommunications infrastructure to support the entire cloud region. Practically too, the location must be able to support the data center itself, and be able to provide reliable power, telecommunications, security and skills.
Considering these requirements, cloud providers focus their expansion plans on economies with the largest gross domestic product (GDP). GDP measures economic activity but, more generally, is an indicator of the health of an economy. Typically, countries with a high GDP have broad and capable telecommunications infrastructure, high technology skills, robust legal and contractual frameworks, and the supporting infrastructure and supply chains required for data center implementation and operation. Furthermore, organizations in countries with higher GDPs have greater spending power and access to borrowing. In other words, they have the cash to spend on cloud applications to give the provider a high enough return on investment.
The 17 countries where all three hyperscalers currently operate cloud regions, or plan to, account for 56% of global GDP. The GDP of countries where at least one hyperscaler intends to operate is 87% of global GDP across just 40 countries (for comparison, the United Nations comprises 195 countries).
Figure 2 shows GDP against hyperscalers present in a country. (US and China’s GDPs are not shown because they are significant outliers.) The figure shows a trend: a greater GDP increases the likelihood of a hyperscaler presence in the region. Three countries buck this trend: Mexico, Turkey and Russia.
Observations
Conclusions
As discussed in Cloud scalability and resiliency from first principles, building applications across different cloud providers is challenging and costly. As a result, customers will seek cloud providers that operate in all the regions they want to reach. To meet this need, providers are following the money. Higher GDP generally equates to more resilient, stable economies, where companies are likely to invest and infrastructure is readily available. The current exception is Russia. High GDP countries yet to have a cloud presence include Turkey and Nigeria.
In practice, most organizations will be able to meet most of their international needs using hyperscaler cloud infrastructure. However, they need to carefully consider where they may want to host applications in the future. Their current provider may not support a target location, but migrating to a new provider that does is often not feasible. (A future Uptime Intelligence update will further explore specific gaps in cloud provider coverage.)
There is an alternative to building data centers or using colocation providers in regions without hyperscalers: organizations seeking new markets could consider where hyperscaler cloud providers may expand next. Rather than directly tracking market demand, software vendors may launch new services when a suitable region is brought online. The cost of duplicating an existing cloud application into a new region is small (especially compared with a new data center build or multi-cloud development). Sales and technical support can often be provided remotely without an expensive in-country presence.
Similarly, colocation providers can follow the money and consider the cloud providers’ expansion plans. A location such as Nigeria, with high GDP and no hyperscaler presence (but good telecommunications infrastructure) may be ideal for data center buildouts for future hyperscaler requirements.
Colocation providers also have opportunities outside the GDP leaders. Many organizations still need local data centers for compliance or regulatory reasons, or for peace of mind, even if a hyperscaler data center is relatively close in terms of latency. In the Uptime Institute Data Center Capacity Trends Survey 2022, 44% of 65 respondents said they would use their own data center if their preferred public cloud provider was unavailable in a country, and 29% said they would use a colocation provider.
Cloud providers increasingly offer private cloud appliances that can be installed in a customer’s data center and connected to the public cloud for a hybrid deployment (e.g., AWS Outposts, VMware, Microsoft Azure Stack). Colocation providers should consider if partnerships with hyperscaler cloud providers can support hybrid cloud implementations outside the locations where hyperscalers operate.
Cloud providers have no limits in terms of country or market. If they see an opportunity to make money, they will take it. But they need to see a return on their investment. Such returns are more likely where demand is high (often where GDP is high) and infrastructure is sufficient.
Cooling to play a more active role in IT performance and efficiency
/in Design, Executive, Operations/by Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.comData center operators and IT tenants have traditionally adopted a binary view of cooling performance: it either meets service level commitments, or it does not. The relationship is also coldly transactional: as long as sufficient volumes of air of the right temperature and quality (in accordance with service-level agreements that typically follow ASHRAE’s guidance) reach the IT rack, the data center facility’s mission has been accomplished. What happens after that point with IT cooling, and how it affects IT hardware, is not facilities’ business.
This practice was born in an era when the power density of IT hardware was much lower, and when server processors still had a fixed performance envelope. Processors were running at a given nominal frequency, under any load, that was defined at the time of manufacturing. This frequency was always guaranteed if there was sufficient cooling available, whatever the workload.
Chipmakers guide IT system builders and customers to select the right components (heat sinks, fans) via processor thermal specifications. Every processor is assigned a power rating for the amount of heat its cooling system must be able to handle at the corresponding temperature limit. This is not theoretical maximum power but rather the maximum that can realistically be sustained (seconds or more) running real-world software. This maximum is called thermal design power (TDP).
The majority of software applications don’t stress the processor enough to get close to the TDP, even if they use 100% of the processor’s time — typically only high-performance computing code makes processors work that hard. With frequencies fixed, this results in power consumption (and thermal power) that is considerably below the TDP rating. Since the early 2000s, nominal processor speeds have tended to be limited by power rather than the maximum speed of circuitry, so for most applications there is untapped performance potential within the TDP envelope.
This gap is wider still in multicore processors when the software cannot benefit from all the cores present. This results in an even larger portion of the power budget not being used to increase application performance. The higher the core count, the bigger this gap can be unless the workload is highly multithreaded.
Processors looking for opportunities
Most server processors and accelerators that came to market in the past decade have mechanisms to address this (otherwise ever-growing) imbalance. Although implementation details differ between chipmakers (Intel, AMD, NVIDIA, IBM), they all dynamically deploy available power budget to maximize performance when and where it is needed most.
This balancing happens in two major ways: frequency scaling and management of power allocation to cores. When a modern server processor enters a phase of high utilization but remains under its thermal specification, it starts to increase supply voltage and then matches frequency in incremental steps. It continues to scale the steps until it reaches any one of the preset limits: frequency, current, power or temperature — whichever comes first.
If the workload is not evenly distributed across cores, or leaves some cores unused, the processor allocates unused power to highly utilized cores (if power was the limiting factor for their performance) to enable them to scale their frequencies even higher. The major beneficiary of independent core scaling is the vast repository of single- or lightly threaded software, but multithreaded applications also benefit where they struggle with Amdahl’s law (when the application is hindered by parts of the code that are not parallelized, so that overall performance depends largely on how fast a core can work through those segments).
This opportunistic behavior of modern processors means the quality of cooling, considering both supply of cold air and its distribution within the server, is not binary anymore. Considerably better cooling increases the performance envelope of the processor, a phenomenon that supercomputing vendors and users have been exploring for years. It also tends to improve overall efficiency because more work is done for the energy used.
Performance is best served cold
Better cooling unlocks performance and efficiency in two major ways:
The lowering of operational temperature through improved cooling brings many performance benefits such as enabling individual processor cores to run at elevated speeds for longer without hitting their temperature limit.
Another, likely sizeable, benefit lies in reducing static power in the silicon. Static power is power lost to leakage currents that perform no useful work, yet keep flowing through transistor gates even when they are in the ”off” state. Static power was not an issue 25 years ago, but has become more difficult to suppress as transistor structures have become smaller, and their insulation properties correspondingly worse. High-performance logic designs, such as those in server processors, are particularly burdened by static power because they integrate a large number of fast-switching transistors.
Semiconductor technology engineers and chip designers have adopted new materials and sophisticated power-saving techniques to reduce leakage currents. However, the issue persists. Although chipmakers do not reveal the static power consumption of their products, it is likely to take a considerable component of the power budget of the processor, probably a low double-digit percentage share.
Various academic research papers have shown that static leakage currents depend on the temperature of silicon, but the exact profile of that correlation varies greatly across chip manufacturing technologies — such details remain hidden from the public eye.
Upgraded air coolers can measurably improve application performance when the processor is thermally limited during periods of high load, though such a speed-up tends to be in the low single digits. This can be achieved by lowering inlet air temperatures or, more commonly, by upgrading the processors’ cooling to lower thermal resistance. Examples of this are: adding larger, CFD-optimized heat sinks built from thermally better conducting alloy (e.g., copper-based alloys); using better thermal interface materials; and introducing more powerful fans to increase airflow. If combined with better facility air delivery and lower inlet temperatures, the speed-up is higher still.
No silver bullets, just liquid cooling
But the markedly lower thermal resistance and consequent lowered silicon temperature that direct liquid cooling (DLC) brings makes a more pronounced difference. Compared with air coolers at the same temperature, DLC (cold plate and immersion) can free up more power by reducing the temperature-dependent component of static leakage currents.
There is an even bigger performance potential in the better thermal properties of liquid cooling: prolonging the time that server processors can spend in controlled power excursions above their TDP level, without hitting critical temperature limits. This behavior, now common in server processors, is designed to offer bursts of extra performance, and can result in a short-term (tens of seconds) heat load that is substantially higher than the rated cooling requirement.
Typically, excursions reach 15% to 25% above the TDP, which did not previously pose a major challenge. However, in the latest generation of products from AMD and Intel, this results in up to 400 watts (W) and 420 W, respectively, of sustained thermal power per processor — up from less than 250 W about five years ago.
Such high-power levels are not exclusive to processor models aimed at high-performance computing applications: a growing number of mainstream processor models intended for cloud, hosting and enterprise workload consolidation can have these demanding thermal requirements. The favorable economics of higher performance servers (including their energy efficiency across an array of applications) generates demand for powerful processors.
Although these TDPs and power excursion levels are still manageable with air when using high-performance heat sinks (at the cost of rack density because of very large heat sinks, and lots of fan power), peak performance levels will start to slip out of reach for standard air cooling in the coming years. Server processor development roadmaps call for even more powerful processor models in the coming years, probably reaching 600 W in thermal excursion power by the mid-2020s.
As processor power escalates and temperature limits grow more restrictive, even DLC temperature choices will be a growing trade-off dilemma as data center and IT infrastructure operators try to balance capital costs, cooling performance, energy efficiency and sustainability credentials. Inevitably, the relationship between data center cooling, server performance and overall IT efficiency will demand more attention.
The effects of a failing power grid in South Africa
/in Executive, Operations/by Max Smolaks, Research Analyst, msmolaks@uptimeinstitute.comEuropean countries narrowly avoided an energy crisis in the past winter months, as a shortfall in fossil fuel supplies from Russia threatened to destabilize power grids across the region. This elevated level of risk to the normally robust European grid has not been seen for decades.
A combination of unseasonably mild weather, energy saving initiatives and alternative gas supplies averted a full-blown energy crisis, at least for now, although business and home consumers are paying a heavy price through high energy bills. The potential risk to the grid forced European data center operators to re-evaluate both their power arrangements and their relationship with the grid. Even without an energy security crisis, power systems elsewhere are becoming less reliable, including some of the major grid regions in the US.
Most mission-critical data centers are designed not to depend on the availability of an electrical utility, but to benefit from its lower power costs. On-site power generation — usually provided by diesel engine generators — is the most common option to backup electricity supplies, because it is under the facility operator’s direct control.
A mission-critical design objective of power autonomy, however, does not shield data center operators from problems that affect utility power systems. The reliability of the grid affects:
South Africa provides a case study in how grid instability affects data center operations. The country has emerged as a regional data center hub over the past decade (largely due to its economic and infrastructure head-start over other major African countries), despite experiencing its own energy crisis over the past 16 years.
A total of 11 major subsea network cables land in South Africa, and its telecommunications infrastructure is the most developed on the continent. Although it cannot match the capacity of other global data center hubs, South Africa’s data center market is highly active — and is expanding (including recent investments by global colocation providers Digital Realty and Equinix). Cloud vendors already present in South Africa include Amazon Web Services (AWS), Microsoft Azure, Huawei and Oracle, with Google Cloud joining soon. These organizations must contend with a notoriously unreliable grid.
Factors contributing to grid instability
Most of South Africa’s power grid is operated by state-owned Eskom, the largest producer of electricity in Africa. Years of under-investment in generation and transmission infrastructure have forced Eskom to impose periods of load-shedding — planned rolling blackouts based on a rotating schedule — since 2007.
Recent years have seen substation breakdowns, cost overruns, widespread theft of coal and diesel, industrial sabotage, multiple corruption scandals and a $5 billion government bail-out. Meanwhile, energy prices nearly tripled in real terms between 2007 and 2020.
In 2022, the crisis deepened, with more power outages than in any of the previous 15 years — nearly 300 load-shedding events, which is three times the previous record of 2020 (Figure 1). Customers are usually notified about upcoming disruption through the EskomSePush (ESP) app. Eskom’s load-shedding measures do not distinguish between commercial and residential properties.
Blackouts normally last for several hours, and there can be several a day. Eskom’s app recorded at least 3,212 hours of load-shedding across South Africa’s grid in 2022. For more than 83 hours, South Africa’s grid remained in “Stage 6”, which means the grid was in a power shortfall of at least 6,000 megawatts. A new record was set in late February 2023, when the grid entered “Stage 8” load-shedding for the first time. Eskom has, in the past, estimated that in “Stage 8”, an average South African could expect to be supplied with power for only 12 hours a day.
Reliance on diesel
In this environment, many businesses depend on diesel generators as a source of power — including data centers, hospitals, factories, water treatment facilities, shopping centers and bank branches. This increased demand for generator sets, spare parts and fuel has led to supply shortages.
Load-shedding in South Africa often affects road signs and traffic lights, which means fuel deliveries are usually late. In addition, trucks often have to queue for hours to load fuel from refineries. As a result, most local data center operators have two or three fuel supply contracts, and some are expanding their on-site storage tanks to provide fuel for several days (as opposed to the 12-24 hours typical in Europe and the US).
There is also the cost of fuel. The general rule in South Africa is that generating power on-site costs about seven to eight times more than buying utility power. With increased runtime hours on generators, this quickly becomes a substantial expense compared with utility energy bills.
Running generators for prolonged periods accelerates wear and heightens the risk of mechanical failure, with the result that the generator units need to be serviced more often. As data center operations staff spend more time monitoring and servicing generators and fuel systems, other maintenance tasks are often deferred, which creates additional risks elsewhere in the facility.
To minimize the risk of downtime, some operators are adapting their facilities to accommodate temporary external generator set connections. This enables them to provision additional power capacity in four to five hours. One city, Johannesburg, has access to gas turbines as an alternative to diesel generators, but these are not available in other cities.
Even if the data center remains operational through frequent power cuts, its connectivity providers, which also rely on generators, may not. Mobile network towers, equipped with UPS systems and batteries, are frequently offline because they do not get enough time to recharge between load-shedding periods if there are several occurrences a day. MTN, one of the country’s largest network operators, had to deploy 2,000 generators to keep its towers online and is thought to be using more than 400,000 liters of fuel a month.
Frequent outages on the grid create another problem: cable theft. In one instance, a data center operator’s utility power did not return following a scheduled load-shedding. The copper cables leading to the facility were stolen by thieves, who used the load-shedding announcements to work out when it was safe to steal the cables.
Lessons for operators in Europe and beyond
Issues with the supply of Russian gas to Europe might be temporary, but there are other concerns for power grids around the world. In the US, an aging electricity transmission infrastructure — much of it built in the 1970s and 1980s — requires urgent modernization, which will cost billions of dollars. It is not clear who will foot this bill. Meanwhile, power outages across the US over the past six years have more than doubled compared with the previous six years, according to federal data.
While the scale of grid disruption seen in South Africa is extreme, it offers lessons on what happens when the grid destabilizes and how to mitigate those problems. An unstable grid will cause similar problems for data center operators around the world, and range from ballooning power costs to a higher risk of equipment failure. This risk will creep into other parts of the facility infrastructure if operations staff do not have time to perform the necessary maintenance tasks. Generators may be the primary source of data center power, but they are best used as an insurance policy.
US operators scour Inflation Reduction Act for incentives
/in Executive, Operations/by Lenny Simon, Senior Research Associate, Uptime InstituteIn the struggle to reduce carbon emissions and increase renewable energy, the US Inflation Reduction Act (IRA), passed in August 2022, is a landmark development. The misleadingly named Act, which is lauded by environmental experts and castigated by foreign leaders, is intended to rapidly accelerate the decarbonization of the world’s largest economy by introducing nearly $400 billion in federal funding over the next 10 years.
Reducing the carbon intensity of electricity production is a major focus of the act and the US clean energy industry will greatly benefit from the tax credits encouraging renewable energy development. But it also includes provisions intended to “re-shore” and create jobs in the US and to ensure that US companies have greater control over the energy supply chain. Abroad, foreign leaders have raised objections over these protectionist provisions creating (or aggravating) a political rift between the US and its allies and trading partners. In response to the IRA, the EU has redirected funds to buoy its low-carbon industries, threatened retaliatory measures and is considering the adoption of similar legislation.
While the politicians argue, stakeholders in the US have been scouring the IRA’s 274 pages for opportunities to capitalize on these lucrative incentives. Organizations that use a lot of electricity are also likely to benefit — including large data centers and their suppliers. Lawyers, accountants and investors working for organizations planning large-scale digital infrastructure investments will see opportunities too. Some of these will be substantial.
Summary of opportunities
For digital infrastructure companies, it may be possible to secure support in the following areas:
Detailed provisions
Any US operator considering significant investments in renewable generation and/or energy storage, including, for example, a UPS — is advised to study the act closely.
Uptime Institute Intelligence’s view is that the following IRA tax credits will apply to operators:
For the tax credits mentioned above, organizations must meet the prevailing wage and apprenticeship requirements (initial guidance by the US Treasury Department and the Internal Revenue Service can be found here) to receive the maximum credit unless the nameplate generation capacity of the project is less than 1 megawatt for the ITC and PTC.
The incentives listed above will be available until 2032 creating certainty for operators considering an investment in renewables, efficiency retrofits or the renewable energy industry. Additionally, these tax credits are transferable: they can be sold — for cash — to another company with tax liability, such as a bank.
Hyperscalers and large colocation providers are best positioned to capitalize on these tax credits: they are building new capacity quickly, possess the expertise and staffing capacity to navigate the legal requirements, and have ambitious net-zero targets.
However, data center operators of all sizes will pursue these incentives where there is a compelling business case. Owners / operators from Uptime Institute’s 2022 global data center survey said more renewable energy purchasing options would deliver the most significant gains in sustainability performance in the next three to five years.
The IRA may also lower the cost barriers for innovative data center designs and typologies. For example, IRA incentives will strengthen the business case for pairing a facility on a microgrid with renewable and long-duration energy storage (LDES). Emerging battery chemistries in development (including iron-air, liquid metal and nickel-zinc) offer discharge durations of 10 hours to 10 days and would benefit from large deployments to prove their viability.
LDES is essential for a reliable low-carbon grid. As the IRA speeds up the deployment of renewables, organizations will need multi-day energy storage to smooth out the high variability of intermittent generators such as solar and wind. Data center facilities may be ideal sites for LDES, even if they are not dedicated for data center use.
Additionally, low-carbon baseload generators such as nuclear, hydrogen and geothermal — all eligible for IRA tax credits — will be needed to replace reliable fossil fuel generators, such as gas turbines and coal power plants.
The incentives in the IRA, welcomed with enthusiasm by climate campaigners the world over, will strengthen the business case in the US for reducing energy consumption, deploying low-carbon energy and energy storage, and/or investing in the clean energy economy.
There is, however, a more problematic side: the rare earth materials and critical components the US will need to meet the objectives of the IRA may be hard to source in sufficient quantities and allegations of protectionism may cause political rifts with other countries.
Lenny Simon, Senior Research Associate lsimon@uptimeinstitute.com
Andy Lawrence, Executive Director of Research alawrence@uptimeinstitute.com
Energy-efficiency focus to shift to IT — at last
/in Design, Executive, Operations/by Daniel Bizo, Research Director, Uptime Institute Intelligence, dbizo@uptimeinstitute.comData centers have become victims of their own success. Ever-larger data centers have mushroomed across the globe in line with an apparently insatiable demand for computing and storage capacity. The associated energy use is not only expensive (and generating massive carbon emissions) but is also putting pressure on the grid. Most data center developments tend to be concentrated in and around metropolitan areas — making their presence even more palpable and attracting scrutiny.
Despite major achievements in energy performance throughout the 2010s — as witnessed by Uptime data on industry-average PUE — this has created challenges for data center builders and operators. Delivering bulletproof and energy-efficient infrastructure at a competitive cost is already a difficult balancing act, even without having to engage with local government, regulators and the public at large on energy use, environmental impact and carbon footprint.
IT is conspicuously absent from this dialogue. Server and storage infrastructure account for the largest proportion of a data centers’ power consumption and physical footprint. As such, they also offer the greatest potential for energy-efficiency gains and footprint compression. Often the issue is not wasted but unused power: poor capacity-planning practices create demand for additional data center developments even where unused (but provisioned) capacity is available.
Nonetheless, despite growing costs and sustainability pressures, enterprise IT operators — as well as IT vendors — continue to show little interest in the topic.
This will be increasingly untenable in the years ahead. In the face of limited power availability in key data center markets, together with high power prices and mounting pressure to meet sustainability legislation, enterprise IT’s energy footprint will have to be addressed more seriously. This will involve efficiency-improvement measures aimed at using dramatically fewer server and storage systems for the same workload.
Uptime has identified four key areas where pressure on IT will continue to build — all of them pointing in the same direction:
Municipalities — and utility providers — need the pace to drop
Concerns over power and land availability have, since 2019, led to greater restrictions on the construction of new data centers (Table 1). This is likely to intensify. Interventions on the part of local government and utility providers typically involve more rigorous application processes, more stringent energy-efficiency requirements and, in some cases, the outright denial of new grid connections for major developments. These restrictions have resulted in costly project delays (and, in some cases, cancellations) for major cloud and colocation providers.
Frankfurt, a key financial hub and home to one of the world’s largest internet exchange ecosystems, set an example. Under a new citywide masterplan (announced in 2022), the city stipulates densified, multistory and energy-optimized data center developments — chiefly out of concerns for sprawling land use and changes to the city’s skyline.
The Dublin area (Ireland) and Loudoun County (Northern Virginia, US) are two stand-out examples (among others) of the grid being under strain and power utilities having temporarily paused or capped new connections because of current shortfalls in generation or transmission capacity. Resolving these limitations is likely to take several years. A number of data center developers in both Dublin and Loudoun County have responded to these challenges by seeking locations further afield.
Table 1 Restrictions on new data centers since 2019 — selected examples
New sustainability regulations
Following years of discussion with key stakeholders, authorities have begun introducing regulation governing performance improvements and sustainability reporting for data centers — a key example being the EC’s Energy Efficiency Directive recast (EED), which will subject data centers directly to regulation aimed at reducing both energy consumption and carbon emissions (see Critical regulation: the EU Energy Efficiency Directive recast).
This regulation creates new, detailed reporting requirements for data centers in the EU and will force operators to improve their energy efficiency and to make their energy performance metrics publicly available — meaning investors and customers will be better equipped to weigh business decisions on the basis of the organizations’ performance. The EED is expected to enter into force in early 2023. At the time of writing (December 2022), the EED could still be amended to include higher targets for efficiency gains (increasing from 9% to 14.5%) by 2030. The EC has already passed legislation mandating regulated organizations to report on climate-related risks, their potential financial impacts and environmental footprint data every year from 2025, and will affect swathes of data centers.
Similar initiatives are now appearing in the US, with the White House Office of Technology and Science Policy’s (OTSP’s) Climate and Energy Implications of Crypto-assets in the US report, published in September 2022. Complementary legislation is being drafted that addresses both crypto and conventional data centers and sets the stage for the introduction of similar regulation to the EED over the next three to five years (see First signs of federal data center reporting mandates appear in US).
Current and draft regulation is predominantly focused on the performance of data center facility infrastructure (power and cooling systems) in curbing the greenhouse gas emissions (GHGs) associated with utility power consumption (Scope 2). While definitions and metrics remain vague (and are subject to ongoing development) it is clear that EC regulators intend to ultimately extend the scope of such regulation to also include IT efficiency.
Expensive energy is here to stay
The current energy crises in the UK, Europe and elsewhere are masking some fundamental energy trends. Energy prices and, consequently, power prices were on an upward trajectory before Russia’s invasion of Ukraine. Wholesale forward prices for electricity were already shooting up — in both the European and US markets — in 2021.
Certain long-term trends also underpin the trajectory towards costlier power and create an environment conducive to volatility. Structural elements to long-term power-price inflation include:
More specifically, baseload power is becoming more expensive because of the economic displacement effect of intermittent renewable energy. Regardless of how much wind and solar (or even hydro) is connected to the grid, reliability and availability considerations mean the grid has to be fully supported by dispatchable generation such as nuclear, coal and, increasingly, gas.
However, customer preference for renewable energy (and its low operational costs) means fleets of dispatchable power plants operate at reduced capacity, with an increasing number on standby. Grid operators — and, ultimately, power consumers — still need to pay for the capital costs and upkeep of this redundant capacity, to guarantee grid security.
IT power consumption will need to be curbed
High energy prices, carbon reporting, grid capacity shortfalls and efficiency issues have been, almost exclusively, a matter of concern for facility operators. But facility operators have now passed the point of diminishing returns, with greater intervention delivering fewer and fewer benefits. In contrast, every watt saved by IT reduces pressures elsewhere. Reporting requirements will, sooner or later, shed light on the vast potential for greater energy efficiency (or, to take a harsher view, expose the full extent of wasted energy) currently hidden in IT infrastructure.
For these reasons, other stakeholders in the data center industry are likely to call upon IT infrastructure buyers and vendors to engage more deeply in these conversations, and to commit to major initiatives. These demands will be completely justified: currently, IT has considerable scope for delivering improved power management and energy efficiency, where required.
Architecting IT infrastructure to deliver improved energy efficiency through better hardware configuration choices, dynamic workload consolidation practices and the use of power-management features (including energy-saving states and power throttling / capping features) — will deliver major energy-efficiency gains. Server utilization, and the inherent efficiency of server hardware, are two key dials that could bring manifold improvements in energy performance compared with typical enterprise IT.
These efficiency gains are not just theoretical: web technology and cloud services operators exploit them wherever they can. There is no reason why other organizations cannot adopt some of these practices and move closer to the performance metrics achievable. In an era of ever-more expensive (and scarce) power resources, together with mounting regulatory pressure, it will be increasingly difficult for IT C-level managers to deny calls to engage in the battle for better energy efficiency.
The full report Five data center predictions for 2023 is available to here.
See our Five Data Center Predictions for 2023 webinar here.
Daniel Bizo
Douglas Donnellan
Asset utilization drives cloud repatriation economics
/in Executive, Operations/by Dr. Owen Rogers, Senior Research Director for Cloud Computing, Uptime Institute, orogers@uptimeinstitute.comThe past decade has seen numerous reports of so-called cloud “repatriations” — the migration of applications back to on-premises venues following negative experiences with, or unsuccessful migrations to, the public cloud.
A recent Uptime Update (High costs drive cloud repatriation, but impact is overstated) examined why these migrations might occur. The Update revealed that unexpected costs were the primary reason for cloud repatriation, with the cost of data storage being a significant factor in driving expenditure.
Software vendor 37signals recently made headlines after moving its project management platform Basecamp and email service HEY from Amazon Web Services (AWS) and Google Cloud to a colocation facility.
The company has published data on its monthly AWS bills for HEY (Figure 1). The blue line in Figure 1 shows the company’s monthly AWS expenditure. This Update examines this data to understand what lessons can be learned from 37signal’s experience.
37signals’ AWS spend — observations
Based on the spend charts included in 37signals’ blog (simplified in Figure 1), some observations stand out:
While 37signals’ applications are architected to scale upwards and downwards as necessary, these applications rarely need to scale rapidly to address unexpected demand. This consistency allows 37signals to purchase servers that are likely to be utilized effectively over their life cycle without performance being impacted due to low capacity.
This high utilization level supports the company’s premise that — at least for its own specific use cases — on-premises infrastructure may be cheaper than public cloud.
Return on server investment
As with any capital investment, a server is expected to provide a return — either through increased revenue, or higher productivity. If a server has been purchased but is sitting unused on a data center floor, no value is being obtained, and CAPEX is not being recovered while that asset depreciates.
At the same time, there is a downside to using every server at its maximum capacity. If asset utilization is too high, there is nowhere for applications to scale up if needed. The lack of a capacity buffer could result in application downtime, frequent performance issues, and even lost revenue or productivity.
Suppose 37signals decided to buy all server hardware one year in advance, predicted its peak usage over the year precisely, and purchased enough IT to deliver that peak (shown in orange on Figure 1). Under this ideal scenario, the company would achieve a 98% utilization of its assets over that period (in a financial, not computing or data-storage sense) — that is, 98% of its investment would be used over the year for a value-adding activity.
The chance of the company being able to make such a perfect prediction is unlikely. Overestimating capacity requirements would result in lower utilization and, accordingly, more waste. Underestimating capacity requirements would result in performance issues. A more sensible approach would be to purchase servers as soon as required (shown in green on Figure 1). This strategy would achieve 92% utilization. In practice, however, the company would have more servers idle for immediate capacity, decreasing utilization further.
Cloud providers can never achieve such a high level of utilization (although non-guaranteed “spot” purchases can help). Their entire proposition relies on being able to deliver capacity when needed. As a result, cloud services must have servers available when required — and lots of them.
Why utilization matters
Table 1 makes simple assumptions that demonstrate the challenge a cloud provider faces in provisioning excess capacity.
These calculations show that this on-premises implementation costs $10,000 in total, with the cloud provider’s total costs being $16,000. Cloud buyers rent units of resources, however, with the price paid covering both operating costs (such as power), the resources being used, and the depreciating value (and costs) of servers held in reserve. A cloud buyer would pay a minimum of $1,777 per unit, compared with a unit cost of $1,111 in an on-premises venue. The exact figures are not directly relevant: what is relevant is the fact that the input cost using public cloud is 60% more per unit —purely because of utilization.
Of course, this calculation is a highly simplified explanation of a complex situation. But, in summary, the cloud provider is responsible for making sure capacity is readily available (whether this be servers, network equipment, data centers, or storage arrays) while ensuring sufficient utilization such that costs remain low. In an on-premises data center this balancing act is in the hands of the organization. If enterprise capacity requirements are stable or slow-growing, it can be easier to balance performance against cost.
Sustaining utilization
It is likely that 37signals has done its calculations and is confident that migration is the right move. Success in migration relies on several assumptions. Organizations considering migrating from the public cloud back to on-premises infrastructure are best placed to make a cost-saving when:
The risk is that 37signals (or any other company moving back to the public cloud) might not be confident of these criteria being met in the longer term. Were the situation to change unexpectedly, the cost profile of on-premises versus public cloud can be substantially altered.