First signs of federal data center reporting mandates appear in US

First signs of federal data center reporting mandates appear in US

The past year (2022) has seen regulators in many countries develop or mandate requirements to report data centers’ operating information and environmental performance metrics. The first of these, the European Commission (EC) Energy Efficiency Directive (EED) recast is currently under review by the European Parliament and is expected to become law in 2023. This directive will mandate three levels of information reporting, the application and publication of energy performance improvement and efficiency metrics, and conformity with certain energy efficiency requirements (see EU’s EED recast set to create reporting challenges).

Similar legislative and regulatory initiatives are now appearing in the US with the White House Office of Technology and Science Policy’s (OTSP’s) Climate and energy implications of crypto-assets in the US report, published in September 2022. Concurrently with this, Senator Sheldon Whitehouse is drafting complimentary legislation that addresses both crypto and conventional data centers and sets the stage for the introduction of similar regulation to the EED over the next three to five years.

The OTSP report focuses on the impacts of the recent precipitous increase in energy consumption resulting from cryptocurrency mining in the US — initially driven by high crypto prices, low electricity costs and China’s prohibition of cryptomining operations. The OTSP report estimates cryptomining energy consumption (for both Bitcoin and Ethereum mining) to be responsible for 0.9% to 1.7% of US electricity consumption, and for 0.4% to 0.8% of greenhouse gas (GHG) emissions, in 2021.

The OTSP’s projections may already be out of date due to the current high energy prices and the collapse in value of most crypto assets. The OTSP’s projections, moreover, do not take into account the likely impact of Ethereum mining operations (estimated to account for one-quarter to one-third of industry consumption) moving from “proof of work” (PoW) to “proof of stake” (PoS).

PoW is the original “consensus mechanism” used in cryptocurrency transactions, whereby miners compete to solve increasingly difficult algorithms to validate transactions — at the cost of ever-increasing energy consumption. PoS transactions are mediated by randomly selected miners who stake a quantity of cryptocurrency (and their experience level) for the right to confirm transactions — enabling the use of less computationally intense (and therefore less energy-intense) algorithms. Ethereum converted to PoS in September 2022 in an initiative known as “the Merge”: this change is expected to reduce its mining energy consumption by over 99%.

The OTSP report implies that the broader adoption of crypto assets and the application of the underlying blockchain software used across a range of business processes will continue to drive increasing blockchain-related energy consumption. The report does not offer a specific projection of increasing energy consumption from cryptomining and further blockchain deployments. Given that most, if not all, enterprise blockchain deployments use PoS validation, and given the ability of PoW infrastructure to move quickly to locations with minimal regulation and energy costs, much of this anticipated energy growth may not materialize.

To mitigate this projected growth in energy consumption, the OTSP report calls on the federal government to encourage and ensure the responsible development of cryptomining operations in three specific areas.

  • Minimizing GHG emissions and other impacts from cryptomining operations. The report proposes that the US government implement a collaborative process to develop effective, evidence-based environmental performance standards governing the development, design and operation of cryptomining facilities. It proposes that the Department of Energy (DOE) or the Environmental Protection Agency (EPA) should be empowered to set energy performance standards for “crypto-asset mining equipment, blockchain and other operations.”
  • Requiring cryptomining organizations to obtain and publicly report data in order to understand, monitor and mitigate impacts. The report stipulates that cryptomining operations should publicly report their location(s), energy consumption, energy mix, GHG emissions (using existing protocols), electronic waste recycling, environmental justice implications and demand-response participation.
  • Promoting further research to improve understanding and innovation. The report recommends prioritizing research and development in next-generation digital asset technologies that promote the US’ goals in terms of security, privacy, equity, resiliency and climate.

While these recommendations are primarily directed at cryptomining operations, the report also assesses conventional (i.e., non-crypto-asset) data center operations, noting that cryptomining energy consumption in 2021 was roughly comparable to that of conventional data centers. This clearly raises the question: if cryptomining energy consumption warrants public data reporting and energy performance standards, then why should conventional data center operations not also be included in that mandate?

Under US law, Congress would need to pass legislation authorizing an administrative agency to require data centers to report their location(s), operational data and environmental performance information. Senator Whitehouse is developing draft legislation to address both crypto-asset and conventional data centers, using the EED as a blueprint. The Senator’s proposals would amend the Energy Independence and Security Act of 2007 (EISA) to require all public and private conventional and cryptomining data center locations with more than 100 kW of installed IT equipment (nameplate power) to report data to the Energy Information Administration (EIA). These data center locations would need to outline their operating attributes: a requirement remarkably similar to the EED’s information reporting mandates.

The proposals also require the DOE to promulgate a final rule covering energy conservation standards for “Servers and Equipment for Cryptomining” within two years of the EISA amendments going into force. While this requirement is specific to cryptomining equipment, it is likely that the DOE will lobby Congress to include energy conservation standards for conventional data center IT equipment as part of these proposed amendments. The DOE has already attempted to set energy conservation standards for computer servers (79 FR 11350 02/28/2014) through authority granted under the EISA regulating commercial office equipment.

Little will happen immediately. Legislative and regulatory processes and procedures in the US can be laborious, and final standards governing data center information and energy efficiency reporting are likely to remain several years away. But the release of the OTSP report and the development of draft US legislation indicate that the introduction and adoption of these standards is a matter of “when” (and how strictly?) rather than “if”.

Owners and operators of digital infrastructure need to be prepared. The eventual promulgation of these standards, taken in conjunction with proposed regulation on climate change disclosures from the Securities and Exchange Commission will, sooner or later, dictate that operators establish data collection and management processes to meet information reporting requirements. Operators will need to develop a strategy for meeting these requirements and will need to have policies in place to ensure they undertake projects that increase the work delivered per megawatt-hour of energy consumed across their data center operations.

Data center managers would also be wise to engage with industry efforts to develop simple and effective energy-efficiency metrics. These metrics are required under both US draft legislation and the EC EED recast and are likely to be included in legislation and regulation in other jurisdictions. An ITI Green Grid (TGG) Working Group has been put in place to work on this issue, and other efforts have been proposed by groups and institutions such as Infrastructure Masons (iMasons) and the Climate Neutral Data Centre Pact. Uptime Institute is also providing detailed feedback on behalf of its members on an EC study proposing options and making recommendations for data reporting and metrics as required under the EED recast.

Industry initiatives that encompass all types of IT operations are going to be important. Just as importantly, the industry will need to converge on a single and cohesive globally applicable metric (or set of metrics) to facilitate standardized reporting and minimize confusion.

Rapid interconnectivity growth will add complexity and risk

Rapid interconnectivity growth will add complexity and risk

Recent geopolitical concerns, predictions of a looming recession, and continued supply chain difficulties are unlikely to dampen growth in digital bandwidth on private networks according to Equinix’s 2022 Global Interconnection Index (GXI). Global interconnection bandwidth (the volume of data exchanged between companies directly, bypassing the public internet) is a barometer for digital infrastructure and sheds light on the difference in dynamics between verticals. High growth in private interconnection is a boon for Equinix as the world’s largest colocation provider by market share but makes resiliency more challenging for its customers: all these interconnects are also potential points of failure.

The Equinix GXI projects strong growth across the industry in 2023, with global interconnection bandwidth projected to increase by 41% compared to 2022. Overall, global interconnection bandwidth is projected to grow by a compound annual growth rate (CAGR) of 40% into 2025, when it is expected to reach nearly 28,000 terabits per second (tbps). These numbers include direct connections between enterprises and their digital business partners (such as telecommunications, cloud, edge, and software as a service (SaaS) providers).

The Equinix study projects faster growth in private interconnection for enterprises than for networks operated by telecommunications companies or cloud providers. This growth in private interconnection is driven by high demand for digital services and products — many of which also require a presence with multiple cloud providers as well as integration with major SaaS companies.

The energy and utility sector is likely to see the greatest growth in private network interconnection through 2025, with a CAGR of 53%, as energy data becomes increasingly important for managing intermittent renewable energy and decarbonizing the grid. Digital services supporting sustainability efforts such as carbon accounting are likely to require additional private interconnection with SaaS providers to accurately track operational sustainability metrics.

The banking and insurance and manufacturing sectors are expected to see CAGRs of 49% and 45%, respectively, over the same period. These industries are particularly sensitive to errors and outages, however, and appropriate planning will be necessary.

There is a reason Equinix has been drawing attention to the benefits of interconnection for the past six years: as at Q2 2022 the company operates 435,800 cross-connects throughout its own data centers. Its closest competitor, Digital Realty, reported just 185,000 cross-connects at its facilities in the same quarter. Equinix defines a cross-connect as a point-to-point cable link between two customers in the same retail colocation data center. For colocation companies, cross-connects not only represent core recurring revenue streams but also make their network-rich facilities more valuable as integration hubs between organizations.

As private interconnection increases, so too does the interdependency of digital infrastructure. Strong growth in interconnection may be responsible for the increasing proportion of networking and third-party-related outages in recent years. Uptime’s 2022 resiliency survey sheds light on the two most common causes of connectivity-related outages: misconfiguration and change management failure (reported by 43% of survey respondents); and third-party network-provider failure (43%). Asked specifically if their organization had suffered an outage caused by a problem with a third-party supplier, 39% of respondents confirmed this to be the case (see Figure 1).

diagram: Most common causes of major third-party outages
Figure 1. The most common causes of major third-party outages

When third-party IT and data center service providers do have an outage, customers are immediately affected — and may seek compensation. Enterprise end-users will need additional transparency and stronger service-level agreements from providers to better manage additional points of failure, as well as the outsourcing of their architecture resiliency. Importantly, managing the added complexity of an enterprise IT architecture spanning on-premises, colocation and cloud facilities demands more organizational resources in terms of skilled staff, time and budget.

Failing that, businesses might encounter unexpected availability and reliability issues rather than any anticipated improvement. According to Uptime’s 2021 annual survey of IT and data center managers, one in eight (of those who had a view) reported that using a mix of IT venues had resulted in their organization experiencing a deterioration in service resiliency, rather than the reverse.

By: Lenny Simon, Senior Research Associate and Max Smolaks, Analyst

Reports of cloud decline have been greatly exaggerated

Reports of cloud decline have been greatly exaggerated

Cloud providers have experienced unprecedented growth over the past few years. CIOs the world over, often prompted by CFOs and CEOs, have been favoring the cloud over on-premises IT for new and major projects — with the result that the largest cloud provider, Amazon Web Services (AWS), has seen revenue increase by 30% to 40% every year since 2014 (when it recorded an 80% jump in turnover). Microsoft Azure and Google have reported similar numbers in recent times.

But there are signs of a slowdown:

  • While AWS reported a quarter-on-quarter revenue increase of 27.5% for Q3 2022, this is down from 33% in Q2 — the slowest growth in its history.
  • Microsoft’s CFO has also commented that Azure could see revenue growth decline in their next quarter, following disappointing 35% growth in the three months to September 2022.

Why this slowdown in cloud growth?

The global macroeconomic environment — specifically, high energy costs together with inflation — is making organizations more cautious about spending money. Cloud development projects are no different from many others and are likely to be postponed or deprioritized due to rising costs, skill shortages and global uncertainty.

Some moves to the cloud may have been indefinitely deferred. Public cloud is not always cheaper than on-premises implementations, and many organizations may have concluded that migration is just not worthwhile in light of other financial pressures.

For those organizations that have already built cloud-based applications it is neither feasible nor wise to turn off applications or resources to save money: these organizations are, instead, spending more time examining and optimizing their costs.

Cutting cloud costs, not consumption

Cloud providers’ top-line revenue figures suggest customers are successfully reducing their cloud costs. How are they doing this?

Optimizing cloud expenditure involves two key activities: first, eliminating waste (such as orphaned resources and poorly sized virtual machines); and second, more cost-effective procurement, through alternative pricing models such as consistent-usage commitments or spot instances — both of which, crucially, reduce expenditure without impacting application performance.

Hyperscaler cloud providers, which are more interested in building longer-term relationships than in deriving higher gross margins in the short term, offer tools to help users reduce expenditure. These tools have improved significantly over the past few years.

Many organizations have now crossed a threshold in terms of cloud use, where the savings to be made mean it is to their benefit to invest in optimization (using these tools). One factor driving optimization here is higher cloud expenditure — in part an ongoing consequence of the pandemic, which saw businesses retooling IT to survive, rather than focusing on cutting IT costs.

It should, perhaps, have been anticipated that customers would, at some point, start using these tools to their own advantage — current pressures on other costs having made cutting IT expenditure more critical than before.

Will cloud prices rise?

Cloud providers’ overriding objective of winning and keeping customers over the long term explains why hyperscalers are likely to try and avoid increasing their prices for the foreseeable future. Providers want to maintain good relationships with their customers so that they are the de facto provider of choice for new projects and developments: price hikes would damage the customer trust they’ve spent so long cultivating.

AWS’s Q3 2022 gross margin was 26%, some 3% down on Q2. This drop in margin could be attributed to rising energy costs, which AWS states almost doubled over the same period (hedging and long-term purchase agreements notwithstanding). Microsoft has reported it will face additional energy costs of $800 million this financial year. While AWS and Microsoft could have increased prices to offset rising energy costs and maintain their profit margins they have, so far, chosen not to do so rather than risk damaging customers’ trust.

How will this play out, going forward? Financial pressures may make organizations more careful about cloud spending. Projects may be subject to more stringent justification and approval, and some migrations are likely to be delayed (or even cancelled) for now. As revenue increases in absolute terms, achieving high-percentage revenue gains becomes increasingly difficult. Nonetheless, while the days of 40% revenue jumps may be over, this recent downturn is unlikely to be the start of a rapid downward spiral. AWS’s Q3 2022 revenue growth may have shrunk in percentage terms: but it was still in excess of $4 billion.

Applications architected for the cloud should be automatically scalable, and capable of meeting customers’ requirements without their having to spend more than necessary. Cloud applications allow organizations to adapt their business models and / or drive innovation — which may be one of the reasons many have been able to survive (and, in some cases, thrive) during challenging times. In a sense, the decline in growth that the cloud companies have suffered recently demonstrates that the cloud model is working exactly as intended.

The hyperscaler cloud providers are likely to continue to expand globally and create new products and services. Enterprise customers, in turn, are likely to continue to find cloud services competitive in comparison with colocation-based or on-premises alternatives. Much of the cloud’s value comes from a perception of it offering “unlimited” resources. If providers don’t increase capacity, they risk failing to meet customers’ expectations when required — damaging credibility, and relationships. AWS, Google and Microsoft continue to compete for market share, worldwide. Reducing investment now could risk future profitability.

AWS currently has 13,000 vacancies advertised on its website — a sign that the cloud sector is certainly not in retreat. This fact, rather, suggests future growth will be strong.

Major data center fire highlights criticality of IT services

Major data center fire highlights criticality of IT services

Uptime Institute’s outages database suggests data center fires are infrequent, and rarely have a significant impact on operations. Uptime has identified 14 publicly reported, high-profile data center outages caused by fire or fire suppression systems since 2020. The frequency of fires is not increasing relative to the IT load or number of data centers but, uncontained, they are potentially disastrous to facilities, and subsequent outages can be ruinous for the business.

SK Group, South Korea’s second largest conglomerate, is the latest high-profile organization to suffer a major data center fire, following a breakout at a multistory colocation facility operated by its SK Inc. C&C subsidiary in Pangyo (just south of Seoul) on October 15. According to police reports, the fire started in a battery room before spreading quickly to the rest of the building. It took firefighters around eight hours to bring the blaze under control.

While there were no reported injuries, this incident could prove to be the largest data center outage caused by fire to date. It is a textbook example of how seemingly minor incidents can escalate to wreak havoc through cascading interdependencies in IT services.

The incident took tens of thousands of servers offline, including not only SK Group’s own systems but also the IT infrastructure running South Korea’s most popular messaging and single sign-on platform, KakaoTalk. The outage disrupted its integrated mobile payment system, transport app, gaming platform and music service — all of which are used by millions. The outage also affected domestic cloud giant Naver (the “Google of South Korea”) which reported disruption to its online search, shopping, media and blogging services.

While SK Group has yet to disclose the root cause of the fire, Kakao, the company behind KakaoTalk, has pointed to the Lithium-ion (Li-ion) batteries deployed at the facility — manufactured by SK on, another SK Group subsidiary. In response, SK Group has released what it claims are records from its battery management system (BMS) showing no deviation from normal operations prior to the incident. Some local media reports contradict this, however, claiming multiple warnings were, in fact, generated by the BMS. Only a thorough investigation will settle these claims. In the meantime, both sides are reported to be “lawyering up.”

The fallout from the outage is not limited to service disruptions or lost revenue, and has prompted a statement from the country’s president, Yoon Suk-yeol, who has promised a thorough investigation into the causes of, and the extent of the damages arising from, the fire. The incident has, so far, led to a police raid on SK Inc. C&C headquarters; the resignation of Kakao co-CEO Whon Namkoong; and the establishment of a national task force for disaster prevention involving military officials and the national intelligence agency. Multiple class-action lawsuits against Kakao are in progress, mainly based on claims that the company has prioritized short-term profits over investment in more resilient IT infrastructure.

The South Korean government has announced a raft of measures aimed at preventing large-scale digital service failures. All large data centers will now be subject to disaster management procedures defined by the government, including regular inspections and safety drills. Longer-term, the country’s Ministry of Science and ICT will be pushing for the development of battery technologies posing a lower fire risk — a matter of national interest for South Korea, home to some of the world’s largest Li-ion cell manufacturers including Samsung SDI and LG Chem, in addition to SK on.

The fire in South Korea will inevitably draw comparisons with the data center fire that brought down the OVHcloud Strasbourg facility in 2021. Impacting some 65,000 customers, many of whom lost their data in the blaze (see Learning from the OVHcloud data center fire), this fire, as in Pangyo, was thought to have involved uninterruptible power supply (UPS) systems. According to the French Bureau of Investigation and Analysis on Industrial Risks (BEA-RI), the lack of an automatic fire extinguisher system, delayed electrical cutoff and building design all contributed to the spread of the blaze.

A further issue arising from this outage, and one that remains to be determined, is the financial cost to SK Group, Kakao and Naver. The fire at the OVHcloud Strasbourg facility was estimated to cost the operator more than €105 million — with less than half of this being covered by insurance. The cost of the fire in Pangyo is likely to run into tens (if not hundreds) of millions of dollars. This should serve as a timely reminder of the importance of fire suppression, particularly in battery rooms.

Li-ion batteries in mission-critical applications — risk creep?

Li-ion batteries present a greater fire risk than valve-regulated lead-acid batteries, regardless of their specific chemistries and construction – a position endorsed by the US’ National Fire Protection Association and others. Since the breakdown of cells in Li-ion batteries produces combustible gases (including oxygen) which can result in a major thermal-runaway event (in which the fire spreads uncontrollably between cells, across battery packs and, potentially, even cabinets if these are inappropriately distanced), the fires they cause are notoriously difficult to suppress.

Many operators have, hitherto, found the risk-reward profile of Li-ion batteries (in terms of their lower footprint and longer lifespan) to be acceptable. Uptime surveys show major UPS vendors reporting strong uptake of Li-ion batteries in data center and industrial applications: some vendors report shipping more than half their major three-phase UPS systems with Li-ion battery strings. According to the Uptime Institute Global Data Center Survey 2021, nearly half of operators have adopted this technology for their centralized UPS plants, up from about a quarter three years ago. The Uptime Institute Global Data Center Survey 2022 found Li-ion adoption levels to be increasing still further (see Figure 1).

diagram of Data centers are embracing Li-ion batteries
Figure 1. Data centers are embracing Li-ion batteries

The incident at the SK Inc. C&C facility highlights the importance of selecting appropriate fire suppression systems, and the importance of fire containment as part of resiliency. Most local regulation governing fire prevention and mitigation concentrates (rightly) on securing people’s safety, rather than on protecting assets. Data center operators, however, have other critically important issues to consider — including equipment protection, operational continuity, disaster recovery and mean time to recovery.

While gaseous (or clean agent) suppression is effective at slowing down the spread of a fire in the early stages of Li-ion cell failure (when coupled with early detection), it is arguably less suitable for handling a major thermal-runaway event. The cooling effects of water and foam mean these are likely to perform better; double-interlock pre-action sprinklers also limit the spread. Placing battery cabinets farther apart can help prevent or limit the spread of a major fire. Dividing battery rooms into fire-resistant compartments (a measure mandated by Uptime Institute’s Tier IV resiliency requirements) can further decrease the risk of facility-wide outages.

Such extensive fire prevention measures could, however, compromise the benefits of Li-ion batteries in terms of their higher volumetric energy density, lower cooling needs and overall advantage in lifespan costs (particularly where space is at a premium).

Advances in Li-ion chemistries and cell assembly will address operational safety concerns — lithium iron phosphate, with its higher ignition point and no release of oxygen during decomposition – being a case in point. Longer term, inherently safer, innovative chemistries — such as sodium-ion and nickel-zinc — will probably offer a more lasting solution to the safety (and sustainability) conundrum around Li-ion. Until then, the growing prevalence of vast amounts of Li-ion batteries in data centers means the propensity of violent fires can only grow — with potentially dire financial consequences.

By: Max Smolaks, Analyst, Uptime Institute Intelligence and Daniel Bizo, Research Director, Uptime Institute Intelligence

Tweak to AWS Outposts reflects demand for greater cloud autonomy

Tweak to AWS Outposts reflects demand for greater cloud autonomy

Amazon Web Services (AWS) has made a minor change to its private-cloud appliance, AWS Outposts, that could significantly impact resiliency. The cloud provider has enabled local access to cloud administration, removing the appliance’s reliance on the public cloud. In the event of a network failure between the public cloud and the user’s data center, the private-cloud container platform can still be configured and maintained.

Many public-cloud providers have extended their offerings to now make their services accessible through the user’s own choice of data center. Services are typically billed in the same way as they are via the public cloud, and accessed through the same portal and software interfaces, but are now delivered from hardware and software hosted in the user’s own facility. Such services are in demand from customers seeking to meet compliance or data protection requirements, or to improve the end-user experience through lower latency.

In one business model, the cloud provider ships a server-storage private-cloud appliance to an organization’s data center. The organization manages the data center. The public-cloud provider is responsible for the hardware and middleware that delivers the cloud functionality.

The term “private cloud” describes a cloud platform where the user has access to elements of the platform not usually accessible in the public cloud (such as the data center facility, hardware and middleware). These appliances are a particular type of private cloud, not designed to be operated independently of the public cloud. They are best thought of as extensions of the public cloud to the on-premises data center (or colocation facility) since administration and software maintenance is performed via the public cloud.

As the public and private cloud use the same platform and application programming interfaces (APIs), applications can be built across the organization’s and the cloud provider’s data centers, and the platform can be managed as one. For more information on private-cloud appliances (see the Uptime Institute Intelligence report Cloud scalability and resiliency from first principles).

The resilience of this architecture has not, hitherto, been assured because the application still relies on the cloud provider’s ability to manage some services, such as the management interface. The public-cloud provider controls the interface for interacting with the user’s on-premises cloud (the “control plane”); if that interface goes down, so too does the ability to administrate the on-premises cloud.

Ironically, it is precisely during an outage that an administrator is most likely to want to make such changes to configuration — to reserve capacity for mission-critical workloads or to reprioritize applications to handle the loss of public-cloud capacity, for example. If an AWS Outpost appliance were being used in a factory to support manufacturing machinery, for instance, the inability to configure local capabilities during a network failure could significantly affect production.

It is for this reason that AWS’s announcement that its Elastic Kubernetes Service product (Amazon EKS) can be managed locally on AWS Outposts is important. Kubernetes is a platform used to manage containers. This new capability allows users to configure API endpoints on the AWS Outposts appliance, meaning the container configuration can be changed via the local network without connecting to the public cloud.

In practical terms, this addition makes AWS Outposts more resilient to outages because it can function in the event of a connectivity failure between the cloud provider and the data center. AWS Outposts is now far more feasible as a disaster-recovery or failover location, and more appropriate for edge locations, where connectivity might be less assured.

The most important aspect of this development, however, is that it indicates AWS — the largest cloud provider — is perhaps acknowledging that users don’t just want an extension of the public cloud to their own facilities. Although many organizations are pursuing a hybrid-cloud approach, where public and private cloud platforms can work together, they don’t want to sacrifice the autonomy of each of those environments.

Organizations want venues to work independently of each other if required, avoiding single points of failure. To address this desire, other AWS Outposts services may be made locally configurable over time as users demand autonomy and greater control over their cloud applications.

Why are governments investigating cloud competitiveness?

Why are governments investigating cloud competitiveness?

In any market, fewer sellers or providers typically results in less choice for buyers. Where the number of sellers is very low this could, theoretically, lead to exploitation, through higher prices or lower-quality goods and services — with buyers having no choice but to accept such terms.

Three hyperscale cloud providers — Amazon Web Services, Google Cloud and Microsoft Azure — have become dominant throughout most of the world. This has triggered investigations by some governments to check that limited competition is not impacting customers.

The UK government’s Office of Communications’ (Ofcom’s) Cloud services market study is intended to investigate the role played by these “cloud provider hyperscalers” in the country’s £15 billion public cloud services market. Ofcom’s objective, specifically, is to understand the strength of competition in the market and to investigate whether the dominance of these hyperscalers is limiting growth and innovation.

Although there is a debate about the cost and strategic implications of moving core workloads to the cloud, competition among cloud provider hyperscalers, so far, seems to be good for users: recent inflation-driven increases notwithstanding, prices have generally decreased (across all providers) over the past few years. Apart from the hyperscalers, users can procure cloud services from local providers (and established brands), colocation providers and private cloud vendors. The cloud provider hyperscalers continue to develop innovative products, sold for pennies per hour through the pay-as-you-go pricing model and accessible to anyone with a credit card.

However, Ofcom is concerned. It cites research from Synergy Research Group showing that the combined market share of the hyperscalers is growing at the expense of smaller providers (at a rate of 3% per year) with the hyperscalers’ UK market share now standing at over 80%. As discussed in Uptime Institute Intelligence’s Cloud scalability and resiliency from first principles report, vendor lock-in can make it harder for users to change cloud providers to find a better deal.

The Herfindahl-Hirschman Index (HHI) is commonly used to assess market competitiveness on the basis of market share. A market with an HHI of over 2,500 suggests a limited number of companies have significant power to control market prices — a “high concentration.” The UK cloud services market is estimated to have an HHI of over 2,900. Given the global HHI of 1,600 for this sector, the UK’s high value validates the need for the Ofcom investigation.

Such a high market concentration isn’t necessarily a problem, however, if competing companies keep prices low while offering innovative products and services to a large population. A high concentration is only problematic if the cloud providers are in a stalemate (or worse, in collusion) — not cutting prices, not releasing new products, and not fighting to win each other’s customers. UK law prevents cloud providers from colluding to fix prices or restrict competition. But with so few competitors, such anti-competitive behavior might emerge accidentally (although there are few — if any — signs of such a stalemate so far).

The most intriguing part of Ofcom’s study will be its recommendations on how to make the market more competitive. Unless Ofcom can find evidence of anti-competitive behavior, there may be very little it can do to help smaller players compete, apart from limiting the hyperscalers’ ambitions, through regulation or divestiture. Outward signs are that cloud providers have come to dominate the market by providing users with the services they expect, at a price they’re willing to pay, rather than through any nefarious means.

Hyperscale cloud providers require colossal capital, substantial and cutting-edge expertise, and global-scale efficiency investments — all of which means they can cut prices, over time, while expanding into new markets and releasing new products. The hyperscalers themselves have not created the significant barrier to entry faced by smaller players in attempting to compete here: that barrier exists because of the sheer scale of operations fundamental to cloud computing’s raison d’etre.

In most countries, competition authorities — or governments generally — have limited ability to help smaller providers overcome this barrier, whether through investment or support. In the case of the UK, Ofcom’s only option is to restrict the dominance of the hyperscalers.

One option open to competition authorities would be regulating cloud prices by setting price caps, or by forcing providers to pass on cost savings. But price regulation only makes sense if prices are going up, and if users have no other alternatives. Many users of cloud services have seen prices come down: and they are, in any case, at liberty to use noncloud infrastructure if providers are not delivering good value.

Ofcom (and other regulators) could, alternatively, enforce the divestment of hyperscalers’ assets. But breaking up a cloud provider on the basis of the products and services offered would penalize those users looking for integrated services from a single source. It would also be an extremely bold and highly controversial step that the UK government would be unlikely to undertake without wider political consensus. In the US, there is a bipartisan support for an investigation into tech giant market power, which could provide that impetus.

Regulators could also legislate to force suppliers to offer greater support in migrating services between cloud providers: but this could stifle innovation, with providers unable to develop differentiated features that might not work elsewhere. Theoretically, a government could even nationalize a major cloud provider (although this is highly unlikely).

Given the high concentration of this market, Ofcom’s interest in conducting an investigation is understandable: while there is limited evidence to date, there could, be anti-competitive factors at play that are not immediately obvious to customers. Ofcom’s study may well not uncover many competitive concerns at the moment but it might, equally, focus attention on the nation’s over-reliance on a limited number of cloud providers in the years ahead.

In this Note, we have focused purely on AWS’, Google’s and Microsoft’s cloud infrastructure businesses (Amazon Web Services, Google Cloud and Microsoft Azure). But these tech giants also provide many other products and services in many markets, each of which has different levels of competitiveness.

Microsoft, for example, has recently been pressured into making changes to its software licensing terms following complaints from EU regulators and European cloud providers (including Aruba, NextCloud and OVHcloud). These regulators and cloud providers argue that Microsoft has an unfair advantage in delivering cloud services (via its Azure cloud), given it owns the underlying operating system. Microsoft, they claim, could potentially price its cloud competitors out of the market by increasing its software licensing fees.

As their market power continues to increase, these tech giants will continue to face anti-competitive regulation and lawsuits in some, or many, of these markets. In the UK, how far Ofcom will investigate the hyperscalers’ impact in particular subsectors, such as retail, mobile, operating systems and internet search is yet to be seen.