Accountability – the “new” imperative

Outsourcing the requirement to own and operate data center capacity is the cornerstone of many digital transformation strategies, with almost every large enterprise spreading their workloads across their own data centers, colocation sites and public cloud. But ask any regulator, any chief executive, any customer: You can’t outsource responsibility — for incidents, outages, security breaches or even, in the years ahead, carbon emissions.

Chief information officers, chief technology officers and other operational heads knew this three or four decades ago (and many have learned the hard way since). That is why data centers became physical and logical fortresses, and why almost every component and electrical circuit has some level of redundancy.

In 2021, senior executives will grapple with a new iteration of the accountability imperative. Even the most cautious enterprises now want to make more use of the public cloud, while the use of private clouds is enabling greater choices of third-party venue and IT architecture. But this creates a problem: cloud service operators, software-as-a-service (SaaS) providers and even some colos are rarely fully accountable or transparent about their shortcomings — and they certainly do not expect to be held financially accountable for consequences of failures. Investors, regulators, customers and partners, meanwhile, want more oversight, more transparency and, where possible, more accountability.

This is forcing many organizations to take a hard look at which workloads can be safely moved to the cloud and which cannot. For some, such as the European financial services sector, regulators will require an assessment of the criticality of workloads — a trend that is likely to spread and grow to other sectors over time. The most critical applications and services will either have to stay in-house, or enterprise executives will need to satisfy themselves and their regulators that these services are run well by a third-party provider, and that they have full visibility into the operational practices and technical infrastructure of their provider.

The data suggests this is a critical period in the development of IT governance. The shift of enterprise IT workloads from on-premises data center to cloud and hosted services is well underway. But there is a long way to go, and some of the issues around transparency and accountability have arisen only recently as more critical and sensitive data and functionality is considered for migration to the cloud.

The first tranche of workloads moving to third parties often did not include the most critical or sensitive services. For many organizations, a public cloud is (or was initially) the venue of choice for specific types of workloads, such as application test and development; big-data processing, such as AI; and new applications that are cloud-native. But as more IT departments become familiar with the tool sets from cloud providers, such as for application development and deployment orchestration, more types of workloads have moved into public clouds only recently, with more critical applications to follow (or perhaps not). High-profile, expensive public cloud outages, increased regulatory pressures and an increasingly uncertain macroeconomic outlook will force many enterprises to assess — or reassess — where workloads should actually be running (a process that has been called “The Big Sort”).

Uptime Institute believes that many mission-critical workloads are likely to remain in on-premises or colo data centers — at least for many years to come: More than 70% of IT and critical infrastructure operators we surveyed in 2020 do not put any critical workloads in a public cloud, with over a quarter of this group (21% of the total sample) saying the reason is a lack of visibility/accountability about resiliency. And over a third of those who do place critical applications in a public cloud also say they do not have enough visibility (see chart below). Clearly, providers’ assurances of availability and of adherence to best practices are not enough for mission-critical workloads. (These results were almost identical when we asked the same question in our 2019 annual survey.)



The issues of transparency, reporting and governance are likely to ripple through the cloud, SaaS and hosting industries, as customers seek assurances of excellence in operations — especially when financial penalties for failures by third parties are extremely light. While even the largest cloud and internet application providers operate mostly concurrently maintainable facilities, experience has shown that unaudited (“mark your own homework”) assurances frequently lead to poor outcomes.

Creeping criticality

There is an added complication. While the definitions and requirements of criticality in IT are dictated by business requirements, they are not fixed in time. Demand patterns and growing IT dependency mean many workloads/services have become more critical — but the infrastructure and processes supporting them may not have been updated (“creeping criticality”). This is a particular concern for workloads subject to regulatory compliance (“compliance drift”).

COVID-19 may have already caused a reassessment of the criticality or risk profile of IT; extreme weather may provide another. When Uptime Institute recently asked over 250 on-premises and colo data center managers how the pandemic would change their operations, two-thirds said they expect to increase the resiliency of their core data center(s) in the years ahead. Many said they expected their costs to increase as a result. One large public cloud company recently asked their leased data center providers to upgrade their facilities to N+1 redundancy, if they were not already.

But even before the pandemic, there was a trend toward higher levels of redundancy for on-premises data centers. There is also an increase in the use of active-active availability zones, especially as more workloads are designed using cloud or microservices architectures. Workloads are more portable, and instances are more easily copied than in the past. But we see no signs that this is diminishing the need for site-level resiliency.

Colos are well-positioned to provide both site-level resiliency (which is transparent and auditable) and outsourced IT services, such as hosted private clouds. We expect more colos will offer a wider range of IT services, in addition to interconnections, to meet the risk (and visibility) requirements of more mission-critical workloads. The industry, it seems, has concluded that more resiliency at every level is the least risky approach — even if it means some extra expense and duplication of effort.

Uptime Institute expects that the number of enterprise (privately owned/on-premises) data centers will continue to dwindle but that enterprise investment in site-level resiliency will increase (as will investment in data-driven operations). Data centers that remain in enterprise ownership will likely receive more investment and continue to be run to the highest standards.


The full report Five data center trends for 2021 is available to members of the Uptime Institute Inside Track community here.

Five Trends for 2021: accountability, automation, edge, sustainability, innovation

What we can expect for mission-critical digital infrastructure in 2021?

Each autumn Uptime Institute, like many other organizations, puts together a list of some of the big trends and themes for the year ahead. This time, we have focused on five big trends that might not have been so obvious 12 months ago.

Heading into 2021, during a macroeconomic downturn, the critical digital infrastructure sector itself continues to expand and to attract enviable levels of new investment. The ongoing build-out of new data centers and networks is largely being driven by cloud, hosted, and “as-a-service” workloads, as more enterprises seek to outsource more of their IT and/or data center capacity. However, for many managers, the COVID-19 pandemic has forced a reassessment — of working practices and, in particular, of risk. The global economy’s dependence on IT is growing, and this is catching the attention of an increasing number of customers, governments and watchdogs.

The coming year (and beyond) also holds new opportunities: Edge computing, artificial intelligence (AI) and new innovations in hardware and software technologies promise greater efficiencies and agility.

Here are Uptime Institute’s five trends for 2021, summarized:

1. Accountability — the “new” imperative

Enterprises want more cloud and greater agility, but they can’t outsource responsibility — for incidents, outages, security breaches or even, in the years ahead, carbon emissions. In 2021, hybrid IT, with workloads running in both on- and off-premises data centers, will continue to dominate, but investments will increasingly be constrained and shaped by the need for more transparency, oversight and accountability. More will be spent on cloud and other services, as well as in on-premises data centers.

2. Smarter, darker data centers

Following a scramble to effectively staff data centers during a pandemic, many wary managers are beginning to see remote monitoring and automation systems in a more positive light, including those driven by AI. An adoption cycle that has been slow and cautious will accelerate. But it will take more than just investment in software and services before the technology reduces staffing requirements.

3. Edge — the next frontier

Significant new demand for edge computing, fueled by technologies such as 5G, the internet of things and AI, is likely to build slowly — but the infrastructure preparation is underway. Expect new alliances and investments across enterprise, mobile and wireline networks, and for a wide range of edge data centers, small and large. Smart and automated software-defined networks and interconnections will become as important as the physical infrastructure.

4. Sustainability: More challenging, more transparent

For years, operators could claim environmental advances based on small, incremental and relatively inexpensive steps — or by adopting new technologies that would pay for themselves anyway. But the time of easy wins and greenwashing is ending: Regulators, watchdogs, customers and others will increasingly expect operators of digital infrastructure to provide hard and detailed evidence of carbon reductions, water savings and significant power savings — all while maintaining, if not improving, resiliency.

5. A surge of innovation

Data center operators (and enterprise IT) are mostly cautious, if not late, adopters of new technologies. Few beyond hyperscale operators can claim to have gained a competitive advantage through technology. However, several new technologies are maturing at the same time, promising advances in the performance and manageability of data centers and IT. Storage-class memory, silicon photonics, ARM servers and software-defined power are ready for greater adoption.

This is a summary of the full report on new Trends for 2021. The full report is available to members of the Uptime Institute. Click to learn more.

Rack Density is Rising

Density is rising

The power density per rack (kilowatts [kW] per cabinet) is a critical number in data center design, capacity planning, and cooling and power provisioning. There have been industry warnings about a meteoric rise in IT equipment rack power density for the past decade (at least). One reason for this prediction is the proliferation of compute-intensive workloads (e.g., AI, IoT, cryptocurrencies, and augmented and virtual reality), all of which drive the need for high-density racks.

Our recent annual surveys found that racks with densities of 20 kW and higher are becoming a reality for many data centers (we asked about highest rack density) — but not to the degree forewarned. Year-over-year, most respondents said their highest density racks were in 10-19 kW range, which is not enough to merit wholesale technical changes. When rack densities are higher than 20-25 kW, direct liquid cooling and precision air cooling becomes more economical and efficient. According to what we see in the field, such high densities are not pervasive enough to have an impact on most data centers.

This does not mean that the trend should be ignored. It is clear from our latest research that average mean rack density in data centers is rising steadily, as the figure below shows. Eliminating respondents with above 30 kW as high-performance outliers, the mean average density in our 2020 survey sample was 8.4 kW/rack. This is consistent with other industry estimates and safely within the provisioned range of most facilities.

PUE over the Years

In our 2020 survey, we asked about the most common (modal average) SERVER rack density, which is perhaps a better metric than overall average density. More than two-thirds (71%) reported a modal average of below 10 kW/rack, with just 16% widely deploying 20 kW or higher rack densities (Figure 7). The most common density was 5-9 kW/rack. Overprovisioning of power/cooling is probably a more common issue than underprovisioning due to rising rack densities.

The modal average power consumption, rack density

Assuming a trend that rack units will increasingly be filled with higher-powered servers that are well utilized, we anticipate that the modal average kW/rack will increase over time. Figure 8 shows that for most organizations — roughly half of those surveyed — average density is increasing, albeit only slowly.

Rack density changing…

We expect density to keep rising. Our research shows that the use of virtualization and software containers pushes IT utilization up, in turn requiring more power and cooling. With Moore’s law slowing down, improvements in IT can require more multi-core processors and, consequently, more power consumption per operation, especially if utilization is low. Even setting aside new workloads, increases in density can be regarded a long-term trend.

But, as our 2020 survey findings demonstrate, the expectation for 20 kW racks throughout the industry has not manifested. We believe that many compute-intensive workloads — those that will significantly push up power use, rack density and heat — currently reside across a relatively small group of hyperscale cloud data centers and are consumed by organizations as a service.

Want to know more about this and other data center trends and strategies? Download a copy of our complete 2020 survey.

Common Factors for IT venue selection (Choosing In-house or Outsource)

Data center capacity is rapidly expanding in outsourced, third-party IT venues, such as colocation data centers and public cloud. Whether measured in megawatts (MW) of uninterruptible power supply capacity or IT load (or by some other measure, such as square feet of white or leased space, or in units of compute or storage) overall capacity is growing rapidly.

There are many factors driving this growth, including newer cloud-based services, such as social media and streaming; mobile applications and services; new enterprise services and applications; the migration of corporate workloads into colocation sites; and the adoption of software as a service (SaaS) and public cloud platforms.

Does this mean that almost all IT workloads will end up running in a third-party data center? In a word, No, not for a long time and in some cases, never. In the 2020 Uptime Institute survey, as in years past, we asked respondents to estimate, by percentage, how much of their workload/data is processed/stored in different types of data centers today and how this might look in two years’ time.

The majority (58%) said that most of their workloads run in corporate data centers today — that is, enterprise-owned, on-premises facilities. And if you add in Micro datacenters and Server closets, that figure rises to 69%!

These findings are similar to those of our previous years’ surveys. They confirm Uptime Institute’s view that the enterprise-owned data center sector, while not necessarily the most innovative, will continue to be the foundation of enterprise IT for the next decade. In our survey, nearly two-thirds of IT workloads are expected to be running in privately owned environments (large data centers, server closets and micro data centers) by 2022, with the remainder contracted to external suppliers. Although the enterprise data center sector is falling as a percentage of the whole, the absolute amount of enterprise data center capacity is still growing.

A mix of factors typically drive enterprise demand for third-party IT venues; similarly, multiple factors drive demand for on-premises data centers. Often some combination of the factors listed below come into play when deciding the best-execution venue for workloads.

COMMON DRIVERS FOR IT VENUE SELECTION DECISIONS

Choosing Outsourced:

Cost: Outsourcing can lower costs in the short to medium term. For organizations “born” in a public cloud or colo, it typically is cost-prohibitive to move to an enterprise data center.

Cost allocation: Outsourcing shifts cost allocations from capex toward more repeatable opex models.

IT agility and flexibility: Outsourcing provides the ability to readily and quickly adapt to changing capacity needs without the burden of managing the full stack of IT and applications; IT can be used for a project’s duration only (e.g., for test and development).

Access to resources: Third-parties may provide access to a wider range of resources, including technology, interconnections, software tools, services and application environments.

Security: Third-parties can offer the most advanced, highly resourced security features.

 

Choosing on-premise:

Cost: Ownership delivers total cost of ownership benefits over the long term; in the shorter term, owners avoid the data transport costs of moving to an outsourced venue.

Governance: On-premises environments may be necessary for compliance with data governance and regulatory requirements.

Control: Owners can closely monitor and control factors such as latency, availability and application performance. While most outsourced venues are strong in these areas, service level agreements vary and are limited.

Risk: Ownership ensures full visibility into (and the ability to adjust) the risk profile of every workload.

Security: Ownership provides the ability to maintain control and governance (dedicated rather than shared physical infrastructure) over security features.

Want to know more about venue selection, data center staffing and skills, latest technologies in use and the cloud, risk and cost management, operational strategies and more? Check out our full 2020 Annual Survey.

COVID-19, air filtration and energy use

The COVID-19 pandemic has caused concerns about data center HVAC (heating, ventilation and air conditioning) filtration. So many data center operators are adjusting filtration protocols, including upgrading to finer MERV (minimum efficiency reporting value) 13 filters, to better filter out aerosols and COVID-19 virus carriers.1 But there is no free lunch. Depending on the design of the data center’s HVAC system, this increase in filtration static pressure can slightly increase energy consumption.

Technically, with a PSC (permanent split capacitor) blower motor, extra static pressure from this denser filter would result in lower airflow, reduced system performance and less air distribution. With ECMs (electronically commutated motors) — the type of motor most commonly found in data center cooling systems — this extra static pressure can result in higher blower motor power consumption.

Recently, an Uptime Institute member reported an increase of 8% energy use following a filtration upgrade from a MERV 8 to a more dense MERV 13 filter, due to the increased fan energy required to maintain static pressure and airflow. Since their data center fan power is 7-10% of total data center energy used, any increased fan energy affects their PUE (power usage effectiveness).

But this single data point may not be representative of all environments, since the quality and quantity of the media in the filter and the configuration of the HVAC system can significantly affect static pressure and airflow. Denser MERV 13 filters won’t necessarily require more fan energy if the operator installs filters that increase surface area by having more and deeper pleats.2

There is a direct correlation between airflow CFM (cubic feet per minute) and IWC (initial resistance water column), so some HVAC adjustments might be needed (see chart below).2


 

Source: “Is There a Downside to High-MERV Filters? The new high-MERV filters extract an energy penalty,” by David Springer, November 02, 2009, in Home energy: The Home Performance Magazine. http://www.homeenergy.org/show/article/nav/issues/page/4/id/667

Some guidance is in order: When choosing a filter remember that airflow (static pressure increase) is just as important to consider as filtration (MERV rating). Just because a filter fits doesn’t mean it’s the best filter for the system, so its very important to check the rating of the facility’s filter and the design specifications of the cooling system.

Many data center cooling systems use an airflow rate between 330–490 CFM, so the system would experience 0.10-0.16 initial resistance (static pressure increase in inches of water column) for the MERV 11 filter example shown in the image below.3


 

 


Typical data center underfloor static pressure is 20-25 pascal or 0.08-0.1 inches of water column. Therefore, while this filter would introduce some increased static pressure, it pales in comparison to the 220 pascal static pressure required to move air through the cooling coils in the CRAC (computer room air conditioning) units or the impingement and turning of cooled air on the subfloor, which can be as high as 200 pascal.4

With careful airflow planning there should be minimal airflow and pressure impacts from more dense filters, such as MERV 13. These finer filters, though, will need to be changed more frequently, since they will trap more particulates and so will clog sooner. In addition to the higher change frequency, budget for higher per-unit costs, for filters with larger surface area media will likely be more expensive than smaller, less efficient filters.

But everything considered, given the modest energy trade-offs and the increased particulate and virus filtering they provide, MERV 13 filter use is quickly becoming the new normal.


Endnotes

1 https://www.ashrae.org/technical-resources/commercial#general
2 “Is There a Downside to High-MERV Filters? The new high-MERV filters extract an energy penalty,” by David Springer, November 02, 2009, in Home energy: The Home Performance Magazinehttp://www.homeenergy.org/show/article/nav/issues/page/4/id/667
https://www.hvacrschool.com/air-filter-static-pressure-drop/
http://asmedigitalcollection.asme.org/heattransfer/article-pdf/132/7/073001/5794918/073001_1.pdf

Why ASHRAE is concerned about edge data centers

There are few organizations that have had as big an impact on data center design as ASHRAE — and specifically, their Technical Committee (TC) 9.9. ASHRAE’s 2004 publication Thermal guidelines for Data Processing Environments (now in its fourth edition) described the optimal environmental operating conditions for the electronics in a data center, and in doing so, effectively established some of the key design goals that have been followed by almost every data center builder and operator for the past 15 years.

In October 2020, ASHRAE published what it describes as a “groundbreaking” Technical Bulletin — this time looking specifically at edge data centers. The Bulletin is a lighter, shorter document than most previous TC9.9 publications, and it does not signal any changes in the thermal operating guidelines that remain so important. Nor does it reveal any significant new findings — even though TC9.9 members do have access to a lot of data and tests conducted by equipment manufacturers.

But its brevity does not imply that it lacks importance. The ASHRAE committee members want to send a warning that unless careful processes and guidelines are followed, and good technical choices made, there will be a significantly increased risk of electronic equipment failure in many edge data centers. Given that a lot more processing will be done at the edge in the next decade or so that means there could be more failures and service disruptions in the future than in the past.

In the document, and in separate conversations with Uptime Institute (which has members on the TC9.9 committee), ASHRAE TC9.9 members identify two major issues as they relate to edge data centers.

First, the simple fact is that almost all “core” enterprise, large, and colocation data centers have a strictly climate-controlled white space. Small edge data centers, however, cannot always maintain this — especially during maintenance. Edge data center doors may be open during maintenance, exposing equipment to rapid temperature change, rain and pollution. Cooling may be turned off during maintenance, leading to rapid heat rise; routine or urgent maintenance may not be possible or timely; and edge environments are more likely to have a higher density and to be hotter, colder, wetter, dustier or more insecure than big, remote, well-managed facilities.

A second issue is that a lot of the equipment on the market has been designed for use in large data centers where failure is more easily tolerated — or even deliberately allowed, as part of an overall “abandon in place” strategy. Put bluntly, the quality of IT equipment is no longer required to be as high as it used to be, because it has been designed to be quickly swapped out. Some leading suppliers no longer do the systematic, lifecycle-based testing that was once a necessity.

What can be done?

The first place to start is to avoid poor site selection — but that, of course, is not always an option for small edge data centers, which need to be near the point of use. Ideally, the actual data center (which may be prefabricated) should be designed to allow ease of maintenance without exposure to outside conditions, to minimize damage (such as to batteries and IT equipment). A data center infrastructure management system that remotely monitors key environmental variables and IT performance will help prevent failures. It also makes sense to design with some concurrent maintainability/fault tolerance, if the design goals/budgets allow.

Other key points of advice are:

  • Protect the edge facility by servicing only during moderate weather or when using a mantrap or tent, to avoid potentially deleterious conditions. Even time of day can be a factor (e.g., humidity is often higher in the morning).
  • Monitor humidity, condensation and temperature during servicing.
  • Monitor the rate of temperature or humidity change when doors are opened.
  • Beware of local particulates/pollution, such as ocean spray, dust, industrial pollutants and heavy vehicle exhaust.
  • Air filtration should meet necessary standards — MERV (Minimum Efficiency Reporting Value) 11 or 13.
  • Beware the effect of gaseous pollutants that cause corrosion (e.g., of components containing copper or silver).
  • Use remote monitoring where possible to track corrosion rate, filter performance, etc.

The key takeaway from this ASHRAE Bulletin is that it is far more difficult (and possibly more expensive) to control virtually all environmental variables in edge data centers. This will almost certainly lead to more failures. Vigilance and good practices can improve early detection and reduce the likelihood of failures and mitigate their impact.

The full document Edge Computing: Considerations for Reliable Operation is available on the ASHRAE website..