Renewed Pressure for Renewables to Power the Data Center

In a recent presentation at the Energy Smart data center conference in Stockholm, Gary Cook, the Greenpeace activist who has tracked data center carbon emissions for a decade, showed a slide of logos, indicating companies that have made a commitment to use 100 percent renewable energy for their IT. Cook showed the commitment started with big brand consumer-facing IT (such as Google and Apple), then spread to big data center operators (such as Equinix and Digital Realty), and now is being adopted by business-to-business companies such as HP Enterprise.

Our research supports Cook’s view that this small cluster of logos will grow to a forest in the years ahead, with a surge of renewed enthusiasm coming from top-level executives. The reason is not altruistic: corporate leaders, investors and shareholders are
exerting increasing pressure on enterprises to actively address climate change risk, better manage natural resources, such as water, and become more energy efficient.

At present, data center operators may not be heavily exposed to the effects of this top-level interest in climate change, but Uptime Institute advises data center operators to prepare for more interest and more pressure.

Financial pressure is one big reason: According to The Forum for Sustainable and Responsible Investment, a U.S.-based membership association formed to advance sustainable, responsible and impactive investing, the amount of funds invested by money managers that incorporate environmental, social and governance (ESG) criteria increased from $8.1 trillion in 2016 to $11.6 trillion in 2018 (see chart below).

Source: The Forum for Sustainable and Responsible Investment, 2018

C-level executives have little choice but to prioritize company objectives and allocate funds in response to these increased investor calls for climate change and sustainability efforts — it could affect the share price. For whatever reason, altruistic or financial, the investments are being made: In a recent report, Schneider Electric reports that companies spend more than $450 billion on energy efficiency and sustainability initiatives, and 63 percent of Fortune 100 companies have set one or more clean energy targets.

There is some evidence, although not conclusive, that companies that commit themselves to time-binding, greenhouse gas emission reduction targets outperform other companies on the financial markets. This may be due to better management in the first place, the push to efficiency or access to more capital. In recent years, Ceres, the MIT Sloan Management Review (a Massachusetts Institute of Technology publication that covers management practices) and The Boston Consulting Group have all drawn similar conclusions about a commitment to ESG and improved revenues and share prices.

Schneider Electric took notice of this investment trend in its 2019 Corporate Energy & Sustainability Progress Report, which it discussed in a recent webinar. Schneider reported that 42 percent of enterprises have customer/investor relations in mind when they publicly commit to energy- and carbon-reduction initiatives, which only slightly trails environment concerns (44 percent).

Source: Schneider Electric, 2019

In recent weeks, no less than four data center operators in Sweden, Singapore, France and the U.S. have told us about the growing importance of reducing energy/carbon emissions. There is a resurgence in green thinking, often coming from top management. These changes will eventually reach many others in IT and data center operations, which will require them to improve their environmental and sustainability performance as well as reduce risk.

——————————————————————————–

For more information on the renewed interest and fiscal pressures being seen for companies to adopt cleaner energy strategies for the infrastructure, join the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday.  For membership information click here.

For most, virtualization reduces data center capacity demand more than anything else

The public cloud is dampening demand for data center capacity and leading to closures, consolidation and a big rethink on data center ownership. Right? Not quite, according to the latest Uptime Intelligence research.

In enterprise and colocation data centers, we found that virtualization helps free up data center capacity more than any other technology or service, with public cloud and new server technologies coming some way behind. And despite this, participants in our research told us enterprise data center demand (especially for storage) is still rising.

In a April 2019 report by Uptime Institute, “Capacity planning in a complex, hybrid world”, we asked more than 250 C-level executives and data center and IT managers at enterprises globally which technologies have the highest impact on data center demand. Virtualization was cited by 51 percent and public cloud by only 32 percent. This was a surprise to us — we had expected cloud to have a greater impact.

The findings underline the power of virtualization, which is mostly adopted for other purposes (such as rapid provisioning) but helps push up server utilization and thus saves space, use of capital (IT) equipment, cooling and, of course, power. Some 40 percent of respondents said virtual machine (VM) compression, increasing the number of VMs per host server, is further reducing capacity demand.

A warning to operators, however: The capacity benefits of virtualization, once carried out, may be short lived. One-third of respondents said that virtualization helped initially but is no longer a factor in reducing capacity demand. Levels of virtualization are now very high in many organizations — above 90 percent is common.

Some operators are adopting a method of virtualization known as application containers (‘containers’), the most common of which is Docker. Unlike VMs, containers do not require a dedicated, pre-provisioned support environment and, therefore, will usually require less compute and memory capacity. Just 23 percent of respondents said they are using containers. About one-quarter of those using or considering containers expect it will reduce their physical server footprint further (or offset growth).

——————————————————————————–

For more information on capacity planning and the role virtualization has in forming strategic plans for essential IT service delivery, join the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday.  For membership information click here.

The Data Center Staffing and Skills Shortage is here NOW!

Sometimes it can be hard to get people to talk about their issues — other times, it can be hard to keep them quiet. A recent Uptime Institute Network member’s meeting began as an open discussion but was soon dominated by one issue: data center staffing.

The members’ concerns reflect the growing disquiet in the industry. Data centers are struggling to recruit and retain sufficient qualified staff to provide and grow reliable operations. In Uptime Institute’s 2018 annual global survey of data center operators, over half of the respondents reported that they were either having difficulty finding candidates to fill open jobs or were having trouble retaining data center staff.

A tight labor market is exacerbating the issue: job vacancies in the United States hit a record high in December 2018, and the US is not the only country with a robust job-seeker forecast. With a large number of experienced managers set to leave the workforce in the next decade or two, analysts now question whether labor shortages will prove a drag on growth. Data center operators have reported losing staff not only to mission critical industries, such as hospitals and utilities, but also to unexpected enterprises — even fairgrounds. Not to mention competition from hyperscales, which are luring experienced data center staff away with hard to resist salaries.

An aging workforce is of particular concern in the fast-growing IT/data center industry. Almost three-quarters of the respondents to our 2018 survey had more than 15 years of work experience, and more than a third had over 25 years’ experience.

Despite the need for more qualified workers, over half of respondents reported that women comprise less than six percent of their data center design, build or operations staff. But a majority (73 percent) felt that the lack of diversity was not a concern.

This may prove to be complacent. McKinsey’s longitudinal data on over 1,000 companies in 12 countries shows a significant correlation between diversity and business performance. And a large study (over 1,000 firms in 35 countries and 24 industries) recently profiled in the Harvard Business Review clarified two important questions about the impact of gender diversity on business performance: First, intention matters. Gender diversity yields benefits only in those industries that view inclusion as important — this may be an important issue for the data center sector to address. Second, the study distinguished cause and effect: Women weren’t just more attracted to high-performing companies; hiring more women led to better performance.

There are many strategies for attracting and keeping data center staff, but none will be a panacea. Watch for new Uptime Institute initiatives and research in the coming months, available to Uptime Institute Members.

——————————————————————————–

For more information on Staffing and Skills needed for the data center, and the impact the growing concern is already having on operational execution, join the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts  and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday.  For membership information click here.

Is PUE actually going UP?

One of the more intriguing results of the Uptime Institute Global Data Center Survey 2019 concerned energy efficiency.  For years, data centers have become ever more efficient, with power usage effectiveness (PUE) ratings across the industry (apparently) falling.  Big operators, such as hyperscale cloud companies and big colos, regularly claim annual or design PUE figures between 1.1 and 1.4.  It is an industry success story — a response to both higher power prices and concerns about carbon dioxide emissions.

Uptime Institute has tracked industry average PUE numbers, at intervals, over 12 years (see figure below).  And this year, for the first time, there was no recorded improvement. In fact, energy efficiency deteriorated slightly, from an average PUE of 1.58 in 2018 to 1.67 in 2019 (lower is better). Can this really be right, and if so, how do we explain it?

Has PUE Improvement Stalled?

Has PUE Improvement Stalled?

The first question is, “Is the data good?” Our respondents are informed (data center operations staff and IT management from around the world) and our sample size for this topic was quite large (624) — those who didn’t know the answer were taken out of the sample. And while there may be a margin of error, we can already see on a year-by-year basis the improvements have flattened out. We can at least conclude that energy efficiency has stopped improving.

The number is also realistic. We know that most operators cannot compete with the finely tuned, aggressively efficient hyperscale data centers in energy efficiency, nor indeed with newer, highly efficient colocation sites. As we said, in these sectors, PUE values of 1.1 to 1.4 are frequently claimed.

What explanations do we have? It is speculation, but we think that several factors could have caused a slight, and probably temporary, halt in PUE improvements. For example, the higher and extreme temperatures experienced in the last year in many parts of the world where data centers are situated could account for increased use of cooling and, hence, higher PUEs. Another factor is that utilization in many data centers — although certainly not in all — has fallen as certain workloads are moved to public cloud services. This means more data centers may be operated below their optimal design efficiency, or they may be cooling inefficiently due to poor layout of servers. Another possible reason is that more operators have higher density racks (we know to this from separate data). This may push cooling systems to work harder or to switch from free cooling to mechanical modes.

Certainly, there is an explanation for the flattening out of the numbers over the 12 years. The most dramatic increases in energy efficiency were achieved between 2007 and 2013, often by taking steps such as hot/cold air separation, raising temperatures, or applying more control on cooling, fans and power distribution. The widespread adoption of free air cooling (direct and indirect) in newer builds has also helped to bring the overall level of energy use down. But it is clear that the easiest steps have largely been taken.

Even so, we do still find these results a little puzzling. Smaller data centers tend to have much higher PUEs and we know there is an industry trend of consolidation, so many are closing. And most colos, a thriving sector, have PUEs below 1.5. Finally, of course, is the addition of new data centers — which tend to have lower PUEs. These factors, coupled with the overall improvement in technology and knowledge, mean PUEs should still be edging down.

One thing we do know and must emphasize: The average PUE per data center does not equal the overall PUE per kW of IT load. This is undoubtedly going down, although it is harder to track. Our data, along with everyone else’s, shows a rapid growth in the proportion of workloads in the public cloud — and there, PUEs are very low. Similarly, more work is in large colos.

But it would also a mistake to think this is the solution. Most mission-critical enterprise IT is not currently going into the public cloud, and enterprise energy efficiency remains important.

A final point: PUE is not the only or even the most important metric to track energy efficiency. Data center operators should always watch and understand the total energy consumption of their data centers, with the goal of improving both IT and facility energy efficiency.

—————————————————————————————————————————————————-

The full report Uptime Institute global data center survey 2019 is available to members of the Uptime Institute Network here. Our upcoming webinar (May 29, 2019 at 12 noon EDT) discussing the survey results is open to the general public.

Comparing the severity of IT service outages: Uptime Institute’s Outage Severity Rating

Avoiding IT service outages is a big concern for any operator or service provider, especially one providing a business-critical service.  But when an outage does occur, the business impact can vary from “barely noticeable” to “huge and expensive.”  Anticipating and modeling the impact of a service interruption should be a part of incident planning and is key to determining the level of investment that should be made to reduce incidents and their impact.

In recent years, Uptime Institute has been collecting data about service outages, including the costs, the consequences and, most notably, the most common causes.  One of our findings is that organizations often don’t collect full financial data about the impact of outages, or if they do, it might take months for these to become apparent. Many of the costs are hidden, even if the outcry from managers and even non-paying customers is most certainly not.  But cost is not a proxy for impact: even a relatively short and inexpensive outage at a big, consumer-facing service provider can attract negative, national headlines.

Another clear trend, now that so many applications are distributed and interlinked, is that “outages” can often be partial, affecting users in different ways. This has, in some cases, enabled some major operators to claim very impressive availability figures in spite of poor customer experience. Their argument: Just because a service is slow or can’t perform some functions doesn’t mean it is “down.”

To give managers a shorthand way to talk about the impact of a service outage, Uptime Institute developed the Outage Severity Rating (below). The rating is not scientific and might be compared to the internationally used Beaufort Scale, which describes how various wind speeds are experienced on land and sea.

The Uptime Institute Outage Severity Rating describes the business/service impact of an IT service interruption, regardless of the cause.

By applying this scale to widely reported outages from 2016-2018, Uptime Institute tracked 11 “Severe” Category 5 outages and 46 “Serious” Category 4 outages. Of these 11 severe outages, no fewer than five occurred at airlines. In each case, multi-million-dollar losses occurred, as flights were cancelled and travelers stranded. Compensation was paid, and negative headlines ensued.

Analysis suggests both obvious and less obvious reasons why airlines were hit so hard: the obvious one is that airlines are not only highly dependent on IT for almost all elements of the operations, but also that the impact of disruption is immediate and expensive. Less obviously, many airlines have been disrupted by low cost competition and forced to “do more with less” in the field of IT. This leads to errors and over-thrifty outsourcing, and it makes incidents more likely.

If we consider Categories 4 and 5 together, the banking and financial services sector is the most over-weighted. For this sector, outage causes varied widely, and in some cases, cost cutting was a factor. More commonly, the real challenge was simply managing complexity and recovering from failures fast enough to reduce the impact.

——————————————————————————–

Members of Uptime Institute Network experience HALF of the incidents that cause these type of service disruptions. Members share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday.  For membership information click here.

Data Center AI (Artificial Intelligence) Creates New Risks

Artificial intelligence (AI) is being used in data centers to drive up efficiencies and drive down risks and costs. But it also creates new types of risks. This is one of the findings from a recent Uptime Intelligence research report #25, “Very smart data centers: How artificial intelligence will power operational decisions”, published in April 2019 (and available to Uptime Institute Network members).

Some of these risks are not clear-cut. Take, for example, new AI-driven cloud services, such as data center management as a service (DMaaS), that pool anonymized data from hundreds or thousands of other customers’ data centers. They apply AI to this vast store of information and then deliver individualized insight to customers via a wide area network, usually the internet. But that raises a big question: Who owns the data, the supplier or the customer? The answer is usually both: customers can keep their own data but the supplier typically also retains a copy (even if the paid service stops, the data becomes an anonymous part of their data lake.)

Whether lack of certainty or clarity over data ownership actually constitutes a risk to data centers is vigorously debated. Some say that if hackers accessed data, it would be of little use as the data is anonymized and, for example, does not include specific location details. Others say hackers could apply techniques, including their own AI analysis, to piece together sensitive information to build up a fairly complete picture.

This is just one example of the risks that should at least be considered when deploying AI. In our new report, we describe four areas of risk with AI offerings:

Commercial risk: AI models and data are (often) stored in the public cloud and outside of immediate control (if using a supplier model) or may be on-site but not understood.

  • Commercial machine learning products and services raise the risk of lock-in because processes and systems may be built on top of models using data that cannot be replicated.
  • Pricing may be increasing as adoption grows — at present, prices are low to attract new data (to build up the effectiveness of AI models) or to attract equipment services or sales.
  • A high reliance on AI could change skills requirements or “deskill” staff positions, which could potentially be an issue.

Legal and service level agreement risk: Again, AI models and data are stored outside of immediate control (if using a supplier model) or may be on-site but not understood.

  • This may be unacceptable for some, such as service providers or organizations operating within strict regulatory environments.
  • In theory, it could also shift liability back to an Data Center AI service supplier — a particular concern for any automated actions provided by the service.

Technical risk: While we usually understand what types of data are being used for human actions and recommendations, it is not always possible to understand why and exactly how a machine reached a decision.

  • It may not be possible to easily change or override decisions.
  • As machines guide more decisions, core skills may become outsourced, leaving organizations vulnerable.

Interoperability risk and other “unknown unknowns”: The risk from the development of “2001” HAL scenarios (i.e., singularity) are over-played but there is an unknown, long-term risk.

  • One example is that Data Center AI is likely to be embedded in most cases (i.e., inside an individual equipment and management system). This could lead to situations where two or three or five systems all have some ability to take action according to their own models, leading to a potential runaway situation — or conflict with each other. For example, a building management system may turn up the cooling, while an IT system moves workload to another location, which turns up cooling elsewhere.

——————————————————————————–

For more on Artificial Intelligence in the data center read our latest report, Very smart data centers: How artificial intelligence will power operational decisions, available to members of the Uptime Institute Network. Members share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday.  For membership information click here.