One of the findings of Uptime Institute’s recently published report Annual outage analysis 2020 is that the most serious categories of outages — those that cause a significant disruption in services — are becoming more severe and more costly. This isn’t entirely surprising: individuals and businesses alike are becoming ever more dependent on IT, and it is becoming harder to replicate or replace an IT service with a manual system.
But one of the findings raises both red flags and questions: Setting aside those that were partial and incidental (which had minimal impact), publicly reported outages over the past three years appear to be getting longer. And this, in turn, is likely to be one of the reasons why the costs and severity of outages have been rising.
The table above shows the numbers of publicly reported outages collected by Uptime Institute in the years 2017-2019, except for those that did not have a financial or customer impact or those for which there was no known cause. The figures show outages are on the rise. This is due to a number of factors, including greater deployment of IT services and better reporting. But they also show a tilt in the data towards longer outages — especially those that lasted more than 48 hours. (This is true even though one of the biggest causes of lengthy outages — ransomware — was excluded from our sample.)
The outage times reported are until full IT service recovery, not business recovery — it may take longer, for example, to move aircraft back to where they are supposed to be, or to deal with backlogs in insurance claims. This trend is not dramatic, but it is real and it is concerning, because a 48-hour interruption can be lethal for many organizations.
Why is it happening? Complexity and interdependency of IT systems and greater dependency on software and data are very likely to be big reasons. For example, Uptime’s Institute’s research shows that fewer big outages are caused by power failure in the data center and more by IT systems configuration now than in the past. While resolving facility engineering issues may not be easy, it is usually a relatively predictable matter: failures are often binary, and very often recovery processes have been drilled into the operators and spare parts are kept at hand. Software, data integrity and damaged/interrupted cross-organizational business processes, however, can be much more difficult issues to resolve or sometimes even to diagnose — and these types of failure are becoming much more common (and yes, sometimes they are triggered by a power failure). Furthermore, because failures can be partial, files may become out of sync or even be corrupted.
There are lessons to be drawn. The biggest is that the resiliency regimes that facilities staff have lived by for three decades or more need to be extended and integrated into IT and DevOps and fully supported and invested in by management. Another is that while disaster recovery may be slowly disappearing as a type of commercial backup service, the principles of vigilance, recovery, and fail over – especially when under stress – are more important than ever.
The full report Annual outage analysis 2020 is available to members of the Uptime Institute Network which can be requested here.
https://journal.uptimeinstitute.com/wp-content/uploads/2020/02/GettyImages-1202490579-blog.jpg10002700Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngAndy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.com2020-03-30 06:00:592020-03-13 10:57:44Are IT Infrastructure Outages Getting Longer?
Over the past few weeks, Uptime Institute held multiple customer roundtables to discuss the impact of the COVID-19 virus on data center operations and potential operational responses to its spread. We gathered our communities insights and best practices, we combined with our own 25 years worth of infrastructure operational management knowledge and we are now making this information available freely to the data center industry. HERE.
A little background to get you started right away….
Dozens of organizations were represented at these roundtables, which was open to a global audience of Uptime Institute Network membership. What we learned is that while most organizations have a plan for foreseen emergency situations, few have one specific to a global pandemic. As a result, many have been hurrily modifying existing plans based on gut feel and good intentions: creating tiered response levels, identifying events that would trigger the next level of response, researching concerns specific to a pandemic (e.g., what does “deep cleaning” mean in a white space, and what are the implications for different data center environments — raised floors, multi-tenant data centers, mixed-use facilities, etc.?).
But this is clearly unchartered territory for ALL of us. For many organizations, the Human Resources and/or Environmental Health & Safety department(s) take the lead in generating an organization-wide response plan, and specific business units, such as data center operations, incorporate that guidance into a plan tailored to their business mission and setting. Because many organizations have data centers in multiple regions, responses may vary by location or facility characteristics. A sample but very broad Emergency Response Plan by the US government’s FDA (with portions pertaining to the delivery of IT services contained within) can be seen here.
But immediately actionable tangible advice goes a long way in times like these. Several participants mentioned that their facilities now screen all potential visitors with a questionnaire. They do not admit anyone who reports symptoms (personally or in family members) or who has traveled recently to areas with high levels of COVID-19 cases. Some repondants advised that nn additional measure of their security involves prescreening all scheduled visitors: Send the visitor the questionnaire via email 4-8 hours prior to their visit and require completion before the appointment is confirmed. Only permit entry if the questionnaire indicates a low probability of infection (confirm all answers remain unchanged upon arrival) and prohibit unscheduled visitors altogether.
Some facilities – for example, multi-tenant data centers or mixed-use facilities – have a higher volume of visitors, and thus greater potential for COVID-19 spread. To avoid inconvenience and potential client dissatisfaction, be proactive: Inform all affected parties of the COVID-19 preparedness plan in place and its impact on their access to the facility in advance.
Sanitization is a particular challenge, with several participants reporting disinfectant/hand sanitizer shortages. Many had questions specific to deep cleaning the white space environment, given its high rate/volume of air exchange, highly specialized electronic equipment and possible raised floor configuration. Spray techniques are more effective than simply wiping surfaces with disinfectant solutions, as the antiseptic mist coats surfaces for a longer period. Many organizations are hiring specialist cleaning firms and/or following CDC recommendations for disinfection.
As COVID-19 spreads, more organizations are moving their energy from academicly tweaking their written response plans to implementing them. In many companies, that decision is made by a business unit, based on site environment, number of COVID-19 cases in the area and government-mandated restrictions. Mission-critical facilities have a particular remit, though, so need to create and implement plans specific to their business mission.
Good preparation simplifies decision-making. Roundtable participants suggest the following:
Categorizing essential versus nonessential tasks and calendaring them in advance (makes it easier to identify maintenance items you can postpone, and for how long).
Cross-training personnel and maintaining up-to-date skill inventories/certifications (helps ensure core capabilities are always available).
Having contingency plans in place (means you’re prepared to manage supply chain disruption and staff shortages).
Stress-testing technologies and procedures in advance (gives you confidence that you can accommodate a move to remote work: shifting procedures that are usually performed manually to an automated process, monitoring remotely, interacting virtually with other team members, etc.).
It’s no longer a question of if a plan like this will be needed, we know it is! Most facility operators need to quickly craft and then implement their response plan, and learn from this incident for the future.
Uptime Institute has created a number of resources and will be providing specific guidance regarding the COVID-19 situation here.
https://journal.uptimeinstitute.com/wp-content/uploads/2020/03/GettyImages-1212420405-blog.jpg10672667Sandra Vailhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngSandra Vail2020-03-17 06:00:572020-03-17 09:44:54COVID-19: IT organizations move from planning to implementation
As enterprises continue to move from a focus on capital expenditures to operating expenditures, more data center components will also be consumed on a pay-as-you-go, “as a service” basis.
“-aaS” goes mainstream
The trend toward everything “as a service” (XaaS) is now mainstream in IT, ranging from cloud (infrastructure-aaS) and software-aaS (SaaS) to newer offerings, such as bare metal-aaS, container-aaS, and artificial intelligence-aaS (AI-aaS). At the IT level, service providers are winning over more clients to the service-based approach by reducing capital expenditures (capex) in favor of operational expenditures (opex), by offering better products, and by investing heavily to improve security and compliance. More organizations are now willing to trust them.
But this change is not confined to the IT: a similar trend is underway in data centers.
Why buy and not build?
While the cost to build new data centers is generally falling, driven partly by the availability of more prefabricated components, enterprise operators have been increasingly competing against lower-cost options to host their IT — notably colocation, cloud and SaaS.
Cost is rarely the biggest motivation for moving to cloud, but it is a factor. Large cloud providers continue to build and operate data centers at scale and enjoy the proportional cost savings as well the fruits of intense value engineering. They also spread costs among customers and tend to have much higher utilization rates compared with other data centers. And, of course, they invest in innovative, leading-edge IT tools that can be rolled out almost instantly. This all adds up to ever-improving IT and infrastructure services from cloud providers that are cheaper (and often better) than using or developing equivalent services based in a smaller-scale enterprise data center.
Many organizations have now come to view data center ownership as a big capital risk — one that only some want to take. Even when it’s cheaper to deliver IT from their own “on-premises” data center, the risks of data center early obsolescence, under-utilization, technical noncompliance or unexpected technological or local problems are all factors. And, of course, most businesses want to avoid a big capital outlay: Our research shows that, in 2017, the total cost of ownership of an “average” concurrently maintainable 3 megawatt (MW) enterprise data center amortized over 15 years was about $90 million, and that roughly half of the cost is invested in three installments over the first six years, assuming a typical phased build and bricks-and-mortar construction.
This represents a significant amount of risk. To be economically viable, the enterprise must typically operate a facility at a high level of utilization — yet forecasting future data center capacity remains enterprises’ top challenge, according to our research.
Demand for enterprise data centers remains sizable, in spite of the alternatives. Many enterprises with smaller data centers are closing them and consolidating into premium, often larger, centralized data centers and outsourcing as much else as possible.
The appeal of the cloud will continue to convince executives and drive strategy. Increasingly, public cloud is an alternative way to deliver workloads faster and cheaper without having to build additional on-premise capacity. Scalability, portability, reduced risk, better tools, high levels of resiliency, infrastructure avoidance and fewer staff requirements are other key drivers for cloud adoption. Innovation and access to leading-edge IT will likely be bigger factors in the future, as will more cloud-first remits from upper management.
Colocation, including sale leasebacks
Although rarely thought of in this way, colocation is the most widely used “data center-aaS” offering today. Sale with leaseback of the data center by enterprise to colos is also becoming more common, a trend that will continue to build (see UII Note 38: Capital inflow boosts the data center market).
Colo interconnection services will attract even more businesses. More will likely seek to lease space in the same facility as their cloud or other third-party service provider, enabling lower latency and fewer costs and more security for third-party services, such as storage-aaS and disaster recovery-aaS.
While more enterprise IT is moving to colos and managed services (whether or not it is cloud), enterprise data centers will not disappear. More than 600 IT and data center managers told Uptime Institute that, in 2021, about half of all workloads will still be in enterprise data centers, and only 18% of workloads in public cloud/SaaS.
Other “as a service” trends in data centers
Data center monitoring and analysis is another relatively new example of a pay-as-you-go service. Introduced in late 2016, data center management as a service is a big data-driven cloud service that provides customized analysis and is paid for on a recurring basis. The move to a pay-as-you-go service has helped unlock the data center infrastructure management market, which was struggling for growth because of costs and complexity.
Energy backup and generation is another area to watch. Suppliers have introduced various pay-as-you-go models for their equipment. These include leased fuel cells owned by the supplier (notably Bloom Energy), which charges customers only for the energy produced. By eliminating the client’s risk and capital outlay, it can make the supplier’s sale easier (although they have to wait to be paid). Some suppliers have ventured in UPS-aaS, but with limited success to date.
More alternatives to ownership are likely for data center electrical assets, such as batteries. Given the high and fast rate of innovation in the technology, leasing large-scale battery installations delivers the capacity and innovation benefits without the risks.
It’s also likely that more large data centers will use energy service companies (ESCOs) to produce, manage and deliver energy from renewable microgrids. Demand for green energy, for energy security (that is, energy produced off-grid) and energy-price stability is growing; ESCOs can deliver all this for dedicated customers that sign long-term energy-purchase agreements but don’t have the capital required to build or the expertise necessary to run a green microgrid.
Demand for enterprise data centers will continue but alongside the use of more cloud and more colo. More will be consumed “as a service,” ranging from data center monitoring to renewable energy from nearby dedicated microgrids.
The full report Ten data center industry trends in 2020 is available to members of the Uptime Institute Network. Membership information can be found here.
https://journal.uptimeinstitute.com/wp-content/uploads/2020/03/GettyImages-1187121207-blog.jpg10002700Rhonda Ascierto, Vice President, Research, Uptime Institutehttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngRhonda Ascierto, Vice President, Research, Uptime Institute2020-03-02 06:01:122020-02-28 15:23:13Pay-as-you-go model spreads to critical components
Despite years of discussion, warnings and strict regulations in some countries, data center hot work remains a contentious issue in the data center industry. Hot work is the practice of working on energized electrical circuits (voltage limits differ regionally) — and it is usually done, in spite of the risks, to reduce the possibility of a downtime incident during maintenance.
Uptime Institute advises against hot work in almost all instances. The safety concerns are just too great, and data suggests work on energized circuits may — at best — only reduce the number of manageable incidents, while increasing the risk of arc flash and other events that damage expensive equipment and may lead to an outage or injury. In addition, concurrently maintainable or fault tolerant designs as described in Uptime Institute’s Tier Standard make hot work unnecessary.
The pressure against hot work continues to mount. In the US, electrical contractors have begun to decline some work that involves working on energized circuits, even if an energized work permit has been created and signed by appropriate management, as required by National Fire Protection Association (NFPA) 70E (Standard for Electrical Safety in the Workplace). In addition, US Department of Labor’s Occupational Safety and Hazards Agency (OSHA) has repeatedly rejected business continuity as an exception to hot work restrictions, making it harder for management to justify hot work and to find executives willing to sign the energized work permit.
OSHA statistics make clear that work on energized systems is a dangerous practice, especially for construction trades workers; installation, maintenance, and repair occupations; and grounds maintenance workers. For this reason, NFPA 70E sharply limits the situations in which organizations are allowed to work on energized equipment. Personnel safety is not the only issue; personal protective equipment (PPE) protects only workers, not equipment, so an arc flash can destroy many thousands of dollars of IT gear.
Ignoring local and national standards can be costly, too. OSHA reported 2,923 lockout/tagout and 1,528 PPE violations in 2017, among the many safety concerns it addressed that year. New minimum penalties for a single violation exceed $13,000, with top total fines for numerous, willful and repeated violations running into the millions of dollars. Wrongful death and injury suits add to the cost, and violations can lead to higher insurance premiums, too.
Participants in a recent Uptime Institute discussion roundtable agreed that the remaining firms performing work on live loads should begin preparing to end the practice. They said that senior management is often the biggest impediment to ending hot work, at least at some organizations, despite the well-known and documented risks. Executive resistance can be tied to concerns about power supplies or failure to maintain independent A/B feeds. In some cases, service level agreements contain restrictions against powering down equipment.
Despite executive resistance at some companies, the trend is clearly against hot work. By 2015, more than two-thirds of facilities operators had already eliminated the practice, according to Uptime Institute data. A tighter regulatory environment, heightened safety concerns, increased financial risk and improved equipment should combine to all but eliminate hot work in the near future. But there are still holdouts, and the practice is far more acceptable in some countries — China is an example — than in others, such as the US, where NFPA 70E severely limits the practice in all industries.
Also, hot work does not eliminate IT failure risk. Uptime Institute has been tracking data center abnormal incidents for more than 20 years and when studying the data, at least 71 failures occurred during hot work. While these failures are generally attributed to poor procedures or maintenance, a recent, more careful analysis concluded that better procedures or maintenance (or both) would have made it possible to perform the work safely — and without any failures — on de-energized systems.
The Uptime Institute abnormal incident database includes only four injury reports; all occurred during work on energized systems. In addition, the database includes 16 reports of arc flash. One occurred during normal preventive maintenance and one during an infrared scan. Neither caused injury, but the potential risk to personnel is apparent, as is the potential for equipment damage (and legal exposure).
Undoubtedly, eliminating hot work is a difficult process. One large retailer that has just begun the process expects the transition to take several years. And not all organizations succeed: Uptime Institute is aware of at least one organization in which incidents involving failed power supplies caused senior management to cancel their plan to disallow work on energized equipment.
According to several Uptime Institute Network community members, building a culture of safety is the most time-consuming part of the transition from hot work, as data centers are goal-oriented organizations, well-practiced at developing and following programs to identify and eliminate risk.
It is not necessary or even prudent to eliminate all hot work at once. The IT team can help slowly retire the practice by eliminating the most dangerous hot work first, building experience on less critical loads, or reducing the number of circuits affected at any one time. To prevent common failures when de-energizing servers, the Operations team can increase scrutiny on power supplies and ensure that dual-corded servers are properly fed.
In early data centers, the practice of hot work was understandable — necessary, even. However, Uptime Institute has long advocated against hot work. Modern equipment and higher resiliency architectures based on dual-corded servers make it possible to switch power feeds in the case of an electrical equipment failure. These advances not only improve data center availability, they also make it possible to isolate equipment for maintenance purposes.
https://journal.uptimeinstitute.com/wp-content/uploads/2020/02/Energized-hot-work-cropped-blog.jpg11983242Kevin Heslinhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngKevin Heslin2020-02-24 05:30:402020-02-14 15:34:41Phasing Out Data Center Hot Work
Uptime Institute Intelligence plans to release its 2019/2020 outages report shortly. This report will examine the types, causes and impacts of public outages, as well as further analyze the results of a recent Uptime survey on outages and impacts. The data will once again show that serious IT service interruptions are common and costly, with the impacts often causing serious disruption.
We have excluded one type of outage from the report: those caused by cyberattacks. Data integrity and cybersecurity is, of course, a very major issue that requires vigilant attention and investment, but it is not currently an area on which Uptime Institute researches and advises. Most security issues are data breaches; although they have serious consequences, they do not usually lead to a service interruption.
However, two forms of malicious attack can and often do lead to outages or at least a severe service degradation. The first is a Distributed Denial of Service (DDoS) attack, where a coordinated attempt is made to overwhelm a site with traffic. Uptime has tracked a number of these each year for many years, and security specialists say they are increasingly common. Even so, most organizations that are DDoS targets have developed effective countermeasures that minimize the threat. These measures include such techniques as packet filtering, load balancing and blocking suspect internet protocol addresses. As a result, DDoS attacks are showing up less frequently in our lists of outages.
The second type, ransomware, is emerging as a major problem and cause of outages. Ransomware attackers deny authorized users access to their own data; the hackers use malware to encrypt the user’s files and refuse to unlock them unless a ransom is paid. Often, operators have no choice but to take down all involved IT services in an attempt to recover access, restore from the last clean backup copy, and purge the systems of viruses. Outages can last days or weeks.
In the past two years, ransomware attacks have increased dramatically. The FBI investigated over 1,400 ransomware attacks in 2018. Government offices are a particular target. Kaspersky Research Labs, operated by security software supplier Kaspersky, identified 147 attacks on municipalities in 2019 (up 60%), in which the criminals demanded ransoms of $5.3 million. The IT Governance blog, based in the UK, recorded 19 major ransomware attacks globally in December 2019 alone.
Most US cities have now signed a charter never to pay a ransom to the criminals — but more importantly, most are now also upgrading their infrastructure and practices to prevent attacks. Some that have been targeted, however, have paid the ransom.
Perhaps the two most serious attacks in 2019 were the City of Baltimore, which refused to pay the ransom and budgeted $18 million to fix its problem; and the City of Atlanta, which also refused to pay the ransom and paid over $7 million to fully restore operations. The WannaCry virus attack in 2018 reportedly cost the UK National Health Service over $120 million (£92 million). And on New Year’s Eve 2019, Travelex’s currency trading went offline for two weeks due to a ransomware attack, costing it millions.
Preventing a ransomware attack has become — or should become — a very high priority for those concerned with resiliency. Addressing the risk may involve some stringent, expensive and inconvenient processes, such as multifactor security, since attackers will likely try to copy all passwords as well as encrypt files. In terms of the Uptime Institute Outage Severity Rating, many attacks quickly escalate to the most serious Category 4 or 5 levels — severe enough to costs millions and threaten the survival of the organization. Indeed, one North American health provider has struggled to recover after receiving a $14 million ransom demand.
All of this points to the obvious imperative: The availability and integrity of digital infrastructure, data and services is critical — in the fullest sense of the word — to almost all organizations today, and assessments of vulnerability need to span security, software, systems, power, networks and facilities. Weaknesses are likely to be exploited; sufficient investment and diligence in this area has become essential and must never waver. In hindsight we discover that almost all outages could have prevented with better management, processes and technology.
Members of the Uptime Institute Network can read more on this topic here.
https://journal.uptimeinstitute.com/wp-content/uploads/2020/02/GettyImages-1142860861-blog.jpg4461221Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngAndy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.com2020-02-10 05:46:282020-01-30 14:56:41The spectre of ransomware
A wave of new technologies, from 5G to the internet of things (IoT) to artificial intelligence (AI), means much more computing and much more data will be needed near the point of use. That means many more small data centers will be required. But there will be no sudden mass deployment, no single standout use case, no single design dominating. Demand is likely to grow faster from 2022.
Small package, big impact
Suppliers in the data center industry are excited. Big vendors such as Schneider, Vertiv and Huawei have been rapidly adding to their product lines and redrawing their financial forecasts; startups — companies such as Vapor IO, EdgeMicro, EdgeInfra and MetroEDGE — are pioneering new designs; and established telco specialists, such as Ericsson, along with telco operators, are working on new technologies and partnerships. Builders and operators of colocation data centers, such as EdgeConneX, Equinix and Compass, are assessing where the opportunity lies.
The opportunity is to supply, build or operate local edge data centers — small micro data centers that are designed to operate near the point of use, supporting applications that are not suited to run in big, remote data centers, even in mid-sized regional colocation data centers. Unlike most larger data centers, micro data centers will mostly be built, configured and tested in a factory and delivered on a truck. Typical sizes will be 50 kW to 400 kW, and there are expected to be a lot of them.
But with the anticipation comes consternation — it is possible to commit too early. Some analysts had predicted that the explosion in edge demand would be in full swing by now, fueled by the growing maturity of the IoT and the 2020 launch schedules for 5G services. Suppliers, however, mostly report only a trickle — not a flood — of orders.
Privately, some suppliers admit they have been caught off guard. There is a deep discussion about the extent of data center capacity needed at the local edge; about just how many applications and services really need local edge processing; and about the type and size of IT equipment needed — maybe a small box on the wall will be enough?
While the technical answers to most of these questions are largely understood, questions remain about the economics, the ownership, and the scale and pace of deployment of new technologies and services. These are critical matters affecting deployment.
Edge demand and 5G
In the past decade, data and processing has shifted to a cloudy core, with hundreds of hyperscale data centers built or planned. This will continue. But a rebalancing is underway (see Uptime Institute Intelligence report: The internet tilts toward the edge), with more processing being done not just at the regional edge, in nearby colocation (and other regional) data centers, but locally, in a micro data center that is tens or hundreds of meters away.
This new small facility may be needed to support services that have a lot of data, such as MRI scanners, augmented reality and real-time streaming; it may be needed to provide very low latency, instantly responsive services for both humans and machines — factory machines are one example, driverless cars another; and it may be needed to quickly crunch AI calculations for immediate, real-time responses. There is also a more mundane application: to provide on-site services, such as in a hospital, factory or retail establishment, should the network fail.
With all these use cases, why is there any doubt about the micro data center opportunity?
First, in terms of demand drivers, no new technology has created so much interest and excitement as 5G. The next generation telecom wireless network standard promises speeds of up to 10 gigabits per second (Gbps) communications, latency of below five millisecond (ms), support for one million devices per square kilometer, and five-nines availability. It will ultimately support a vast array of new always-on, low latency and immersive applications that will require unimaginable amounts of data and compute power — too much to realistically or economically send back to the internet’s hyperscale core. Much of this will require low-latency communications and rapid processing of a few milliseconds or less — which, the speed of light dictates, must be within a few kilometers.
Few doubt that 5G will create (or satisfy) huge demand and play a pivotal role in IoT. But the rollout of 5G, already underway, is not going to be quick, sudden or dramatic. In fact, full rollout may take 15 years. This is because the infrastructure required to support 5G is too expensive, too complex, and involves too many parties to do all at once. Estimates vary, with at least one analyst firm predicting that telecom companies will need to spend $1 trillion upgrading their networks.
A second issue that is creating uncertainty about demand is that many edge applications — whether supported by 5G or some other networking technology (such as WiFi 6) — may not require a local micro data center. For example, high-bandwidth applications may be best served from a content distribution network at the regional edge, in a colo, or by the colo itself, while many sensors and IoT devices produce very little data and so can be served by small gateway devices. Among 5G’s unique properties is the ability to support data-heavy, low-latency services at scale — but this is exactly the kind of service that will mostly be deployed in 2021 or later.
Suppliers and telcos alike, then, are unsure about the number, type and size of data centers at the local edge. Steve Carlini, a Schneider Electric executive, told Uptime Institute that he expects most demand for micro data centers supporting 5G will be in the cities, where mobile edge-computing clusters would likely each need one micro data center. But the number of clusters in each city, far fewer than the number of new masts, would depend on demand, applications and other factors.
A third big issue that will slow demand for micro data centers is economic and organizational. These issues include licensing, location and ownership of sites; support and maintenance; security and resiliency concerns; and management sentiment. Most enterprises expect to own their own edge micro data centers, according to Uptime Intelligence research, but many others will likely prefer to outsource this altogether, in spite of potentially higher operational costs and a loss of control.
Suppliers are bullish, even if they know demand will grow slowly at first. Among the first-line targets are those simply looking to upgrade server rooms, where the work cannot be turned over to a colo or the cloud; factories with local automation needs; retailers and others that need more resiliency in distributed locations; and telcos, whose small central offices need the security, availability and cost base of small data centers.
This wide range of applications has also led to an explosion of innovation. Expect micro data centers to vary in density, size, shape, cooling types (include liquid), power sources (including lithium ion batteries and fuel cells) and levels of resiliency.
The surge in demand for micro data centers will be real, but it will take time. Many of the economic and technical drivers are not yet mature; 5G, one of the key underlying catalysts, is in its infancy. In the near term, much of the impetus behind the use of micro data centers will lie in their ability to ensure local availability in the event of network or other remote outages.
The full report Ten data center industry trends in 2020 is available to members of the Uptime Institute Network here.
https://journal.uptimeinstitute.com/wp-content/uploads/2020/01/GettyImages-157394357-blog.jpg10242720Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.comhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngAndy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.com2020-02-03 05:34:532020-01-30 14:46:06Micro data centers: An explosion in demand, in slow motion
Are IT Infrastructure Outages Getting Longer?
/in Executive, Operations/by Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.comOne of the findings of Uptime Institute’s recently published report Annual outage analysis 2020 is that the most serious categories of outages — those that cause a significant disruption in services — are becoming more severe and more costly. This isn’t entirely surprising: individuals and businesses alike are becoming ever more dependent on IT, and it is becoming harder to replicate or replace an IT service with a manual system.
But one of the findings raises both red flags and questions: Setting aside those that were partial and incidental (which had minimal impact), publicly reported outages over the past three years appear to be getting longer. And this, in turn, is likely to be one of the reasons why the costs and severity of outages have been rising.
The table above shows the numbers of publicly reported outages collected by Uptime Institute in the years 2017-2019, except for those that did not have a financial or customer impact or those for which there was no known cause. The figures show outages are on the rise. This is due to a number of factors, including greater deployment of IT services and better reporting. But they also show a tilt in the data towards longer outages — especially those that lasted more than 48 hours. (This is true even though one of the biggest causes of lengthy outages — ransomware — was excluded from our sample.)
The outage times reported are until full IT service recovery, not business recovery — it may take longer, for example, to move aircraft back to where they are supposed to be, or to deal with backlogs in insurance claims. This trend is not dramatic, but it is real and it is concerning, because a 48-hour interruption can be lethal for many organizations.
Why is it happening? Complexity and interdependency of IT systems and greater dependency on software and data are very likely to be big reasons. For example, Uptime’s Institute’s research shows that fewer big outages are caused by power failure in the data center and more by IT systems configuration now than in the past. While resolving facility engineering issues may not be easy, it is usually a relatively predictable matter: failures are often binary, and very often recovery processes have been drilled into the operators and spare parts are kept at hand. Software, data integrity and damaged/interrupted cross-organizational business processes, however, can be much more difficult issues to resolve or sometimes even to diagnose — and these types of failure are becoming much more common (and yes, sometimes they are triggered by a power failure). Furthermore, because failures can be partial, files may become out of sync or even be corrupted.
There are lessons to be drawn. The biggest is that the resiliency regimes that facilities staff have lived by for three decades or more need to be extended and integrated into IT and DevOps and fully supported and invested in by management. Another is that while disaster recovery may be slowly disappearing as a type of commercial backup service, the principles of vigilance, recovery, and fail over – especially when under stress – are more important than ever.
The full report Annual outage analysis 2020 is available to members of the Uptime Institute Network which can be requested here.
COVID-19: IT organizations move from planning to implementation
/in Executive, News, Operations/by Sandra VailOver the past few weeks, Uptime Institute held multiple customer roundtables to discuss the impact of the COVID-19 virus on data center operations and potential operational responses to its spread. We gathered our communities insights and best practices, we combined with our own 25 years worth of infrastructure operational management knowledge and we are now making this information available freely to the data center industry. HERE.
A little background to get you started right away….
Dozens of organizations were represented at these roundtables, which was open to a global audience of Uptime Institute Network membership. What we learned is that while most organizations have a plan for foreseen emergency situations, few have one specific to a global pandemic. As a result, many have been hurrily modifying existing plans based on gut feel and good intentions: creating tiered response levels, identifying events that would trigger the next level of response, researching concerns specific to a pandemic (e.g., what does “deep cleaning” mean in a white space, and what are the implications for different data center environments — raised floors, multi-tenant data centers, mixed-use facilities, etc.?).
But this is clearly unchartered territory for ALL of us. For many organizations, the Human Resources and/or Environmental Health & Safety department(s) take the lead in generating an organization-wide response plan, and specific business units, such as data center operations, incorporate that guidance into a plan tailored to their business mission and setting. Because many organizations have data centers in multiple regions, responses may vary by location or facility characteristics. A sample but very broad Emergency Response Plan by the US government’s FDA (with portions pertaining to the delivery of IT services contained within) can be seen here.
But immediately actionable tangible advice goes a long way in times like these. Several participants mentioned that their facilities now screen all potential visitors with a questionnaire. They do not admit anyone who reports symptoms (personally or in family members) or who has traveled recently to areas with high levels of COVID-19 cases. Some repondants advised that nn additional measure of their security involves prescreening all scheduled visitors: Send the visitor the questionnaire via email 4-8 hours prior to their visit and require completion before the appointment is confirmed. Only permit entry if the questionnaire indicates a low probability of infection (confirm all answers remain unchanged upon arrival) and prohibit unscheduled visitors altogether.
Some facilities – for example, multi-tenant data centers or mixed-use facilities – have a higher volume of visitors, and thus greater potential for COVID-19 spread. To avoid inconvenience and potential client dissatisfaction, be proactive: Inform all affected parties of the COVID-19 preparedness plan in place and its impact on their access to the facility in advance.
Sanitization is a particular challenge, with several participants reporting disinfectant/hand sanitizer shortages. Many had questions specific to deep cleaning the white space environment, given its high rate/volume of air exchange, highly specialized electronic equipment and possible raised floor configuration. Spray techniques are more effective than simply wiping surfaces with disinfectant solutions, as the antiseptic mist coats surfaces for a longer period. Many organizations are hiring specialist cleaning firms and/or following CDC recommendations for disinfection.
As COVID-19 spreads, more organizations are moving their energy from academicly tweaking their written response plans to implementing them. In many companies, that decision is made by a business unit, based on site environment, number of COVID-19 cases in the area and government-mandated restrictions. Mission-critical facilities have a particular remit, though, so need to create and implement plans specific to their business mission.
Good preparation simplifies decision-making. Roundtable participants suggest the following:
It’s no longer a question of if a plan like this will be needed, we know it is! Most facility operators need to quickly craft and then implement their response plan, and learn from this incident for the future.
Uptime Institute has created a number of resources and will be providing specific guidance regarding the COVID-19 situation here.
Pay-as-you-go model spreads to critical components
/in Executive, Operations/by Rhonda Ascierto, Vice President, Research, Uptime InstituteAs enterprises continue to move from a focus on capital expenditures to operating expenditures, more data center components will also be consumed on a pay-as-you-go, “as a service” basis.
“-aaS” goes mainstream
The trend toward everything “as a service” (XaaS) is now mainstream in IT, ranging from cloud (infrastructure-aaS) and software-aaS (SaaS) to newer offerings, such as bare metal-aaS, container-aaS, and artificial intelligence-aaS (AI-aaS). At the IT level, service providers are winning over more clients to the service-based approach by reducing capital expenditures (capex) in favor of operational expenditures (opex), by offering better products, and by investing heavily to improve security and compliance. More organizations are now willing to trust them.
But this change is not confined to the IT: a similar trend is underway in data centers.
Why buy and not build?
While the cost to build new data centers is generally falling, driven partly by the availability of more prefabricated components, enterprise operators have been increasingly competing against lower-cost options to host their IT — notably colocation, cloud and SaaS.
Cost is rarely the biggest motivation for moving to cloud, but it is a factor. Large cloud providers continue to build and operate data centers at scale and enjoy the proportional cost savings as well the fruits of intense value engineering. They also spread costs among customers and tend to have much higher utilization rates compared with other data centers. And, of course, they invest in innovative, leading-edge IT tools that can be rolled out almost instantly. This all adds up to ever-improving IT and infrastructure services from cloud providers that are cheaper (and often better) than using or developing equivalent services based in a smaller-scale enterprise data center.
Many organizations have now come to view data center ownership as a big capital risk — one that only some want to take. Even when it’s cheaper to deliver IT from their own “on-premises” data center, the risks of data center early obsolescence, under-utilization, technical noncompliance or unexpected technological or local problems are all factors. And, of course, most businesses want to avoid a big capital outlay: Our research shows that, in 2017, the total cost of ownership of an “average” concurrently maintainable 3 megawatt (MW) enterprise data center amortized over 15 years was about $90 million, and that roughly half of the cost is invested in three installments over the first six years, assuming a typical phased build and bricks-and-mortar construction.
This represents a significant amount of risk. To be economically viable, the enterprise must typically operate a facility at a high level of utilization — yet forecasting future data center capacity remains enterprises’ top challenge, according to our research.
Demand for enterprise data centers remains sizable, in spite of the alternatives. Many enterprises with smaller data centers are closing them and consolidating into premium, often larger, centralized data centers and outsourcing as much else as possible.
The appeal of the cloud will continue to convince executives and drive strategy. Increasingly, public cloud is an alternative way to deliver workloads faster and cheaper without having to build additional on-premise capacity. Scalability, portability, reduced risk, better tools, high levels of resiliency, infrastructure avoidance and fewer staff requirements are other key drivers for cloud adoption. Innovation and access to leading-edge IT will likely be bigger factors in the future, as will more cloud-first remits from upper management.
Colocation, including sale leasebacks
Although rarely thought of in this way, colocation is the most widely used “data center-aaS” offering today. Sale with leaseback of the data center by enterprise to colos is also becoming more common, a trend that will continue to build (see UII Note 38: Capital inflow boosts the data center market).
Colo interconnection services will attract even more businesses. More will likely seek to lease space in the same facility as their cloud or other third-party service provider, enabling lower latency and fewer costs and more security for third-party services, such as storage-aaS and disaster recovery-aaS.
While more enterprise IT is moving to colos and managed services (whether or not it is cloud), enterprise data centers will not disappear. More than 600 IT and data center managers told Uptime Institute that, in 2021, about half of all workloads will still be in enterprise data centers, and only 18% of workloads in public cloud/SaaS.
Other “as a service” trends in data centers
Data center monitoring and analysis is another relatively new example of a pay-as-you-go service. Introduced in late 2016, data center management as a service is a big data-driven cloud service that provides customized analysis and is paid for on a recurring basis. The move to a pay-as-you-go service has helped unlock the data center infrastructure management market, which was struggling for growth because of costs and complexity.
Energy backup and generation is another area to watch. Suppliers have introduced various pay-as-you-go models for their equipment. These include leased fuel cells owned by the supplier (notably Bloom Energy), which charges customers only for the energy produced. By eliminating the client’s risk and capital outlay, it can make the supplier’s sale easier (although they have to wait to be paid). Some suppliers have ventured in UPS-aaS, but with limited success to date.
More alternatives to ownership are likely for data center electrical assets, such as batteries. Given the high and fast rate of innovation in the technology, leasing large-scale battery installations delivers the capacity and innovation benefits without the risks.
It’s also likely that more large data centers will use energy service companies (ESCOs) to produce, manage and deliver energy from renewable microgrids. Demand for green energy, for energy security (that is, energy produced off-grid) and energy-price stability is growing; ESCOs can deliver all this for dedicated customers that sign long-term energy-purchase agreements but don’t have the capital required to build or the expertise necessary to run a green microgrid.
Demand for enterprise data centers will continue but alongside the use of more cloud and more colo. More will be consumed “as a service,” ranging from data center monitoring to renewable energy from nearby dedicated microgrids.
The full report Ten data center industry trends in 2020 is available to members of the Uptime Institute Network. Membership information can be found here.
Phasing Out Data Center Hot Work
/in Executive, Operations/by Kevin HeslinDespite years of discussion, warnings and strict regulations in some countries, data center hot work remains a contentious issue in the data center industry. Hot work is the practice of working on energized electrical circuits (voltage limits differ regionally) — and it is usually done, in spite of the risks, to reduce the possibility of a downtime incident during maintenance.
Uptime Institute advises against hot work in almost all instances. The safety concerns are just too great, and data suggests work on energized circuits may — at best — only reduce the number of manageable incidents, while increasing the risk of arc flash and other events that damage expensive equipment and may lead to an outage or injury. In addition, concurrently maintainable or fault tolerant designs as described in Uptime Institute’s Tier Standard make hot work unnecessary.
The pressure against hot work continues to mount. In the US, electrical contractors have begun to decline some work that involves working on energized circuits, even if an energized work permit has been created and signed by appropriate management, as required by National Fire Protection Association (NFPA) 70E (Standard for Electrical Safety in the Workplace). In addition, US Department of Labor’s Occupational Safety and Hazards Agency (OSHA) has repeatedly rejected business continuity as an exception to hot work restrictions, making it harder for management to justify hot work and to find executives willing to sign the energized work permit.
OSHA statistics make clear that work on energized systems is a dangerous practice, especially for construction trades workers; installation, maintenance, and repair occupations; and grounds maintenance workers. For this reason, NFPA 70E sharply limits the situations in which organizations are allowed to work on energized equipment. Personnel safety is not the only issue; personal protective equipment (PPE) protects only workers, not equipment, so an arc flash can destroy many thousands of dollars of IT gear.
Ignoring local and national standards can be costly, too. OSHA reported 2,923 lockout/tagout and 1,528 PPE violations in 2017, among the many safety concerns it addressed that year. New minimum penalties for a single violation exceed $13,000, with top total fines for numerous, willful and repeated violations running into the millions of dollars. Wrongful death and injury suits add to the cost, and violations can lead to higher insurance premiums, too.
Participants in a recent Uptime Institute discussion roundtable agreed that the remaining firms performing work on live loads should begin preparing to end the practice. They said that senior management is often the biggest impediment to ending hot work, at least at some organizations, despite the well-known and documented risks. Executive resistance can be tied to concerns about power supplies or failure to maintain independent A/B feeds. In some cases, service level agreements contain restrictions against powering down equipment.
Despite executive resistance at some companies, the trend is clearly against hot work. By 2015, more than two-thirds of facilities operators had already eliminated the practice, according to Uptime Institute data. A tighter regulatory environment, heightened safety concerns, increased financial risk and improved equipment should combine to all but eliminate hot work in the near future. But there are still holdouts, and the practice is far more acceptable in some countries — China is an example — than in others, such as the US, where NFPA 70E severely limits the practice in all industries.
Also, hot work does not eliminate IT failure risk. Uptime Institute has been tracking data center abnormal incidents for more than 20 years and when studying the data, at least 71 failures occurred during hot work. While these failures are generally attributed to poor procedures or maintenance, a recent, more careful analysis concluded that better procedures or maintenance (or both) would have made it possible to perform the work safely — and without any failures — on de-energized systems.
The Uptime Institute abnormal incident database includes only four injury reports; all occurred during work on energized systems. In addition, the database includes 16 reports of arc flash. One occurred during normal preventive maintenance and one during an infrared scan. Neither caused injury, but the potential risk to personnel is apparent, as is the potential for equipment damage (and legal exposure).
Undoubtedly, eliminating hot work is a difficult process. One large retailer that has just begun the process expects the transition to take several years. And not all organizations succeed: Uptime Institute is aware of at least one organization in which incidents involving failed power supplies caused senior management to cancel their plan to disallow work on energized equipment.
According to several Uptime Institute Network community members, building a culture of safety is the most time-consuming part of the transition from hot work, as data centers are goal-oriented organizations, well-practiced at developing and following programs to identify and eliminate risk.
It is not necessary or even prudent to eliminate all hot work at once. The IT team can help slowly retire the practice by eliminating the most dangerous hot work first, building experience on less critical loads, or reducing the number of circuits affected at any one time. To prevent common failures when de-energizing servers, the Operations team can increase scrutiny on power supplies and ensure that dual-corded servers are properly fed.
In early data centers, the practice of hot work was understandable — necessary, even. However, Uptime Institute has long advocated against hot work. Modern equipment and higher resiliency architectures based on dual-corded servers make it possible to switch power feeds in the case of an electrical equipment failure. These advances not only improve data center availability, they also make it possible to isolate equipment for maintenance purposes.
The spectre of ransomware
/in Executive, Operations/by Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.comUptime Institute Intelligence plans to release its 2019/2020 outages report shortly. This report will examine the types, causes and impacts of public outages, as well as further analyze the results of a recent Uptime survey on outages and impacts. The data will once again show that serious IT service interruptions are common and costly, with the impacts often causing serious disruption.
We have excluded one type of outage from the report: those caused by cyberattacks. Data integrity and cybersecurity is, of course, a very major issue that requires vigilant attention and investment, but it is not currently an area on which Uptime Institute researches and advises. Most security issues are data breaches; although they have serious consequences, they do not usually lead to a service interruption.
However, two forms of malicious attack can and often do lead to outages or at least a severe service degradation. The first is a Distributed Denial of Service (DDoS) attack, where a coordinated attempt is made to overwhelm a site with traffic. Uptime has tracked a number of these each year for many years, and security specialists say they are increasingly common. Even so, most organizations that are DDoS targets have developed effective countermeasures that minimize the threat. These measures include such techniques as packet filtering, load balancing and blocking suspect internet protocol addresses. As a result, DDoS attacks are showing up less frequently in our lists of outages.
The second type, ransomware, is emerging as a major problem and cause of outages. Ransomware attackers deny authorized users access to their own data; the hackers use malware to encrypt the user’s files and refuse to unlock them unless a ransom is paid. Often, operators have no choice but to take down all involved IT services in an attempt to recover access, restore from the last clean backup copy, and purge the systems of viruses. Outages can last days or weeks.
In the past two years, ransomware attacks have increased dramatically. The FBI investigated over 1,400 ransomware attacks in 2018. Government offices are a particular target. Kaspersky Research Labs, operated by security software supplier Kaspersky, identified 147 attacks on municipalities in 2019 (up 60%), in which the criminals demanded ransoms of $5.3 million. The IT Governance blog, based in the UK, recorded 19 major ransomware attacks globally in December 2019 alone.
Most US cities have now signed a charter never to pay a ransom to the criminals — but more importantly, most are now also upgrading their infrastructure and practices to prevent attacks. Some that have been targeted, however, have paid the ransom.
Perhaps the two most serious attacks in 2019 were the City of Baltimore, which refused to pay the ransom and budgeted $18 million to fix its problem; and the City of Atlanta, which also refused to pay the ransom and paid over $7 million to fully restore operations. The WannaCry virus attack in 2018 reportedly cost the UK National Health Service over $120 million (£92 million). And on New Year’s Eve 2019, Travelex’s currency trading went offline for two weeks due to a ransomware attack, costing it millions.
Preventing a ransomware attack has become — or should become — a very high priority for those concerned with resiliency. Addressing the risk may involve some stringent, expensive and inconvenient processes, such as multifactor security, since attackers will likely try to copy all passwords as well as encrypt files. In terms of the Uptime Institute Outage Severity Rating, many attacks quickly escalate to the most serious Category 4 or 5 levels — severe enough to costs millions and threaten the survival of the organization. Indeed, one North American health provider has struggled to recover after receiving a $14 million ransom demand.
All of this points to the obvious imperative: The availability and integrity of digital infrastructure, data and services is critical — in the fullest sense of the word — to almost all organizations today, and assessments of vulnerability need to span security, software, systems, power, networks and facilities. Weaknesses are likely to be exploited; sufficient investment and diligence in this area has become essential and must never waver. In hindsight we discover that almost all outages could have prevented with better management, processes and technology.
Members of the Uptime Institute Network can read more on this topic here.
Micro data centers: An explosion in demand, in slow motion
/in Design, Executive/by Andy Lawrence, Executive Director of Research, Uptime Institute, alawrence@uptimeinstitute.comA wave of new technologies, from 5G to the internet of things (IoT) to artificial intelligence (AI), means much more computing and much more data will be needed near the point of use. That means many more small data centers will be required. But there will be no sudden mass deployment, no single standout use case, no single design dominating. Demand is likely to grow faster from 2022.
Small package, big impact
Suppliers in the data center industry are excited. Big vendors such as Schneider, Vertiv and Huawei have been rapidly adding to their product lines and redrawing their financial forecasts; startups — companies such as Vapor IO, EdgeMicro, EdgeInfra and MetroEDGE — are pioneering new designs; and established telco specialists, such as Ericsson, along with telco operators, are working on new technologies and partnerships. Builders and operators of colocation data centers, such as EdgeConneX, Equinix and Compass, are assessing where the opportunity lies.
The opportunity is to supply, build or operate local edge data centers — small micro data centers that are designed to operate near the point of use, supporting applications that are not suited to run in big, remote data centers, even in mid-sized regional colocation data centers. Unlike most larger data centers, micro data centers will mostly be built, configured and tested in a factory and delivered on a truck. Typical sizes will be 50 kW to 400 kW, and there are expected to be a lot of them.
But with the anticipation comes consternation — it is possible to commit too early. Some analysts had predicted that the explosion in edge demand would be in full swing by now, fueled by the growing maturity of the IoT and the 2020 launch schedules for 5G services. Suppliers, however, mostly report only a trickle — not a flood — of orders.
Privately, some suppliers admit they have been caught off guard. There is a deep discussion about the extent of data center capacity needed at the local edge; about just how many applications and services really need local edge processing; and about the type and size of IT equipment needed — maybe a small box on the wall will be enough?
While the technical answers to most of these questions are largely understood, questions remain about the economics, the ownership, and the scale and pace of deployment of new technologies and services. These are critical matters affecting deployment.
Edge demand and 5G
In the past decade, data and processing has shifted to a cloudy core, with hundreds of hyperscale data centers built or planned. This will continue. But a rebalancing is underway (see Uptime Institute Intelligence report: The internet tilts toward the edge), with more processing being done not just at the regional edge, in nearby colocation (and other regional) data centers, but locally, in a micro data center that is tens or hundreds of meters away.
This new small facility may be needed to support services that have a lot of data, such as MRI scanners, augmented reality and real-time streaming; it may be needed to provide very low latency, instantly responsive services for both humans and machines — factory machines are one example, driverless cars another; and it may be needed to quickly crunch AI calculations for immediate, real-time responses. There is also a more mundane application: to provide on-site services, such as in a hospital, factory or retail establishment, should the network fail.
With all these use cases, why is there any doubt about the micro data center opportunity?
First, in terms of demand drivers, no new technology has created so much interest and excitement as 5G. The next generation telecom wireless network standard promises speeds of up to 10 gigabits per second (Gbps) communications, latency of below five millisecond (ms), support for one million devices per square kilometer, and five-nines availability. It will ultimately support a vast array of new always-on, low latency and immersive applications that will require unimaginable amounts of data and compute power — too much to realistically or economically send back to the internet’s hyperscale core. Much of this will require low-latency communications and rapid processing of a few milliseconds or less — which, the speed of light dictates, must be within a few kilometers.
Few doubt that 5G will create (or satisfy) huge demand and play a pivotal role in IoT. But the rollout of 5G, already underway, is not going to be quick, sudden or dramatic. In fact, full rollout may take 15 years. This is because the infrastructure required to support 5G is too expensive, too complex, and involves too many parties to do all at once. Estimates vary, with at least one analyst firm predicting that telecom companies will need to spend $1 trillion upgrading their networks.
A second issue that is creating uncertainty about demand is that many edge applications — whether supported by 5G or some other networking technology (such as WiFi 6) — may not require a local micro data center. For example, high-bandwidth applications may be best served from a content distribution network at the regional edge, in a colo, or by the colo itself, while many sensors and IoT devices produce very little data and so can be served by small gateway devices. Among 5G’s unique properties is the ability to support data-heavy, low-latency services at scale — but this is exactly the kind of service that will mostly be deployed in 2021 or later.
Suppliers and telcos alike, then, are unsure about the number, type and size of data centers at the local edge. Steve Carlini, a Schneider Electric executive, told Uptime Institute that he expects most demand for micro data centers supporting 5G will be in the cities, where mobile edge-computing clusters would likely each need one micro data center. But the number of clusters in each city, far fewer than the number of new masts, would depend on demand, applications and other factors.
A third big issue that will slow demand for micro data centers is economic and organizational. These issues include licensing, location and ownership of sites; support and maintenance; security and resiliency concerns; and management sentiment. Most enterprises expect to own their own edge micro data centers, according to Uptime Intelligence research, but many others will likely prefer to outsource this altogether, in spite of potentially higher operational costs and a loss of control.
Suppliers are bullish, even if they know demand will grow slowly at first. Among the first-line targets are those simply looking to upgrade server rooms, where the work cannot be turned over to a colo or the cloud; factories with local automation needs; retailers and others that need more resiliency in distributed locations; and telcos, whose small central offices need the security, availability and cost base of small data centers.
This wide range of applications has also led to an explosion of innovation. Expect micro data centers to vary in density, size, shape, cooling types (include liquid), power sources (including lithium ion batteries and fuel cells) and levels of resiliency.
The surge in demand for micro data centers will be real, but it will take time. Many of the economic and technical drivers are not yet mature; 5G, one of the key underlying catalysts, is in its infancy. In the near term, much of the impetus behind the use of micro data centers will lie in their ability to ensure local availability in the event of network or other remote outages.
The full report Ten data center industry trends in 2020 is available to members of the Uptime Institute Network here.