Mixed resiliency at the edge

Many analysts have forecast an explosion in demand for edge data centers. After a long, slow start, demand is beginning to build, with small, prefabricated and mostly remotely operated data centers ready to be deployed to support a varying array of applications.

There are still many uncertainties surrounding the edge market, ranging from business models to ownership, and from numbers to types of deployment. One open question is how much resiliency will be needed, and how it will be achieved.

While on-site infrastructure redundancy (site-level resiliency) remains the most common approach to achieving edge data center resiliency, Uptime Institute’s research shows increased interest in software- and network-based distributed resiliency. Nine of 10 edge data center owners and operators believe it will be very or somewhat commonly used in two to three years.

Distributed resiliency, which involves synchronous or asynchronous replication of data across multiple sites, has, until recently, mainly been used by large cloud and internet service providers. It is commonly deployed in cloud availability zones and combined with site-level resiliency at three or more connected physical data centers.

While site-level redundancy is primarily a defense against equipment faults at a site, distributed resiliency can harden against major local events or sabotage (taking out a full site). It can also reduce infrastructure costs (by reducing site-level redundancy needs) and provide increased business agility by flexible placement and shifting of IT workloads. Edge data centers making use of distributed resiliency are connected and operated in a coordinated manner, as illustrated in Figure 1. The redundant element in this case is at least one full edge data center (not a component or system). When a disruption occurs, when capacity limitations are reached, or when planned maintenance is required, some (or all) of the IT workloads in an edge data center will be shifted to one or more other edge data centers.

diagram: Different resiliency approaches are used for edge data centers — Figure 1. Different resiliency approaches are used for edge data centers

Site-level resiliency relies on redundant capacity components (also including major equipment) for critical power, cooling, and network connectivity — the approach widely adopted by almost all data centers of any size. Edge data centers using only site-level resiliency tend to run their own IT workloads independently from other edge data centers.

Edge data centers making use of distributed resiliency are connected and operated in a coordinated manner, commonly using either a hierarchical topology or a mesh topology to deliver multisite resiliency.

None of these approaches or topologies are mutually exclusive, although distributed resiliency creates opportunities to reduce component redundancy at individual edge sites without risking service continuity.

Uptime Institute’s research suggests that organizations deploying edge data centers can benefit from the combined use of site-level resiliency and distributed resiliency.

Organizations deploying distributed resiliency should expect some challenges before the system works flawlessly, due to the increased software and network complexity. Because edge data centers are typically unstaffed, resilient remote monitoring and good network management/IT monitoring are essential for early detection of disruption and capacity limitations, regardless of the resiliency approach used.