Long shifts in data centers — time to reconsider?

Long shifts in data centers — time to reconsider?

Human error has been — and remains to be — a major cause of outages in data centers. Uptime Intelligence’s research shows that about four in 10 operators have had a major outage in the past three years in which human error played a role (Annual outage analysis 2023). Half of these respondents said errors were made because staff failed to follow the correct procedures.

Thorough training, regular practice in equipment testing and work experience all help to reduce these errors — particularly in an emergency when a prompt reaction is crucial. An often underappreciated factor is the importance of mental performance and the effects of fatigue.

The relationship between shift length, fatigue and human error is well documented, but less clear is how the data center industry can define shifts that help minimize human error. The recommended best practices for other industries do not always translate into the data center world, where 24/7 service availability is the standard. Additionally, data center owners and operators wanting to optimize shift length to limit fatigue need to navigate employee preferences and region-specific constraints.

What the research says

Studies indicate there is a tipping point after which the performance of most staff deteriorates. Researchers at the Chinese University of Hong Kong Department of Systems Engineering and Engineering Management analyzed 241 papers on the relationship between shift length and occupational health and found that individuals working more than 10-hour shifts are significantly more likely to experience fatigue. A similar review from the Finnish Institute of Occupational Health shows the risk of workplace injury due to fatigue-related accidents across a range of industries is 15% higher in 10-hour shifts than 8-hour shifts, and jumps to 38% higher at 12 hours.

The errors that stem from disruption to circadian rhythms (biological processes over a 24-hour period) and mental exhaustion, and can lead to injury (e.g., from improper machine operation), can be considered products of cognitive oversight. This oversight, which is an unintentional failure to interpret events correctly, is at the root of much human error in data centers and can potentially result in not just injury, but a disruption to services.

Currently, 8- to10-hour single-day shifts are most common in the data center industry across all major regions, according to the Uptime Institute Data Center Staffing Survey 2023 (Operators struggle to overcome ongoing staff and skills shortage). There are, however, some geographic variations in the results: while 17% of all respondents report single-day shifts of more than 10 hours, Asia-Pacific leads at 22%. In contrast, respondents from Europe have more than three times as many 5- to 7-hour shifts as respondents from Asia-Pacific, but just over half (13%) report shifts of more than 10 hours.

Policy variations across different regions are clearly a factor in how data center owners and operators choose specific shift lengths for their employees, particularly in relation to night shifts. In Europe, labor laws in several major countries do not allow night shifts to exceed 8 or 10 hours as standard. Exceptions can be made to meet 24/7 staffing requirements, with night shifts extended to 12 hours, as long as employees are compensated with sufficient paid time off work.

These policy restrictions in Europe — along with the survey results indicating that European respondents provide more 5- to 7-hour shifts than respondents from other regions — may indicate that these companies are hiring more part-time employees to make up their staffing shortfall.

Companies in other regions attempting to replicate a similar strategy to reduce shift length face obstacles. Unlike European employees, workers in the US and several Latin-American countries risk losing access to healthcare coverage if their shifts become shorter. In the US there is no statutory obligation for the employer to provide healthcare coverage if employees work less than a 40-hour week. Staff are therefore reluctant to reduce their weekly hours.

Employers can limit long shifts — particularly night shifts (which have higher workplace injury risk) — to 8 hours. While this may appear to be an intuitive solution to avoid performance deterioration, Uptime Institute’s technical consultants advise that any change will not be without friction, and shift length may not even be the primary contributory factor. Some key considerations are:

  • Complacency and ownership. Shift structure should promote sharing of knowledge, break monotony of routines and help develop a sense of inclusion through rotating shifts. Shift silos, such as staff having a fixed schedule, with some only working at weekends or nights, may create unhealthy attitudes resulting from complacency or a lack of team cohesion.
  • Meeting staff lifestyle preferences. Despite data suggesting that long shifts are detrimental to performance, it is difficult for some operators to cut back hours. Uptime Institute technical consultants often see a staff preference for 12-hour shifts over several days, for the benefits of both additional overtime pay and extended blocks of time off work.
  • Relief shifts. Consensus in the industry is that extending shifts to more than 12 hours is ultimately worse for the business than sending employees home. For many operators, however, extending shifts to beyond 12 hours is unavoidable as a means of meeting staffing requirements. In practice, identifying individuals that can handle these extended shift lengths is not easy. It is not just very long shifts that carry the risks associated with fatigue. Staff not being able to rest sufficiently due to covering the shifts of absentee staff is another source of potential exhaustion, even if these shifts are not particularly long.

Long-term impact

Sourcing the appropriate, qualified individual for a relief shift in an understaffed industry is challenging. Typically, companies request employees to clock in on their rest days. This may work well for an employee during a week they are already off work, but it could also force employees to clock back on before they have had sufficient rest between shifts. Adding more staff into the shift rotation may prevent other employees from having to extend shifts or clock in with insufficient rest, but this simply patches over the root of the problem: the absence of staff from their scheduled shifts.

Operators need to monitor absence levels and understand the reasons behind these absence levels. The cumulative long-term impact of working shifts of more than 10 hours increases the risk of developing a range of health conditions, as well as fatigue. Although many data center operators have developed shift schedules to minimize errors, this needs to be balanced with a long-term view of health, work life balance and burn-out.

Planning ahead

Retroactively adjusting shift lengths of established employees could result in low morale and counterintuitively result in higher levels of fatigue as staff adjust to their new schedule changes. Many data center owners and operators, however, are undergoing significant infrastructure expansion, which need to be staffed on a shift rotation that minimizes human error and limits the risks of disruption to service availability. Owners and operators should consider the following recommendations:

  • Avoid shift lengths of more than 12 hours. Staffing levels and schedules should be defined to minimize the occurrences of abnormally long shifts.
  • Identify shifts that are not appropriate as relief shifts. Establish a system for ensuring well-rested coverage. Monitor overtime and rest periods between shifts to avoid calling in exhausted staff.
  • Consider individual employee preferences but remain mindful that shift workers often ignore potential risks to their own job performance and health when requesting their preferred schedule.

The Uptime Intelligence View

While many data center managers take a flexible approach to staffing, relief shifts remain a common source of human error. Employees experiencing long-term effects of extended shift work, in terms of risks to health and performance, may be perpetuating difficulties in filling the required shifts due to increased levels of staff absence. These factors can result in an operational stress of lower-than-ideal staffing levels in many facilities, leaving data center managers with few options to optimize shifts.

Share this