Explaining the Uptime Institute’s Tier Classification System (April 2021 Update)
Note: This is an April 2021 update to an article previously published.
Uptime Institute’s Tier Standard and its Tier Classification System for data centers have been applied by owners and operators of data centers for nearly 30 years. Since its creation in the mid-1990s, the system has evolved from a shared industry terminology into the global standard for performance management of data center critical infrastructure topology and operational plans. (The Tier Standard is a two-volume set, including one that focuses on the design topologies that create new capacity, and the other volume that focuses on the operational plans associated with that capacity.)
Over the years, some industry observers and pundits have questioned the complexity of the Tier System and, in some cases, have misrepresented the purpose and purview of the program. In many of these cases, what we find is the influencer is simply trying to fit the results-oriented Tier Standard into some kind of more basic checklist-style framework with which they are more familiar. To be clear, the Tier Standard states the desired results, not how to achieve them. This fundamental and simple approach allows Tier Standard users to attain the desired results in whatever innovative way they desire – as long as they attain the performance objective results. Invariably, we find that many of these authors and interview subjects have never been intimately involved with an actual Tier Standard-based design project and the subsequent Tier Certification of the site. Typically, the commenter’s understanding of the Tiers is secondhand and years out of date.
Anyone in the industry who knew our late founder Ken Brill knows the Uptime Institute doesn’t shy away from rigorous debate. And we happily engage in substantive discussions about the Tier Standard program with clients and interested parties. In fact, we welcome the opportunity to transform technology discussions into business impact discussions, which is really what the Tier Standard is all about.
In light of the above, let’s take this opportunity to explain what the Tier Standard looks like today, illustrate how Tier Certification works, list some companies that have invested in Tier Certification, and offer Uptime Institute’s vision for the future.
What is the Tier Standard?
Uptime Institute created the Tier Standard and its Tier Classification System to consistently evaluate various data center facilities in terms of potential site infrastructure performance, or uptime. It has two parts that cover both the design aspects as well as the operational aspects.
For the design, there are four levels of performance, referred to as Tier I (Basic Capacity), Tier II (Redundant Capacity), Tier III (Concurrently Maintainable) and Tier IV (Fault Tolerant). As the name suggests, each of these are identified by the resulting characteristics, and each higher level incorporates all of the features of the previous levels. So a Concurrently Maintainable Tier III design would include all of the result definitions found in the Redundant Capacity Tier II requirements, which would also include all of the performance definitions found in Basic Capacity Tier I.
Tier I: Basic Capacity
A Tier I data center provides dedicated site infrastructure to support information technology beyond an office setting. Tier I infrastructure includes a dedicated space for IT systems; an uninterruptible power supply (UPS) to filter power spikes, sags, and momentary outages; dedicated cooling equipment that won’t get shut down at the end of normal office hours; and an engine generator to protect IT functions from extended power outages.
Tier II: Redundant Capacity
Tier II facilities include redundant critical power and cooling components to provide select maintenance opportunities and an increased margin of safety against IT process disruptions that would result from site infrastructure equipment failures. The redundant components include power and cooling equipment such as UPS modules, chillers or pumps, and engine generators.
Tier III: Concurrently Maintainable
A Tier III data center requires no shutdowns for equipment replacement and maintenance. A redundant delivery path for power and cooling is added to the redundant critical components of Tier II facilities so that each and every component needed to support the IT processing environment can be shut down and maintained without impacting IT operations.
Tier IV: Fault Tolerance
Tier IV site infrastructure builds on Tier III, adding the concept of Fault Tolerance to the site infrastructure topology. Fault Tolerance means that when individual equipment failures or distribution path interruptions occur, the effects of the events are stopped short of the IT operations.
Please refer to Tier Standard: Topology for full criteria.
Which one is right for me?
Data center infrastructure costs and operational complexities increase with Tier level, and it is up to the data center owner to determine the Tier level that fits his or her business’s need. A Tier IV solution is not always “better” than a Tier II solution, because the business value for the specific application to be run in that site may be lower. Remember, this must be more than a technical discussion. The data center infrastructure needs to match the business application, otherwise companies can overinvest for less critical services or take on too much risk for key applications. And it’s also important to remember that regardless of which Tier is chosen for the design, the long-term value of any site is determined by how well it performs in production, which is largely a function of how it is operated! This critical operational aspect is covered in detail by the Tier Standard as well, in the second volume covered below, entitled “Operational Sustainability.”
Uptime Institute recognizes that most data center designs are custom endeavors, with complex design elements and multiple technology choices. With all of the technology choices available to designers today, including renewable power sources, innovative distribution and advanced cooling approaches, the Tier Standard of Topology is unique in being able to encourage that usage, as long as the resulting infrastructure precisely meets the outcomes defined in the Tier Standard for the level chosen. As such, the Tier Classification System does not prescribe any specific technology, schematic or other design criteria beyond those resulting outcomes stated above. It is up to the data center owner to meet those outcome criteria in any method that fits his or her infrastructure goals based on the business parameters.
One point of clarification: In 2009, Uptime Institute removed all references to “expected downtime per year” when using the Tier Standard. The current Tier Standard of Topology does not assign availability predictions to Tier levels. This change was due to a maturation of the data center industry, and the understanding that operational behaviors can have a huge impact on site availability regardless of the technical prowess of the design and build.
The Tier Standard
Both volumes of the Tier Standard documents (Topology and Operational Sustainability) are freely available and can be downloaded with the click of a button after agreeing to a simple MOU. Tens of thousands of data center sites have already used the Tier Standard as their reference when creating new capacity. Many of these sites are designed using best effort to create capacity that meets the desired outcomes specified in the Tier Standard. But in 2021, with all of the global pressures being seen, “best effort” is no longer enough. Digital business transformation has created an infrastructure-centric world, and when capacity stumbles, business stops. So the critical nature of that capacity is now paramount.
In most cases, executive management and other stakeholders expect the technologists to have taken every rational step available to assure that business continues to operate under the entire range of expected operating conditions. Ultimately most organizations have adopted the ‘trust but verify’ mentality. So while they trust their technical teams to have created business-aligned capacity at the right size, at the right cost, and with the ability to operate properly under all expected operating conditions, they require some form of verification that their work has been done correctly. And that is where Uptime Institute’s Certification services come into the discussion.
Leveraging Tier Certification
The Tier Certification process typically starts with a company deploying new data center capacity. The data center owner defines a need to achieve a specific Tier Level to match a business demand. Data center construction projects usually have two main phases: 1) the design, and 2) the construction. This is followed by a third phase that begins with commissioning and continues indefinitely throughout production operations. It’s very important to have a solid plan for each of these three phases!
Data center owners turn to Uptime Institute during all of these phases for an unbiased review of their work, to ensure that data center designers, contractors and service providers are delivering against their requirements and expectations. In essence, they look to Uptime Institute to verify that the designers have implemented the Tier Standard properly (what we call Tier Certification of Design Documents), that the construction contractors have built properly based on the design (what we call Tier Certification of Constructed Facility), and that the operators have a comprehensive plan to assure the site performs as needed over time, over the wide range of expected operating conditions.
As the author of the Tier Standard, and to assure absolute client confidence in the resulting review, Uptime Institute has chosen to be the only organization that can formally certify data centers against the Tier Classification System and issue resulting Tier Standard award documents. And since Uptime Institute does not design, build or operate data centers, our only role is to evaluate designs, constructed projects and operational plans for their ability to deliver the outcomes defined in the Tier Standard. Clients who engage with Uptime Institute and receive their respective awards can be assured that what they are creating meets the outcomes defined in the Tier Standard.
Tier Certification Steps
The first step in a Tier Certification process is a Tier Certification of Design Documents (TCDD). Uptime Institute consultants review 100% of the design documents, ensuring each subsystem among electrical, mechanical, monitoring, and automation meet the fundamental concepts and there are no weak links in the chain. Uptime Institute establishes the ability for any facility built faithfully using the reviewed design documents to meet the outcomes defined in the Tier Standard itself. The review process may be an iterative one, and require follow-up discussions and potential changes to the design to meet the stated objectives. Problems identified in the design phase are much less expensive to remediate than those that would be caught subsequently in the construction phase. Once Tier Standard compliance is confirmed, Uptime Institute then awards a TCDD Certification and the appropriate documentation stating this compliance. This is a great first step, as it assures that the design itself will meet the stated performance needs of the business.
Uptime Institute has reviewed every conceivable type of data center across the world. We’ve worked with thousands of clients in more than 100 countries. As you might imagine, we’ve learned a few things along the way. One of the most important observations we’ve made is that some companies may not realize that data center construction projects mistakes are much more common than you would intuitively think. The chosen construction firm must ultimately interpret the design drawing and implement the subsystems accurately. In our nearly 30 years of experience, we find that more than 85% of all data center designs are incorrectly executed in construction. So while a Tier Certification of Design Documents verifies that the project will deliver the required performance on paper, the real-world as-built performance may be drastically different depending on how accurately the construction phase is executed. And that is why Uptime Institute offers the Tier Certification of Constructed Facility (TCCF) as well. It assures that contractors accurately build what was designed, which is the most important assurance executive teams, shareholders and stakeholders demand.
Remember, there are two phases in data center construction projects. The TCDD Certification is never supposed to be a ‘final stage’ in the Certification process, but rather a checkpoint for companies to demonstrate that the first phase of the capital project met Tier Standard-based performance requirements. And since design requirements and technologies change quickly, Uptime Institute Certifies our clients’ designs for a period of two years to assure that resulting construction projects are always the most effective they can be and that they are continuously testing their assertions, embracing new business needs and leveraging new technological approaches as needed.
With construction projects regularly exceeding $100 million USD, data center owners use the Tier Certification of Design Documents Certification to hold the project teams accountable, and to ensure that the resulting construction expenditures will produce performance that meets their stated design objectives.
This brings us to the next phase in a Tier Certification process: TCCF. During a TCCF, a team of Uptime Institute consultants conducts a site visit, identifying discrepancies between the design drawings and installed equipment. Our consultants observe tests and demonstrations to prove Tier Standard compliance and the resulting performance goals. Fundamentally, this is the value of the Tier Certification – finding these blind spots and weak points in the chain. When the data center owner addresses deficiencies, and Uptime Institute verifies the ultimate performance of the facility (including a complete “pull the plug” test at the end), we award the TCCF documentation.
Going Beyond Tier Standard Design and Construction Certification: Tier Certification of Operational Sustainability
And the final Tier Certification phase of any new project is referred to as the “Tier Certification of Operational Sustainability” (TCOS). This is a critically important, yet often overlooked, component for performance. Uptime Institute will assess the operational plans and parameters for any Tier Certified data center, and help the client understand where issues may occur that could derail the production in a Tier Certified site. In 2021, we find that human error still accounts for more than 70% of all downtime, and some of the most widely known outages have found their root cause in preventable scenarios. In our 2021 outage analysis survey of nearly 1000 operators, more than 75% admitted that their most recent outage was PREVENTABLE! Hence, TCOS must be considered as the critically important final step when any new capacity is planned.
As mentioned previously, Uptime Institute recognizes the huge role operations plays in keeping data center services available. To that end, Uptime Institute developed the Tier Standard: Operational Sustainability and certifies data center operations much in the same manner as it does with the physical facility itself. This is a site-specific assessment and benchmarking of a facilities management team’s operational processes, with an on-site visit and detailed reporting. Any Tier-Certified site can add operational certification through the TCOS process.
(For clients with existing non-Tier Certified sites, or that have for whatever reason chosen not to certify data center facilities against Tier Standard performance goals, a similar certification of operational sustainability can be performed, referred to as the Management & Operations (M&O) Stamp of Approval.)
Just like the Tier Certification of Operational Sustainability, the client and Uptime Institute work together to assess the selected site(s) against the M&O criteria and desired outcomes. And again, since the criteria was drawn from Uptime Institute’s Tier Standard: Operational Sustainability, it follows the same processes and methodology to objectively assess compliance. The M&O Stamp of Approval is a certification of operations, and has been vetted through enterprise owners, outsourced operations teams, and multi-tenant industry practitioners to assure compatibility with a wide variety of management solutions and across multiple computing environments.
Common elements assessed in the TCOS and M&O engagements:
-Staffing and Organization (on-staffing levels, qualifications, and skill mix)
-Training and Professional Development Assessment
-Preventative Maintenance Program and Processes
-Operating Conditions and Housekeeping
-Planning, Management, and Coordination practices and resources
Please refer to Tier Standard: Operational Sustainability for full criteria.
Tier Certification Clients
Literally thousands of clients use the Tier Standard for their data center projects. It is the de-facto standard because it focuses on results rather than checklists. The clearest proof of its value is the huge list of companies investing in Tier Certification. And while it is easy for less-invested parties to ‘claim Tier compliance,’ it is a wholly different matter to lay your solution open to a rigorous review by Uptime Institute.
Look at adoption among the Financial Services community, telecommunications companies, colocation providers and the many data center developers, including Digital Realty, Equinix, Compass, Stack and Cyxtera. We have been pleased to impress each and every one of those companies with our dedication to quality and thoroughness, because we understand all that is on the line for them and their clients. Once engaged, we have responsibilities in their core business platforms, and take that responsibility very seriously.
Here is the full list of Tier Certification awards.
By covering these essential areas, a management team can operate a site to its full uptime potential, obtain maximum leverage of the installed infrastructure/design and improve the efficacy of operations.