A new process to compare IT deployment options across both in-house and outsourced alternatives
By Julian Kudritzki and Matt Stansberry, Uptime Institute
Uptime Institute FORCSS™ is an original system to capture, compare, and prioritize the various impacts to the many IT deployment alternatives.
On an ongoing basis, enterprise organizations decide to deploy IT assets in an internal data center, colocation facility, hosting environment, or cloud solution. These decisions may not holistically view the financial, risk, performance, or other impacts. FORCSS enables the management of an enterprise organization to identify, weigh, and communicate the advantages and risks of IT applications deployment options using a consistent and relevant criteria based in business drivers and influences.
Since its inception, the mission of the Uptime Institute has been to assist the enterprise in devising feasible and adaptable data center solutions that are responsive to the business. Successful solutions align data center design, technology selection, construction, and operation to achieve high reliability. One of the leading challenges today is deciding the most viable IT deployment option.
FORCSS helps the enterprise to overcome this challenge by focusing on the critical selection factors, thereby reducing or eliminating unfounded assumptions and organizational “blind spots.” FORCSS establishes a consistent and repeatable set of evaluation criteria and a structure to communicate the informed decision to stakeholders.
A coherent IT deployment strategy is often difficult because the staff responsible for IT assets and IT services across multiple geographies and multiple operating units are themselves spread over multiple geographies and operating units. The result can be a range of operating goals, modes, and needs that are virtually impossible to incorporate into a single, unified deployment strategy. And when a single strategy is developed from the “top down,” the staff responsible for implementing that strategy often struggles to adapt that strategy to their operational requirements and environments.
FORCSS was developed to provide organizations with the flexibility to respond to varying organizational needs while maintaining a consistent overall strategic approach to IT deployments. FORCSS represents a process a) to apply consistent selection criteria to specific deployment options, and b) to translate the outcome of the key criteria into a concise structure that can be presented to “non-IT” executive management.
The FORCSS system is composed of six necessary and sufficient selection factors relevant to an effective deployment decision. These six factors, or criteria, provide a holistic evaluation system, and drive a succinct decision exercise that avoids analytical paralysis. FORCSS identifies the relevant internal and external input.
And, by scaling the importance of the criteria within the system, FORCSS allows each organization to align the decision process to organizational needs and business drivers.
FORCSS Factors And Their Definitions
The Uptime Institute FORCSS system is a means to evaluate deployment alternatives. Accordingly, it is crucial to have a working knowledge of the tangible or intangible values associated with the application being deployed. Tangible values are notably revenues and intangible end-user satisfaction. To be effective and lasting, FORCSS must involve the stakeholder requesting the IT deployment. In other words, don’t lose sight of your client.
The fiscal consequences associated with deployment alternatives.
- Net Revenue Impact: An estimation of gross profit margin—estimated revenues of IT service or application minus cost of ownership.
- Comparative Cost of Ownership: The identified differential cost of deploying the alternative plus ongoing operations and maintenance, including the incremental cost of scaling the alternative as business grows. For example: Significant cost centers can include real estate development, MEP infrastructure, cost of financing, taxes, IT equipment, software license and customization, staffing, service provider and consulting fees. The most definitive cost information for each alternative is from a Total Cost Of Ownership (TCO) accounting protocol, for those few companies that have the capability to reliably determine TCO. Differential and incremental cost is often more directly determined.
- Cash and Funding Commitment: Representation of liquidity—cash necessary at appropriate intervals for the projected duration of the business service.
A deployment alternative’s ability to fulfill compute capacity demand over time.
- Time to Value: The time period from decision to IT service availability. Timeline must include department deployment schedules of IT, facilities, network, and service providers.
- Scalable Capacity: Available capacity for expansion of a given deployment alternative.
- Business Leverage and Synergy: Significant ancillary benefits of a deployment alternative outside of the specific application or business service.
For example: Improve economies of scale and pricing for other applications. Or, geographic location of a particular site provides business benefits beyond the scope of a single application.
A deployment alternative’s potential for negative business impacts.
- Cost of Downtime vs. Availability: Estimated cost of an IT service outage vs. forecasted availability of deployment alternative.
- Acceptable Security Assessment: Internal security staff evaluation of deployment alternative’s physical and data security.
- Supplier Flexibility: Potential “lock-ins” from a technical or contractual standpoint.
For example: Rating situations as simple, difficult/costly, or impossible to negotiate regarding software, hardware, site, and service provider commitments.
- Verification, internal and/or third-party, of a deployment alternative’s compliance with regulatory, industry, or other relevant criteria.
- Government: Legally mandated reporting obligations associated with the application or business service. For example: HIPAA, Sarbanes-Oxley, PCI-DSS.
- Corporate Policies: Internal reporting requirements associated with the application or business service. For example: Data protection and privacy, ethical procurement, Corporate Social Responsibility.
- Compliance & Certifications to Industry Standards: Current or recurring validations achieved by the site or service provider, beyond internal and governmental regulations. For example: SAS 70®, SSAE 16, Uptime Institute Tier Certification or M&O Stamp of Approval, ISO®.
- Environmental consequences of a deployment alternative.
- Carbon and Water Impact: Carbon and water use for given a site or service. For example: The Green Grid’s Carbon Usage Effectiveness (CUE)™ and Water Usage Effectiveness (WUE)™ metrics.
- Green Compliance & Certifications: Current or recurring validations achieved by the site or service provider, beyond internal and governmental regulations, of sustainable design and/or operations practices. For example: LEED®, BREEAM®, Carbon Credits, European Union Code of Conduct, U.S. EPA Energy Star®, and The Green Grid’s® DC Maturity Model equalizer.
- PUE Reporting: PUE is an industry-accepted indicator of a site or service provider’s efficiency commitment.
- A deployment alternative’s capability to meet end-user performance requirements.
- Application Availability: Computing environment uptime at the application or operating system level.
- Application Performance: Evaluation of an application functional response; acceptable speeds at the end-user level.
- End-User Satisfaction: Stakeholder response that an application or deployment alternative addresses end-user functional needs. For example: End-user preference for Graphical User Interfaces or Operating/Management Systems tied to a specific deployment alternative.
Using Uptime Institute FORCSS
This system was developed and validated by thought leaders in the enterprise IT industry to ensure
usefulness by those who inform senior-level decision makers.Many organizations already perform due diligence that would include most of this process. But the Uptime Institute FORCSS system provides the following:
- A structure and a set of common definitions agreed upon by an elite group of data center owners and
operators from around the world.
- A succinct and effective way to communicate recommendations to the C-level executives.
Uptime Institute believes the FORCSS system is sufficiently flexible and comprehensive to improve IT investment decisions.
Notes on using FORCSS:
Uptime Institute acknowledges that there are overlaps and dependencies across all six factors. But, in order to provide a succinct, sufficient process to inform C-level decision makers, categories must be finite and separate to avoid analysis paralysis. The purpose of FORCSS is to identify the business requirements of the IT service, and pragmatically evaluate capabilities of potential deployment options as defined.
Uptime Institute recognizes organizations will have unique business demands and priorities. Therefore, it will be necessary for each company conducting a FORCSS analysis to weigh each criteria according to specific business requirements. For example, most companies try to maximize data center efficiency. But, for a growing number of organizations, overall environmental sustainability of operations and supplier choices is a very public (therefore critical) aspect of their business. Organizations that put a high value on sustainability will weigh the criteria accordingly when applying FORCSS in their organizations. Other organizations may weigh sustainability at a low value, as inconsequential.
Uptime Institute is currently evaluating numerous concepts for FORCSS ‘displays.’ These displays will be graphical in nature, rather than a numerical score, to allow for evaluation of each factor within FORCSS and provide a visual comparison of one deployment alternative against another. Please visit FORCSS on the Uptime Institute Web site for the latest information and tools.
Uptime Institute’s Unique Approach To FORCSS Development
In order to ensure the development of a well-rounded, thorough, and useful methodology, Uptime Institute facilitated a series of Charrettes. (A Charrette is a design process that brings stakeholders together at one time, in one place, as a group completing design tasks in a focused, time-limited effort.) The benefits of this approach are that the stakeholders begin with a common understanding of the design objective, share in the development process, and receive immediate feedback on the result of their deliberations.
In October 2011 the first Charrette was held, composed of peers within Uptime Institute and the 451 Group. The fundamental objective was to define the problem and assemble an original framework to be submitted at a second Charrette of key industry stakeholders. This initial work created the structure of a multiple-component solution, including business functions, facilities infrastructure, computing hardware, and applications performance perspectives.
Building on this foundational effort, in January 2012, Uptime Institute hosted over 25 hand-picked senior technology executives from large organizations across multiple industries at a second Charrette. Uptime Institute invited executive leaders at organizations whose decisions impacted international markets and brands and provided broad experience making decisions influenced by multiple factors and challenges.
This group edited and crystallized the original structure into six top-level criteria, or principal factors, that make up the FORCSS framework. Following the second Charrette, Uptime Institute identified three key components for each of the six top-level criteria to further define the FORCSS criteria, and presented the expanded system at Uptime Institute Symposium in Santa Clara, CA, in May 2012.
At Symposium, Uptime Institute reconvened the previous group of executives who comprised the second Charrette, as well as new end-user participants, for a follow-up Charrette on FORCSS.
Some of the new participants represented companies that had been in business for more than 100 years and plan to be in business another 100 years. Many of these organizations are at a strategic inflection point—do they modernize or minimize their IT infrastructures? The participants recognized the FORCSS approach as a means to improve confidence in decision making and avoid unintended consequences.
The third Charrette participants were tasked with vetting the expanded 18-point FORCSS process. The discussions and debate provided substantive insight resulting in changes to the components making up the six factors.
The majority of executives at the second Charrette reported consistent and enduring challenges within their organizations:
FORCSS Begins With These Steps:
- Incomplete data when evaluating internal assets, such as data center capital costs that aren’t included in
TCO calculations for IT projects, or lack of in sight into personnel costs associated with providing internal
- Lack of insight into cloud computing security, pricing models, and reliability data. Lack of credible cloud
computing case studies.
- Inconsistency in reporting structures across geographies and divisions and between internal resources and
- Difficulty articulating business value for criteria not tied to a specific cost metric, like redundancy or
service quality. Difficulty connecting IT metrics to business performance metrics.
- Challenge of capacity planning for IT requirements forecast beyond six months due to evolving
architecture/application strategy and shifting vendor roadmaps.
- Difficulty collecting information across the various stakeholders, from application development,
corporate real estate.
- The first step is to identify the new application workload to be analyzed. The process is designed to
evaluate a specific application workload against specific, existing assets or external resources (or in cases
where a new site or service may be considered, detailed evaluation of planned asset).
- Identify and engage the decision maker or C-level executive who will sign off on the final project. Provide
background on FORCSS as a selection tool for winnowing deployment choices and eliminating blind spots
in an organization.
- Identify senior management in adjacent divisions to assess the implementation being considered. No one
person will have sufficient insight into all areas of an organization. Be sure to include application owners
and users, facilities/real estate, IT operations, and any other stakeholders.
- Set parameters for your application to determine the functional life cycle of the application or IT service
being analyzed in order to determine the value of the application, appropriate cost profile, and other
necessary attributes that ensure the viability of business solution.
Uptime Institute recognizes the many challenges in conducting a FORCSS analysis:
- Getting buy-in and understanding of the FORCSS language across disciplines and at the C-level.
- Avoiding inappropriate weighting of Risk or other criteria based on division bias.
- Obtaining objective data on third-party service provider pricing and availability.
Also, many companies may be challenged by the subjective nature of some of the inputs or have difficulty determining the true costs and benefits of various projects.
The purpose of this timely initiative is to improve a company’s investments and decision making, not to compare one company’s decisions against another’s. The way one organization determines the business value of an application or total cost of providing a service does not need to be the same as how another organization gathers those same data inputs.
A FORCSS analysis may pose tough questions without easy answers, but will help organizations make IT deployment decisions with confidence.
Julian Kudritzki joined the Uptime Institute in 2004 and currently serves as Chief Operating Officer. He is responsible for the global proliferation of Uptime Institute Standards. He has supported the founding of Uptime Institute offices in numerous regions, including Brasil, Russia and North Asia. He has collaborated on the development of numerous Uptime Institute publications, education programs and unique initiatives such as Server Roundup and FORCSS. He is based in Seattle, WA.
Matt Stansberry is Director of Content and Publications for the Uptime Institute and also serves as Program Director for the Uptime Institute Symposium, an annual spring event that brings together 1,500 stakeholders in enterprise IT, data center facilities, and corporate real estate to deal with the critical issues surrounding enterprise computing. He was formerly Editorial Director for Tech Target’s Data Center and Virtualization media group, and was managing editor of Today’s Facility Manager magazine. He has reported on the convergence of IT and Facilities for over a decade.
Three data center design professionals talk about their work and Uptime Institute’s ATD program.
By Kevin Heslin, Uptime Institute
Uptime Institute’s Accredited Tier Designer (ATD) program and its Tier Certification program have affected data center design around the world, raised standards for construction, and brought a new level of sophistication to facility owners, operators, and designers everywhere, according to three far-flung professionals who have completed the ATD program. Adel Rizk of Saudia Arabia’s Edarat, Gerard Thibault, senior technical director, Design and Construction division of Digital Realty (DLR) in the U.K., and Michael Kalny, head of Metronode Engineering, Leighton Telecommunications Group, in Australia, have applied the concepts they learned in the ATD program to develop new facilities and improve the operation of legacy facilities while also aggressively implementing energy-efficiency programs. Together, they prove that high reliability and energy efficiency are not mutually exclusive goals. Of course, they each work in different business environments in different countries, and the story of how they achieve their goals under such different circumstances makes interesting reading.
In addition to achieving professional success, the ATDs each noted that Tier certification and ATD programs had helped them innovate and develop new approaches to data center design and operations while helping market facilities and raise the standards of construction in their countries.
More than that, Rizk, Thibault, and Kalny have followed career arcs with some similarities. Each developed data center expertise after entering the field from a different discipline, Rizk from telephony and manufacturing, Thibault from real estate, and Kalny from building fiber transmission networks. They each acknowledge the ATD program as having deepened their understanding of data center design and construction and having increased their ability to contribute to major company initiatives. This similarity has particular significance in the cases of Rizk and Kalny, who have become data center experts in regions that often depend on consultants and operators from around the globe to ensure reliability and energy efficiency. It is in these areas, perhaps, that the ATD credential and Tier certification have their greatest impact.
On the other hand, the U.K, especially London, has been the home of many sophisticated data center operators and customers for years, making Thibault’s task of modifying Digital Realty’s U.S. specification to meet European market demands a critical one.
On the technology front, all three see continued advances in energy efficiency, and they all see market demand for greater sustainability and energy efficiency. Kalny and Thibault both noted increased adoption of higher server air supply temperatures in data centers and the use of outside air. Kalny, located in Australia, noted extreme interest in a number of water-saving technologies.
Hear from these three ATDs below:
Just tell me a little about yourself.
I’m a consulting engineer. After graduating from a civil engineering program in 1998 and working for a few years on public projects for the Public Switched Telephone Network (PSTN) Outside Plant (OSP), I decided in 2000 to change my career and joined a manufacturer of fast-moving consumer goods. During this period, I also pursued my MBA.
After gaining knowledge and experience in IT by enhancing and automating the manufacturer’s operations and business processes, I found an opportunity to start my own business in IT consulting with two friends and colleagues of mine and co-founded Edarat Group in 2005.
As a consultant working in Edarat Group, I also pursued professional certifications in project management (PMP) and business continuity (MBCI) and was in charge of implementing the Business Continuity Management Program for telecom and financial institutions in Saudi Arabia.
How did you transition from this IT environment to data centers?
One day, a customer who was operating a strategic and mission-critical data center facility asked me to help him improve the reliability of his MEP infrastructure. I turned his problem into an opportunity and ventured into the data center facility infrastructure business in 2008.
In 2009-2010, Edarat Group, in partnership with IDGroup, a leading data center design company based out of Boston, developed the design for two Tier IV and two Tier III data centers for a telecom operator and the smart cities being built in Riyadh by the Public Pension Agency. In 2010, I got accredited as a Tier Designer (ATD) by the Uptime Institute, and all four facilities achieved Tier Certification of Design Documents (TCDD).
What was the impact of the Tier certification?
Once we succeeded in achieving the Tier Certification, it was like a tipping point.
We became the leading company in the region in data center design. Saudi Arabia values certifications very highly. Any certification is considered valuable and even considered a requirement for businesses, as well as for professionals. By the same token, the ATD certificate positioned me as the lead consultant at that time.
Since that time, Edarat has grown very rapidly, working on the design and construction supervision of Tier III, Tier IV, and even Tier II facilities. Today, we have at least 10 facilities that received design Tier-certifications and one facility that is Tier III Certified as a Constructed Facility (TCCF).
What has been your personal involvement in projects at Edarat?
I am involved in every detail in the design and construction process. I have full confidence in these facilities being built, and Uptime Institute Certifications are mere evidence of these significant successful achievements.
What is Edarat doing today?
Currently, we are involved in design and construction. In construction, we review material submittals and shop drawings and apply value engineering to make sure that the changes during construction don’t affect reliability or Tier certification of the constructed facility. Finally, we manage the integrated testing and final stages of commissioning and ensure smooth handover to the operations team.
Are all your projects in Saudi Arabia?
No. We also obtained Tier III certification for a renowned bank in Lebanon. We also have done consultancy work for data centers in Abu Dhabi and Muscat.
What stimulates demand for Tier certification in Saudi Arabia?
Well, there are two factors: the guarantee of quality and the show-off factor due to competition. Some customers have asked us to design and build a Tier IV facility for them, though they can tolerate a long period of downtime and would not suffer great losses from a business outage.
Edarat Group is vendor-neutral, and as consultants, it is our job to educate the customer and raise his awareness because investing in a Tier IV facility should be justifiable when compared to the cost of disruption.
My experience in business continuity enables me to help customers meet their business requirements. A data center facility should be fit-for-purpose, and every customer is unique, each having different business, regulatory, and
operational requirements. You can’t just copy and paste. Modeling is most important at the beginning of every data center design project.
Though it may seem like hype, I strongly believe that Uptime Institute certification is a guarantee of reliability and high availability.
What has the effect of the ATD been on the data center industry in Saudi Arabia?
Now you can see other players in the market, including systems integrators, getting their engineers ATD certified. Being ATD certified really helps. I personally always refer to the training booklet; you can’t capture and remember everything about Tiers after just three days of training.
What’s unique about data centers in Saudi Arabia?
Energy is cheap; telecom is also cheap. In addition, Saudi Arabia is a gateway from Europe to Asia. The SAS1 cable connects Europe to India through Saudi Arabia. Energy-efficient solutions are difficult to achieve. Free cooling is not available in the major cities, and connectivity is not yet available in remote areas where free cooling is available for longer periods during the year. In addition, the climate conditions are not very favorable to energy-efficient solutions; for example, dust and sand make it difficult to rely on solar power. In Riyadh, the cost of water is so high that it makes the cost of investing in cooling towers unjustifiable compared to air-cooled chillers. It could take 10 years to get payback on such a system.
Budget can sometimes be a constraint on energy efficiency because, as you know, green solutions have high capex, which is unattractive because energy is cheap in Saudi Arabia. If you use free cooling, there are limited hours, plus the climate is sandy, which renders maintenance costs high. So the total cost of ownership for a green solution is not really justifiable from an economic perspective, and the government so far does not have any regulations on carbon emissions and so forth.
Therefore, in the big cities, Riyadh, Dammam, and Jeddah, we focus primarily on reliability. Nevertheless, some customers still want to achieve LEED Gold.
What’s the future for Edarat?
We are expanding geographically and expanding our services portfolio. After design and building, the challenge is now in the operation. As you already know, human errors represent 70 percent of the causes for downtime. Customers are now seeking our help to provide consultancy in facility management, such as training, drafting SOPs, EOPs, capacity management, and change management procedures.
I received an honors degree in electrical engineering in the 1980s and completed a postgraduate diploma in communications systems a couple of years later. In conjunction with practical experience working as a technical officer and engineer in the telecommunications field, I gained a very sound foundation on which to progress my career in the ICT space.
My career started with a technical officer position in a company called Telecom, the monopoly carrier operating in Australia. I was there about 14 years and worked my way through many departments within the company including research, design, construction and business planning. It was a time spent learning much about the ICT business and applying engineering skills and experience to modernize and progress the Telecom business. Around 1990 the Australian government decided to end the carrier monopoly, and a company by the name of Optus emerged to compete directly with Telecom. Optus was backed by overseas carriers Bellsouth (USA) and Cable and Wireless (UK). There were many new exciting opportunities in the ICT carrier space when Optus began operations. At this point I left Telecom and started with Optus as project manager to design and construct Optus’ national fiber and data center network around Australia.
Telecom was viewed by many as slow to introduce new technologies and services and not competitive compared to many overseas carriers. Optus changed all that. They introduced heavily discounted local and overseas calls, mobile cellular systems, pay TV, point-to-point high-capacity business network services and a host of other value-added services for business and residential customers. At the time, Telecom struggled to develop and launch service offerings that could compete with Optus, and a large portion of the Australian population embraced the growth and service offerings available from Optus.
I spent 10 years at Optus, where I managed the 8500-kilometer rollout of fiber that extended from Perth to Adelaide, Melbourne, Canberra, Sydney and Brisbane. I must have done a good job on the build, as I was promoted to the role of Field/National Operations manager to manage all the infrastructure that was built in the first four years. Maybe that was the punishment in some way? I had a workgroup of some 300-400 staff during this period and gained a great deal of operational experience.
The breadth of knowledge, experience and networks established during my time at Optus was invaluable and led me to my next exciting role in the telecommunications industry during 2001. Nextgen Networks was basically formed to fill a void in the Australian long-haul, high-capacity digital transmission carrier market, spanning all mainland capital cities with high-speed fiber networks. Leighton Contractors was engaged to build the network and maintain it. In conjunction with transmission carriage services, Nextgen also pre-empted the introduction and development of transmission nodes and data center services. Major rollouts of fiber networks, associated transmission hubs and data centers were a major undertaking, providing exciting opportunities to employ innovation and new technologies.
My new role within Leighton Telecommunications Group to support and build the Nextgen network had several similarities with Optus. Included were design activities and technical acceptance from the builder of all built infrastructure, including transmission nodes and data centers. During 2003 Nextgen Networks went into administration as the forecast demand for large amounts of transmission capacity did not eventuate. Leighton Telecommunications realized the future opportunities and potential of the Nextgen assets and purchased the company. Through good strategic planning and business management, the Nextgen business has continued to successfully expand and grow. It is now Australia’s third largest carrier. This was indeed a success story for the Leighton Telecommunications Group.
Metronode was established as a separate business entity to support the Nextgen network rollout by way of providing capital city transmission nodes for the longhaul fiber network and data center colo space. Metronode is now one of the biggest data center owner/operators in Australia, with the largest coverage nationally.
I’ve now been with the Leighton Telco Group for 12 years and have worked in the areas of design, development, project management and operations. Much of the time was actually spent in the data center area of the business.
For the last three years I have headed Metronode’s Engineering Group and have been involved in many exciting activities including new technology assessment, selection and data center design. All Metronode capital city data centers were approaching full design capacity a couple of years ago. In order to continue on a successful growth path for data center space, services and much improved energy efficiency, Metronode could no longer rely on traditional data center topologies and builds to meet current and future demands in the marketplace. After very careful consideration and planning, it was obvious that any new data center we would build in Australia would have to meet several important design criteria. These included good energy efficiency (sub 1.2 PUE), modular construction, quick to build, high availability and environmental sustainability. Formal certification of the site to an Uptime Institute Tier III standard was also an important requirement.
During 2011 I embarked on a mission to assess a range of data center offerings and technologies that would meet all of Metronode’s objectives.
So your telecom and fiber work set you up to be the manager of operations at Metronode, but where did you learn the other data center disciplines?
During the rollout of fiber for the Optus network, capital city “nodes” or data centers were built to support the transmission network. I was involved with the acceptance of all the transmission nodes and data centers, and I guess that’s where I got my first exposure to data centers. Also, when Nextgen rolled out its fiber network, it was also supported by nodes and data centers in all the capital cities.
I was primarily responsible for acceptance of the fiber network, all regenerator sites (basically mini data centers) and the capital city data centers that the fiber passed through. I wasn’t directly involved with design, but I was involved with commissioning and acceptance, which is where I got my experience.
Do you consider yourself to be more of a network professional or facilities professional?
A portion of both, however, stronger affinity with the data center side of things. I’m an electrical engineer and relate more closely to the infrastructure side of data centers and transmission regeneration sites–they all depend on cooling, UPS, batteries, controlled environments, high levels of redundancy and all that sort of thing.
Prior to completion of any data center build (includes fiber transmission nodes), a rigorous commissioning and integrated services testing (IST) regime is extremely important. A high level of confidence in the design, construction and operation is gained after the data center is subjected to a large range of different fault types and scenarios to successfully prove resilience. My team and I always work hard to cover all permutations and combinations of fault/operational scenarios during IST to demonstrate resilience of the site before handing it over to our operations colleagues.
What does the Australian data center business look like? Is it international or dependent on local customers?
Definitely dependent on both local and international customers. Metronode is a bit different from most of the competition, in that we design, build, own and operate each of our sites. We specialize in providing wholesale colo services to the large corporations, state and federal government departments and carriers. Many government departments in Australia have embarked on plans to consolidate and migrate their existing owned, leased and “back of office” sites into two or more modern data centers managed and operated by experienced operators with a solid track record. Metronode recently secured the contract to build two new major data centers in NSW to consolidate and migrate all of government’s requirements.
Can you describe Metronode’s data centers?
We have five legacy sites that are designed on traditional technologies and builds comprising raised floor, chilled-water recirculation, CRACs on floor, under-floor power cabling from PDUs to racks and with relatively low-power density format. About three years ago, capacity was reaching design limits and the requirement to expand was paramount. I was given the task of reviewing new and emerging data center technologies that would best fit Metronode’s business plans and requirements.
It was clear that a modular data center configuration would provide significant capital savings upfront by way of only expanding when customer demand dictated. The data center had to be pre-built in the factory, tested and commissioned and proven to be highly resilient under all operating conditions. It also had to be able to grow in increments of around 800kW of IT load up to a total 15 MW if need be. Energy efficiency was another requirement, with a PUE of less than 1.2 set as a non-negotiable target.
BladeRoom technology from the UK was the technology chosen. Metronode purchased BladeRoom modules from the U.K. They were shipped to Australia and assembled on-site. We also coordinated the design of plant rooms to accommodate site HV intake, switchgear, UPS and switchboards to support the BladeRoom modules. The first BladeRoom deployment of 1.5 MW in Melbourne took around nine months to complete.
BladeRoom uses direct free air and evaporative cooling as the primary cooling system. It uses an N+1 DX cooling system as a backup. The evaporative cooling system was looked at very favorably in Australia mainly because of the relatively high temperatures and low humidity levels throughout the year in most capital cities.
To date, we have confirmed that free air/evaporative cooling is used for 95-98% of the year, with DX cooling systems used for the balance. Our overall energy usage is very low compared to any traditional type sites.
We were the first data center owner/operator to have a fully Uptime Institute-certified Tier III site in Australia. This was another point of differentiation we used to present a unique offering in the Australian marketplace.
In Australia and New Zealand, we have experienced many other data center operators claiming all sorts of Tier ratings for their sites, such as Tier IV, Tier IV+, etc. Our aim was to formalize our tier rating by gaining a formal accreditation that would stand up to any scrutiny.
The Uptime Institute Tier ratings appealed to us for many reasons, so we embarked on a Tier III rating for all of our new BladeRoom sites. From a marketing perspective, it’s been very successful. Most customers have stopped asking many questions relating to concurrent maintainability, now that we have been formally certified. In finalizing the design for our new generation data centers, we also decided on engineering out all single points of failure. This is over and above the requirement for a Tier III site, which has been very well received in the marketplace.
What is the one-line of the BladeRoom electrical system?
The BladeRoom data hall comprises a self-contained cooling unit, UPS distribution point and accommodation space for all the IT equipment.
Support of the BladeRoom data hall requires utility power, a generator, UPS and associated switchgear to power the BladeRoom data hall. These components are built into what we call a “duty plant room.” The design is based on a block-redundant architecture and does not involve the paralleling of large strings of generators to support the load under utility power failure conditions.
A separate duty plant room is dedicated or assigned to each BladeRoom. A separate “redundant plant room” is part of the block-redundant design. In the case of any duty plant room failure, critical data hall load will be transferred to the redundant plant room via a pair of STS switches. Over here we refer to that as a block-redundant architecture. We calculate that we will achieve better than 5 nines availability.
Our objective was to ensure a simple design in order to reduce any operational complexity.
What’s the current footprint and power draw?
The data center built in Melbourne is based on a BladeRoom data hall with a cooling module on either end, allowing support 760 kW of IT load.
To minimize footprint, we build data halls into a block, comprising four by 760 kW and double-stack them. This provides us with two ground floor data halls and two first floor data halls, with a total IT capacity of 3 MW. Currently, the Melbourne site comprises a half block, which is 1.5 MW; we’re planning to build the next half block by the end of this year. Due to the modularity aspect and the similar design, we simply replicate the block structure across the site, based on client-driven demand. Our Melbourne site has capacity to accommodate five blocks or 15 MW of IT load.
In terms of the half block in operation in Melbourne, we have provisioned about 1 MW of IT load to our clients; however, utilization is still low at around 100 kW. In general, we have found that many clients do not reach their full allocated capacity for some time, possibly due to being conservative about demand forecast and the time it takes for complex migrations and new installations to be completed.
What about power density?
We can accommodate 30 kW per rack. A supercomputer was recently installed that took up six or seven rack spaces. With a future power demand in excess of 200kW, we are getting close to the 30 kW per rack mark.
At our new Melbourne site, we currently support an average IT load of 100 kW across 1.5 MW of IT load capacity,
which approximates to 7% IT load, a very light load which would reflect an unmentionable PUE in a traditionally
designed site! Our monthly PUE is now running at about 1.5. Based on trending over the last three months (which have been summer months in Australia) we are well on target to achieving our design PUE profile. We’re very confident we’ll have a 1.2 annual rolling PUE once we reach 30% load; we should have a sub 1.2 PUE as the load approaches 100%.
What got you interested in the ATD program and what have been its benefits?
During my assessment of new-generation data centers, there was some difficulty experienced in fully understanding the resilience that various data center configurations provided and comparing them against one another. At this point it was decided that some formal standard/certification would be the logical way to proceed, so that data center performance/characteristics could be compared in a like-for-like manner. A number of standards and best practices were reviewed, including those published by IBM, BICSI, AS/NZ Standards, TIA 942, UI, etc.; many of which quoted different rating definitions for each “tier.” The Uptime Institute tiering regime appealed to me the most, as it was not prescriptive in nature and yet provided an objective basis for comparing the functionality, maintainability, redundancy and fault tolerance levels for different site infrastructure topologies.
My view was that formal certification of a site would provide a clear differentiator between ourselves and competition in the marketplace. To further familiarize myself with the UI standards, applications and technical discussions with clients, an ATD qualification was considered a valuable asset. I undertook the ATD course nearly two years ago. Since then, the learnings have been applied to our new designs, in discussions with clients on technical performance, in design reviews with consultants, and for comparing different data center attributes.
I found the ATD qualification very useful in terms of assessing various designs. During the design of our Melbourne data center, our local design consultants didn’t have anyone who was ATD accredited locally, and I found that designs presented did not always comply with the minimum requested requirements for Tier III, and in many cases exceeded the Tier III requirement. This was another very good reason for completing the ATD course in order to keep an eye on consultant designs.
Michael, are you the only ATD at Metronode?
I’m the only Tier accredited designer in Metronode. Our consultants have since had a few people accredited because it looked a little bit odd that the client had an ATD and the consultant didn’t.
What does the future hold?
We won a major contract with the NSW government last year, which involved building two new data centers. The NSW government objective was to consolidate around 200 existing “data centers” and migrate services into the two main facilities. They’re under construction as we speak. There’s one in Sydney and one just south of Sydney near Wollongong. We recently obtained Tier III design certification from Uptime for both sites.
The Sydney data center will be ready for live operations in mid-July this year and the site near Wollongong operational a couple of months later. The facilities in Sydney and near Wollongong have been dimensioned to support an ultimate IT capacity of 9 MW.
Metronode also has a new data center under construction in Perth. It will support an ultimate IT capacity of 2.2 MW, and the first stage will support 760 kW. We hope to obtain Tier III design certification on the Perth site shortly and expect to have it completed and operational before the end of the year.
The other exciting opportunity is in Canberra, and we’re currently finalizing our design for this site. It will be a Tier III site with 6 MW of IT capacity.
With my passion for sustainability and high efficiency, we’re now looking at some major future innovations to further improve our data center performance. We are now looking at new hydrogel technologies where moisture from the data hall exhaust air can be recycled back into the evaporative cooling systems. We are also harvesting rainwater from every square meter of roof at our data centers. Rainwater is stored in tanks and used for the evaporative cooling systems.
Plant rooms containing UPS, switchboards, ATS, etc. in our legacy sites are air conditioned. If you walked into one of these plant rooms, you’d experience a very comfortable 23 or 24 degrees Centrigrade temperature all year round. Plant rooms in our new data centers run between 35-40 degrees Centrigrade, allowing them to be free air-cooled for most of the year. This provides significant energy savings and allows our PUE to be minimized.
We are now exploring the use the hot exhaust air from the data halls to heat the generators, rather than using electrical energy to heat engine water jackets and alternators. Office heating is another area where use of data
hall exhaust is being examined.
How did you get started?
I had quite a practical education up to the point I undertook a degree in electronics and electrical engineering and then came to the market.
I worked on a number of schemes unconnected with data centers until the point that I joined CBRE in 1998, and it was working with them that I stumbled across the data center market in support of Global Crossing, looking for real estate across Europe trying to build its Pan European network. Within CBRE, I provided a lot of support to the project management of their data center/POP builds as well as consultancy to the CBRE customer base. When Global Crossing got their network established, I joined them to head up their building services design function within the Global Center, the web-hosting portion of Global Crossing. Together with a colleague leading construction, we ran a program of building five major data centers across the tier one cities with two more in design.
After the crash of 2001, I returned to CBRE, working within the technical real estate group. I advised clients such as HSBC, Goldman Sachs and Lehman Brothers about the design and programming and feasibility of potential data centers. So I had a lot of exposure to high capital investment programs, feasibility studies for HSBC, for instance, covering about L220 million investment in the U.K. for a pair of mirrored data centers to replace a pair of dated facilities for a global bank.
During the period, 2005-2006, I became aware of Digital Realty through a number of projects undertaken by out team. I actually left CBRE to work for Digital directly, becoming European employee #4, setting up an embryonic group as part of the REIT that is Digital Realty.
Since then, I’ve been responsible for a number of new builds in Europe, including Dublin and London, driving the design standards that Digital builds to in Europe, which included taking the U.S. guidelines requirements and adapting them to not only European units but also to European data center requirements. That, in a way, led to my current role.
Today, I practice less of the day-to-day development and am more involved with strategic design and how we design and build projects. I get involved with the sales team about how we try to invite people into our portfolio. One of those key tools was working with Uptime Institute to get ATD accreditation so that we could talk authoritatively in assessing customer needs for reliability and redundancy.
What are the European data center requirements, other than units of measure and power differences?
I think it’s more about the philosophy of redundancy. It’s my own view that transactions in the U.S. are very much more reliant on the SLA and how the developer or operator manages his own risk compared to certification within the U.K. End users of data centers, whether it is an opex (rental) or build–to-own requirement, seem to exercise more due diligence for the product they are looking to buy. Part of the development of the design requirement was that we had to modify the electrical arrangements to be more of a 2N system and provide greater resilience in the cooling system to meet the more stringent view of the infrastructure.
So you feel your U.K. customers are going to examine claims of redundancy more carefully, and they’re going to look for higher standards?
Yes. It’s matter of where they see the risks. Historically, we’ve seen that in the U.K. and across Europe too, they’ve sought to eradicate the risks, whereas I think maybe because of the approach to rental in the U.S., the risk is left with the operator. DLR has a robust architecture globally to mitigate our risk, but other operators may adopt different risk mitigation measures. It seems to me that over the years people in the U.K. market want to tick all the boxes and ensure that there is no risk before they take on the SLA, whereas in the U.S., it’s left to the operator to manage.
How has the ATD program played into your current role and helped you meet skepticism?
I think one of the only ways people can benchmark a facility is by having a certain stamp that says it meets a certain specification. In the data center market, there isn’t really anything that gives you a benchmark to judge a facility by, aside from the Uptime Institute ATD and certification programs.
It’s a shame that people in the industry will say that a data center is Tier X, when it hasn’t been assessed or certified. More discerning clients can easily see that a data center that has been certified in-line with the specs of the Uptime Institute, will gives them the assurance they need compared to a facility where they have no visibility of what a failure is going to do to the data center service. Certification, particularly to the Uptime Institute guidance is really a good way of benchmarking and reducing risk. That is certainly what customers are looking for. And perhaps it helps them sleep better at tonight; I don’t know.
Did the standards influence the base design documents of Digital Realty? Or was the U.S. version more or less complete?
I think the standards have affected the document quite a lot within Europe. We were the first to introduce the 2N supply right through the medium voltage supply to the rack supply. In the U.S., we always operated at 2N UPS, but the switchgear requires the skills of the DLR Tech Ops team to make it to concurrently maintainable. Additional features are required to meet the TUI standards that I’ve come to understand from taking the course.
I think we always had looked at achieving concurrent maintainability, but that might be by taking some
additional operational risks. When you sit back and analyze systems using the Uptime philosophy, you can see that having features such as double valving or double electrical isolation gives you the ability to maintain the facility not just by maintaining your N-capacity, or further resilience if you have a Tier IV system, but in in a safe and predictable manner.
We’ve often considered something concurrently maintainable on systems where a pipe freeze could be used to replace a critical valve. Now that might well be an industry-accepted technique, but if it goes wrong, the consequences could be very significant. What I’ve learned from the ATD process is regularizing the approach to a problem in a system and how to make sure that it is fail safe in its operation to either concurrent maintainability or fault tolerant standards if you are at the Tier III or Tier IV level, respectively.
How does Digital deploy its European Product Development strategy?
What we tried to do through product development is offer choice to people. In terms of the buildings, we are trying to build a non-specific envelope that can adapt to multiple solutions and thereby give people the choice to elect to have those data center systems deployed as the base design or upgrade them to meet Tier requirements.
Within a recent report that I completed on product development, the approach has made bringing the facility up to a Tier lever simpler or cheaper to adapt. We don’t by default build everything to a Tier III in every respect, although that’s changing now with DLR’s decision to gain a higher degree of certification. So far, we’ve pursued certification within the U.S. market as required and more frequently on the Asia-Pacific new build sites we have.
I think that difference may be due to the maturity of each market. There are a lot of people building data centers who perhaps don’t have the depth of maturity in engineering that DLR has. So people are looking for the facilities they buy to be certified so they can be sure they are reliable. Perhaps in the U.S. and Europe, they might be more familiar with data centers; they can look at the design themselves and make that assessment.
And that’s why in the European arena, we want to offer not only a choice of system but also to improve higher load efficiencies. The aim is to offer chilled water and then outside air, direct or indirect, or Tier certified designs, all within the same building and all offered as a sort of off-the-shelf product from a catalog of designs.
It would seem that providing customers with the option to certify to Tier would be a lot easier in a facility where you have just one customer.
Yes, but we are used to having buildings where we have a lot of customers and sometimes a number of customers in one data hall. Clearly, the first customer that goes into that space will often determine the infrastructure of that space. You can’t go back when you have shared systems beyond what the specification was when the initial build was completed. It is complicated sometimes, but it is something we’re used to because we do deal in multi-customer buildings.
What is the current state of Digital’s footprint in terms of size?
Within the U.K., we now have properties (ten buildings) in the London metro area, which represents over 1.2 million ft2 of space. This is now our largest metro region outside of the U.S. We have a very strong presence south of central London near Gatwick airport, but that has increased with recently acquired stock. We have two facilities there, one we built for a sole client—it was a build-to-suit option—and then a multi-tenant one. Then we have another multi-tenant building, probably in the region of 8 MW in the southwest of London. To the north in Manchester, we have another facility that is fully leased, and it’s probably in the neighborhood of 4-5 MW.
We’re actually under way with a new development for a client, a major cloud and hosting provider. We are looking to provide a 10-MW data center for them, and we’re going through the design and selection process for that project at the moment.
That’s the core of what we own within the U.K., but we also offer services for design and project management. We actually assisted HSBC with two very, very secure facilities; one to the north of London in the region of 4.5 MW
with 30,000 ft2 raised floor base and another in the north of the U.K.with an ultimate capacity of 14.5 MW and a full 2N electrical system, approaching a Tier IV design but not actually certified.
The most recent of our projects that we finished in Europe has been the first phase of a building for Telefonica in Madrid. This was a project where we we acted as consultants and design manager in a process to create a custom designed Tier IV data center with outside air, and that was done in conjunction with your team and Keith Klesner. I believe that’s one of only nine Tier IV data centers in Europe.
Walk me through the Madrid project.
It’s the first of five planned phases in which we assisted in creating a total of seven data halls, along with a support office block. The data halls are approximately 7,500 ft2 each. We actually advised on the design and fit-out of six of these data halls, and the design is based on an outside-air, direct-cooling system, which takes advantage of the very dry climate. Even though the dry-bulb temperatures are quite high, Madrid, being at high altitude, gives Telefonica the ability to get a good number of free-cooling hours within the year, driving down their PUE and running costs.
Each of the data centers has been set to run at four different power levels. The initial phase is at 1,200 kW per data hall, but the ultimate capacity is 4.8 MW per hall. All of it is supported by a full concurrently maintainable and fault-tolerant Tier IV-certified infrastructure. On the cooling side, the design was based on N+2 direct-air cooling units on the roof. Each unit is provided with a chilled water circuit for cooling in recirculation mode when outside air free-cooling is not available. There are two independent chilled-water systems in physically separate support buildings, separated from the main data center building.
The electrical system is based on a full 2N+1 UPS system with transformerless UPS to help the power efficiency and reduce the losses within the infrastructure. Those are based around the Schneider Galaxy 7000 UPS. Each of those 2N+1 UPS systems and the mechanical cooling systems were supported by a mains infrastructure at medium voltage, with a 2N on-site full continuous duty-rated backup generation system.
Is the PUE for a fully built out facility or partially loaded?
Bear with me, because people will focus on that. The approximate annualized PUEs based on recorded data at 100% were 1.25; 75%, 1.3; 50%, 1.35; and not surprisingly, as you drop down the curve, at 25% it rose to about 1.5.
What do you foresee for future development?
In the last few years, there has been quite a significant change in how people look at the data center and how people are prepared to manage the temperature parameters. Within the last 18 months I would say, the desire to adopt ASHRAE’s 2011 operating parameters for servers has been fairly uniform. Across the business, there has been quite a significant movement, which has been brought to a head by a combination of a lot of new cooling technologies going forward. So now you have ability to use outside air direct with full mechanical backup or outside air indirect where there is a use of evaporative cooling, but in the right climates, of course.
I think there is also an extreme amount of effort looking at various liquid-cooled server technology. From my standpoint, we still see an awful lot of equipment that wants to be cooled by air because that is the easiest presentation of the equipment, so it may be a few years before liquid rules the day.
There’s been a lot of development mechanically and I think we’re sort of pushing the limits of what we can
achieve with our toolkit.
In the next phase of development, there’s got to be ways to improve the electrical systems’ efficiencies, so I think there is going to be a huge pressure on UPS technology to reduce the losses and different types of voltage distribution, whether that be direct current or elevated ac voltages. All these ideas have been around before but may not have been fully exploited. The key thing is the potential, provided you are not operating on recirc for lot of the time, with outside air direct you’ve got PUEs approaching 1.2 and closing in on 1.15 in the right climate. At that level of PUE, the UPS and the electrical infrastructure can score a significant part of the remaining PUE uplift, with the amount of waste that currently exists. So I think one of the issues going forward, will be addressing the efficiency issues on the electrical side of the equation.
Kevin Heslin is senior editor at the Uptime Institute. He served as an editor at New York Construction News, Sutton Publishing, the IESNA, and BNP Media, where he founded Mission Critical, the leading publication dedicated to data center and backup power professionals. In addition, Heslin served as communications manager at the Lighting Research Center of Rensselaer Polytechnic Institute. He earned the B.A. in Journalism from Fordham University in 1981 and a B.S. in Technical Communications from Rensselaer Polytechnic Institute in 2000.
Uptime Institute Accredited Tier Designers address particulates and seismic activity in the Chilean capitol.
By Panagiotis Laziridis, ATD and Jan Carlos Sens, ATD
The city of Santiago, Chile, is a challenging place to site a data center. It has a history of strong earthquakes, it is close to dozens of active volcanoes, and the nearby Andes Mountains capture air particulates and VOCs, so the air is very polluted. On the other hand, its moderate climate means that it enjoys low temperatures practically every day of the year, with average temperature ranging from 15° C to 25°C, which is a highly favorable environment for an air conditioning energy recovery system.
In 2011, Sonda, a South American IT provider, decided to tackle Santiago’s difficult building environment by building a data center that would meet the Uptime Institute’s Tier III, withstand earthquakes and achieve a PUE between 1.25 and 1.50 by making use of the city’s potential for free cooling.
Other features of the data center include:
- Total area of 6,500 square meters (m2)
- Six 250 m2 IT rooms be installed in three steps
- Installed load of 600 kilowatts (kW) per room (density of 2.4 kW/m2, 6 kW/rack)
- 48-hours minimum storage of diesel oil for generators
- 48-hours minimum storage of potable water for replenishing evaporative coolers.
In addition, the site would include a well capable of supplying all the water consumed in the data center, making the data center self-sufficient with regard to makeup water for the cooling system.
Project designers began by evaluating a direct free-cooling solution for the site. That idea was discarded after the design team analyzed the risks posed by the local environment, particularly the particulates and VOCs, and the low humidity level. Santiago sits in a valley created by the Andes ridge and the Pre-cordillera ridge, which causes thermal inversion throughout the year, especially in the winter.
Based on recommendations from ASHRAE TC 9.9 – Mission Critical Facilities, Technology Spaces and Electronic Equipment and ASHRAE – Gaseous and Particulate Contamination Guidelines for Data Centers, an indirect free-cooling system replaced the direct free-cooling system. The air conditioning system was designed using chillers with a centrifugal compressor and water condenser. Dual coil CRAC units were used for energy recovery.
The new design met the challenges posed by local temperatures that can go below as low as -6.0° C, which could cause water in the piping and cooling tower and thermal storage tank to freeze. As a consequence, the design team chose evaporative coolers instead of cooling towers, which minimized the volume of water exposed to low temperatures. In addition, an ethylene glycol-clean water solution prevents freezing in closed circuits and reduces fouling sludge in the cooling coil of the free cooling, piping and chillers. The racks work with return ductwork in order to enable the use of cold aisles at 25 ° C and return above 35 ° C, maximizing the hours of free cooling.
The evaporative cooler meets the total capacity of the central chiller (CAG), but takes into account the transient flow of the CAG plus free cooling. The total thermal load that reaches the cooler is at the most equal to the maximum load of the CAG, i.e, the energy dissipated by the evaporative cooler operating with the CAG at full load corresponds to the thermal load of the rooms (IT) plus the work of the compressor and the total heat rejected defines the capacity of the evaporative cooler. When free cooling dissipates the load, the portion of heat corresponding to the work of the compressor is lower, and the more efficient the free cooling the lower will be (total heat is always lower).
The automation system was designed to take into account the performance curve of the chiller (NPLV) and evaporative coolers. The number of machines in operation is always determined by the equation of load times setup at the best operating point, and the automation system can even activate a backup machine to achieve the best energy efficiency point.
The free cooling also reduces the consumption of makeup water, since the volume of evaporated water directly relates to the thermal load dissipated by the evaporative cooler.
The heat load is also lower because there is no need to dissipate heat generated by the chiller compressor, so the volume of water to evaporate is less. Under extreme low conditions (TBU below 2.0° C), it is possible to disconnect evaporative cooler’s recirculation pump, and it will operate as a dry cooler.
The table below shows the overall energy and water performance of the data center, including PUE and WUE values for the year. Carrier’s HAP 4:50 software program was used both for the thermal load calculation and the energy simulation. The data were transferred to Excel spreadsheets where chiller, cooler and pump data were introduced, as well as energy losses in transformers and UPS according to installation.
In addition, use of indirect free cooling eliminated the risk of particulates and VOC contamination in the data center as well as the problems associated with low humidity in the air, saving the filtration system, power consumption, and water for humidification costs and reducing the risk to the IT equipment.
Even in the presence of volcanic ash, only the fresh air conditioner (which can be turned off ) and the water exposed to the outside environment, the basin and external components of the heat exchanger of the evaporative cooler are exposed.
The concentration of particulates can be controlled by increasing the blowdown flow of the cooler, so the cooler ends up acting as an air scrubber.
The data center building was designed using advanced technology against earthquakes. It was built as an independent block sitting on foundations with a vibration dampening system. All piping and penetrations are equipped with vibration damping elements (flexible).
As a general practice, the project met the challenges of building a data center in a region full of hazards, with high incidence of earthquakes and high levels of environmental contamination, but also one featuring a low operating cost and reliable utility power supply (urban area).
Commissioning on this project has been concluded, with integrated testing underway in May. Sonda Quilicura received Tier III Certification of Constructed Facility in the spring of 2013.
Panagiotis Lazaridis is a mechanical engineer and graduated at Faculdade de Engenharia Industrial (FEI) in 1985. He is director of L&M Engenharia, an engineering company specialized in design of electrical installations, hydraulic installations and air conditioning systems design for office buildings, industrial plants and mission critical areas. He has participated in projects for numerous clients from the banking, manufacturing, and technology sectors, including EDS, Barclays, Alstom, IBM and Dell.
Jan Carlos Sens is a mechanical engineer and graduated at Faculdade Armando Álvares Penteado in 1985, postgraduate in Complements of Thermodynamics Applied to Processes, at Universidade de São Paulo – USP Engineering Manager Mechanical Installations and System of Ventilation and Air Conditioning. He participated in projects in mission critical data center, industrial; hydroelectric, nuclear and thermoelectric plants and other technical projects. His data center clients have included Vivo, Ativas, T-System, Uol, Banco do Brasil Caixa, CIPD PB and Barclays.
The Dream Job
By Fred Dickerman
Editor’s note: Mr. Dickerman’s feature on the challenges of starting a Facilities Team breaks new and unexpected ground, as Mr. Dickerman adapts the new FORCSS methodology to help resolve a staffing question in a hypothetical case. The Uptime Institute did not anticipate this use of FORCSS as it developed the new methodology. In fact, records of the two Charrettes of industry stakeholders do not include any discussion of staffing levels.
Nonetheless, the Uptime Institute and the authors of the FORCSS document are gratified by Mr. Dickerman’s imaginative, if hypothetical, application of FORCSS, recognizing that the success of FORCSS as a tool depends on individuals like Mr. Dickerman finding it an easy-to-use tool to prioritize IT deployments.
It is Monday morning, and Pat, newly promoted to the position of Data Center program manager, is flying to the construction site of the company’s new data center. Last Friday the CIO of the company said to Pat, “I have good news and bad news! The good news is that based on the great job you did with the FORCSS™ analysis for our company’s five-year data center strategy, we have decided to promote you. You are going to be responsible for the operation of the new Tier III data center we’re building so we can consolidate all our IT into one facility.”
Naturally Pat wondered, “What’s the bad news?”
The CIO continued, “Since this is the first data center we’re going to own, we don’t actually have a Facility Management team for you to manage. You’re going to have to create the team starting from scratch. And the team has to be in place within four months to help commission the new data center and accept the facility from the contractor. Fly out to the site on Monday, and come back to me in two weeks with a draft of your plans for the new Facility Management team.”
Of course, some program managers, informed that they will be creating a Facility Management (FM) team from scratch, might view that as more good news rather than bad news.
Experienced facility managers will immediately recognize that Pat’s challenge is much bigger than just hiring a new FM team. The “People” component is certainly part of what needs to be done in the next four months; however, operating the new data center will also require establishing relationships with service vendors, utilities and suppliers. Creating a maintenance plan with tasks and schedules is essential, and each maintenance task, whether preventive, predictive or reactive, will require a written procedure for the operators to follow. In addition, the data center will need a complete set of operating policies and rules. While all this is being created, Pat (and the FM team) will need to monitor the construction of the new facility and participate in the commissioning, acceptance and certifications of the site. Finally, the FM team will want to establish a good working relationship with the team’s “customers,” the IT personnel who will be installing and operating IT equipment in the data center.
Pat’s first task is to decide on a mission statement for the new FM team. Employees sometimes think of a mission statement as an enterprise-level declaration of the goals and objectives of a company, developed to have something to put on the first page of the annual report but having little relevance to operations. But all employees of an enterprise, and certainly all the managers, should have mission statements of their own, with three key elements:
- What am I supposed to do (goals and objectives)?
- Who am I supposed to do it for (clients/stakeholders)?
- How am I going to be measured (measures of value)?
In Pat’s case, the mission statement might start out quite simply:
- Operate the data center with 100% safety and 100% availability. [In our story, Pat’s data center is a critical facility, with a consequent requirement for a commitment to 100% availability. An organization with several sites and a resilient overall architecture might, in theory, accept an objective of less than 100% availability, but it is hard to imagine any facility manager going to a CIO and saying “I’m committed to 95% uptime for this data center!”]
- Install and activate IT equipment as it migrates into the data center, on time and within budget.
- Manage the facility within an approved budget.
- The CIO
- The company’s IT departments and IT users
- Stakeholders – The rest of the company, vendors, the company’s customers, shareholders.
Measures of value:
- Safety and availability records – no incidents, no events
- IT fit-out scheduled dates versus actual dates
- Facility budget versus actuals.
Of course, when Pat presents a mission statement to the CIO some additional objectives might be added:
- A PUE target
- A renewable energy target to support the company’s environmental sustainability goals
- Certification targets for the team to achieve Uptime Institute, ISO or other certifications within specified time frames.
Creating the Structure, and Structuring the Team
Once management has approved the mission of the new department, Pat can begin to develop the strategies required to reach the objectives. From this point forward all decisions will be based on the technical and business conditions specific to the new corporate data center, so Pat will need to understand those conditions. In addition to the obvious review of the design and equipment selections for the new data center, Pat will engage in in-depth discussions with all the IT user groups to understand their migration plans, capacity growth forecasts and the criticality of the applications that will be hosted in the data center.
Pat will also need to understand the exostructure–the external resources and factors that can either support or constrain the FM team or that pose a risk to the operation of the data center. [Note that IT departments sometimes refer to cloud-based services and other external IT resources as exostructure, but that is not how Pat’s company views it].
To understand the exostructure, Pat will need to interview key equipment vendors, local utilities and service providers to determine their capabilities, strengths and weaknesses and look at all the physical constraints and risks in the area. Finally, since almost all the decisions will result in some level of expenditure of company funds, Pat will want to carefully document everything learned during those investigations and the subsequent strategic decisions for inclusion in the budgeting process for the new department.
To develop strategies in a relatively simple manner, Pat might consider starting with three focus areas:
- Materials and Methods (M&M)
- Policies and Procedures (P&P)
For each focus area, Pat must identify critical decision points; factors influencing each decision can be listed, perhaps in a simple spreadsheet or decision matrix as shown in Figure 1.
As Pat investigates the infrastructure and the exostructure specific to the new data center, the critical factors for each item can be replaced with real world data–costs, conditions, existing regulations or standards and so on. Then the decisions column can be filled in with Pat’s recommendations on each item. During the planning sessions with the CIO, the approval of those decisions can be documented.
The most significant decisions will require separate documentation, to explain the costs, risks and benefits of a particular issue (e.g. using FTEs or a service vendor to provide operations coverage after hours and on weekends or investing in a large inventory of spare parts on site).
Pat is comfortable with the FORCSS methodology and wants to make FORCSS a standard decision tool within the company. The more times the methodology is used in the decision process, the better. So, here’s how Pat might apply FORCSS to presenting one of the key decisions that needs to be made, justified and documented: whether to use FTEs or a service contractor for operational staffing of the new data center (see Figure 2).
The Facility Management Structure
After spending time going through the decision matrix and reviewing the FORCSS analyses of the major decisions with the CIO, Pat will be well positioned to assemble the facility management structure for the new data center.
The elements will include:
- The team’s mission statement with clear objectives and measurements. One of the objectives will be an internal service level agreement, essentially a contract between the FM team and the IT department, committing to availability, response and communication levels.
- A table of organization and responsibilities for the new department, including both internal positions and key vendors. For each internal position, there will be an associated job description; for each vendor, a scope description and service level agreement (see Figure 3).
- A staffing plan that details hours of operation, shift structures and tasks to be self-performed and outsourced. As this is a new data center, the staffing plan will include a recruiting plan to find and hire the employees required.
- A training plan with descriptions of each required training session and course, ranging from the safety overview session (SOS) given to every person who enters the data center to the individual development plans for full-time employees. Licenses, certifications or professional credentials required for any employee will be included in the training plan.
- A list of internal and external policies that will need to be created for the data center. This will include policies for safety, work rules, human resources, security and access, environmental sustainability, purchasing and materials management, cleaning and rules of conduct for employees, visitors and vendors.
- The vendor management plan, listing the vendors the FM team intends to contract with and ultimately detailing the scope of the vendor’s responsibilities and commitments, the service level agreement with that vendor, the vendor’s contacts and escalation, and the vendor’s freedom of action–tasks the vendor is allowed to do on a routine basis, tasks the vendor must ask permission to do and tasks the vendor must be supervised for.
- A list of the plans and procedures for preventive, predictive and reactive maintenance that will need to be created over the next four months. The procedures themselves can be developed based on manufacturer’s maintenance recommendations, industry standards, the design of the data center (including Tier level) and the framework of policies and rules that will regulate the operation of the data center.
- A description of the maintenance management system that the team will use to schedule, track and document preventive, predictive and reactive maintenance. This will include an acquisition plan to purchase the system and an implementation plan to place the system in operation, load the full list of maintenance tasks with associated tools, spares and consumables and a full set of operational procedures for using the system starting from Day 1.
- And since this is a new data center, Pat will include a commissioning and acceptance plan that will detail the steps in the transition from a site under construction to site in operation.
The last element of the facility-structuring plan that Pat will develop is a budget, which will have both capital expense and operating expense forecasts. Capital expenses in the first year will be high, since tools and equipment will need to be purchased, along with initial stocks of spares and consumables. Since Pat has carefully documented all the decisions that combine to create the facility management structure for the new data center, the operating budget can be derived from those decisions.
With a decision matrix and the outlines of the facility-structuring plan in hand, Pat is ready to meet with the CIO and get approval for these decisions and strategies. And once management has approved the plan, the hard work of turning that plan into reality can begin.
How much involvement the FM team has in the design, construction and commissioning of the site will affect the facility’s availability and hence the ability of the team to meet its mission statement, especially in the first days and weeks after a facility is accepted and handed over. Many of us have had experience with design engineers and skilled contractors who have never actually operated a critical facility, and who include elements in the infrastructure that will make the operations team’s task more difficult. Examples can range from mundane–valves that can only be reached by ladder or scissor lift, no convenience outlets in critical spaces to plug in tools–to silly–security doors with the magnetic lock on the unsecure side of the door, roof drains at the high point of a roof section–to serious–obstructed fire corridors and poorly positioned smoke detectors. And, commissioning agents usually test for reliability, not operability, so many of these operational snags will be missed during commissioning. And, at many new sites, commissioning tests (and sometimes Uptime Institute Tier Certification of Constructed Facility demonstrations) are performed by the contractor with the commissioning agent watching.
The FM team that will actually operate the facility may only be observing, not actively participating. So, the first time the facilities engineers take “hands-on” control of the systems is when the facility is in production.
The first time the critical systems’ maintenance tasks are performed is when the facility is in production. And, the first time the FM team must respond to a system or component failure is when the facility is in production.
It is understandable that owners may be reluctant to pay salaries or fees to an operations team with “nothing to operate” (while the facility is in design and construction), but having the FM team on site early on will reduce risk of human error during the critical first months of operation, when the site is at its most vulnerable. In addition, FM team involvement in the selection and purchase negotiations for major infrastructure systems should result in lower total cost of ownership (by reducing operational costs) as well as better documentation and training packages (two items which construction contractors seldom include in purchase negotiations, but which make a big impact on the operability of the site).
Fred Dickerman is vice president, Data Center Operations for DataSpace. In this role, Mr. Dickerman oversees all data center facility operations for DataSpace, a colocation data center owner/operator in Moscow, Russian Federation. He has more than 30 years experience in data center and missioncritical facility design, construction and operation. His project resume includes owner representation and construction management for over $1 billion (US) in facilities development, including 500,000 square feet of data center and five million square feet of commercial properties. Prior to joining DataSpace, Mr. Dickerman was the VP of Engineering and Operations for a Colocation Data Center Development company in Silicon Valley.
Excel is the basis of a surprisingly simple visual tool to document underfloor conditions
By Chad Beery, ATD
Engineers regularly employ pictures and drawings to communicate design ideas. They sketch on whiteboards during a meeting, talk with their hands and sometimes even sketch on the back of napkins. In fact, the deliverables produced by design engineers and consultants include drawings—although today these are typically computer-generated. And while it is true that engineers and design consultants rely on all sorts of images, it is equally true that sometimes only a photograph can capture and convey important details accurately. The composite underfloor picture is one such instance.
Take, for example, a large enterprise that has a 14,000-square-foot (ft2) computer room with a 30-inch raised floor. Chilled-water air-handling units fed from a piping loop in the raised floor provided cooling, with most of the piping having been installed in the early 1990s. As part of a room refresh, the enterprise engaged Peters, Tschantz & Associates to develop a plan to replace the piping without interrupting service to the active computer room.
Even with a very clear underfloor plenum, replacing piping without interrupting service in a computer room is a difficult task—demolition and welding present hazards to sensitive equipment, and floor space for rigging and staging is at a premium. In this case, extensive copper and fiber data cables as well as power whips run under the floor compounded the complexity. In many areas, the cabling had been abandoned in place as equipment was removed. Over time, the problem became worse, as more and more abandoned cables made it even less clear what was in service and what could be safely removed.
Peters, Tschantz & Associates began its design work with a field investigation to help it fully understand the existing conditions. We were able to document the existing piping and create a three-dimensional model (Revit) of the piping. We knew that understanding the location of the power and data cabling was crucial to completing this project successfully.
Simple observation told us that it would be very impractical, if not impossible, to document and present the underfloor condition with enough detail to accurately convey the complexity of the work. As an alternative, we developed a technique to create a composite photograph of the entire underfloor plenum (see Figure 1).
First, we established a labeling nomenclature for floor tiles (letters for columns, numbers for rows). Then we lifted the floor tiles one at a time, so that we could take a digital photograph of the underfloor plenum directly beneath each tile. In some cases, equipment such as PDUs, CRACs and IT cabinets on the tiles blocked our efforts. As the photographs were taken, we recorded their addresses on a dry erase board laid next to the open tile.
Post-processing work began as soon as the last of the photographs was taken. We renamed the photographs using the column-and-row nomenclature. We cropped all the photos to a 1:1 aspect ratio (see Figure 3) to match the 24-by-24-inch floor tile opening and then compressed the files.
Next, we constructed a grid matching the floor tile layout in Microsoft Excel, using cell borders to outline the room. A VBA macro written by our design team scans through a folder of pictures, reading their addresses from their file names and inserting them in the proper location in the grid. The result is a single file composed of over 2,500 individual pictures.
As can be expected, the file is quite large. To allow for higher-resolution photographs of smaller regions of the floor, the macro was enhanced to allow the user to filter for a specific location in the room, instead of generating a picture of the entire room (see Figure 4), which helps us generate composite pictures of small areas of the floor quickly.
Because of the size of the picture, the client wanted us to develop a method of identifying each floor tile on the composite photo, so we added a user-selected label-view function to the macro. These labels greatly ease the referencing specific floor tiles in reports, meetings or telephone conversations (see Figure 5).
Having an accurate record of underfloor conditions has been helpful to the engineers as they plan minor moves, adds and changes in the computer room; as well as for developing master planning strategies for system replacement and upgrade. Concepts that seem practical on the surface (looking above the floor only) can be reviewed after looking deeper (under the floor).
The facilities staff has also found the composite photo useful for conveying the need for underfloor cleanup of abandoned cabling.
In one case, the photo was used to overcome a manager’s reluctance to approve a project for removal of underfloor cabling.
While photographs do not appear poised to take the place of traditional engineering drawings in the near future, this project is an example of how an innovative use of technology can provide great benefit to the designer, facility owner and installing contractor.
Chad Beery, PE, ATD, LEED AP, is one of the three ATDs at Peters, Tschantz & Associates, Inc. in Akron, OH. The firm’s 30+ employees provide MEP engineering services to a variety of industries, with a focus on the mission-critical and health-care sectors. Since joining the firm in 2007, Mr. Beery has been involved in many different project types. His mission-critical work has involved data center design, equipment replacements, CFD modeling and redundancy and capacity consulting. Another important part of his work has been numerous control system design projects, both new and retrofit. Enjoying hands-on work and seeing systems in action, Mr. Beery has also developed system commissioning skills on a number of projects ranging from data centers to schools to research facilities.