Decommissioning as a Discipline: Server Roundup Winners Share Success
How did these six enterprises find and eliminate so much waste?
Comatose IT equipment, servers long abandoned by application owners and users but still racked and running, are hiding in plain sight within even the most sophisticated IT organizations. Obsolete or unused servers represent a double threat in terms of energy waste—squandering power at the plug, but also wasting data center facility power and capacity.
Uptime Institute Research circa 2009 states decommissioning one rack unit (1U) of servers can result in a savings of US$500 per year in energy costs, an additional US$500 in operating system licenses and US$1,500 in hardware maintenance costs. But reaping those rewards is no easy task.
According to Uptime Institute’s estimates based on industry experience, around 20% of servers in data centers today are obsolete, outdated or unused. That percentage may in fact be conservative.
According to one media report, Lexis Nexis found 50% of its servers were comatose in one of its audit samples. When the insurance firm SunLife took back management from an outsourced data center engagement firm in 2011, it found 40% of its servers were doing absolutely nothing
As early as 2006, Uptime Institute Founder Ken Brill identified comatose servers as one of the biggest opportunities for companies to improve overall IT energy efficiency. While Mr. Brill advocated for industry action on this issue, he often cautioned, “Nobody gets promoted for going around in the data center and unplugging servers.” Mr. Brill meant that data center professionals had no incentive to remove comatose machines and that IT executives lacked insight into the impact idle IT equipment was having on the cost structures of their organizations, as their departments do not pay the data center power bill.
The corporate disconnect between IT and Facilities Operations continues to challenge the data center industry. Data center managers need to overcome that organizational barrier and get executive level buy-in in order to implement an effective server decommissioning program.
This is why Uptime Institute invited companies around the globe to help address and solve the problem of comatose servers by participating in the Server Roundup, an initiative to promote IT and Facilities integration and improve data center energy efficiency.
The annual Uptime Institute Server Roundup contest was launched in October 2011 to raise awareness about the removal and recycling of comatose and obsolete IT equipment in an effort to reduce data center energy use. In 2012, Uptime Institute named AOL and NBC Universal inaugural Server Roundup champions. AOL had removed nearly 10,000 obsolete servers, and NBC Universal culled 1,090 comatose machines, representing 29% of its overall IT footprint. The following year’s results were even more impressive.
2013 Winners and Finalists
WINNER: AOL won in back-to-back years for its overall tally of servers removed. The global Web services company decommissioned 8,253 servers in calendar year 2012. This produced (gross) total savings of almost US$3 million from reduced utility and maintenance costs and asset resale/scrap. Environmental benefits included reducing carbon emissions by more than 16,000 tons, according to AOL.
WINNER: Barclays, a global financial organization, removed 5,515 obsolete servers in 2012, gaining power savings of around 3 megawatts, US$3.4 million annualized savings for power, and a further US$800K savings in hardware maintenance.
FINALIST: TD Bank removed 513 servers in 2012. The team from this Canadian financial firm removed 2,941 units in the 5 years they’ve been working to remove obsolete machines from the raised floor. Although the TD Bank annual server count does not approach the impressive numbers put up by AOL, the organization makes up for it in volume of waste that it diverts from local and municipal waste sites. All the equipment sent through the E-Waste recycler is salvaged within a 110-mile radius from TD Bank’s primary data centers. Nothing is shipped overseas for processing.
FINALIST: McKesson pulled 586 servers in 2012, reducing data center power usage by 931.7 kilowatts and saving US$734,550.
FINALIST: Sun Life Financial removed 387 servers in 2012, which resulted in 32 kilowatts of power savings across three data centers and financial savings of US$8,800 per month.
Since the contest’s launch two years ago, Server Roundup participants have decommissioned and recycled 30,000 units of obsolete IT equipment.
In the sidebars, Server Roundup winner Paul Nally and finalist Rocco Alonzi discuss the challenges and benefits of a server-decommissioning program and detailed their strategies for success .
Takeaways From Last Year’s Winners
During the 2013 Uptime Institute Symposium, last year’s winners provided the following advice:
- Get senior management to buy-into the program. “There is risk involved, but we need to get senior management buy-off on the risk,” Nally said. “There’s short-term risk and long-term risk. If you flip the wrong switch, and we have, you’ll cause an outage. But if you leave it on the wire to stagnate for five to six years, when it eventually dies, we will not be able to recover it.”
- When you pitch server decommissioning to execs, discuss business impacts. “The easiest way to find yourself alone in an empty room is to call a meeting about server retirement,” Nally said. “People don’t understand the challenge. When we have the conversations with the C-level suite, we tell them what 5,000 servers means. We don’t talk in terms of kilowatts. We talk in terms of dollars.”
- The biggest roadblock will be cultural. “Executives have other things on their roadmaps that are more interesting, like developing revenue. Getting buy-in requires getting people to commit to doing stuff they don’t like doing. People would rather move on to the next great thing, rather than dealing with the management problem they have,” said Scott Killian, Senior Technical Director of Data Center Services at AOL.
- Get some help. “We brought in a couple of university students to do a bookto-floor audit of all the servers over three months under supervision of my group,” Alonzi said. “We took that information and started to cross-reference based on applications. All these data were about 80% accurate. Once we gathered all the information, we found question marks around a lot of hardware. There was work we had to do with our service providers, network people and storage guys. We literally had to drag people onto the raised floor and point to a cabinet or a bank of servers and say, ‘What are these doing?’”
- Don’t be afraid to perform the “scream test.” “This is where you have a server that you know is not live, but you cannot find or establish the server owner. You pull the network cable from the back of the server and see who calls you to report the server being down and then investigate from there,” said Guy Pattison, Technical Solutions Officer, Data Center Management, TD Bank.
- Document as much as possible. “Having a good DCIM is key. We have a backend system polling the servers to understand how machines are being used and who’s using them,” Nally said.
- Keep up with incoming servers. “Any new hardware purchased comes through the data center operations group,” Alonzi said. “We don’t make a decision on what they’re buying, but we make sure it’s assigned to a project, and it’s not landing on the dock because the vendor was having a fire sale. Unless there’s a net new project or growth, we challenge more now.”
Paul Nally, Director at Barclays
“It has been said that the greenest data center is the one that’s never built. That is the main reason we have our server decommissioning program at Barclays. We are looking to shrink our data center footprint and benefit from the savings that this affords us, while allowing ourselves to massively expand our overall compute capability. When obsolete servers are removed in the thousands, it creates the capacity that we need to bring the next generation of systems in.
We save in space; we save in power. It helps us meet our carbon targets. When we eliminate or virtualize a server, we also save on network, SAN, and software costs. A server that may have cost US$100,00 seven years ago, took up half a rack of space, and required a couple of kW to run is absolutely crushed in compute performance by a modern blade costing US$5,000. But the benefits extend throughout the overall organization. A focus on removing these obsolete systems simplifies the environment from a network and systems administration perspective. Applications teams benefit from a more stable system that is easily maintained and integrated into contemporary management frameworks. We end up in a cleaner, safer, cheaper place with the capacity in hand that we need to continue to grow our business. There is real work, and some risk, in getting this job done, but the benefits are simply too many to ignore.”
Rocco Alonzi, AVP Data Center Governance at Sun Life Financial
“The removal of an under-utilized server sounds much easier than it really is. The thought of turning off a server and removing it from the raised floor can be overwhelming even if you are 100% certain that it is no longer required. Think about the process for a moment. As the server connections (electrical power, network, SAN storage) are removed and the server physically pulled out of a production cabinet, the hard drive data must be permanently destroyed and finally the server needs to be returned to the vendor or disposed of properly. The logical aspect includes another entire separate process so that in the end it is much easier on everyone to leave it powered on.
This is the message I communicated to the Leadership team followed by a solution and a promise. The solution included a dedicated resource (Contractor), asset database, and cooperation from the Server, Storage, and Network support teams. The contractor walked the raised floor performing an asset database book to raised-floor audit. And, yes, this did take some time, three months to be exact. This rich information was used to identify the servers that were not in the database but physically on the raised floor. We also challenged the support groups to associate their service offering with corresponding hardware infrastructure. These two exercises led to approximately 400 servers being switched off.
The promise was that Data Centre Operations team would do all the work after the hardware device was switched off. This included working with the support groups to reclaim IP addresses, SAN storage ports, and electrical power cords. We also provided the Financial department with detailed hardware information reclaiming cost savings that was passed on to the business unit. Finally, a process was put into place to remove the physical server from the raised floor, destroy the data, and properly dispose of the hardware.
The message: Raise awareness to the Leadership team of the issue and take a dedicated approach of decommissioning hardware infrastructure. It is well worth the effort.”
Matt Stansberry is director of Content and Publications for the Uptime Institute and also serves as program director for the Uptime Institute Symposium, an annual spring event that brings together 1,500 stakeholders in enterprise IT, data center facilities, and corporate real estate to deal with the critical issues surrounding enterprise computing. He was formerly Editorial Director for Tech Target’s Data Center and Virtualization media group, and was managing editor of Today’s Facility Manager magazine. He has reported on the convergence of IT and Facilities for over a decade.