An artificial intelligence (AI) strategy for data center management and operation requires more than just data and some very smart humans. Selecting specific use cases and understanding the types of data that influence AI outcomes — and then validating those outcomes — will be key if the needs of the business are to be met.
By focusing on specific use cases, early successes can then be scaled, and further value can be extracted incrementally. Managers don’t need to be AI experts, but Uptime Institute does recommend that they understand the fundamental depth and breadth of the AI being applied. Doing so means they can better determine how much data is required and how the AI will be using the data, which will be critical when validating results and recommendations.
As a first step, let’s address a few points about AI. First, what is the difference between algorithms and models? AI marketers can use these terms to mean the same thing, although they are not.
An algorithm is a sequence of mathematical steps or computational instructions. It is an automated instruction set. An algorithm can be a single instruction or a sequence of instructions — its complexity depends on how simple or complex each individual instruction is and/or the sheer number of instructions that the algorithm needs to execute.
In AI, a model refers to a mathematical model that is able to process data and provide the expected response to or outcome of that data. For example, if an algorithm is applied to a data set, the outcome would be the model. So, the model is the outcome of one or many algorithms. A model changes if the data fed into the algorithm changes, or if the same data is fed through a different algorithm.
Another very important distinction is between the two main types of AI techniques being used in data centers today: machine learning and deep learning.
There are three main types of machine learning techniques:
Supervised learning: Humans supply a model and training data. Algorithms take the training data and fine-tune the model so the inputs and outputs/responses are more closely aligned. As more data is added over time, the algorithms further improve the model and can make reasonable predictions for responses to new data. Supervised machine learning is most commonly used in data centers and other industries.
Unsupervised learning: Algorithms find patterns or intrinsic structures in unlabeled data. In some scenarios, unsupervised machine learning techniques are combined with supervised ones. In effect, the output of unsupervised machine learning can become the training data for supervised machine learning.
Reinforcement learning: Humans supply a model and unlabeled data. When an algorithm determines an optimal outcome for the data, it is reinforced by a positive mathematical “reward.” (An open-source reinforcement learning model from Google is appropriately called Dopamine.) By providing feedback, it learns through different variations. Of these, reinforcement learning is the newest machine learning technique.
Deep learning, a subset of machine learning, uses multiple layers of artificial neural networks to build algorithms, based on vast data, that find an optimal way to make decisions or perform tasks on their own. Humans supply training data and algorithms, and the computer breaks down these inputs into a hierarchy of very simple concepts. Each concept becomes a mathematical node on the neutral network. Instead of using machine learning models from humans, deep learning uses the training data like a neural network, which works like a decision tree. It builds new models from its own analysis of the training data.
Which technique is best for which use case? It depends on the quality and sophistication of the algorithm, as well as the model and data being used. If all these things are equal, however, there are certain techniques that are particularly well-suited to certain use cases.
Some say deep learning can find greater levels of inefficiencies because it is unfettered by known models. On the other hand, supervised machine learning is more transparent (making it easier for domain-expert humans to validate results) and, arguably, quicker to automate.
It can vary but below are some use cases that can be well-suited to different types of machine learning and for deep learning.
It is still early days, but it is likely that certain techniques will dominate specific use cases over time.
At a minimum, operators should understand the fundamental level of the depth and breadth of the AI being applied. Ask the supplier to show the data points in the model and the relationship between those items — in other words, how the AI is using the data to make recommendations for action. And, of course, it is always important to track the results when actions are taken (by a human operator).
The full report Very smart data centers: How artificial intelligence will power operational decisions is available to members of the Uptime Institute Network community. For more information about membership, click here.
https://journal.uptimeinstitute.com/wp-content/uploads/2019/08/GettyImages-1135342603-blog.jpg18414928Rhonda Ascierto, Vice President, Research, Uptime Institutehttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngRhonda Ascierto, Vice President, Research, Uptime Institute2019-08-12 08:00:442019-11-18 08:49:50Data center AI: Start with the end in mind
We have heard for a dozen years about the Open Computing Project (OCP) and their non-traditional approach to computing hardware, from the racks to the servers, storage and networking. And over the last few years to the Open19 Foundation started to promote their alternative platform which resembles the more traditional 19-inch rack approach we have all known since the 1980’s. (Note: the official Open19 formal specification was only publicly released this past year). But neither of these approaches has resulted in earth-shattering, data center changing deployments outside of the web-scalers and a handful of early adopter pilots and test-beds. And many argue that companies like Facebook that are heavily invested in the OCP platform, have essentially created a hardware design that works for THEM (mostly) and that very few other companies could realize the stated savings that Facebook enjoys with this ‘non-standard’ approach due to many factors, including hardware, software, staffing, support, training and compatibility concerns.
But time continues to move forward and some of those core values we have grown up on in the data center are changing too. Think of storage and networking approaches now versus 10 years ago. Think of what VMware looked like back then versus today. Vastly different. Think of the skills needed back then versus now. So perhaps there is room for a new platform. As the staffing for IT gets new blood, perhaps they can cut their teeth on a new platform?
As such the Open19 Foundation has designated 2019 the Year of Accelerated Adoption for the ‘new platform’, its Open19 specification, which defines four server profiles, based on a standard 19-inch rack. Open19 expects the specification to be the basis of flexible and economic data center and edge solutions for facilities of many sizes and densities.
At the organization’s May 2019 summit, Yuval Bachar, president of the Open19 Foundation and principal engineer of data center architecture at LinkedIn, told Data Center Knowledge that Open19 gear has been deployed at scale in the social network’s facilities in Oregon and Texas. In addition, two mega data centers are running proofs of concept and six other companies are deploying or evaluating Open19 technology.
These early deployments support recent Uptime Institute Intelligence findings: Just3% of respondents to our Ninth Annual Uptime Institute Data Center Survey (available to Uptime Institute Network members) said they were deploying Open19 hardware or designs, with another eight percent evaluating Open19. That’s a total of about 50 respondents deploying or evaluating Open19. However, 54% of respondents said that they were not aware of Open19.
Despite these survey results, we agree with the Foundation: conditions may be right for an increase in Open19 adoption.
Viewed from one perspective, these adoption (or planned adoption) figures are really quite impressive: until its public release on March 12, 2019, the Open19 project specification was available only to the current Foundation members (including founding members Flex, GE Digital, Hewlett Packard Enterprise, Packet, LinkedIn and Vapor IO). The public release of the Open19 standard greatly increases the potential for new product options and deployments.
We found an additional point of interest in our survey data: senior executives (56%) and designers (47%) are more aware of Open19 than IT management (41%) and critical facilities management (41%). Senior executives (16%) and design engineers (17%) are also far more likely to say that they are deploying or considering Open19 designs or hardware than IT management (6%) and critical facilities management (9%). One possibility: Open19 designs and hardware are making their way into production without serious disruption to the routines of IT management and critical facilities management. That would be a promising development for Open19.
——————————————————————————–
For more information on OCP and Open19 and other data center standards, a wealth of research is available to members of the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday. For membership information click here.
https://journal.uptimeinstitute.com/wp-content/uploads/2019/06/Open19-Rack-Photo-wide.jpg6881883Kevin Heslinhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngKevin Heslin2019-08-05 09:00:032019-07-07 06:15:37Open19 expects 2019 to be the year of “Accelerated Adoption”
In our October 2018 report, A mission-critical industry unprepared for climate change, Uptime Institute Intelligence urged data center operators and owners to plan for the effects of climate change. We specifically encouraged data center owners and operators to meet with government officials and utility executives to learn about local and regional disaster preparation and response plans.
A recent public filing by Pacific Gas and Electric (PG&E), California’s largest electric utility, underlines our point and gives data center owners and operators in that state, including areas near Silicon Valley, a lot to discuss.
According to The Wall Street Journal, PG&E plans to dramatically expand the number and size of areas where it will cut power when hot and dry weather makes wildfires likely, effectively eliminating transmission and distribution gear as a cause of wildfires. In addition, the utility announced plans to spend $28 billion over the next four years to modernize infrastructure.
Extreme wildfires, arguably a result of climate change, have caused PG&E and its customers big problems. In 2018, PG&E intentionally interrupted service in two different areas, disrupting essential services and operations in one area (Calistoga) for two days. And on May 16, 2019, California confirmed that utility-owned power started last November’s Camp Fire, which killed 85 people and destroyed the town of Paradise.
The utility’s been forced to take drastic steps: In January 2019, it sought bankruptcy protection, citing more than $30 billion in potential damages (including as much as $10.5 billion related to the Camp Fire) from wildfires cause by its aging infrastructure and failure to address the growing threat of extreme wildfires caused by climate change.
PG&E is in the front line but is not unique. The case demonstrates that it is unwise for data center operators and owners to address reliability in isolation. Circumstances affecting data centers in the PG&E service territory, for instance, can vary widely, making communicating with utility officials and local authorities essential to maintaining operations in a disaster and any recovery plan.
In this case, one might identify three distinct periods:
In the past, when climate change and aging infrastructure combined to gradually increase the risk of wildfire to a crisis point.
Now, when the bankrupt utility suddenly announced a plan to intentionally interrupt service to reduce wildfire risk, even though the experience in Calistoga suggests that customers and local governments are not prepared for the consequences of emergency power outages.
Sometime in the future, when PG&E’s major infrastructure investments begin to bear fruit and utility reliability begins to increase.
Each of these stages brings markedly different reliability and cost considerations, depending on how close a data center is to areas vulnerable to wildfires and where (and when) PG&E makes its infrastructure investments.
The full report, “A mission-critical industry unprepared for climate change” is available to members of Uptime Institute Network. Want to know more about this organization? Check out the complete benefits of membership here.
https://journal.uptimeinstitute.com/wp-content/uploads/2019/05/carr-fire-gty-er-180731_hpMain-999x375.png373989Kevin Heslinhttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngKevin Heslin2019-07-29 08:00:292019-07-07 06:18:23PG&E Turns Power Off (a.k.a. Climate Change and the Data Center)
The spectacular growth of the public cloud has many drivers, only one of which is the deployment, redevelopment or migration of enterprise IT into the cloud. But many groups within the industry — data center builders and operators, hardware and software suppliers, networking companies, and providers of skills and services alike — are closely watching the rate at which enterprise workloads are moving to the cloud, and for good reason: Enterprise IT is a proven, high-margin source of revenue, supported by large, reliable budgets. Attracting — or keeping — enterprise IT business is critical to the existing IT ecosystem.
The popular view is that enterprise IT is steadily migrating to the public cloud, with the infrastructure layers being outsourced to cheaper, more reliable, more flexible, pay-as-you-go cloud services. Amazon Web Services (AWS), Microsoft Azure and Google are the biggest beneficiaries of this shift. There is very little contention on this directional statement.
It is only when we add the element of TIMEFRAME that we start to lose universal agreement. It is frequently reported and forecasted that we will reach a tipping point where traditional data center centric style approaches (including non-public cloud or non-SaaS provided IT) will become prohibitively expensive, less efficient and too difficult to support. Looking at any complex system of change, its pretty clear this is sound thinking, BUT when will this happen? Some of the data published by industry analysts and (self-)proclaimed experts suggests it will be many years, while other studies suggest we will soon be nearing that point. Its really a mixed bag depending on who you are talking to.
The Uptime Intelligence view is this kind of genetic fundamental change usually happens much more slowly than the technologists predict and we expect the traditional infrastructure platforms (including in-house data centers and customer managed co-location sites) to be the bedrock of Enterprise IT for many years to come.
So how can the views vary so widely? In 2018, 451 Research’s “Voice of the Enterprise Digital Pulse” survey asked 1,000 operators, “Where are the majority of your workloads/applications deployed — now, and in two years’ time?” In the second half of 2018, 60% said the bulk of their loads were “on-premises IT”. Only 20% said the bulk of their workloads were already in the public cloud or in SaaS.
That is still a fair portion in the public cloud or SaaS, but it has taken time: public cloud has been available for about 13 years (AWS debuted in 2006) and SaaS, over 20 (Salesforce was launched in 1999). Over this period, enterprises have had the choice of co-location, managed hosting (cloud or otherwise), or cloud and SaaS. If we view all but traditional IT infrastructure as “new,” a rough summary might be to say that, over that time, just over a third of enterprises have mostly favored the public cloud as a location for their NEW applications, while two-thirds of organizations have favored other ways of running their IT that give them more control.
But what happens next? 451 Research’s data (see below) does suggest that the move away from on-site, traditional IT is really starting to step up. Thirty-nine percent (39%) of organizations say that by 2020, the bulk of their data will be in SaaS or a public cloud service. That’s a big number, although it is an aspiration — many organizations have not transitioned anywhere near as fast as they would have liked. But look at this another way: Even in 2020, nearly half of all organizations say their loads will still mostly be in enterprise data centers or under their control in a co-location facility. So a lot of enterprise infrastructure will remain in place, with over a third of these organizations still hosting applications mostly in their own data centers.
But the real story is buried in the science of survey design… the way 451 Research (and other research firms) asked its question doesn’t reveal all the nuances that we see at Uptime Institute Intelligence and in fact their simpler questioning misses the fact that both total infrastructure capacity and the data center portion of it are both growing, albeit at different rates: Data Center based capacity is growing but at a slower rate than the total infrastructure capacity. Some big organizations are indeed shifting the bulk of their NEW work to the cloud, but they are simultaneously maintaining and even expanding a premium data center and some edge facilities. This is a complex topic to be sure.
So let’s focus on brand new data gathered in 2019, with included much more granular questions, in the context of where capacity exists, and where business applications, old and new are being deployed. In Uptime Institute’s 2019 operator survey (which had results released in May 2019), Uptime Institute asked a more comprehensive question: “What percentage of your organization’s total IT would you describe as running in the following environments?” This question focused on percentage of existing workloads, rather than where the main workload was located or what would happen in the future (a more objective representation).
This Uptime Institute study confirmed that the shift to the public cloud and service oriented capacity was happening, but revealed that the shift was less dramatic and much slower than most industry reporters and pundits would suggest. Over 600 operators said that, in 2021, about half of all workloads will still be in enterprise data centers, and only 18% of workloads in public cloud/SaaS services. Just over three-quarters of all workloads, they believe, will still be managed by their own enterprise staff at a variety of locations that include enterprise data centers, co-location venues, server closets and micro/edge type data centers. The science of any survey confirms that the more granularity in the question, the more actionable the intelligence.
So what does it all mean? The IT ecosystem that already supports a diverse hybrid infrastructure is not yet facing a tipping point for any one portion. That doesn’t mean it won’t come eventually. Basic physics kicks in: workloads that are re-engineered to be cloud-compatible are much more easily moved to the cloud! But there are many other factors as well and as such, core infrastructure change will be gradual. These reasons including economics of work, dynamic scale-ability, risk management, support-ability, application support, overall performance, service reliability and the changing business models facing every modern business today. Simply put: The Enterprise IT environment is changing everyday and its hybrid nature is already very challenging, but Enterprise IT is not in eminent terminal decline.
Public cloud operators, meanwhile, will continue to assess what they need to do to attract more critical IT workloads. Uptime Institute Intelligence has found that issues of transparency, trust, governance and service need addressing — outstanding tools and raw infrastructure capacity alone are not sufficient. We continue to monitor the actual shifts in workload placements and will report regularly as needed.
More detailed information on this topic, along with a wealth of other strategic planning tools are available to members of the Uptime Institute Network, a members-only community of industry leaders which drive the data center industry. Detail can be found here.
https://journal.uptimeinstitute.com/wp-content/uploads/2019/05/Hybrid-for-blog-with-pct.jpg16184484Andy Lawrence, Executive Director of Research, Uptime Institute, [email protected]https://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngAndy Lawrence, Executive Director of Research, Uptime Institute, [email protected]2019-07-22 09:00:002019-05-30 14:29:50Enterprise IT and the public cloud: What the numbers tell us
It is still very early days, but it is clear that artificial intelligence (AI) is set to transform the way data centers are designed, managed and operated — eventually. There has been a lot of misrepresentation and hype around AI, and it’s not always clear how it will be applied, and when. The Uptime Institute view is that it will be rolled out slowly, with initially conservative and limited use cases now and for the next few years. But its impact will grow.
There have been some standout applications to date — for example, predictive maintenance and peer bench-marking — and we expect there will be more as suppliers and large companies apply AI to analyze a wider range of relationships and patterns among a vast range of variables, including resource use, environmental impacts, resiliency and equipment configurations.
Today, however, AI is mostly being used in data centers to improve existing functions and processes. Use cases are focused on delivering tangible operational savings, such as cooling efficiency and alarm suppression/rationalization, as well as predicting known risks with greater accuracy than other technologies can offer.
Artificial Intelligence is currently being applied to perform existing well-understood and defined functions and processes faster and more accurately. In other words, not much new, just better. The table below is taken from the new Uptime Intelligence report “Very smart data centers: How artificial intelligence will power operational decisions” (available to Uptime Institute Network Members) and shows AI functions/services that are being offered or that are in development; with a few exceptions, they are likely to be familiar to data center managers, particularly those that have already deployed data center infrastructure management (DCIM) software.
So where might AI be applied beyond these examples? We think it is likely AI will be used to anticipate failure rates, as well as to model costs, budgetary impacts, supply-chain needs and the impact of design changes and configurations. Data centers not yet built could be modeled and simulated in advance, for example, to compare the operational and/or performance profile and total cost of ownership of a Tier II design data center versus a Tier III design.
Meanwhile, we can expect more marketing hype and misinformation, fueled by a combination of AI’s dazzling complexity, which only specialists can deeply understand, and by its novelty in most data centers. For example:
Myth #1: There is a best type of AI for data centers. The best type of AI will depend on the specific task at hand. Simpler big-data approaches (i.e., not AI) can be more suitable in certain situations. For this reason, new “AI-driven” products such as data center management as a service (DMaaS) often use a mix of AI and non-AI techniques.
Myth #2: AI replaces the need for human knowledge. Domain expertise is critical to the usefulness of any big-data approach, including AI. Human data center knowledge is needed to train AI to make reasonable decisions/recommendations and, especially in the early stages of a deployment, to ensure that any AI outcome is appropriate for a particular data center.
Myth #3: Data centers need a lot of data to implement AI. While this is true for those developing AI, it is not the case for those looking to buy the technology. DMaaS and some DCIM systems use pre-built AI models that can provide limited but potentially useful insights within days.
The advent of DMaaS, which first became commercialized in 2016, is likely to drive widespread adoption of AI in data centers. With DMaaS, large sets of monitored data about equipment and operational environments from different facilities (and different customers) are encrypted, pooled in data lakes, and analyzed using AI, anomaly detection, event-stream playback and other approaches.
Several suppliers now offer DMaaS, a service that parallels the practice of large data center operators who use internal data from across their portfolios to inform decision-making and optimize operations. DCIM suppliers are also beginning to embed AI functions into their software.
Data center AI is here today and is available to almost any facility. The technology has moved beyond just hyper-scale facilities and will move beyond known processes and functions — but probably not for another two or three years.
——————————————————————————–
For more information on Artificial Intelligence and how it is already being applied in the data center, along with a wealth of other research consider becoming part of the Uptime Institute Network community. Members of this owner-operator community enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday. For membership information click here.
https://journal.uptimeinstitute.com/wp-content/uploads/2019/08/AI-2.jpg25877158Rhonda Ascierto, Vice President, Research, Uptime Institutehttps://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngRhonda Ascierto, Vice President, Research, Uptime Institute2019-07-15 08:15:032019-07-07 06:17:22Artificial Intelligence in the Data Center: Myth versus Reality
Recently I attended the Data Center Dynamics (DCD) Smart Energy conference in Stockholm. During a panel discussion on energy, data centers and innovation, David Hall (Senior Director of Technology Innovation for Equinix) made two observations, almost in passing, about metrics and monitoring. Both were intriguing and, to my ears, suggested that operators’ thinking about sustainability and data center energy use is starting to evolve.
The first metric was power usage effectiveness (PUE) — the universally applied but widely criticized standard way to measure data center energy efficiency. In essence he asked, What does it matter if your data center PUE is (a very inefficient) 4 or 5, if you are capturing and re-using the heat (i.e., the byproduct of wasted energy)? After all, as many at the conference pointed out, waste energy can be used for district, building or campus heating; for greenhouses; for heating swimming pools; or, as both Facebook (in Northern Sweden) and the National Renewable Energy Laboratory (in Colorado) do, for melting snow.
As moderator of the discussion, I took David a little to task, asking if Equinix was really doing a lot of heat recovery (no) and noting the pride that Equinix takes in its mostly low and very healthy PUE numbers. (I might have added that computers don’t make the most efficient heaters.)
But small details aside, his point is right: It does not make economic or environmental sense to “boil the sky” with warm air or the rivers with warm water. Large data centers, even very efficient ones, can and do put out a lot of heat, which wastes money, burns up energy and pushes up carbon emissions.
Heat recovery has, until now, had little adoption. In Europe, it is probably viewed as peculiarly Nordic. But it might get traction in the future: as capacity moves nearer the edge and more processing takes place in the cities, there is more opportunity to use the heat; as data centers are run at higher temperatures, the “quality” of that heat will be better; and as (if?) liquid cooling finally wins more over more operators, more people will be attracted by the opportunity to plumb the water directly into a local or district heating system. Bottom line: There is likely to be a lot more heat recovery from data centers in the future.
The second metric? In the years ahead, David predicted that “managing state of charge” for batteries will become a key operational concern — perhaps not a metric, exactly, but that may come. This doesn’t sound wildly exciting, but there is an important reason: At present, most batteries are lead-acid and, because this technology is not suited to multiple or rapid recharges, the batteries are kept in a fully charged state more or less permanently.
But lithium-ion (Li-ion) batteries are a different matter. They can be cycled thousands of times with relatively minimal degradation. Over time, Li-ion batteries, mostly but not only in uninterruptible power supplies, will be used in ways that utilize this capability. They will be charged when energy is cheap and available, then discharged when energy is needed, when it provides additional capacity, when it can be sold, or when it can be re-allocated/distributed to a different area of the data center in greater need. Smart energy systems are already making use of this capability — battery monitoring is coupled with data about IT loads, redundancy, utility power, cooling and other data. In a more advanced, “energy-smart” data center, operators will need to know the state of charge of every energy storage device in the system at all times.
Another interesting fact that came up at the DCD Smart Energy event: Several delegates said they were working on projects using previously owned lithium-ion batteries, most notably from electric vehicles (which have higher rapid charge requirements). The issues about whether these batteries are the most suitable type of Li-ion chemistry for the data center aside, using former vehicle batteries is very economical and should cause few problems as long as the batteries are monitored. Again, this suggests the economics and use case for Li-ion are steadily tilting in favor of wider adoption: expect more Li-ion and more “energy-smart” data centers.
——————————————————————————–
For more information on advanced energy management and the potential for Lithium batteries to transform the data center power cost model, join the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday. For membership information click here.
https://journal.uptimeinstitute.com/wp-content/uploads/2019/04/RECOVERY2.jpg4451256Andy Lawrence, Executive Director of Research, Uptime Institute, [email protected]https://journal.uptimeinstitute.com/wp-content/uploads/2022/12/uptime-institute-logo-r_240x88_v2023-with-space.pngAndy Lawrence, Executive Director of Research, Uptime Institute, [email protected]2019-07-08 09:00:282019-05-10 15:05:26Recover heat, re-charge power
Data center AI: Start with the end in mind
/in Executive, Operations/by Rhonda Ascierto, Vice President, Research, Uptime InstituteAn artificial intelligence (AI) strategy for data center management and operation requires more than just data and some very smart humans. Selecting specific use cases and understanding the types of data that influence AI outcomes — and then validating those outcomes — will be key if the needs of the business are to be met.
By focusing on specific use cases, early successes can then be scaled, and further value can be extracted incrementally. Managers don’t need to be AI experts, but Uptime Institute does recommend that they understand the fundamental depth and breadth of the AI being applied. Doing so means they can better determine how much data is required and how the AI will be using the data, which will be critical when validating results and recommendations.
In a recent report written by Uptime Institute Intelligence entitled “Very smart data centers: How artificial intelligence will power operational decisions” on the subject of AI in data centers, should provide a better understanding of what is required.
As a first step, let’s address a few points about AI. First, what is the difference between algorithms and models? AI marketers can use these terms to mean the same thing, although they are not.
An algorithm is a sequence of mathematical steps or computational instructions. It is an automated instruction set. An algorithm can be a single instruction or a sequence of instructions — its complexity depends on how simple or complex each individual instruction is and/or the sheer number of instructions that the algorithm needs to execute.
In AI, a model refers to a mathematical model that is able to process data and provide the expected response to or outcome of that data. For example, if an algorithm is applied to a data set, the outcome would be the model. So, the model is the outcome of one or many algorithms. A model changes if the data fed into the algorithm changes, or if the same data is fed through a different algorithm.
Another very important distinction is between the two main types of AI techniques being used in data centers today: machine learning and deep learning.
There are three main types of machine learning techniques:
Deep learning, a subset of machine learning, uses multiple layers of artificial neural networks to build algorithms, based on vast data, that find an optimal way to make decisions or perform tasks on their own. Humans supply training data and algorithms, and the computer breaks down these inputs into a hierarchy of very simple concepts. Each concept becomes a mathematical node on the neutral network. Instead of using machine learning models from humans, deep learning uses the training data like a neural network, which works like a decision tree. It builds new models from its own analysis of the training data.
Which technique is best for which use case? It depends on the quality and sophistication of the algorithm, as well as the model and data being used. If all these things are equal, however, there are certain techniques that are particularly well-suited to certain use cases.
Some say deep learning can find greater levels of inefficiencies because it is unfettered by known models. On the other hand, supervised machine learning is more transparent (making it easier for domain-expert humans to validate results) and, arguably, quicker to automate.
It can vary but below are some use cases that can be well-suited to different types of machine learning and for deep learning.
It is still early days, but it is likely that certain techniques will dominate specific use cases over time.
At a minimum, operators should understand the fundamental level of the depth and breadth of the AI being applied. Ask the supplier to show the data points in the model and the relationship between those items — in other words, how the AI is using the data to make recommendations for action. And, of course, it is always important to track the results when actions are taken (by a human operator).
The full report Very smart data centers: How artificial intelligence will power operational decisions is available to members of the Uptime Institute Network community. For more information about membership, click here.
Open19 expects 2019 to be the year of “Accelerated Adoption”
/in Design, Executive/by Kevin HeslinWe have heard for a dozen years about the Open Computing Project (OCP) and their non-traditional approach to computing hardware, from the racks to the servers, storage and networking. And over the last few years to the Open19 Foundation started to promote their alternative platform which resembles the more traditional 19-inch rack approach we have all known since the 1980’s. (Note: the official Open19 formal specification was only publicly released this past year). But neither of these approaches has resulted in earth-shattering, data center changing deployments outside of the web-scalers and a handful of early adopter pilots and test-beds. And many argue that companies like Facebook that are heavily invested in the OCP platform, have essentially created a hardware design that works for THEM (mostly) and that very few other companies could realize the stated savings that Facebook enjoys with this ‘non-standard’ approach due to many factors, including hardware, software, staffing, support, training and compatibility concerns.
But time continues to move forward and some of those core values we have grown up on in the data center are changing too. Think of storage and networking approaches now versus 10 years ago. Think of what VMware looked like back then versus today. Vastly different. Think of the skills needed back then versus now. So perhaps there is room for a new platform. As the staffing for IT gets new blood, perhaps they can cut their teeth on a new platform?
As such the Open19 Foundation has designated 2019 the Year of Accelerated Adoption for the ‘new platform’, its Open19 specification, which defines four server profiles, based on a standard 19-inch rack. Open19 expects the specification to be the basis of flexible and economic data center and edge solutions for facilities of many sizes and densities.
At the organization’s May 2019 summit, Yuval Bachar, president of the Open19 Foundation and principal engineer of data center architecture at LinkedIn, told Data Center Knowledge that Open19 gear has been deployed at scale in the social network’s facilities in Oregon and Texas. In addition, two mega data centers are running proofs of concept and six other companies are deploying or evaluating Open19 technology.
These early deployments support recent Uptime Institute Intelligence findings: Just 3% of respondents to our Ninth Annual Uptime Institute Data Center Survey (available to Uptime Institute Network members) said they were deploying Open19 hardware or designs, with another eight percent evaluating Open19. That’s a total of about 50 respondents deploying or evaluating Open19. However, 54% of respondents said that they were not aware of Open19.
Despite these survey results, we agree with the Foundation: conditions may be right for an increase in Open19 adoption.
Viewed from one perspective, these adoption (or planned adoption) figures are really quite impressive: until its public release on March 12, 2019, the Open19 project specification was available only to the current Foundation members (including founding members Flex, GE Digital, Hewlett Packard Enterprise, Packet, LinkedIn and Vapor IO). The public release of the Open19 standard greatly increases the potential for new product options and deployments.
We found an additional point of interest in our survey data: senior executives (56%) and designers (47%) are more aware of Open19 than IT management (41%) and critical facilities management (41%). Senior executives (16%) and design engineers (17%) are also far more likely to say that they are deploying or considering Open19 designs or hardware than IT management (6%) and critical facilities management (9%). One possibility: Open19 designs and hardware are making their way into production without serious disruption to the routines of IT management and critical facilities management. That would be a promising development for Open19.
——————————————————————————–
For more information on OCP and Open19 and other data center standards, a wealth of research is available to members of the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday. For membership information click here.
PG&E Turns Power Off (a.k.a. Climate Change and the Data Center)
/in Design, Executive, Operations/by Kevin HeslinIn our October 2018 report, A mission-critical industry unprepared for climate change, Uptime Institute Intelligence urged data center operators and owners to plan for the effects of climate change. We specifically encouraged data center owners and operators to meet with government officials and utility executives to learn about local and regional disaster preparation and response plans.
A recent public filing by Pacific Gas and Electric (PG&E), California’s largest electric utility, underlines our point and gives data center owners and operators in that state, including areas near Silicon Valley, a lot to discuss.
According to The Wall Street Journal, PG&E plans to dramatically expand the number and size of areas where it will cut power when hot and dry weather makes wildfires likely, effectively eliminating transmission and distribution gear as a cause of wildfires. In addition, the utility announced plans to spend $28 billion over the next four years to modernize infrastructure.
Extreme wildfires, arguably a result of climate change, have caused PG&E and its customers big problems. In 2018, PG&E intentionally interrupted service in two different areas, disrupting essential services and operations in one area (Calistoga) for two days. And on May 16, 2019, California confirmed that utility-owned power started last November’s Camp Fire, which killed 85 people and destroyed the town of Paradise.
The utility’s been forced to take drastic steps: In January 2019, it sought bankruptcy protection, citing more than $30 billion in potential damages (including as much as $10.5 billion related to the Camp Fire) from wildfires cause by its aging infrastructure and failure to address the growing threat of extreme wildfires caused by climate change.
PG&E is in the front line but is not unique. The case demonstrates that it is unwise for data center operators and owners to address reliability in isolation. Circumstances affecting data centers in the PG&E service territory, for instance, can vary widely, making communicating with utility officials and local authorities essential to maintaining operations in a disaster and any recovery plan.
In this case, one might identify three distinct periods:
Each of these stages brings markedly different reliability and cost considerations, depending on how close a data center is to areas vulnerable to wildfires and where (and when) PG&E makes its infrastructure investments.
The full report, “A mission-critical industry unprepared for climate change” is available to members of Uptime Institute Network. Want to know more about this organization? Check out the complete benefits of membership here.
Enterprise IT and the public cloud: What the numbers tell us
/in Executive/by Andy Lawrence, Executive Director of Research, Uptime Institute, [email protected]The spectacular growth of the public cloud has many drivers, only one of which is the deployment, redevelopment or migration of enterprise IT into the cloud. But many groups within the industry — data center builders and operators, hardware and software suppliers, networking companies, and providers of skills and services alike — are closely watching the rate at which enterprise workloads are moving to the cloud, and for good reason: Enterprise IT is a proven, high-margin source of revenue, supported by large, reliable budgets. Attracting — or keeping — enterprise IT business is critical to the existing IT ecosystem.
The popular view is that enterprise IT is steadily migrating to the public cloud, with the infrastructure layers being outsourced to cheaper, more reliable, more flexible, pay-as-you-go cloud services. Amazon Web Services (AWS), Microsoft Azure and Google are the biggest beneficiaries of this shift. There is very little contention on this directional statement.
It is only when we add the element of TIMEFRAME that we start to lose universal agreement. It is frequently reported and forecasted that we will reach a tipping point where traditional data center centric style approaches (including non-public cloud or non-SaaS provided IT) will become prohibitively expensive, less efficient and too difficult to support. Looking at any complex system of change, its pretty clear this is sound thinking, BUT when will this happen? Some of the data published by industry analysts and (self-)proclaimed experts suggests it will be many years, while other studies suggest we will soon be nearing that point. Its really a mixed bag depending on who you are talking to.
The Uptime Intelligence view is this kind of genetic fundamental change usually happens much more slowly than the technologists predict and we expect the traditional infrastructure platforms (including in-house data centers and customer managed co-location sites) to be the bedrock of Enterprise IT for many years to come.
So how can the views vary so widely? In 2018, 451 Research’s “Voice of the Enterprise Digital Pulse” survey asked 1,000 operators, “Where are the majority of your workloads/applications deployed — now, and in two years’ time?” In the second half of 2018, 60% said the bulk of their loads were “on-premises IT”. Only 20% said the bulk of their workloads were already in the public cloud or in SaaS.
That is still a fair portion in the public cloud or SaaS, but it has taken time: public cloud has been available for about 13 years (AWS debuted in 2006) and SaaS, over 20 (Salesforce was launched in 1999). Over this period, enterprises have had the choice of co-location, managed hosting (cloud or otherwise), or cloud and SaaS. If we view all but traditional IT infrastructure as “new,” a rough summary might be to say that, over that time, just over a third of enterprises have mostly favored the public cloud as a location for their NEW applications, while two-thirds of organizations have favored other ways of running their IT that give them more control.
But what happens next? 451 Research’s data (see below) does suggest that the move away from on-site, traditional IT is really starting to step up. Thirty-nine percent (39%) of organizations say that by 2020, the bulk of their data will be in SaaS or a public cloud service. That’s a big number, although it is an aspiration — many organizations have not transitioned anywhere near as fast as they would have liked. But look at this another way: Even in 2020, nearly half of all organizations say their loads will still mostly be in enterprise data centers or under their control in a co-location facility. So a lot of enterprise infrastructure will remain in place, with over a third of these organizations still hosting applications mostly in their own data centers.
But the real story is buried in the science of survey design… the way 451 Research (and other research firms) asked its question doesn’t reveal all the nuances that we see at Uptime Institute Intelligence and in fact their simpler questioning misses the fact that both total infrastructure capacity and the data center portion of it are both growing, albeit at different rates: Data Center based capacity is growing but at a slower rate than the total infrastructure capacity. Some big organizations are indeed shifting the bulk of their NEW work to the cloud, but they are simultaneously maintaining and even expanding a premium data center and some edge facilities. This is a complex topic to be sure.
So let’s focus on brand new data gathered in 2019, with included much more granular questions, in the context of where capacity exists, and where business applications, old and new are being deployed. In Uptime Institute’s 2019 operator survey (which had results released in May 2019), Uptime Institute asked a more comprehensive question: “What percentage of your organization’s total IT would you describe as running in the following environments?” This question focused on percentage of existing workloads, rather than where the main workload was located or what would happen in the future (a more objective representation).
This Uptime Institute study confirmed that the shift to the public cloud and service oriented capacity was happening, but revealed that the shift was less dramatic and much slower than most industry reporters and pundits would suggest. Over 600 operators said that, in 2021, about half of all workloads will still be in enterprise data centers, and only 18% of workloads in public cloud/SaaS services. Just over three-quarters of all workloads, they believe, will still be managed by their own enterprise staff at a variety of locations that include enterprise data centers, co-location venues, server closets and micro/edge type data centers. The science of any survey confirms that the more granularity in the question, the more actionable the intelligence.
So what does it all mean? The IT ecosystem that already supports a diverse hybrid infrastructure is not yet facing a tipping point for any one portion. That doesn’t mean it won’t come eventually. Basic physics kicks in: workloads that are re-engineered to be cloud-compatible are much more easily moved to the cloud! But there are many other factors as well and as such, core infrastructure change will be gradual. These reasons including economics of work, dynamic scale-ability, risk management, support-ability, application support, overall performance, service reliability and the changing business models facing every modern business today. Simply put: The Enterprise IT environment is changing everyday and its hybrid nature is already very challenging, but Enterprise IT is not in eminent terminal decline.
Public cloud operators, meanwhile, will continue to assess what they need to do to attract more critical IT workloads. Uptime Institute Intelligence has found that issues of transparency, trust, governance and service need addressing — outstanding tools and raw infrastructure capacity alone are not sufficient. We continue to monitor the actual shifts in workload placements and will report regularly as needed.
More detailed information on this topic, along with a wealth of other strategic planning tools are available to members of the Uptime Institute Network, a members-only community of industry leaders which drive the data center industry. Detail can be found here.
Artificial Intelligence in the Data Center: Myth versus Reality
/in Executive, Operations/by Rhonda Ascierto, Vice President, Research, Uptime InstituteIt is still very early days, but it is clear that artificial intelligence (AI) is set to transform the way data centers are designed, managed and operated — eventually. There has been a lot of misrepresentation and hype around AI, and it’s not always clear how it will be applied, and when. The Uptime Institute view is that it will be rolled out slowly, with initially conservative and limited use cases now and for the next few years. But its impact will grow.
There have been some standout applications to date — for example, predictive maintenance and peer bench-marking — and we expect there will be more as suppliers and large companies apply AI to analyze a wider range of relationships and patterns among a vast range of variables, including resource use, environmental impacts, resiliency and equipment configurations.
Today, however, AI is mostly being used in data centers to improve existing functions and processes. Use cases are focused on delivering tangible operational savings, such as cooling efficiency and alarm suppression/rationalization, as well as predicting known risks with greater accuracy than other technologies can offer.
Artificial Intelligence is currently being applied to perform existing well-understood and defined functions and processes faster and more accurately. In other words, not much new, just better. The table below is taken from the new Uptime Intelligence report “Very smart data centers: How artificial intelligence will power operational decisions” (available to Uptime Institute Network Members) and shows AI functions/services that are being offered or that are in development; with a few exceptions, they are likely to be familiar to data center managers, particularly those that have already deployed data center infrastructure management (DCIM) software.
So where might AI be applied beyond these examples? We think it is likely AI will be used to anticipate failure rates, as well as to model costs, budgetary impacts, supply-chain needs and the impact of design changes and configurations. Data centers not yet built could be modeled and simulated in advance, for example, to compare the operational and/or performance profile and total cost of ownership of a Tier II design data center versus a Tier III design.
Meanwhile, we can expect more marketing hype and misinformation, fueled by a combination of AI’s dazzling complexity, which only specialists can deeply understand, and by its novelty in most data centers. For example:
Myth #1: There is a best type of AI for data centers. The best type of AI will depend on the specific task at hand. Simpler big-data approaches (i.e., not AI) can be more suitable in certain situations. For this reason, new “AI-driven” products such as data center management as a service (DMaaS) often use a mix of AI and non-AI techniques.
Myth #2: AI replaces the need for human knowledge. Domain expertise is critical to the usefulness of any big-data approach, including AI. Human data center knowledge is needed to train AI to make reasonable decisions/recommendations and, especially in the early stages of a deployment, to ensure that any AI outcome is appropriate for a particular data center.
Myth #3: Data centers need a lot of data to implement AI. While this is true for those developing AI, it is not the case for those looking to buy the technology. DMaaS and some DCIM systems use pre-built AI models that can provide limited but potentially useful insights within days.
The advent of DMaaS, which first became commercialized in 2016, is likely to drive widespread adoption of AI in data centers. With DMaaS, large sets of monitored data about equipment and operational environments from different facilities (and different customers) are encrypted, pooled in data lakes, and analyzed using AI, anomaly detection, event-stream playback and other approaches.
Several suppliers now offer DMaaS, a service that parallels the practice of large data center operators who use internal data from across their portfolios to inform decision-making and optimize operations. DCIM suppliers are also beginning to embed AI functions into their software.
Data center AI is here today and is available to almost any facility. The technology has moved beyond just hyper-scale facilities and will move beyond known processes and functions — but probably not for another two or three years.
——————————————————————————–
For more information on Artificial Intelligence and how it is already being applied in the data center, along with a wealth of other research consider becoming part of the Uptime Institute Network community. Members of this owner-operator community enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday. For membership information click here.
Recover heat, re-charge power
/in Operations/by Andy Lawrence, Executive Director of Research, Uptime Institute, [email protected]Recently I attended the Data Center Dynamics (DCD) Smart Energy conference in Stockholm. During a panel discussion on energy, data centers and innovation, David Hall (Senior Director of Technology Innovation for Equinix) made two observations, almost in passing, about metrics and monitoring. Both were intriguing and, to my ears, suggested that operators’ thinking about sustainability and data center energy use is starting to evolve.
The first metric was power usage effectiveness (PUE) — the universally applied but widely criticized standard way to measure data center energy efficiency. In essence he asked, What does it matter if your data center PUE is (a very inefficient) 4 or 5, if you are capturing and re-using the heat (i.e., the byproduct of wasted energy)? After all, as many at the conference pointed out, waste energy can be used for district, building or campus heating; for greenhouses; for heating swimming pools; or, as both Facebook (in Northern Sweden) and the National Renewable Energy Laboratory (in Colorado) do, for melting snow.
As moderator of the discussion, I took David a little to task, asking if Equinix was really doing a lot of heat recovery (no) and noting the pride that Equinix takes in its mostly low and very healthy PUE numbers. (I might have added that computers don’t make the most efficient heaters.)
But small details aside, his point is right: It does not make economic or environmental sense to “boil the sky” with warm air or the rivers with warm water. Large data centers, even very efficient ones, can and do put out a lot of heat, which wastes money, burns up energy and pushes up carbon emissions.
Heat recovery has, until now, had little adoption. In Europe, it is probably viewed as peculiarly Nordic. But it might get traction in the future: as capacity moves nearer the edge and more processing takes place in the cities, there is more opportunity to use the heat; as data centers are run at higher temperatures, the “quality” of that heat will be better; and as (if?) liquid cooling finally wins more over more operators, more people will be attracted by the opportunity to plumb the water directly into a local or district heating system. Bottom line: There is likely to be a lot more heat recovery from data centers in the future.
The second metric? In the years ahead, David predicted that “managing state of charge” for batteries will become a key operational concern — perhaps not a metric, exactly, but that may come. This doesn’t sound wildly exciting, but there is an important reason: At present, most batteries are lead-acid and, because this technology is not suited to multiple or rapid recharges, the batteries are kept in a fully charged state more or less permanently.
But lithium-ion (Li-ion) batteries are a different matter. They can be cycled thousands of times with relatively minimal degradation. Over time, Li-ion batteries, mostly but not only in uninterruptible power supplies, will be used in ways that utilize this capability. They will be charged when energy is cheap and available, then discharged when energy is needed, when it provides additional capacity, when it can be sold, or when it can be re-allocated/distributed to a different area of the data center in greater need. Smart energy systems are already making use of this capability — battery monitoring is coupled with data about IT loads, redundancy, utility power, cooling and other data. In a more advanced, “energy-smart” data center, operators will need to know the state of charge of every energy storage device in the system at all times.
Another interesting fact that came up at the DCD Smart Energy event: Several delegates said they were working on projects using previously owned lithium-ion batteries, most notably from electric vehicles (which have higher rapid charge requirements). The issues about whether these batteries are the most suitable type of Li-ion chemistry for the data center aside, using former vehicle batteries is very economical and should cause few problems as long as the batteries are monitored. Again, this suggests the economics and use case for Li-ion are steadily tilting in favor of wider adoption: expect more Li-ion and more “energy-smart” data centers.
——————————————————————————–
For more information on advanced energy management and the potential for Lithium batteries to transform the data center power cost model, join the Uptime Institute Network. Members enjoy a continuous stream of relevant and actionable knowledge from our analysts and share a wealth of experiences with their peers from some of the largest companies in the world. Membership instills a primary consciousness about operational efficiency and best practices which can be put into action everyday. For membership information click here.