COOLING TECHNOLOGY Evolving cooling solutions for high-density data centers
Related Vendors
As data center size, power density and energy demand steadily increase, more efficient cooling technologies become essential. This article looks at how operators are augmenting or replacing air cooling with liquid cooling, AI techniques, and free cooling.
Data centers form the backbone of today’s digital world and consume enormous amounts of power. And this demand is only going to increase; global data center power consumption is forecast to more than double between 2022 and 2026i. Newly constructed hyperscale data centers require power capacities of at least 100 megawatts, equivalent to an annual electricity consumption of more than 400,000 electric vehicles.
This surge is driven by the growing demand for data center internet services and the internet of things (IoT), combined with the increasing integration of artificial intelligence (AI) technology. AI applications are significantly more energy-intensive than traditional online services such as Google search, with ChatGPT queries needing almost 10 times as much processing power to run.
This sharply increasing energy usage naturally raises serious concerns, because of both escalating costs and increasing environmental impact. And, as cooling typically accounts for around 40 % of a data center’s power consumptionii, owners and operators are continuously seeking to improve their facility’s cooling efficiency. Progress over recent years is evidenced by improving Power Usage Effectiveness (PUE) ratios, which compare total data center power consumption with power consumed by IT equipment. An ideal PUE is 1.0, while a European Commission report showed that actual average PUE values across the EU have ranged from 2.03 in 2010 to 1.75 in 2018. As of 2023, the most efficient data centers were achieving levels equal to or better than 1.2.
However, the amount of energy involved is driving the industry’s ongoing quest for even higher efficiency through more innovative solutions. To understand the progress that’s being made, let’s look at why cooling is so important, the established solutions in use around the world, and the technologies now emerging to replace them.
Heating and cooling
Excessive data center heating arises from IT equipment such as servers, storage units, and networking gear, all consuming significant power and releasing heat accordingly, being densely packed and, typically, under 24/7 operation. This situation is worsening as data centers endeavor to accommodate ever-increasing workloads with more power and more density.
Yet excessive heat can cause GPUs, CPUs, and memory modules to throttle performance or fail entirely. Conversely, stable temperatures help maintain consistent performance and reduce the risk of unexpected outages. Additionally, cooler environments reduce wear and tear on components, leading to longer hardware life. Overall power consumption can be reduced, to decrease operating costs.
Efficient cooling also helps data centers to comply with strict local environmental and operational standards.
How the data center cooling landscape is evolving
Air cooling and liquid cooling are the two most popular types of data center coolingiii. Air cooling remains widely used worldwide, as it involves long-standing, tried and tested technology. However, data center operators are increasingly turning to liquid cooling – or a hybrid solution involving both technologies – as an additional or alternative solution, as facilities’ power demands increasingly outstrip air cooling system capabilities.
Air cooling
This cooling method is ideal for smaller or older data centers that combine raised floors with hot and cold aisle designs. When the computer room AC (CRAC) unit or computer room air handler (CRAH) sends out cold air, the pressure below the raised floor increases and sends the cold air into the equipment inlets. The cold air displaces the hot air, which is then returned to the CRAC or CRAH, where it's cooled and recirculated.
Hot and cold air aisles increase the efficiency of air-based cooling systems by enabling more targeted placement of intake and exhaust vents. This prevents hot and cold air mixing so the cooling CRAC or CRAH can work more efficiently.
Also, a CRAH is more efficient than a CRAC, as it draws outside air in and cools it using chilled water instead of refrigerant. A CRAC functions like a residential AC unit that uses refrigerants to cool the air. CRAC units are more appropriate for small data center closets because they can't keep up with enterprise-level data centers.
Liquid cooling
Liquid cooling is a more recent technology. It's efficient and cost-effective because it can be installed on data center devices that need it the most. Liquid transfers heat away from emitting sources more efficiently than air, and supports greater equipment densities and items that generate above-average heat, such as high-density and edge-computing data centers.
Immersion cooling
Immersion cooling is a popular form of liquid cooling, involving immersing the equipment directly into a bath of non-conductive liquid. It differs from water cooling, where the liquid is potentially harmful to electronics and thus flows through a sealed loop isolated from the heat source. A watertight water block is used to indirectly transfer the heat from the heat source to the working fluid.
In immersion cooling, however, heat is transferred directly away from the heat source using the working fluid. With immersion cooling, the working fluid must be non-conductive and tends to be within four families of fluids:
- de-ionized water
- mineral oil
- fluorocarbon-based fluids
- synthetic fluids
A wide variety of fluids are available for immersion cooling, with the most suitable being transformer oils and other electrical cooling oils. Non-purpose oils, including cooking, motor, and silicone oils have also been successfully used to cool computer servers.
Liquid immersion cooling involves submerging computer components or entire servers in a thermally conductive but electrically non-conductive liquid. This method allows for direct heat transfer from the components to the liquid, which can absorb significantly more heat than air, making it highly effective for coolingiv.
Types of Immersion Cooling
- Single-Phase Immersion Cooling: In this system, the coolant remains in a liquid state throughout the cooling process. The liquid absorbs heat from the components and is circulated through a heat exchanger to dissipate the heat.
- Two-Phase Immersion Cooling: This method utilizes the phase change of the coolant from liquid to gas. As the liquid absorbs heat, it vaporizes, and the gas is then condensed back into a liquid in a cooling system, allowing for efficient heat transfer.
Liquid immersion cooling has many benefits, including higher efficiency – up to 3000 times more effective than air cooling. It also consumes less space, saving valuable data center real estate. Noise levels are lower, as fewer fans are required.
Meanwhile, reliability is enhanced as immersion cooling, unlike air cooling, does not draw in dust and particulate contamination from the atmosphere; this can extend electronic component lifespan.
Direct-to-chip liquid cooling
Direct-to-chip liquid cooling is another liquid cooling variant, used to manage the heat generated by high-performance computing systemsv. It involves directly applying liquid coolants to the processors and other critical components. This method provides superior heat dissipation, enabling servers to operate at optimal performance levels while reducing energy consumption.
To manage heat efficiently, direct-to-chip liquid cooling involves several key components. Firstly, cold plates, featuring internal channels through which the coolant flows, are attached directly to the chips to absorb heat from the chips' surfaces at source. The coolant, typically a specialized liquid with high thermal conductivity and low electrical conductivity, ensures safety and efficiency in heat transfer. A pump circulates this coolant through the system, maintaining continuous heat removal. Finally, the heat exchanger transfers the absorbed heat from the coolant to an external cooling source, such as a radiator or a cooling tower, completing the cooling cycle.
How does direct-to-chip liquid cooling compare with immersion cooling?
Both methods have their own advantages. Direct-to-chip cooling offers precise cooling directly at the heat source, which is efficient for high-density and high-performance systems. It is also easier to integrate into existing data center infrastructure.
However, immersion cooling can provide more uniform cooling and is often more effective at handling extremely high heat loads. For the most demanding environments, immersion cooling may be needed or preferred due to its ability to manage intense thermal challenges more effectively.
The choice between the two depends on specific application needs, cost considerations, and infrastructure compatibility.
SILICON CARBIDE SOLUTIONS FOR DATACENTER POWER SUPPLIES
Finding the right technology to solve the datacenter power challenge
Free cooling
The immersion and direct-to-chip cooling technologies described above both offer sophisticated improvements to data center cooling efficiency. However, depending on a data center's location and current cooling methodologies, free cooling may be a viable, lower-cost way to boost cooling efficiency and reduce energy usage.
As the term implies, free cooling cools data center infrastructure with virtually no energy use. However, there are caveats, and the method isn’t for every facilityvi.
In data centers, free cooling involves dissipating heat without artificially cooling air or water. Typically, free cooling systems collect air or water from the ambient environment, then circulate it into data center server rooms or individual server racks.
Free cooling is distinct from mechanical cooling, which relies on refrigerants and compressors to cool air or liquid. Its key benefit is that, through being much more passive than mechanical cooling, it uses much less energy, so it can boost data center power efficiency and sustainability.
For data center operators, free cooling systems also offer the benefit of being low in cost to install and easy to operate and maintain, since they require few components. Also, because these systems do not use refrigerants, there is no potential for refrigerant leakage into the atmosphere.
However, most free cooling systems are not totally energy-free. They often require fans or water circulators to move cooling media, and that equipment uses electricity – albeit much less than a typical HVAC compressor. More importantly, free cooling only works in certain locations and under certain conditions.
To operate a free cooling system, a data center must have access to air or water whose natural temperature is lower than its internal temperature. As a result, free cooling is typically not viable for data centers in warm climates, during hot seasons or during warm periods of the day.
Nor is it realistic for most data centers to cool servers using free cooling alone. Most facilities need mechanical cooling systems in place to sustain operations during periods when free cooling is not viable. This means that free cooling is a complement to, not a replacement for, mechanical cooling for the typical data center.
Additionally, free cooling systems that circulate water could lead to high volumes of water usage – which is already a challenge for many facilities. For this reason, data centers in regions where water is scarce may not be able to take advantage of free cooling, even if water in the environment is naturally cool enough to support this technique.
A final potential challenge is that IT equipment that runs at particularly high temperatures – modern AI hardware may, for example – might not be a candidate for free cooling. Ambient air or water may not be cool enough to dissipate heat from this equipment at the rate necessary to prevent overheating.
Despite these challenges, the fact that free cooling is relatively inexpensive and simple to install means that data centers can often take advantage of free cooling – at least as an auxiliary cooling measure – easily enough.
In many cases, installing free cooling is as simple as augmenting existing HVAC systems with air exchangers that can pull air from outdoors and circulate it through the system already in place to dissipate heat using mechanical cooling. Leaving the fans on but turning compressors off provides free cooling, assuming the outdoor air is cool enough on its own.
AI and its many impacts on data center cooling
Irrespective of the cooling technique – or techniques – being used across a data center, artificial intelligence (AI) is being applied in many ways to further improve cooling efficiency. For example, AI algorithms can analyze real-time data from temperature sensors, airflow meters, and power load monitors. This creates a live thermal map of the data center, allowing cooling systems to adapt instantly to hotspots and workload changesvii.
AI can improve cooling deployment efficiency through precision cooling – targeting only the areas that need it, rather than overcooling entire rooms. Some systems also recover and re-use heat, improving sustainability.
AI
The impact of artificial intelligence on the semiconductor industry
AI can also analyze vibration, current, and temperature data to detect early signs of equipment failure, to enable pre-emptive maintenance. This minimizes downtime and extends the life of the cooling infrastructure.
Machine learning models (like Long Short-Term Memory (LSTM) networks) forecast thermal spikes based on historical and real-time data, while cooling systems can pre-emptively adjust airflow or liquid cooling before temperatures rise.
HVAC systems can benefit from smart control. AI can direct variable frequency drives (VFDs) to modulate fan and compressor speeds. This reduces mechanical wear by up to 40 % and cuts energy use during partial-load conditions.
AI supports the deployment of hybrid air and liquid cooling systems, optimizing performance based on server density and heat output. For example, Aligned Data Centers has unveiled an innovation hub in Phoenix, Arizona to test the advanced cooling technologies needed to support future generations of high-powered AI chips and the increasingly power-hungry server racks housing them.
The Advanced Cooling Lab will focus on Aligned’s modular, hybrid infrastructure that can accommodate both air and liquid cooling loads, the company said in a blog post. “Liquid-ready” cooling infrastructure is a consideration for data center customers hoping to future-proof their investments as power densities increase, cooling specialists sayviii.
Future data center cooling: Liquid or air?
As cooling efficiency becomes increasingly under the spotlight, the choice between liquid cooling and air cooling has never been more relevant. Which approach will win the race: the well-established air cooling or the rapidly advancing liquid cooling?
Air cooling has been the standard for data centers for decades. It relies on fans, heat sinks, and air conditioning to remove heat from IT equipment. Warm air is expelled, and cooler air is circulated to maintain an optimal temperature. It’s proven, and widely-used worldwide, with lower upfront costs compared with liquid cooling. Maintenance is easier, with minimal risk of leaks or malfunctionsix.
However, power consumption is high, while efficiency is limited as server arrays become denser. Additionally, air-cooled data centers require a larger footprint to accommodate the air cooling infrastructure.
While air cooling has lower initial setup costs, its higher operational expenses mean that over time, the total cost of ownership (TCO) increases. Data centers running on air cooling may also face future upgrades or retrofits as AI workloads and high-performance computing (HPC) demands grow, further adding costs.
Conversely, the liquid cooling techniques discussed above dissipate heat much more efficiently than air. Space requirements and operational costs are both reduced, with better support for high-density workloads such as AI, high performance computing, (HPC), and hyperscale data centers.
Nevertheless, liquid cooling does bring challenges, including higher initial investment, and a potential risk of leaks. However, this is mitigated by using dielectric (non-conductive) liquids in advanced systems. Maintenance requirements are also complex, requiring specialized knowledge and equipment.
As data centers push toward greater efficiency and sustainability, cost remains a critical factor when choosing between liquid cooling and air cooling. While liquid cooling offers superior thermal management, the financial implications of transitioning from traditional air cooling need careful consideration.
Despite its high upfront cost, liquid cooling pays for itself over time through reduced power consumption, longer hardware lifespan (due to lower thermal stress), and lower maintenance costs. Additionally, some government incentives and sustainability programs support liquid cooling adoption, improving ROI.
Many data centers are exploring hybrid cooling solutions, combining air and liquid cooling to balance cost and efficiency. Hybrid systems allow organizations to leverage existing air cooling infrastructure while implementing liquid cooling in high-density areas, minimizing upfront costs while benefiting from liquid cooling's efficiency.
In conclusion, we can say that despite liquid cooling’s advantages, air cooling will not disappear overnight. Many data centers will continue to rely on air-based solutions, especially for legacy infrastructure that lacks the budget or need for liquid cooling upgrade.
Hybrid cooling systems are emerging, blending air and liquid cooling for a balanced approach, while new AI-optimized cooling systems are improving the efficiency of traditional air cooling. And, if the data center’s location is right, it may be possible to augment the mechanical cooling methods with free cooling.
References
- i Data center power consumption - statistics & facts | Statista
- ii Cool runnings: making data centres more energy efficient | Modus | RICS
- iii Data center cooling systems and technologies and how they work | TechTarget
- iv What Is Immersion Cooling? | Liquid Immersion Cooling | Submer
- v What Is Direct-to-Chip Liquid Cooling? | Supermicro
- vi Free Cooling for Data Centers: Strategies and Advantages
- vii The Future of Data Center Cooling: AI Innovations and Advanced HVAC Motor Technologies
- viii Phoenix cooling lab showcases hybrid data center cooling capabilities
- ix The Future of Data Center Cooling: Liquid vs. Air – Which Will Dominate?
(ID:50549762)