The hidden realities of AI racks: 1,600kg, 100 cables, and liquid cooling

Four takeaways from a webinar on the realities of next-gen AI data centres.

The hidden realities of AI racks: 1,600kg, 100 cables, and liquid cooling
Photo Caption: (Screenshot) Indrama (top left) and Mark Langford (top right).

I had a great time at the w.media HPC Investment Webinar this morning, where we discussed next-gen AI, HPC, and data centres.

Here are four things I learned.

More bandwidth needed

It used to be that you could throw more chips at a given compute problem for more processing power, says SemiAnalysis' Jordan Nanos, as he talked about some common misconceptions of AI data centres.

The bottleneck with AI workloads, increasingly, is in bandwidth, which is shifting to the network. Both AI inference and AI training require a high-bandwidth network between machines.

Everything from day one

Data centre customers are increasingly asking for higher density power and cooling of up to 60-100kW per rack, says NeutraDC's Indrama Purba.

And they now want it all from "day one," be it power, cooling, or connectivity, which is quite unlike usual uptake patterns for traditional data centre deployments.

Comparing HPC with AI

Given that they are often deployed together, how do HPC and AI workloads differ? AMD's Paul Skaria and Jordan chimed in on this.

Though much of the infrastructure is similar, Paul observed that one is high precision and the other works with lower precision. Jordan, on his part, framed AI as another HPC workload, albeit one with very different characteristics.

AI racks are heavy

Finally, Supermicro's Magelli R. shared some lesser-known facts about the latest AI racks in data centres. For a start, he noted that a 52U rack from Supermicro weighs around 1,600kg.

These racks require server lifts and ladders to access, are designed with a liquid-cooling focus rather than a space-focused one, and present significant cable-management challenges. According to Magelli, a GB300-class rack can have around 100 cables leaving the rack, more if lower-amp PDUs are used. Fibre optic networking is preferred as the cables are slimmer and have more reach than copper Ethernet.

In closing

There was more, including STULZ's Mark Langford who shared insights about the demarcation point for liquid cooling along with a fair amount of technical detail, but I'm out of space for now.

Thank you to the more than 100 who dialled in and contributed meaningful questions.