NVIDIA and AMD Unveil New GPU Platforms in January Digest

NVIDIA and AMD Unveil New GPU Platforms in January Digest

January, typically a quiet month in the hardware sector, witnessed a significant shake-up as both NVIDIA and AMD launched their flagship products. This month, Sergey Kovalev, dedicated server manager at Selectel, shares insights into the most talked-about hardware innovations, spanning from GPUs to new disk and networking equipment.

NVIDIA has introduced the Vera Rubin™ platform, which is designed using an "extreme codesign" approach that integrates hardware and software. This new generation GPU aims to support agent-based AI, mixture of experts (MoE) models, and long context applications. Vera Rubin™ is a complete system architecture featuring six core components, including the NVIDIA Vera™ Arm processor with 88 custom Armv9.2 cores and the Rubin™ GPU, which comes equipped with HBM4 and NVLink 6.

The platform boasts impressive specifications: it can achieve up to 50 petaflops in FP4 for inference—five times more powerful than its predecessor, Blackwell—and offers 288 GB of HBM4 memory with a bandwidth of 22 TB/s, surpassing Blackwell's performance by 2.8 times. The NVLink 6 interconnect allows for a speed of 3.6 TB/s per GPU, doubling the speed of Blackwell.

NVIDIA has also highlighted the energy efficiency of the Vera platform and its built-in support for confidential computing. The Rubin™ GPU is particularly focused on training and inference of large models, emphasizing low precision and hardware adaptive data compression. This platform, in its maximum configuration, can aggregate memory capacity of up to 54 TB LPDDR5x and 20.7 TB HBM4, achieving an astonishing bandwidth of 1.6 PB/s.

In tandem with the Rubin announcement, NVIDIA unveiled a new infrastructure for inference context memory storage, designed to significantly enhance token throughput and overall energy efficiency compared to traditional storage solutions. While NVIDIA has signaled the start of full-scale production for Rubin, server solutions from partners are expected to roll out later this year, drawing interest from major industry players.

On the other hand, AMD has revealed its Helios™ rack-scale AI platform, featuring the next-generation Instinct MI455X GPUs. This first-of-its-kind system for AMD is built on EPYC™ Zen 6 processors and can support up to 72 MI455X accelerators, boasting a combined HBM4 memory of 31 TB with a bandwidth of 1.4 PB/s. AMD claims that Helios can deliver up to 2.9 exaflops in FP4 for inference and 1.4 exaflops in FP8 for training, positioning it for modern AI data centers that require sophisticated power and cooling solutions.

The new MI400-series accelerators will be manufactured using TSMC's 2nm process technology, marking a significant milestone in GPU development. This lineup introduces a variety of models tailored to different AI applications, moving away from a one-size-fits-all approach to a more diversified product range.

In a related development, Microsoft has launched its Maia 200 AI accelerator, built on a 3nm process and optimized for inference tasks. This second-generation accelerator features innovative scaling architecture using standard Ethernet, enabling it to integrate seamlessly into Microsoft's heterogeneous AI infrastructure.

Finally, Alibaba's T-Head Semiconductor has introduced the Zhenwu 810E, an AI accelerator designed for both training and inference tasks, particularly in autonomous driving.

The unveiling of these advanced GPU platforms by NVIDIA and AMD signals a competitive shift in the market, promising enhanced capabilities for AI applications and setting new benchmarks for performance that will challenge competitors to innovate rapidly in response.

Informational material. 18+.

" content="b3bec31a494fc878" />