Today, AMD held a keynote at this year’s Computex. At this trade fair, AMD presented its new AI chip. The AMD Instinct MI325X.

Let’s take a brief look at the history of AMD’s Accelarator and its future plans. it all started in 2020 with its AMD CDNA MI100 Accelerator. The first purpose-built accelerator for FP64 and FP32 HPC workloads. The successor followed a year later. The CDNA MI200 Accelerator. This has a denser computing architecture with higher memory capacity and bandwidth. the CDNA 3 was then released in 2022. The focus here was on a standardized memory with AI data format performance and in-node networking.
Then came the CDNA 3 MI300X with generative AI leadership, which was available until recently, followed by AMD’s latest accelerator for this year. The AMD Instinct MI325X platform with HBM3E memory and higher computing power. But the plan does not end here, because next year we will continue with a new version of the CDNA. The CDNA 4, which will offer even higher computing power and memory management. a next-gen architecture of the accelerator will then be presented in 2026.

But enough of an insight into this story. Let’s take a closer look at the details of the new accelerator. As previously mentioned, an HBM3 memory is installed. The memory is said to be twice as large as its predecessor. More precisely, up to 288 GB. The bandwidth will also increase by a factor of 1.3x to 6 TB/s. AMD is also setting an example here against NVIDIA’s H200 accelerators. It is superior to its H200 chip many times over. 8 AMD Instinct MI325X GPUs are installed on an AMD Instinct MI325X platform. Here, the theoretical peak of PF should be around 10.4. AN HBM3E memory comes to an impressive 2.3 TB, which can be achieved. The Infinity Fabric bandwidth is set at around 896 GB/s.
The Instinct supports the most popular Gen-AI models such as GPT-4, LLAMA 2 and Stable Diffusion. AMD ROCm technology is also used here. For example, over 700,000 models can be used with Hugging Face through ROCm. Full support with ROCm is also possible with OpenAI’s Triton. The aim is to push the boundaries of data center AI performance. For AMD’s partner Microsoft, Satya Nadella mentioned that the MI300X offers a leading price-performance ratio and is also optimized for Microsoft Azure workloads.
“MI300X offers leading price/performance on GPT4 inference. Optimized for Microsoft Azure workloads”

A small preview of next year’s model has also been presented. This will be a 3 nm process node with 288 GB HBME3 memory and FP4/FP6 datatype support. We will probably only find out more next year.
The information was provided by AMD in advance. The only condition was compliance with the blocking period on 03.06.2024 at 5.00 a.m.
Source: AMD
1 Antwort
Kommentar
Lade neue Kommentare
Mitglied
Alle Kommentare lesen unter igor´sLAB Community →