On June 12, 2025, AMD presented its current technology roadmap for professional computing platforms at an international trade event. The focus was on new solutions for high-end workstations, AI infrastructure and the further development of the open software platform ROCm. The company presented its next generation of EPYC processors, Radeon graphics cards with AI acceleration and ROCm 7 with extended framework integrations. Strategic partnerships with leading cloud providers and software developers were also highlighted. The event attracted international attention and was well received by the trade press and analysts. Here is a brief summary of the most important points for you.
Important thoughts on architecture and platform strategy
The strategic importance of workstations is emphasized by their central role in numerous technical, creative and scientific fields of work. Forecasts see this category continuing to grow until close to nine million units in 2029. The product strategy presented comprises three main pillars: High-performance processors for desktop workstations, specialized graphics solutions for professional applications and low-power and compact computing units for mobile or mini-workstations.
A comparison between previous and current generations reveals significant progress in terms of parallelization and bandwidth. The newly introduced processor family is based on an architecture with up to 96 computing cores, each supporting two threads. Each computing core cluster has several arithmetic units for integer and floating point calculations as well as specialized units for address generation. The cache subsystem has also been enlarged. Each core has its own L1 and L2 cache, and the CCD module provides a total of 40 MiB of cache. In the full expansion stage, this results in a total of 480 MiB.
The memory connection is made via eight DDR5 channels with a frequency of 6400 MT/s with ECC support. This results in a theoretical bandwidth of over 400 GB/s. In addition, up to 128 PCIe 5.0 lanes are available. The socket used corresponds to the predecessor design and thus enables the reuse of existing platforms. In direct performance comparisons with a competing CPU of the same class, a performance increase of over 80 percent is achieved in Cinebench for multi-threaded applications. In other application scenarios from the areas of rendering, simulation, software compilation, media production and AI inference models, improvements of between 40 and 145 percent are reported.
In addition to the processors, a new GPU solution for professional AI workloads was also presented. The architecture is based on a monolithic 4 nm design with around 54 billion transistors. There are 128 specialized AI units available, which enable up to 1531 TOPS on INT4 calculations. The video memory comprises 32 GiB GDDR6 with a wide interface and high clock frequency. The power requirement is 300 watts.
AMD Press Workshop - Main Stage_Embargoed until June 17 @ 9AM ET
Software support and ISV optimizations
A central topic is the increasing parallelization in modern applications. Computing tasks in the areas of CAD, simulation, rendering, image processing, code compilation and path tracing now use a large number of threads. Many professional software packages are now optimized for use on processors with up to 96 computing cores. These include solutions from areas such as mechanical engineering, media production, architectural planning and scientific data processing.
While AI models in many cases work with reduced accuracy and can therefore reduce memory bandwidth and computing power requirements, technical applications often require FP32 or even FP64. These data types place greater demands on the platform, both in terms of bandwidth and power consumption. In order to make efficient use of the available hardware, reference is made to parallel program libraries that support threading and vectorization. The use of OpenMP, TBB and modern memory allocators such as jemalloc is recommended. AVX2 and AVX-512 instructions are available for SIMD vector operations.
In the area of GPU computing, HIP-based compilers and runtime environments are used, embedded in the ROCm platform. This provides libraries for linear algebra, image processing and neural networks. It offers a low-level API and supports integration into existing AI frameworks such as TensorFlow and PyTorch. The compatibility of the Radeon graphics solutions with a growing number of professional applications is being successively expanded. These include well-known rendering solutions as well as specialized design and visualization programs.
AMD Press Workshop - ISV_Embargoed until June 17 @ 9AM ET
ROCm platform for machine learning
The open software platform for GPU computing supports a variety of Linux and Windows distributions via WSL integration. The current version ROCm 6.x includes backend support for PyTorch, TensorFlow, ONNX runtime and Triton 3.3. Hardware support includes current graphics cards with RDNA-3 and RDNA-4 architecture, including dedicated models for professional and AI workloads. The mobile AI-Max components are also part of the platform strategy in the future.
System partners offer ready-made solutions for research, development and industrial AI applications. This lowers the hurdle for integrating AI workflows into existing infrastructures.
AMD Press Workshop - ROCm_Embargoed until June 17 @ 9AM ET
OEM integration and platform strategy
A key goal is to build a complete ecosystem of hardware, software and partners. Manufacturers should offer the new components in certified systems that support standardized drivers and management interfaces. The differentiation of the platforms is realized via the number of available PCIe lanes, the memory channels and the platform management. The high-end versions have eight memory channels and up to 128 PCIe lanes, while the HEDT versions for semi-professional users are equipped with four channels and 80 lanes.
The aim is to provide workstations for every application class, from local AI inference solutions to real-time rendering in virtual production environments and scientific software development. This diversification is intended to sustainably increase the acceptance of the platform in the professional market.
AMD Press Workshop - Ecosystem_Embargoed until June 17 @ 9AM ET
Summary
The topics demonstrate a consistent and technologically sound strategy to strengthen its position in the workstation market. The advances at the level of processor architecture, GPU technology and software support are coordinated and build on each other. The combination of high computing power, broad software compatibility and systematic OEM integration should help to manage demanding workloads from science, technology and media more efficiently. A particular focus is on the execution of AI workloads with low latency and high parallelism on locally installed hardware. The integration of the ROCm platform underlines the claim to provide an open and expandable environment for high-performance computing and machine learning.
5 Antworten
Kommentar
Lade neue Kommentare
Veteran
Mitglied
Urgestein
Mitglied
Urgestein
Alle Kommentare lesen unter igor´sLAB Community →