AMD still has one left, and that’s what we have today. They want to take the competitive Radeon RX 6800 XT 16 GB one step further and so Dr. Lisa Su pulled a last ace out of her sleeve some time ago and presented the Radeon RX 6900 XT first virtually and as a purely digital PowerPoint foil tiger. Now, when this card has finally materialized for us, it poses the question of the week: cuddly cat, a house cat or a really big cat? It will be tight, I may spoil that at this point. The rest is of course in today’s article, which is a must read on all pages.
The big Radeon RX 6900 XT finally puts the provisional end of the new RDNA2 series on top, because with Navi22, Navi23 and Navi24 there will be some more offspring in the foreseeable future. Further down, of course. But what makes this new card so interesting for the end user? With RDNA 2 AMD had introduced new power saving techniques, with the “Infinity Cache” they also want to enable higher memory bandwidths per watt, because the memory interface itself is rather narrow with 256 bit for efficiency reasons and nobody is stingy with the clock rates.
The new graphics cards can already cope with the new video codec AV1, they support DirectX 12 Ultimate for the first time and thus also DirectX Raytracing (DXR). With AMD FidelityFX, they also offer a feature designed to give developers more freedom in the choice of effects. Also included is Variable Rate Shading (VRS), which can save an immense amount of computing power if image areas that are not in the eye of the player anyway are smartly reduced in the display quality.
The Radeon RX 6900 XT as reference design
With the 80 Compute Units (CU) instead of the 72 of the Radeon RX 6800 XT, the Radeon RX 6900 XT now has 5120 shaders in its full configuration. 2015 MHz is the base clock as with the Radeon RX 6800 XT and the boost clock is 2250 MHz as default. The card also relies on 16 GB GDDR6 with 16 Gbps, which is made up of 8 modules of 2 GB each. Well known is also the 256-bit memory interface and the 128 MB Infinity Cache, which is supposed to solve the bandwidth problem.
The Radeon RX 6900 XT weighs almost 1500 grams, is 26.7 cm long, 12 cm high (11.5 cm installation height from PEG), 4.5 cm thick (2.5 slot design), whereby a backplate and the PCB with a total of four additional millimeters are added. The slot bezel is closed, carries 1x HDMI 2.1 and two DP 1.4a connectors. In addition there is a USB Type C socket. The body is made of light metal, the Radeon lettering is illuminated (even RGB controllable) and the whole thing is powered, as with the Radeon RX 6800 XT, again via two 8-pin sockets. More about this on the next page at the teardown.
The screenshot from GPU-Z then provides information about the remaining data of the new card:
Ray tracing / DXR
At least since the presentation of the new Radeon cards it is clear that AMD will also support raytracing. Here one takes a path that differs significantly from NVIDIA and implements a so-called “Ray Accelerator” per Compute Unit (CU). Since the Radeon RX 6900 XT has a total of 80 CUs, this means that the Radeon RX 6900XT also has 80 such accelerators, while the smaller Radeon RX 6800 XT has 72 and the RX 6800 a full 60. A GeForce RTX 3090 comes with 82 RT cores, so nominally two less. When comparing the smaller cards, it is 72 for the RX 6800 XT and 68 for the GeForce RTX 3080. However, the RT cores are organized differently and you will have to wait and see what really arrives here at the end. So it would be an apple and pear comparison.
But what has AMD actually come up with? Each of these accelerators is initially capable of simultaneously calculating up to 4 beam/box intersections or a single beam/triangular cut per cycle. This is how the intersections of the rays with the scene geometry are calculated (analogous to the Bounding Volume Hierarchy), first pre-sort them and then return this information to the shaders for further processing within the scene or output the final shading result. However, NVIDIA’s RT cores seem to be much more complex, as I explained in detail at the Turing launch. What counts is the result alone, and that is exactly what we have benchmarks for.
Smart Access Memory (SAM)
At the presentation of the new Radeon cards, AMD already showed SAM, i.e. Smart Access Memory – a feature that I have activated today in addition to the normal benchmarks, which also allows a direct comparison. But actually SAM is not Neuers, just verbally more beautifully packaged. Behind this is nothing else but the clever handling of the Base Address Register (BAR) and exactly this support must be activated in the substructure. With modern AMD graphics hardware, size-adjustable PCI bars (see also PCI SIG from 24.0.4.2008) have been playing an important role for quite some time, since the actual PCI BARs are normally limited to 256 MB, while the new Radeon graphics cards now offer up to 16 GB VRAM.
The consequence is that only a fraction of the VRAM is directly accessible to the CPU, which without SAM requires a whole range of bypass solutions in the so-called driver stack. This of course always costs performance and should therefore be avoided. So that’s where AMD comes in with SAM. This is not new, but must be implemented cleanly in the UEFI and activated later. This in turn only works if the system is running in UEFI mode and CSM/legacy is deactivated.
CSM stands for the Compatibility Support Module. The Compatibility Support Module is only available under UEFI and it ensures that older hardware and software also works with UEFI. The CSM is always helpful if not all hardware components are compatible to UEFI. Some older operating systems and the 32-bit versions of Windows cannot be installed on UEFI hardware. However, it is precisely this compatibility setting that often prevents the clean Windows variant required for the new AMD components from being installed.
Benchmarks and evaluation
For the benchmarks this time I chose 10 games very purposefully and weighted between old and new, as well as AMD or NVIDIA specific. Three of them are also additionally measured with DXR, but only in 1080p with regard to playable frame rates, thus also omitting NVIDIA’s DLSS, which one could fairly include, but doesn’t make sense here due to the resolution. Therefore I measure the two Radeons once without and once with SAM over all games and resolutions, although it is currently as proprietary as NVIDIA’s DLSS. But at least it doesn’t have to be implemented in the games and is therefore always available, suitable hardware provided.
I covered each game on one page with a total of 6 graphics per resolution or setting. This is absolutely self-explanatory and I spare myself the text, which becomes obsolete because of all the graphics. Facts not words. For this there is a cumulative summary with a detailed explanation at the end. Efficiency and power consumption are also game-related and cumulative, plus the frame times and the variances, because percentiles alone are not the final straw.
Radeon RX 6900 XT | Radeon RX 6800 XT | Radeon RX 6800 | |
Stream processors | 5,120 | 4,608 | 3,840 |
Compute units | 80 | 72 | 60 |
Texture units | 320 | 288 | 240 |
Ray accelerators | 80 | 72 | 60 |
Game clock | 2.015 MHz | 2.015 MHz | 1.815 MHz |
Boost clock | 2.250 MHz | 2.250 MHz | 2.105 MHz |
Memory | 16 GB GDDR6 | 16 GB GDDR6 | 16 GB GDDR6 |
Infinity cache | 128 MB | 128 MB | 128 MB |
TDP | 300 W | 300 W | 250 W |
Slot size | 2.5 | 2.5 | 2 |
Test system and evaluation software
The benchmark system is new and is now completely based on AMD. PCIe 4.0 is of course mandatory. These include the matching X570 motherboard in the form of an MSI MEG X570 Godlike and the Ryzen 9 5950X, which is water-cooled and slightly overclocked. In addition, the matching DDR4 4000 RAM from Corsair in the form of the Vengeance RGB, as well as several fast NVMe SSDs. For direct logging during all games and applications, I use both NVIDIA’s PCAT and my own shunt measurement system, which makes it much more comfortable. The measurement of the detailed power consumption and other somewhat more complicated things is done in a special laboratory on two tracks using high-resolution oscillograph technology…
…and the self-created, MCU-based measurement setup for motherboards graphics cards (pictures below), where in the end the thermographic infrared images are also taken with a high-resolution industrial camera in an air-conditioned room. The audio measurements are then taken outside in my chamber (room-in-room).
The used software relies on my own interpreter including evaluation software as well as a very extensive and flexible Excel sheet for the graphical conversion. I have also summarized the individual components of the test system in tabular form:
Test System and Equipment |
|
---|---|
Hardware: |
AMD Ryzen 9 5950X O CMSI MEG X570 Godlik e2x 16 GB Corsair DDR4 4000 Vengeance RGB Pr o1x 2 TByte Aorus (NVMe System SSD, PCIe Gen. 4 )1x 2 TB Corsair MP400 (Data )1x Seagate FastSSD Portable USB- CBe Quiet! Dark Power Pro 12 1200 Watt |
Cooling: |
Alphacool Ice Block XPX Pr oAlphacool Ice Mincer (modifie d)Thermal Grizzly Kryonaut |
Case: |
Raijintek Paean |
Monitor: | BenQ PD3220U |
Power Consumption: |
Oscilloscope-based system: Non-contact direct current measurement on PCIe slot (riser card)N on-contact direct current measurement at the external PCIe power supplyDi rect voltage measurement at the respective connectors and at the power supply unit2x R ohde & Schwarz HMO 3054, 500 MHz multichannel oscilloscope with memory function4 x Rohde & Schwarz HZO50, current clamp adapter (1 mA to 30 A, 100 KHz, D C)4x Rohde & Schwarz HZ355, probe (10:1, 500 MHz)1x Rohde & Schwarz HMC 8012, HiRes digital multimeter with memory f u nctionMCU-based shunt measuring (own build, Powenetics soft ware)Up to 10 channels (max. 100 values per sec ond)Special riser card with shunts for the PCIe x16 slot (P E G)NVIDIA PCAT and FrameView 1.1 |
Thermal imager: |
1x Optris PI640 + 2x Xi400 Thermal Imager sPix Connect Softwar eType K Class 1 thermal sensors (up to 4 channels) |
Acoustics: |
NTI Audio M2211 (with calibration file )Steinberg UR12 (with phantom power for the microphones )Creative X7, Smaart v. 7Own anechoic chamber, 3.5 x 1.8 x 2.2 m (LxDxH) Axial measurements, perpendicular to the centre of the sound source(s), measuring distance 50 c mNoise emission in dBA (slow) as RTA measuremen tFrequency spectrum as graphic |
OS: | Windows 10 Pro (all updates, current certified or press drivers) |
- 1 - Introduction, Data and Test Setup
- 2 - Teardown: PCB, Power Scheme and Cooler
- 3 - Borderlands 3
- 4 - Control (+DXR)
- 5 - Far Cry New Dawn
- 6 - Ghost Recon Breakpoint
- 7 - Horizon Zero Dawn
- 8 - Metro Exodus (+DXR)
- 9 - Shadow of the Tomb Raider
- 10 - Watch Dog Legion (+DXR)
- 11 - Wolfenstein Youngblood
- 12 - World War Z
- 13 - Power Consumption and Efficiency in Gaming
- 14 - Power Consumption in Detail, Voltages and Standards
- 15 - Transients and PSU Recommendation
- 16 - Clock Rate and Temperatures
- 17 - Fan Curve and Noise
- 18 - Summary and Conclusion
Kommentieren