Today I finally reveal the secret of the
GeForce RTX 4080 12GB GeForce RTX 4070 Ti in the form of the board partner card MSI GeForce RTX 4070 Ti SUPRIM X 12GB, which is the smallest Ada card so far. The near-full AD104 is used, while the larger RTX 4080 is based on the much more trimmed AD103 and the RTX 4090 on the somewhat reduced AD102. New chip, new luck? Let’s be surprised, because there is no Founders Edition as a test sample this time, but only Board Partner cards. However, ours follows the standard TBP, so you can definitely equate the two.
Of course, there are many benchmarks as usual, the comprehensive teardown, a very elaborate board and cooler analysis with some reverse engineering, as well as the analysis of the power consumption and the load peaks including a suitable power supply recommendation. Since I know that many colleagues will also repeat all the technical details including theory, which have already been presented in various tidbits, I’ll spare myself that today on a large scale and only briefly refer to the already known data. After all, you want to see real figures today and not PR fireworks.
Unpack and get to work – those who buy the card get a few nice extras in the scope of delivery. There would be
the fuse of the 12VHPWR adapter with 3x 8-pin for old power supply plugs, a mouse pad and of course the card itself along with graphics card holder. Well, you need them, otherwise you literally have a serious problem. But before I go into more detail about the card itself, here’s a little information insert about the chip and the used architecture.
I had the choice between this card and an MSI RTX 4070 Ti GamingX Trio, which is of course much simpler and thus pushes the price more towards MSRP. This card comes with a 12VHPWR adapter with only 2x 8-pin and a TBP limit of 305 watts maximum. However, since both cards come with 285 watts TBP out of the box and I also tested this clock, the performance differences are close to zero and the GPU lottery decides the rest. They were almost exactly the same speed in my case. If there is interest, I will of course also test this card individually.
The AD 104 and the new Ada architecture
The 294.5 mm² chip of the NVIDIA GeForce RTX 4070 Ti is also manufactured in the TSMC 4N process and has 35.8 billion transistors. The Ada architecture relies on 5 Graphic Processing Clusters (GPC) and 60 new streaming multiprocessors (SM) with 7600 CUDA cores, whose performance and energy efficiency have increased significantly compared to Ampere. In addition, there are 240 tensor cores of the 4th generation. Generation and Optical Flow, enabling transformative AI technologies including NVIDIA DLSS and the new NVIDIA DLSS 3 frame rate multiplier.
The 60 RT cores of the 3rd generation The new generation offers up to 2x ray tracing performance, Shader Execution Reordering (SER) improves ray tracing operations by a factor of two. In addition, there are a total of 30 Texture Processing Clusters (TPC), 240 Texture Units (TU) and 80 ROPs. The L2 cache is 49152 KB in total, and the card uses 12 GB of GDDR6X clocked at 10500 MHz on a rather narrow 192-bit interface, which corresponds to a data rate of 21 Gbps and a bandwidth of 504 GB/s.
The AD104-400 of the GeForce RTX 4070 Ti has hardly been limited and only offers one NVDEC (decoder) instead of four in total. However, NVIDIA uses a dual AV1 encoder in both chips, with the NVIDIA encoder (NVENC) being the 8th generation. Generation with AV1 is said to work up to 40% more efficiently than H.264. The rest remains identical. Except for the trimmed decoder, the AD104-400 is more or less a full upgrade.
The changes to all three core types can be summarized quite simply:
- Programmable Shader: Ada’s SM includes an important new technology called Shader Execution Reordering (SER), which reorders work on the fly, providing a 2x speedup for ray tracing. SER is as big an innovation as the out-of-order design for CPUs was at the time. 83 shader TFLOPS are quite a statement
- Tensor Cores of the 4. Generation: The new Tensor Core in Ada includes the NVIDIA Hopper FP8 Transformer Engine, which delivers over 1.3 petaFLOPS for AI inference workloads in the RTX 4090. Compared to FP16, FP8 halves data storage requirements and doubles AI performance. The GeForce RTX 4090 thus offers more than twice the total Tensor Core processing power of the RTX 3090 Ti.
- RT Core of the 3. Generation: A new opacity micromap engine accelerates ray tracing of alpha-checked geometries by a factor of 2. Add to this a new micro-mesh engine that handles all the geometric richness without further BVH creation and storage costs. Triangulation throughput is 191 RT-TFLOPS, compared to Ampere’s 78 RTTFLOPS.
The card still relies on a PCIe Gen. 4 interface and only for the external power connection with the 12VHPWR connector (12+4 pin) on an element of the PCIe Gen. 5 specification. The TGP is 285 watts and can also be increased up to 365 watts, depending on the board partner (which is rather pointless because the voltage limits at some point anyway). The extremely oversized cooler will know how to prevent the chip’s maximum permissible 90 °C anyway.
The MSI GeForce RTX 4070 Ti SUPRIM X 12 GB in detail
The RTX 4080 SUPRIM X 16GB weighs 2352 grams, the RTX 4070 Ti SUPRIM X “only” 2004 grams The length of 34 cm has grown by another centimeter, but the height of 13.5 cm is slightly lower, but still far above the normal size. You need a trum of cases if you have to use the great adapter. The material looks somehow already spacey as always and it has once again perfectly succeeded in incorporating the familiar SUPRIM design language. Light metal and a bit of plastic, plus a visually successful light metal backplate with luminous accents – the buyer can live with that, even visually.
MSI uses a dual BIOS including silent and gaming mode. The only difference between the two modes is the fan curve, which is a bit less aggressive in Silent mode and also lets the card run a bit “warmer”. But with this giant cooler, even that is more than enough. Power limit and clock rates are identical in both modes. MSI has enabled up to 365 watts for the card, which is rather wasteful in the end, because you will never be able to exploit such values in gaming, not even with OC. Because then, as always, the voltage limits first.
But you can not only put power into the card, but also video connections. There are four of them, to be quite precise, as there are: three times DisplayPort 1.4a and once HDMI 2.1a. That is especially a pity for the DisplayPort when it comes to the new specifications. Chance missed, unfortunately. And with HDMI, you have to trick with the compression from 4K onwards if you want it to go above 120 Hz. That’s a pity, but it’s not MSI’s fault.
The screenshot from GPU-Z shows us the default settings of the gaming mode, which correspond to those of the silent mode except for the higher power limit.
With this, the first page is finished and we are slowly preparing for the test.