Today we can finally present the NVIDIA GeForce RTX 4080 Founders Edition (FE) including all technical details and the long-awaited benchmarks, because the suspense curve was already plenty big. Even though NVIDIA actually sees the card as the successor to the GeForce RTX 3080, it is still a challenge to the RTX 3090 and especially to the GeForce RTX 3090 Ti, because the price will probably be exactly where the old top model used to be. The GeForce RTX 4080 again features really fat coolers, because I already wrote in an article about the origin of the mega-sized coolers for the RTX 4090 that the TBP could only be extremely shortened afterwards. Here, the superstructures are still oversized by 100 watts, because the planned TDP was cut from once 420 watts to 320 watts. Of course, this makes it all the more curious, especially in terms of temperatures and volume.
Today, we will first primarily focus on the technical overview including the GeForce RTX 4080 FE as a test object. Of course, as usual including benchmarks, teardown, board and cooler analysis, as well as power consumption and load peaks with power supply recommendation. Of course, we also include the important things like DLSS 3.0 and Reflex in the benchmarks (including short introductions), but I already have to refer to the many other follow-ups that will then deal with further benchmarks, image quality comparisons to FSR and Xe, and latencies.
This subject has become so complex that you simply can’t do it justice with a quick skim. We even have a special video analysis for you then, the basis of which I had already built into the CMS for the RTX 4090. So, once again, it’s worth enjoying NVIDIA week until the end. And we will of course use all this effort as a basis to then compare all this again directly with AMD’s upcoming RDNA3 graphics card generation in November.
The AD 103 and the new Ada architecture
The NVIDIA GeForce RTX 4080 is also manufactured in the TSMC 4N process and has 45.9 billion transistors and, I can already spoil, also offers a decent leap in terms of performance, efficiency and also AI-supported graphics, even if the gap to the GeForce RTX 4090 seems almost huge. The Ada architecture relies on up to 7 Graphic Processing Clusters (GPC) and up to 80 new Streaming Multiprocessors (SM) with 10,240 CUDA cores, whose performance and energy efficiency have increased significantly.
In addition, tensor cores of the 4. generation and Optical Flow, enabling transformative AI technologies including NVIDIA DLSS and the new NVIDIA DLSS 3 frame rate multiplier. The RT cores of the 3. generation offer up to 2x ray tracing performance, Shader Execution Reordering (SER) improves ray tracing operations by a factor of two. In addition, NVIDIA now also uses a dual AV1 encoder, with the NVIDIA encoder (NVENC) being the 8th generation. Generation with AV1 is said to work up to 40% more efficiently than H.264.
The AD103-300 of the GeForce RTX 4080 has been limited a bit and still offers 7 GPC in total, but two of them have been cut from 12 to 10 SM and one even to 8 SM. This still results in 76 SM including the 9728 CUDA cores for the chip of the new consumer card. In addition, there are a total of 38 Texture Processing Clusters (TPC), 76 RT cores of the 3rd generation. Generation, 304 4th generation tensor cores. Generation, 304 Texture Units (TU) and 112 ROPs. The L2 cache is 65536 KB in total and the card uses 16 GB of GDDR6X clocked at 11200 MHz on a 256-bit interface, which corresponds to a data rate of 22.4 Gbps and a bandwidth of 716.8 GB/s.
The changes to all three core types can be summarized quite simply:
- Programmable Shader: Ada’s SM includes an important new technology called Shader Execution Reordering (SER), which reorders work on the fly, providing a 2x speedup for ray tracing. SER is as big an innovation as the out-of-order design for CPUs was at the time. 83 shader TFLOPS are quite a statement
- Tensor Cores of the 4. Generation: The new Tensor Core in Ada includes the NVIDIA Hopper FP8 Transformer Engine, which delivers over 1.3 petaFLOPS for AI inference workloads in the RTX 4090. Compared to FP16, FP8 halves data storage requirements and doubles AI performance. The GeForce RTX 4090 thus offers more than twice the total Tensor Core processing power of the RTX 3090 Ti.
- RT Cores of the 3. Generation: A new opacity micromap engine accelerates ray tracing of alpha-checked geometries by a factor of 2. Add to this a new micro-mesh engine that handles all the geometric richness without further BVH creation and storage costs. Triangulation throughput is 191 RT-TFLOPS, compared to Ampere’s 78 RTTFLOPS.
The card still relies on a PCIe Gen. 4 interface and only for the external power connection with the 12VHPWR connector (12+4 pin) on an element of the PCIe Gen. 5 specification. The TGP is 320 watts and can also be raised up to 400 watts, depending on the board partner (which is rather pointless because the voltage limits at some point anyway). The extremely oversized cooler will know how to prevent the chip’s maximum permissible 90 °C anyway.
The NVIDIA GeForce RTX 4080 FE 16 GB in detail
The FE weighs “only” 2123 grams. With the 30.5 cm length and 12.5 cm height from slot to top edge and a thickness of exactly 6 cm, this is still a real bruiser, but it is still significantly more compact than most board partner cards. With three 6+2 pin connectors, the included adapter to the 12+4 pin connector turns out similar to the old threesome of the GeForce RTX 3090 Ti. I wonder what awaits us in the final consequence? I know it already and you may be surprised right away!
Of course, you can put not only power into the card, but also video connections. Four, to be exact, as there are: three times DisplayPort 1.4a and once HDMI 2.1a. That is especially a pity for DisplayPort when it comes to the new specifications for 2.1. Chance missed, unfortunately. But let’s be honest: when will there be affordable monitors. 2023? Rather hardly…
Looks like a closed event, but it’s not the only magnetic shutter that you can unfold! But all in good time, today is the end of the underwear for now. Naked facts are coming soon, I promise!
With that, the first page is done and we are slowly preparing for the test.
- 1 - Introduction, technical data and technology
- 2 - Test system and the igor'sLAB MIFCOM-PC
- 3 - Teardown: PCB, components and cooler
- 4 - Gaming performance WQHD (2560 x 1440 Pixels)
- 5 - Gaming performance UHD (3840 x 2160 Pixels)
- 6 - Gaming performance UHD + DLSS/FSR/XeSS (3840 x 2160 Pixels)
- 7 - DLSS 3.0 and the longest benchmark bars
- 8 - NVIDIA Reflex and latencies
- 9 - Workstation performance
- 10 - Power consumption, load levels and standards
- 11 - Transients and PSU recommendation
- 12 - Temperatures, clock rates, OC, fans and noise
- 13 - Summary and conclusion