Gaming • GPUs • Graphics • Practice • Reviews

Correctly limiting the power consumption of the GeForce RTX 4090: MSI Afterburner vs. 12VHPWR / 12V-2×6 adapter or both?

11. January 2024 06:00

What does the hard power limit do?

I’ve already covered the whole thing in relation to NVIDIA’s Boost in a more detailed article recently. But it fits in so well here that I’ll briefly summarize it again. The purpose of telemetry is to achieve maximum graphics performance while minimizing power consumption and the resulting side effects, such as waste heat, and to use all the monitoring data for this purpose. The main objective is to adjust the core voltage of the GPU in real time so that only as much power is supplied as is actually required for the current GPU load and to achieve the optimum clock rate. On the other hand, things like the power limits ensure that real upper limits are set for this mechanism. It is precisely these two tasks that we really need to separate for a better understanding!

So let’s start with our hard limitation via the input-side power limits (firmware, adapter variants). The firmware constantly estimates the energy consumption at very short intervals (virtually in real time), simultaneously queries all the sensors and the GPU prediction and also includes the telemetry data from the voltage regulator and the input monitoring (shunts, image below). These values are sent to the pre-programmed DPM (digital power management), i.e. the arbitrator. This control complex also knows the power, thermal and current limits of the GPU (BIOS, driver), which it can read from the respective registers. If even one of the input variables is exceeded, in this case the maximum currents, the arbitrator can reduce the voltage or clock rate. Which then limits the power consumption, as in our case!

Monitoring the 12V rail on a GeForce RTX 4070 using shunts

The software limitation uses the VFE

What does the software do differently with the Afterburner? To put it in ready-to-use terms: The individual boost steps including the default voltage are stored in the firmware, whereby the clock of the lowest boost stage is shifted or determined by a so-called offset and the rest is then calculated by the arbitrator (mediator). With AMD, the clock rates and voltages are set for a number of predefined DPM states, which is significantly less precise (more granular), but ultimately works in a similar way.

The disadvantage of such a publicly visible (and with suitable software also customizable) “frequency/voltage curve” is that it is actually not so easy to define it in general. What the end user can really only modify is a certain partial shift on the basis of previously calculated, individual limit and reference values of each individual chip under the current conditions! This is where the so-called VFE (Voltage Frequency Engine) comes into play, providing a flexible framework to specify or evaluate the relationship between clock frequencies, which is normally a function of voltage, speedo and temperature. Or to put it in a nutshell: The determined voltage for each of these frequency points of such a curve is actually a function of the speedo of the GPU, which is determined by the “Continuous Virtual Binning”. When the power limit is lowered via the afterburner, the VFE is also affected.

The main function of the VFE is therefore to dynamically adjust the voltage and frequency of the processors in order to optimize performance and energy efficiency. The VFE works closely with the PMU (Power Management Unit) to provide the correct voltage and frequency values for different operating states and load conditions. The VFE is responsible for adjusting the voltage and frequency, while Speedo monitors the PVT variations and provides the necessary information for the VFE to make the correct adjustments.

In the PMU runtime phase (see telemetry diagram at the top of the page), the perf task then takes samples of the GPU temperature every 200 ms (programmable), for example. If the temperature exceeds a (programmable) hysteresis value, the V/F curve is re-evaluated by solving the corresponding VFE equation and reprogramming the AVFS hardware. And the card repeats this nice loop until we switch the computer off again. The only difference is that we can’t change anything manually here, apart from setting the percentage in the afterburner.

Summary and conclusion

Interestingly, a 500-watt card can be operated better at 300 watts with the Afterburner than a 450-watt card, at least as far as the variances and frame times are concerned. Otherwise, I would not try to solve the hard cut simply by means of an adapter, but always use software to bring the entire telemetry including VFE on board. In the end, this regulates more precisely and is more in line with the purpose of the entire solution, including “Continuous Virtual Binning”, whose scope is much greater without tighter hard limits.

Yes, the “Requested Power Limit” can also be slightly exceeded, but that doesn’t matter as long as you stay within approximate limits. Incidentally, today’s article is based on a chance find that I came across while benchmarking. Thanks to Mafia II and switching off a power supply. You never stop learning. You can also find further reading on the subject here:

Das „Geheimnis“ hinter NVIDIAs ausgefeilter Telemetrie: die Rolle von Buckets, Speedo und Continuous Virtual Binning (CVB)

Pages:

180 Antworten

Zeige alle Kommentare an

Kommentar

Lade neue Kommentare

Casi030

Urgestein

11,923 Kommentare 2,339 Likes

#1 Jan 11, 2024

Wie ist das Verhalten wenn man über den Takt drosselt etwas unterhalb des eingestellten PowerLimits?

Antwort Gefällt mir

Saschman73

Veteran

471 Kommentare 274 Likes

#2 Jan 11, 2024

:sneaky:

View image at the forums

Antwort Gefällt mir

Igor Wallossek

10,234 Kommentare 18,936 Likes

#3 Jan 11, 2024

Takt lohnt nicht. Die VRE regelt das besser, dann lieber einen Framezahlbegrenzer.

Antwort 1 Like

Tronado

Urgestein

3,786 Kommentare 1,981 Likes

#4 Jan 11, 2024

Takt ist ziemlich unabhängig vom PL, man kann (je nach Chip und Kühlergüte) durchaus auf 80% PL runtersetzen und immer noch 140 MHz übertakten.

Antwort 1 Like

Tronado

Urgestein

3,786 Kommentare 1,981 Likes

#5 Jan 11, 2024

Framelimit ist immer gut, wenn es geht. PL auch kein Problem, kaum weniger fps bei 75-80% PL.

Antwort 1 Like

big-maec

Urgestein

856 Kommentare 507 Likes

#6 Jan 11, 2024

Das mache ich in der Regel öfters, wenn ich merke, dass die Karte in Spielen unnötig hochheizt. Mache das meist so, weil ich auch etwas zu faul bin, mich damit auseinanderzusetzen, die Karte anders zu limitieren.

Antwort 1 Like

*andi*

Neuling

5 Kommentare 5 Likes

#7 Jan 11, 2024

Die Spikes finde ich immer noch brutal. Ich frage mich ob man das nicht mit einer Kombination aus Spulen und Kondensatoren deutlich glätten könnte. Nur fehlt mit das Wissen für die korrekte Auslegung sowie auch die Möglichkeit zu messen ob es was bringen würde... Und ich habe die Befürchtung das dass Netzteil aus dem Takt kommt wenn eine zu große Induktive/Kapazitive Last anliegt.

Antwort Gefällt mir

Tronado

Urgestein

3,786 Kommentare 1,981 Likes

#8 Jan 11, 2024

Was hast du denn für ein Netzteil? Alle guten, nicht zu alten NT ab 850W (Corsair/Seasonic/Be Quiet!) haben mit den auf 450W begrenzten Karten keine Probleme.

Antwort Gefällt mir

PCIE Express 6.9

Mitglied

27 Kommentare 11 Likes

#9 Jan 11, 2024

Ich hab meine 3080ti auch auf 850mV mittels Afterburner undervolted damit sich mein Netzteil nicht abschält.
Das Powerlimit hab ich zwar zusätzlich auf 300 Watt begrenzt aber gefühlt hat das nie wirklich geholfen.
Ohne Spikes liefen auch 450 Watt einwandfrei in einem Benchmark.
Aber beim Zocken sind die Spikes je nach Game schon extrem.

Antwort 2 Likes

Rizoma

Veteran

173 Kommentare 140 Likes

#10 Jan 11, 2024

Das sind schon heftige spitzen, was hat man für welche wenn sie karte mit 600w läuft? Dann hat man doch spitzen die über der 12VHPWR Spezifikation liegen die sind zwar nur sehr kurz aber häufig. Das wäre dann doch bei 600w dauerbetrieb und nicht optimal verarbeiteten Stecker sicherlich die ursache nr. 1 für geschmort Stecker.

Antwort Gefällt mir

MD_Enigma

Mitglied

76 Kommentare 39 Likes

#11 Jan 11, 2024

Ich verstehe die versteifung auf den Afterburner nicht. Kann man doch per command line machen und ist bei jedem NVidia Treiber automatisch installiert.
nvidia-smi -pl 300

Antwort 3 Likes

grimm

Urgestein

3,105 Kommentare 2,044 Likes

#12 Jan 11, 2024

ich habe bei 6900XT sowie bei der aktuellen 4080 sowohl mit UV als auch mit Framelimiter experimentiert. Am Ende hat der FL immer "gewonnen", einfach, weil die Karte sich nehmen kann, was sie braucht, die Lösung für jedes Spiel taugt und keine Feinjustierung braucht und es nie Ruckler gab.

Antwort 3 Likes

Starfox555

Urgestein

1,484 Kommentare 750 Likes

#13 Jan 11, 2024

Wie waren die Temperaturen an den Kontaktflächen des
bei der Limitierung durch den After Burner ?

Antwort Gefällt mir

Igor Wallossek

10,234 Kommentare 18,936 Likes

#14 Jan 11, 2024

Die Kontakte waren nie heißer als die Platine. Die ist ja nachgewiesenermaßen richtig heiß.

Antwort Gefällt mir

Igor Wallossek

10,234 Kommentare 18,936 Likes

#15 Jan 11, 2024

Ich schrieb ja nicht umsonst NVAPI, nur wird die Masse das nicht frickeln wollen. :D

Antwort Gefällt mir

Starfox555

Urgestein

1,484 Kommentare 750 Likes

#16 Jan 11, 2024

Und um wieviel Kelvin lag die Differenz der Platinen Temperatur zwischen "nativer Spannungsversorgung", Software-, bzw. Hardware-Limitierung?

Antwort Gefällt mir

arcDaniel

Urgestein

1,624 Kommentare 890 Likes

#17 Jan 11, 2024

Also, wie soll ich diese Aussage verstehen?

Wenn ich bei meiner 4080 die Spannungs/Taktkurve anpassen, bekomme ich bei 0,945V einen Takt von 2640mhz, dabei liegt der Verbrauch bei um die 225W mit Ausreißer um die 240W.

Nun wenn sich hier keine Optimierung lohnt... einfach auf 240W begrenzen, bekomme ich gemessen und spürbar in Spielen weniger Leistung. Im Bench werden dann die 2640mhz nicht mal mehr erreicht. Mit einem Frame-Limit bleibt auch da die Karte durstiger.

Wäre es so einfach würde ich wieder mehr Linux nutzen, es gibt aber da im Moment noch keine Möglichkeit die Spannungs/Taktkurve anzupassen...

Antwort 1 Like

Starfox555

Urgestein

1,484 Kommentare 750 Likes

#18 Jan 11, 2024

View image at the forums

CoreCtrl / CoreCtrl · GitLab

Profile based system control utility

View image at the forums

gitlab.com

Antwort Gefällt mir

arcDaniel

Urgestein

1,624 Kommentare 890 Likes

#19 Jan 11, 2024

Nützliches Tool, als ich noch eine AMD-GPU hatte.....

Antwort Gefällt mir

Alle Kommentare lesen unter igor´sLAB Community →

Danke für die Spende

Du fandest, der Beitrag war interessant und möchtest uns unterstützen? Klasse!

Hier erfährst Du, wie: Hier spenden.

Hier kannst Du per PayPal spenden.

SHARKOON SGK50 S3 Mechanical Keyboard in comparison test – Affordable 75% keyboard with open-source firmware and pleasant sound in different variants

NiPoGi AK1 Plus Mini-PC Review – Curious interior, Intel N97 and perhaps a Raspberry Pi replacement

About the author

View All Posts

Igor Wallossek

Editor-in-chief and name-giver of igor'sLAB as the content successor of Tom's Hardware Germany, whose license was returned in June 2019 in order to better meet the qualitative demands of web content and challenges of new media such as YouTube with its own channel.

Computer nerd since 1983, audio freak since 1979 and pretty much open to anything with a plug or battery for over 50 years.

Follow Igor:
YouTube Facebook Instagram Twitter