What does the hard power limit do?
I’ve already covered the whole thing in relation to NVIDIA’s Boost in a more detailed article recently. But it fits in so well here that I’ll briefly summarize it again. The purpose of telemetry is to achieve maximum graphics performance while minimizing power consumption and the resulting side effects, such as waste heat, and to use all the monitoring data for this purpose. The main objective is to adjust the core voltage of the GPU in real time so that only as much power is supplied as is actually required for the current GPU load and to achieve the optimum clock rate. On the other hand, things like the power limits ensure that real upper limits are set for this mechanism. It is precisely these two tasks that we really need to separate for a better understanding!
So let’s start with our hard limitation via the input-side power limits (firmware, adapter variants). The firmware constantly estimates the energy consumption at very short intervals (virtually in real time), simultaneously queries all the sensors and the GPU prediction and also includes the telemetry data from the voltage regulator and the input monitoring (shunts, image below). These values are sent to the pre-programmed DPM (digital power management), i.e. the arbitrator. This control complex also knows the power, thermal and current limits of the GPU (BIOS, driver), which it can read from the respective registers. If even one of the input variables is exceeded, in this case the maximum currents, the arbitrator can reduce the voltage or clock rate. Which then limits the power consumption, as in our case!
The software limitation uses the VFE
What does the software do differently with the Afterburner? To put it in ready-to-use terms: The individual boost steps including the default voltage are stored in the firmware, whereby the clock of the lowest boost stage is shifted or determined by a so-called offset and the rest is then calculated by the arbitrator (mediator). With AMD, the clock rates and voltages are set for a number of predefined DPM states, which is significantly less precise (more granular), but ultimately works in a similar way.
The disadvantage of such a publicly visible (and with suitable software also customizable) “frequency/voltage curve” is that it is actually not so easy to define it in general. What the end user can really only modify is a certain partial shift on the basis of previously calculated, individual limit and reference values of each individual chip under the current conditions! This is where the so-called VFE (Voltage Frequency Engine) comes into play, providing a flexible framework to specify or evaluate the relationship between clock frequencies, which is normally a function of voltage, speedo and temperature. Or to put it in a nutshell: The determined voltage for each of these frequency points of such a curve is actually a function of the speedo of the GPU, which is determined by the “Continuous Virtual Binning”. When the power limit is lowered via the afterburner, the VFE is also affected.
The main function of the VFE is therefore to dynamically adjust the voltage and frequency of the processors in order to optimize performance and energy efficiency. The VFE works closely with the PMU (Power Management Unit) to provide the correct voltage and frequency values for different operating states and load conditions. The VFE is responsible for adjusting the voltage and frequency, while Speedo monitors the PVT variations and provides the necessary information for the VFE to make the correct adjustments.
In the PMU runtime phase (see telemetry diagram at the top of the page), the perf task then takes samples of the GPU temperature every 200 ms (programmable), for example. If the temperature exceeds a (programmable) hysteresis value, the V/F curve is re-evaluated by solving the corresponding VFE equation and reprogramming the AVFS hardware. And the card repeats this nice loop until we switch the computer off again. The only difference is that we can’t change anything manually here, apart from setting the percentage in the afterburner.
Summary and conclusion
Interestingly, a 500-watt card can be operated better at 300 watts with the Afterburner than a 450-watt card, at least as far as the variances and frame times are concerned. Otherwise, I would not try to solve the hard cut simply by means of an adapter, but always use software to bring the entire telemetry including VFE on board. In the end, this regulates more precisely and is more in line with the purpose of the entire solution, including “Continuous Virtual Binning”, whose scope is much greater without tighter hard limits.
Yes, the “Requested Power Limit” can also be slightly exceeded, but that doesn’t matter as long as you stay within approximate limits. Incidentally, today’s article is based on a chance find that I came across while benchmarking. Thanks to Mafia II and switching off a power supply. You never stop learning. You can also find further reading on the subject here:
180 Antworten
Kommentar
Lade neue Kommentare
Urgestein
Veteran
1
Urgestein
Urgestein
Urgestein
Neuling
Urgestein
Mitglied
Veteran
Mitglied
Urgestein
Urgestein
1
1
Urgestein
Urgestein
Urgestein
Urgestein
Alle Kommentare lesen unter igor´sLAB Community →