The telemetry of current NVIDIA graphics cards and the Voltage Frequency Engine (VFE)
Now I want to describe NVIDIA’s Boost (and AMD’s Power Tune in a more generalized form) and put what I’ve just read into context, even if I have to repeat myself a bit (diagram below). The task of the so-called telemetry is to achieve maximum graphics performance with the lowest possible power consumption and the resulting side effects, such as waste heat, and to use all the monitoring data for this purpose. The main objective is to adjust the core voltage of the GPU in real time so that only as much power is supplied as is actually required for the current GPU load and to achieve the optimum clock rate.
Let’s start by simply calling it a voltage curve (I’m sure everyone has heard this before), even if I’ll have to go into more detail later. To put it in ready-to-use terms: The individual boost steps including the default voltage have been stored, whereby the clock of the lowest boost stage is shifted or determined by a so-called offset and the rest then results from the calculations of the arbitrator (mediator). With AMD, the clock rates and voltages are set for a number of predefined DPM states, which is significantly less precise (more granular), but ultimately works in a similar way.
The firmware constantly estimates the energy consumption at very short intervals (virtually in real time), simultaneously queries all the sensors and the GPU prediction and incorporates the telemetry data from the voltage regulator and the input monitoring (shunts, image below). These values are sent to the pre-programmed DPM (digital power management), i.e. the arbitrator. This control complex also knows the power, thermal and current limits of the GPU (BIOS, driver), which it can read from the respective registers. Within these limits, it controls the temperatures, all voltages, clock frequencies and fan speeds and always tries to get the maximum performance out of the card. If even one of the input variables is exceeded, the mediator can reduce the voltage or clock rate.
The disadvantage of such a publicly visible (and with suitable software also customizable) “frequency/voltage curve” is that it is actually not so easy to define in general terms. What the end user can really only modify is a certain partial shift on the basis of previously calculated, individual limit and reference values of each individual chip under the current conditions! This is where the so-called VFE (Voltage Frequency Engine) comes into play, providing a flexible framework to specify or evaluate the relationship between clock frequencies, which is normally a function of voltage, speedo and temperature. Or to put it in a nutshell: The calculated voltage for each of these frequency points of such a curve is actually a function of the GPU’s speedo, which is determined by “continuous virtual binning”.
You guessed it, now it gets a little trickier. Remember the first paragraphs on binning and the ATE flow: Continuous Virtual Binning (CVB) uses statistical models and algorithms to continuously and virtually analyze the performance of semiconductor components instead of actually physically testing them. “Continuous Virtual Binning” in our case of the GPU means that the voltage decreases by 10 mV (regular step size) when the speedo is increased by the same amount (based on a linear or quadratic equation). The voltage for each frequency point is a function of the temperature of the GPU.
The clock frequency and therefore the voltages of the GPU depend on the temperature. Semiconductors (p-type and n-type) can have either a positive or negative temperature coefficient and as the temperature increases, the movement in MOS transistors can decrease. This decrease increases the threshold voltage (Vt). This makes the transistor slower. Therefore, an increase in temperature will decrease the clock frequency and vice versa. This temperature dependency is captured in the same quadratic equation that uses the chip’s speedo. Since the frequency specified in the steps must logically remain locked, the voltage increases as the temperature rises in order to achieve the required frequency (or vice versa). This quadratic equation, which captures the relationship between the frequencies and their corresponding voltages, is captured by the so-called VFE frame, which is stored as part of the configuration data in the VBIOS firmware on the chip’s EEPROM and can no longer be overwritten.
The main function of the VFE is therefore to dynamically adjust the voltage and frequency of the processors in order to optimize performance and energy efficiency. The VFE works closely with the PMU (Power Management Unit) to provide the correct voltage and frequency values for different operating states and load conditions. I will come to this in the next paragraph. In summary, the Voltage Frequency Engine and Speedo work together to optimize performance and energy efficiency. The VFE is responsible for adjusting the voltage and frequency, while Speedo monitors the PVT variations and provides the necessary information for the VFE to make the right adjustments.
Now, let’s take a breath. But it’s not as complicated as it might sound at first. To make a long story short: You can neither trick nor override the Speedo. What you can change manually is always based on the stored Speedo and the values of the VFE, over which the end customer also has no influence. And now we also know that good cooling is often worth more than the most brutal OC. It’s the dreaded dog-eat-dog principle with air-cooled cards, where increasing the power limit for a higher clock rate also leads to higher temperatures and therefore lower clock rates again. You can do this forever and the card won’t get any faster. Just thirstier. This is exactly why the opposite undervolting is so clever, because it enables higher boost steps due to lower temperatures. So quasi lossless OC for free.
Danke für die Spende
Du fandest, der Beitrag war interessant und möchtest uns unterstützen? Klasse!
Hier erfährst Du, wie: Hier spenden.
Hier kannst Du per PayPal spenden.
22 Antworten
Kommentar
Lade neue Kommentare
Mitglied
1
Urgestein
Mitglied
Mitglied
Urgestein
1
Neuling
Veteran
1
Urgestein
Urgestein
Urgestein
Urgestein
Alle Kommentare lesen unter igor´sLAB Community →