Question XFX 5700 XT Cards

Anthony

Mitglied
Mitglied seit
Nov 28, 2020
Beiträge
22
Bewertungspunkte
6
Punkte
2
Preface: @Mini_Me has demonstrated profound knowledge when modding the 5700 XT series of graphics cards, and for those looking for an awesome tutorial, please reference https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/.

Story: At the present time, I own 40 XFX 5700 XT video cards; I have 40 more on order to make a total quantity of 80 cards. With this many cards, squeezing out an additional 1 MegaHash (MH) from each card would translate into +80MH. Out of the box, the stock bios on these XFX 5700 XT cards reflects about 51MH per card. With bios modifications and tweaks, it is possible for me to obtain between 54MH-56MH per card.

Configuration: I have 5 rigs in operation and soon I will have 5 more in operation. Each rig uses 8 cards. I am using HiveOS to manage my rigs.

The importance of bios modifications: Basically, using bios modifications allows each card to provide a better hashrate -- and as hashrate increases, potential ETH mined increases. (More Hash == More $$). Bios modifications also change how each card consumes electricity -- and aside from the initial cost of computer components, electricity is the second greatest cost when it comes to mining.
  • Stock Bios: 80 Cards x 51MH from each card = 4.080GH
  • Modified Bios: 80 Cards x (54MH || 56MH) = (4.320GH || 4.480GH).
The delicate balance in bios modification reflects a specific relationship between: 1.) Power Consumption, 2.) Temperature, 3.) Hashrate, and 4.) Stability vs. Instability. In my opinion, the goal of proper bios modification means that: (a.) each card consumes as little power as necessary, (b.) each card remains within "safe" temperatures to prevent card damage, (c.) each card produces the maximum hashrate, (d.) each card remains stable, which is to say that the card does not produce "invalid" shares or other errors that get in the way of making money.

The above mentioned thread link reflects a dialogue between two members, @Mini_Me and @AlleyCat. The thread is several pages long because it reflects a series of tests (trials, errors, and new trials with new errors, etc.) with different bios values and with different values in HiveOS. There are multiple variables in play (both on the bios tweaking side with MPT and RBE and on the software tweaking side in HiveOS), and changing one variable influences other variables. The thread is so long because many different settings were tried out in order to achieve the best possible result for specific card manufacturers, in this case manufacturer Sapphire and manufacturer Gigabyte.

This post in particular captured my attention: https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/post-79315.

RBE modified values in use:
Option 1 (Preferable and Recommended), applying Apple Inc. vram timings straps linked below, once for MT61K256M32 Micron, and Save the vbios after that load the saved vbios and once again apply the straps for K4Z80325BC Samsung if it is existed and save the vbios again,
https://www.igorslab.de/community/a...-mt61k256m32_gddr6_optimized_timings-zip.6544

MPT modified values in use:
Features Tab:
PPTable Features -> Feature Control = Nothing done. Left at default.
Overdrive Features = All Boxes Checked.

Overdrive Limits Tab:
GFX Maximum Clock = 1550
Memory Maximum Cock = 1000
Power Limit Maximum = 0
Power Limit Minimum = 0
Memory Timing Control = 1
Fan RPM Maximum = 3500
Fan RPM Minimum = 1100
Fan Acoustic Limit RPM Maximum = 3500
Fan Acoustic Limit RPM Minimum = 1100
Zero RPM Control = 0

Power and Voltage Tab:
Maximum Voltage GFX = 900
Maximum Voltage SoC = 1150
Minimum Voltage GFX = 700
Minimum Voltage SoC = 750
Power Limit GPU = 140
TDC Limit GFX (A) = 140
TDC Limit SoC (A) = 14

Frequency Tab:
GFX Maximum = 1400
GFX Minimum = 300
SoC Maximum = 1267
SoC Minimum = 507
Memory DPM 0 = 100
Memory DPM 1 = 500
Memory DPM 2 = 625
Memory DPM 3 = 960

Curve Tab:
(Left Alone, no change).
AVFS (GHz->V)
Override box is not ticked.
a = 0.017810
b = -0.047280
c = 0.054020
StaticVoltageOffset (GHz->V) = 0.000000

Fan Tab:
PWM Minimum = 15
Fan Acoustic Limit RPM = 1550
Fan Throtteling RPM = 3200
Fan Maximum RPM = 3500
Fan Target Temperature = 85
Fan GFX Clock = 800
Zero RPM Enable Box is ticked.
Stop Temperature = 60
Start Temperature = 68

As you can see, in order to achieve stability (prevent invalid shares, rejected shares, and phoenixminer reboots), I have to run most cards at 1375 core. Other cards I had to reduce to 1325 core. Some cards run at memory 930, some at 905, some at 900, and some at 890.

I know my cards can give better performance. Even though I have reduced power values to reduce # of invalid-rejected shares, phoenixminer still reboots from time to time, and sometimes, I will have a card that will "stop" and disappear ... which causes the phoenixminer rebooter. For these cards, I reduce values.

Any advice from master @Mini_Me would be appreciated.


RIG #1
RIG1.png

RIG #2
RIG2.png

RIG #3
RIG3.png

RIG #4
RIG4.png

RIG #5
RIG5.png
 
Please use one of the setting below to decrease the memory temperature,

Core clock, VDD : Memory clock, VDDCI, MVDD, Hashrate in TeamRedMiner,

- 1330 MHz, 750 mV : 850 (1700/2) MHz, 800 mV, 1350 mV, ~52.10 MH/s,
- 1345 MHz, 750 or 760 mV : 865 (1730/2) MHz, 810 mV, 1350 mV, ~53.10 MH/s,
- 1360 MHz, 760 or 770 mV : 880 (1760/2) MHz, 820 mV, 1350 mV, ~54.10 MH/s.
Thank you very much @Mini_Me!! I'm going to try these values and I will keep you informed of the results obtained.
 
Hey Guys. Please try the B-mod of TRM.

Can Mini_me help us with some C clock settings that should work?

It is supposed to shave 9W for each card!

Navi 5700XT/5700 Transition
===========================
If a driver that supports large allocations is installed, the miner will default
5700XT/5700s to the new B-mode, using as much vram as possible on the gpu. The
main benefit from using this mode is support for high hashrates at significantly
lower core clk+voltage settings compare to A-mode and previous TRM versions.
For a 5700XT/5700 tuned to previous TRM versions, our tests indicate that you
can drop core clk ~100 MHz and voltage -50mV in B-mode while still preserving
hashrate (although often with a tiny loss). The efficiency increase is an
obvious trade-off win even with a small hashrate decrease. If you're on a well
defined voltage curve, you shouldn't have to touch your voltage settings,
lowering the core clk is enough, otherwise you should lower voltage manually as
well. You might need to retune if gpus crash at the lower settings, slightly
raising voltage until stable again.
If you for some reason can't run in the new B-mode, you can switch to the A-mode
manually with --eth_config=Axxx (if you're not familiar with this argument, see
USAGE.txt for more info).
 
Please use one of the setting below to decrease the memory temperature,

Core clock, VDD : Memory clock, VDDCI, MVDD, Hashrate in TeamRedMiner,

- 1330 MHz, 750 mV : 850 (1700/2) MHz, 800 mV, 1350 mV, ~52.10 MH/s,
- 1345 MHz, 750 or 760 mV : 865 (1730/2) MHz, 810 mV, 1350 mV, ~53.10 MH/s,
- 1360 MHz, 760 or 770 mV : 880 (1760/2) MHz, 820 mV, 1350 mV, ~54.10 MH/s.
Hi Guys!! Sorry for the late update to this thread... busy days at job.
I have tested the 3 options of values that kindly give me @Mini_Me... but with not luck regarding mem temps and a little reduce of power consumption in the last sets of values.

First I show you how was the values before testing (with the numbers of page 32 of @Mini_Me guide)

Captura de pantalla 2021-01-31 a las 1.24.56 p.m..png

Then... with the first set of values and after 2 hours of mining (this values give me a copule of reboots with GPU dead errors):

Captura de pantalla 2021-01-31 a las 2.36.11 p.m..png

Captura de pantalla 2021-01-31 a las 2.36.41 p.m..png

Second set of values - 2 hours mining:
Captura de pantalla 2021-01-31 a las 3.28.33 p.m..png

Third set of values - 2 hours mining:

Captura de pantalla 2021-01-31 a las 4.39.29 p.m..png

Have to say that as I said in my first post the reboots with GPU Dead errors appears from time to time since before the tests and they are always GPU1... I have tried switching Cards and Risers but the errors still GPU1; now I'm testing without using that particular pci-e slot on motherboard.
For now I will follow the advice of @Mini_Me and turn off the rig for an hour every 2 days. I have not test yet the vbios from @Anthony on my cards... I think thats the next step.
Thanks guys for your valuable help.
 
Hi Guys!! Sorry for the late update to this thread... busy days at job.
I have tested the 3 options of values that kindly give me @Mini_Me... but with not luck regarding mem temps and a little reduce of power consumption in the last sets of values.

First I show you how was the values before testing (with the numbers of page 32 of @Mini_Me guide)

Anhang anzeigen 10612

Then... with the first set of values and after 2 hours of mining (this values give me a copule of reboots with GPU dead errors):

Anhang anzeigen 10613

Anhang anzeigen 10614

Second set of values - 2 hours mining:
Anhang anzeigen 10615

Third set of values - 2 hours mining:

Anhang anzeigen 10616

Have to say that as I said in my first post the reboots with GPU Dead errors appears from time to time since before the tests and they are always GPU1... I have tried switching Cards and Risers but the errors still GPU1; now I'm testing without using that particular pci-e slot on motherboard.
For now I will follow the advice of @Mini_Me and turn off the rig for an hour every 2 days. I have not test yet the vbios from @Anthony on my cards... I think thats the next step.
Thanks guys for your valuable help.
I had a similar issue with RX 5700 (non XT), and I solved the issue by setting the following minder option:
--eth_config=B,B,B,B,A
This settings set the GPU4 to mode A, and GPU' 0-3 to B.

This is a workaround. I would try to modify the maximum TDC Limit GFX +4A
 
I had a similar issue with RX 5700 (non XT), and I solved the issue by setting the following minder option:
--eth_config=B,B,B,B,A
This settings set the GPU4 to mode A, and GPU' 0-3 to B.

This is a workaround. I would try to modify the maximum TDC Limit GFX +4A
Hi @AlleyCat !! Thanks for your answer... One newbie question: I have to set to mode A the Card that has GPU Dead errors and the others to mode B?
Thanks in advance
 
Hi Guys!! Sorry for the late update to this thread... busy days at job.
I have tested the 3 options of values that kindly give me @Mini_Me... but with not luck regarding mem temps and a little reduce of power consumption in the last sets of values.

First I show you how was the values before testing (with the numbers of page 32 of @Mini_Me guide)

Anhang anzeigen 10612

Then... with the first set of values and after 2 hours of mining (this values give me a copule of reboots with GPU dead errors):

Anhang anzeigen 10613

Anhang anzeigen 10614

Second set of values - 2 hours mining:
Anhang anzeigen 10615

Third set of values - 2 hours mining:

Anhang anzeigen 10616

Have to say that as I said in my first post the reboots with GPU Dead errors appears from time to time since before the tests and they are always GPU1... I have tried switching Cards and Risers but the errors still GPU1; now I'm testing without using that particular pci-e slot on motherboard.
For now I will follow the advice of @Mini_Me and turn off the rig for an hour every 2 days. I have not test yet the vbios from @Anthony on my cards... I think thats the next step.
Thanks guys for your valuable help.
Hi and welcome,

Please could you inform what is your power supply PSU model and rate?
 
Hi @AlleyCat !! Thanks for your answer... One newbie question: I have to set to mode A the Card that has GPU Dead errors and the others to mode B?
Thanks in advance
Correct. The TRM rev. 8.0.0 defaults RX 5700 GPU's to mode B. By forcing the GPU that fails to use Mode A you will be running like the previous kernel.

Please note that this is just a workaround.
 
Please use one of the setting below to decrease the memory temperature,

Core clock, VDD : Memory clock, VDDCI, MVDD, Hashrate in TeamRedMiner,

- 1330 MHz, 750 mV : 850 (1700/2) MHz, 800 mV, 1350 mV, ~52.10 MH/s,
- 1345 MHz, 750 or 760 mV : 865 (1730/2) MHz, 810 mV, 1350 mV, ~53.10 MH/s,
- 1360 MHz, 760 or 770 mV : 880 (1760/2) MHz, 820 mV, 1350 mV, ~54.10 MH/s.
Hi!

With this settings. What SOC frequency can we use on HIVEOS OC? Its possible to down some consume and heat or with Micron memory is recomended stay default like you said in guide @Mini_Me.

1612261178604.png
This is the RIG and i woud like down the heat.
 
Zuletzt bearbeitet :
Hi!

With this settings. What SOC frequency can we use on HIVEOS OC? Its possible to down some consume and heat or with Micron memory is recomended stay default like you said in guide @Mini_Me.
Hi and welcome,

For stability on the long run, the default SoC Maximum clock is recommended,

You can still set the SoC maximum clock at 957 MHz for memory clock 910 and below or at 1093 MHz for memory clock 915 MHz to 950 or 960 MHz.
 
Hi and welcome,

For stability on the long run, the default SoC Maximum clock is recommended,

You can still set the SoC maximum clock at 957 MHz for memory clock 910 and below or at 1093 MHz for memory clock 915 MHz to 950 or 960 MHz.
By the way, if the SoC maximum clock is set at default then the DPM scaling will automatically set the values based on the memory clock.
 
Please could you inform how the gpu are connected to the power supply?
Hi @Mini_Me!! sorry for the late update... Last night when we talked it was late night here and I go to work early. The GPUs are connected to the PSU with one 1x8pin and 1x6pin connectors and the risers with molex; I know that for risers 1x6 connectors are better but I don't have enough of them in my PSUs and I believe that adapters arent good too. The GPUs are MSI rx5700 MECH OC (non XT). I have read that a lot of people has high mem temps with these model in particular; I'm thinking in try the thermal pads solution too. Thank you for your constant support.
 
Hi @Mini_Me!! sorry for the late update... Last night when we talked it was late night here and I go to work early. The GPUs are connected to the PSU with one 1x8pin and 1x6pin connectors and the risers with molex; I know that for risers 1x6 connectors are better but I don't have enough of them in my PSUs and I believe that adapters arent good too. The GPUs are MSI rx5700 MECH OC (non XT). I have read that a lot of people has high mem temps with these model in particular; I'm thinking in try the thermal pads solution too. Thank you for your constant support.
Hi and welcome,

No worries, I believe that 2 gpu are connected to 1 psu, correct.

Please do remember that your 650 watt gold output is approximately 520 watt and the quality of the cables and risers should be good to minimize the risks and the cause of problems,

Regarding the memory temperature, you can decrease the memory clock until the desired temperature reached.
 
Hi @Mini_Me!!! Thanks a lot for your constant help and sorry to bother you again. I ask you a beginner question... Where do I set the Core State to 1 and Mem State to1 for the rx5700s?? Are these values set in the page 32 guide? Because for what I have read these values in the HiveOs OC profiles aren't for rx5700 cards.
Thanks in advance
 
Hi @Mini_Me!!! Thanks a lot for your constant help and sorry to bother you again. I ask you a beginner question... Where do I set the Core State to 1 and Mem State to1 for the rx5700s?? Are these values set in the page 32 guide? Because for what I have read these values in the HiveOs OC profiles aren't for rx5700 cards.
Thanks in advance
Hi and welcome,

There is no need for these options anymore that is why I removed it from the guide.
 
Oben Unten