Question XFX 5700 XT Cards

Anthony

Mitglied
Mitglied seit
Nov 28, 2020
Beiträge
22
Bewertungspunkte
6
Punkte
2
Preface: @Mini_Me has demonstrated profound knowledge when modding the 5700 XT series of graphics cards, and for those looking for an awesome tutorial, please reference https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/.

Story: At the present time, I own 40 XFX 5700 XT video cards; I have 40 more on order to make a total quantity of 80 cards. With this many cards, squeezing out an additional 1 MegaHash (MH) from each card would translate into +80MH. Out of the box, the stock bios on these XFX 5700 XT cards reflects about 51MH per card. With bios modifications and tweaks, it is possible for me to obtain between 54MH-56MH per card.

Configuration: I have 5 rigs in operation and soon I will have 5 more in operation. Each rig uses 8 cards. I am using HiveOS to manage my rigs.

The importance of bios modifications: Basically, using bios modifications allows each card to provide a better hashrate -- and as hashrate increases, potential ETH mined increases. (More Hash == More $$). Bios modifications also change how each card consumes electricity -- and aside from the initial cost of computer components, electricity is the second greatest cost when it comes to mining.
  • Stock Bios: 80 Cards x 51MH from each card = 4.080GH
  • Modified Bios: 80 Cards x (54MH || 56MH) = (4.320GH || 4.480GH).
The delicate balance in bios modification reflects a specific relationship between: 1.) Power Consumption, 2.) Temperature, 3.) Hashrate, and 4.) Stability vs. Instability. In my opinion, the goal of proper bios modification means that: (a.) each card consumes as little power as necessary, (b.) each card remains within "safe" temperatures to prevent card damage, (c.) each card produces the maximum hashrate, (d.) each card remains stable, which is to say that the card does not produce "invalid" shares or other errors that get in the way of making money.

The above mentioned thread link reflects a dialogue between two members, @Mini_Me and @AlleyCat. The thread is several pages long because it reflects a series of tests (trials, errors, and new trials with new errors, etc.) with different bios values and with different values in HiveOS. There are multiple variables in play (both on the bios tweaking side with MPT and RBE and on the software tweaking side in HiveOS), and changing one variable influences other variables. The thread is so long because many different settings were tried out in order to achieve the best possible result for specific card manufacturers, in this case manufacturer Sapphire and manufacturer Gigabyte.

This post in particular captured my attention: https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/post-79315.

RBE modified values in use:
Option 1 (Preferable and Recommended), applying Apple Inc. vram timings straps linked below, once for MT61K256M32 Micron, and Save the vbios after that load the saved vbios and once again apply the straps for K4Z80325BC Samsung if it is existed and save the vbios again,
https://www.igorslab.de/community/a...-mt61k256m32_gddr6_optimized_timings-zip.6544

MPT modified values in use:
Features Tab:
PPTable Features -> Feature Control = Nothing done. Left at default.
Overdrive Features = All Boxes Checked.

Overdrive Limits Tab:
GFX Maximum Clock = 1550
Memory Maximum Cock = 1000
Power Limit Maximum = 0
Power Limit Minimum = 0
Memory Timing Control = 1
Fan RPM Maximum = 3500
Fan RPM Minimum = 1100
Fan Acoustic Limit RPM Maximum = 3500
Fan Acoustic Limit RPM Minimum = 1100
Zero RPM Control = 0

Power and Voltage Tab:
Maximum Voltage GFX = 900
Maximum Voltage SoC = 1150
Minimum Voltage GFX = 700
Minimum Voltage SoC = 750
Power Limit GPU = 140
TDC Limit GFX (A) = 140
TDC Limit SoC (A) = 14

Frequency Tab:
GFX Maximum = 1400
GFX Minimum = 300
SoC Maximum = 1267
SoC Minimum = 507
Memory DPM 0 = 100
Memory DPM 1 = 500
Memory DPM 2 = 625
Memory DPM 3 = 960

Curve Tab:
(Left Alone, no change).
AVFS (GHz->V)
Override box is not ticked.
a = 0.017810
b = -0.047280
c = 0.054020
StaticVoltageOffset (GHz->V) = 0.000000

Fan Tab:
PWM Minimum = 15
Fan Acoustic Limit RPM = 1550
Fan Throtteling RPM = 3200
Fan Maximum RPM = 3500
Fan Target Temperature = 85
Fan GFX Clock = 800
Zero RPM Enable Box is ticked.
Stop Temperature = 60
Start Temperature = 68

As you can see, in order to achieve stability (prevent invalid shares, rejected shares, and phoenixminer reboots), I have to run most cards at 1375 core. Other cards I had to reduce to 1325 core. Some cards run at memory 930, some at 905, some at 900, and some at 890.

I know my cards can give better performance. Even though I have reduced power values to reduce # of invalid-rejected shares, phoenixminer still reboots from time to time, and sometimes, I will have a card that will "stop" and disappear ... which causes the phoenixminer rebooter. For these cards, I reduce values.

Any advice from master @Mini_Me would be appreciated.


RIG #1
RIG1.png

RIG #2
RIG2.png

RIG #3
RIG3.png

RIG #4
RIG4.png

RIG #5
RIG5.png
 
PS: I do not touch the Bios switch. I leave all switch positions at default position form the box.

I also posted a picture from Watts from Rig # 5 from the wall of the total rig system. 1071.5W or 1.0715Kw.

Default-Bios-Switch.jpgRIG#5Watts.jpg
 
Preface: @Mini_Me has demonstrated profound knowledge when modding the 5700 XT series of graphics cards, and for those looking for an awesome tutorial, please reference https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/.

Story: At the present time, I own 40 XFX 5700 XT video cards; I have 40 more on order to make a total quantity of 80 cards. With this many cards, squeezing out an additional 1 MegaHash (MH) from each card would translate into +80MH. Out of the box, the stock bios on these XFX 5700 XT cards reflects about 51MH per card. With bios modifications and tweaks, it is possible for me to obtain between 54MH-56MH per card.

Configuration: I have 5 rigs in operation and soon I will have 5 more in operation. Each rig uses 8 cards. I am using HiveOS to manage my rigs.

The importance of bios modifications: Basically, using bios modifications allows each card to provide a better hashrate -- and as hashrate increases, potential ETH mined increases. (More Hash == More $$). Bios modifications also change how each card consumes electricity -- and aside from the initial cost of computer components, electricity is the second greatest cost when it comes to mining.
  • Stock Bios: 80 Cards x 51MH from each card = 4.080GH
  • Modified Bios: 80 Cards x (54MH || 56MH) = (4.320GH || 4.480GH).
The delicate balance in bios modification reflects a specific relationship between: 1.) Power Consumption, 2.) Temperature, 3.) Hashrate, and 4.) Stability vs. Instability. In my opinion, the goal of proper bios modification means that: (a.) each card consumes as little power as necessary, (b.) each card remains within "safe" temperatures to prevent card damage, (c.) each card produces the maximum hashrate, (d.) each card remains stable, which is to say that the card does not produce "invalid" shares or other errors that get in the way of making money.

The above mentioned thread link reflects a dialogue between two members, @Mini_Me and @AlleyCat. The thread is several pages long because it reflects a series of tests (trials, errors, and new trials with new errors, etc.) with different bios values and with different values in HiveOS. There are multiple variables in play (both on the bios tweaking side with MPT and RBE and on the software tweaking side in HiveOS), and changing one variable influences other variables. The thread is so long because many different settings were tried out in order to achieve the best possible result for specific card manufacturers, in this case manufacturer Sapphire and manufacturer Gigabyte.

This post in particular captured my attention: https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/post-79315.

RBE modified values in use:


MPT modified values in use:


As you can see, in order to achieve stability (prevent invalid shares, rejected shares, and phoenixminer reboots), I have to run most cards at 1375 core. Other cards I had to reduce to 1325 core. Some cards run at memory 930, some at 905, some at 900, and some at 890.

I know my cards can give better performance. Even though I have reduced power values to reduce # of invalid-rejected shares, phoenixminer still reboots from time to time, and sometimes, I will have a card that will "stop" and disappear ... which causes the phoenixminer rebooter. For these cards, I reduce values.

Any advice from master @Mini_Me would be appreciated.


RIG #1
Anhang anzeigen 8066

RIG #2
Anhang anzeigen 8065

RIG #3
Anhang anzeigen 8064

RIG #4
Anhang anzeigen 8063

RIG #5
Anhang anzeigen 8062
Greetings and welcome,

A very good and comprehensive post you have done here, my gratitude for that and it is much appreciated,

First I am not a master, I am only a guy who really like and want to share whatever benefits he get within his capabilities with others as he really care about them and beside me there are many contributers as well that contributed and still doing that there at that thread,

Regarding your rig, please check the last updated guide that can be found on pages 2, 28, and 32 for simple guide and page 15 for advanced guide for cards with sensitive power curve, in the kindly attached link below is the guide at page 32,


After you update your cards with the new guide, test and update whenever it is possible.
 
Zuletzt bearbeitet :
PS: I do not touch the Bios switch. I leave all switch positions at default position form the box.

I also posted a picture from Watts from Rig # 5 from the wall of the total rig system. 1071.5W or 1.0715Kw.

Anhang anzeigen 8069 Anhang anzeigen 8070
It is highly recommended to put the switch on the Silent vbios as the OC one may have a security lock that may allow only one reflash to be done after that it will lock down the reflash capability.
 
Preface: @Mini_Me has demonstrated profound knowledge when modding the 5700 XT series of graphics cards, and for those looking for an awesome tutorial, please reference https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/.

Story: At the present time, I own 40 XFX 5700 XT video cards; I have 40 more on order to make a total quantity of 80 cards. With this many cards, squeezing out an additional 1 MegaHash (MH) from each card would translate into +80MH. Out of the box, the stock bios on these XFX 5700 XT cards reflects about 51MH per card. With bios modifications and tweaks, it is possible for me to obtain between 54MH-56MH per card.

Configuration: I have 5 rigs in operation and soon I will have 5 more in operation. Each rig uses 8 cards. I am using HiveOS to manage my rigs.

The importance of bios modifications: Basically, using bios modifications allows each card to provide a better hashrate -- and as hashrate increases, potential ETH mined increases. (More Hash == More $$). Bios modifications also change how each card consumes electricity -- and aside from the initial cost of computer components, electricity is the second greatest cost when it comes to mining.
  • Stock Bios: 80 Cards x 51MH from each card = 4.080GH
  • Modified Bios: 80 Cards x (54MH || 56MH) = (4.320GH || 4.480GH).
The delicate balance in bios modification reflects a specific relationship between: 1.) Power Consumption, 2.) Temperature, 3.) Hashrate, and 4.) Stability vs. Instability. In my opinion, the goal of proper bios modification means that: (a.) each card consumes as little power as necessary, (b.) each card remains within "safe" temperatures to prevent card damage, (c.) each card produces the maximum hashrate, (d.) each card remains stable, which is to say that the card does not produce "invalid" shares or other errors that get in the way of making money.

The above mentioned thread link reflects a dialogue between two members, @Mini_Me and @AlleyCat. The thread is several pages long because it reflects a series of tests (trials, errors, and new trials with new errors, etc.) with different bios values and with different values in HiveOS. There are multiple variables in play (both on the bios tweaking side with MPT and RBE and on the software tweaking side in HiveOS), and changing one variable influences other variables. The thread is so long because many different settings were tried out in order to achieve the best possible result for specific card manufacturers, in this case manufacturer Sapphire and manufacturer Gigabyte.

This post in particular captured my attention: https://www.igorslab.de/community/threads/gigabyte-5700-xt-bios-mod-fails.3304/post-79315.

RBE modified values in use:


MPT modified values in use:


As you can see, in order to achieve stability (prevent invalid shares, rejected shares, and phoenixminer reboots), I have to run most cards at 1375 core. Other cards I had to reduce to 1325 core. Some cards run at memory 930, some at 905, some at 900, and some at 890.

I know my cards can give better performance. Even though I have reduced power values to reduce # of invalid-rejected shares, phoenixminer still reboots from time to time, and sometimes, I will have a card that will "stop" and disappear ... which causes the phoenixminer rebooter. For these cards, I reduce values.

Any advice from master @Mini_Me would be appreciated.


RIG #1
Anhang anzeigen 8066

RIG #2
Anhang anzeigen 8065

RIG #3
Anhang anzeigen 8064

RIG #4
Anhang anzeigen 8063

RIG #5
Anhang anzeigen 8062
By the way, regarding the miner, I highly recommend the TeamRedMiner as it is optimized for mining AMD cards.

And a couple of notes kindly attached below regarding choosing the proper power supply to buy,


 
  1. TeamRedMiner (55.5MH per XFX 5700XT) produces less hash per card than PhoenixMiner (56.60MH per XFX 5700 XT). Over 8 cards per rig, the rate went from about 453MH to about 444MH. However, if the loss in MH produces a more stable system, that that's okay in my book. If PhoenixMiner causes instability or causes a system reboot, then you're not making money during the reboot process. If you have a system that reboots a lot over 24Hrs, then you lose money each time the system reboots.
  2. Average Core Temp Across 40 cards = 47.425
  3. Average Mem Temp Across 40 cards = 72.75
  4. Some cards produced errors (invalid shares, rejected shares) within 5 minutes of operation. For these cards, I lowered mem clock from 905 to 900, and left all other values alone. That seems to have helped.
Overclocking Template I use in HiveOS:
Core Clock = 1420
Core State = 2
Core Voltage = 790
Memory Controller Voltage = 780
Memory Clock = 905 ... (Adjusted down to 900 for cards that give errors within 5 mins of startup)
Mem State = blank
Memory Voltage = 1350
Fan % = blank
Power Limit = blank
Aggressive Undervolting = off
Amdmemtweak REF = blank

MPT:
Features Tab:
PPTable Features = Not Touched. Feature Control = Nothing Touched.
Overdrive Features = All boxes are ticked.

Overdrive Limits Tab:
GFX Maximum Clock = 1440
Memory Maximum Clock = 1000
Power Limit Maximum = 0
Power Limit Minimum = 0
Memory Timing Control = 1
Fan RPM Maximum = 3500
Fan RPM Minimum = 1100
Fan Acoustic Limit RPM Maximum = 3500
Fan Acoustic Limit RPM Minimum = 1100
Zero RPM Control = 0

Power and Voltage Tab:
Maximum Voltage GFX = 1050
Maximum Voltage SoC = 1050
Minimum Voltage GFX = 750
Minimum Voltage SoC = 750
Power Limit GPU = 140
TDC Limit GFX (A) = 120
TDC Limit SoC (A) = 12

Frequency Tab:
GFX Maximum = 1270
GFX Minimum = 300
SoC Maximum = 1267
SoC Minimum = 507
Memory DPM 0 = 100
Memory DPM 1 = 500
Memory DPM 2 = 625
Memory DPM 3 = 950

Curve Tab:
Nothing touched.
AVFS (GHz->V) Override - not ticked.
0.017810 = a
-0.047280 = b
0.054020 = c
StaticVoltageOffset (GHz->V) = 0.000000

Fan Tab:
PWM Minimum = 15
Fan Acoustic Limit RPM = 1550
Fan Throtteling RPM = 3200
Fan Maximum RPM = 3500
Fan Target Temperature = 85
Fan Target GFX Clock = 800
Zero RPM Enable = is ticked.
Stop Temperature = 60
Start Temperature = 70

RBE:
Am using Apple Straps
 
  1. TeamRedMiner (55.5MH per XFX 5700XT) produces less hash per card than PhoenixMiner (56.60MH per XFX 5700 XT). Over 8 cards per rig, the rate went from about 453MH to about 444MH. However, if the loss in MH produces a more stable system, that that's okay in my book. If PhoenixMiner causes instability or causes a system reboot, then you're not making money during the reboot process. If you have a system that reboots a lot over 24Hrs, then you lose money each time the system reboots.
  2. Average Core Temp Across 40 cards = 47.425
  3. Average Mem Temp Across 40 cards = 72.75
  4. Some cards produced errors (invalid shares, rejected shares) within 5 minutes of operation. For these cards, I lowered mem clock from 905 to 900, and left all other values alone. That seems to have helped.
Overclocking Template I use in HiveOS:


MPT:


RBE:
Splendid work you have done,

Regarding the TeamRedMiner, it is more stable and generate real hash rate unlike the PhoenixMiner,

If you want to increase the memory clock to 905 or 910 without invalid/rejected shares then increase the VDDCI by 20 mV from 780 to 800, if there is still a invalid/rejected shares then set the VDDCI at 850 mV,

Regarding the core state, it is better for to set it at 1, and you are done, after that ee can check your motherboard bios settings if you want.
 
  1. TeamRedMiner (55.5MH per XFX 5700XT) produces less hash per card than PhoenixMiner (56.60MH per XFX 5700 XT). Over 8 cards per rig, the rate went from about 453MH to about 444MH. However, if the loss in MH produces a more stable system, that that's okay in my book. If PhoenixMiner causes instability or causes a system reboot, then you're not making money during the reboot process. If you have a system that reboots a lot over 24Hrs, then you lose money each time the system reboots.
  2. Average Core Temp Across 40 cards = 47.425
  3. Average Mem Temp Across 40 cards = 72.75
  4. Some cards produced errors (invalid shares, rejected shares) within 5 minutes of operation. For these cards, I lowered mem clock from 905 to 900, and left all other values alone. That seems to have helped.
Overclocking Template I use in HiveOS:


MPT:


RBE:
By the way, the cards that produced errors and rejected shares at memory clock 905 MHz, on what miner these cards were set, TeamRedMiner or PhoenixMiner?
 
Here is feedback from my experience with the XFX cards.

1) Follow the Power Table on Page 32.
2) Use SoC Max speed 10 1093. This will drop your speed to 950 at memory 910 MHz or less. I find this to have a significant impact on power consumption at the wall.
3) Cluster your cards based on memory, if possible.
4) The combination of Core 1380 and memory 912 (or 912) gives me the optimal speed/power
5) Adjust the VDD to 1.8. If the card is responsible for crashing the system, increase the VDD by 10mv until the card is stable.
6) I am running the Micron memory MVDD at 1325 to reduce power at the wall. It gives me a low temperature for the memory.
7) Cards with high memory temperature (e.g., above 88c), I drop the core/VDD to 1360/1.8 to reduce the heat. I try to deal with bad heat pads without opening the cards. Some users address this issue by fixing the heat conductivity for these bad cards.
8) Fan in hive is on Auto. My target core is 50c, target memory temp 88 (because of the reduced core and memory speeds). The fans can table up to 5w each when they kick in.
9) Maintain large headroom with the power supply. The risers can take up to 40W for each GPU, so don't overload the 6pin SATA connections. I think 2 GPUs are within the PSAU load capability.

I am getting 502MH/s with 9 GPUs, at 1340W at the wall, which is a 0.375 ratio. This data can be used to compare the efficiency between rigs.

This information in combination with the work you have done, could help others to setup stable rigs.

Regards, and thanks.

AlleyCat
 
Here is feedback from my experience with the XFX cards.

1) Follow the Power Table on Page 32.
2) Use SoC Max speed 10 1093. This will drop your speed to 950 at memory 910 MHz or less. I find this to have a significant impact on power consumption at the wall.
3) Cluster your cards based on memory, if possible.
4) The combination of Core 1380 and memory 912 (or 912) gives me the optimal speed/power
5) Adjust the VDD to 1.8. If the card is responsible for crashing the system, increase the VDD by 10mv until the card is stable.
6) I am running the Micron memory MVDD at 1325 to reduce power at the wall. It gives me a low temperature for the memory.
7) Cards with high memory temperature (e.g., above 88c), I drop the core/VDD to 1360/1.8 to reduce the heat. I try to deal with bad heat pads without opening the cards. Some users address this issue by fixing the heat conductivity for these bad cards.
8) Fan in hive is on Auto. My target core is 50c, target memory temp 88 (because of the reduced core and memory speeds). The fans can table up to 5w each when they kick in.
9) Maintain large headroom with the power supply. The risers can take up to 40W for each GPU, so don't overload the 6pin SATA connections. I think 2 GPUs are within the PSAU load capability.

I am getting 502MH/s with 9 GPUs, at 1340W at the wall, which is a 0.375 ratio. This data can be used to compare the efficiency between rigs.

This information in combination with the work you have done, could help others to setup stable rigs.

Regards, and thanks.

AlleyCat
I forgot to say, use RTM. Don't use Phoenix.
 
quick points here. with this kind of scale 80 cards.
1. TRM miner is best out of lot. Phoenix miner exaggerates hashrate. Test it yourself if you want to. Do at least 2 day test.
2. Try to clock at 912 on memory. go easy on it. 912 Mhz on memory also clocks the SoC at 950 >> power savings.
3. Keep a tab on gddr6 memory temps.
 
quick points here. with this kind of scale 80 cards.
1. TRM miner is best out of lot. Phoenix miner exaggerates hashrate. Test it yourself if you want to. Do at least 2 day test.
2. Try to clock at 912 on memory. go easy on it. 912 Mhz on memory also clocks the SoC at 950 >> power savings.
3. Keep a tab on gddr6 memory temps.
Greetings and welcome,

Regarding the memory clock at 912 MHz while the soc max clock is capped at 950 MHz, it most likely may lead to card failure and even a system crash because of the overshoot in the clock, what does that mean is what I am going to explain below,

The max memory clock is 1825 MHz (912 MHz) for SoC clock of 950 MHz, a minimum of 5 MHz overshoot occasionally happens as when setting a value that practically does not mean it is fixed, that's mean 1825 + 5 = 1830 MHz (915 MHz) and this will make the SoC max clock increase from 950 MHz to 1085 MHz and because the SoC max clock was capped at 950 MHz, this occasionally yield to a card failure and a system crash, therefore for SoC at 950 MHz, the max safe memory clock is 910 MHz.
 
Zuletzt bearbeitet :
Greetings and welcome,

Regarding the memory clock at 912 MHz while the soc max clock is capped at 950 MHz, it most likely may lead to card failure and even a system crash because of the overshoot in the clock, what does that mean is what I am going to explain below,

The max memory clock is 1825 MHz (912 MHz) for SoC clock of 950 MHz, a minimum of 5 MHz overshoot occasionally happens as when setting a value that practically does not mean it is fixed, that's mean 1825 + 5 = 1830 MHz (915 MHz) and this will make the SoC max clock increase from 950 MHz to 1085 MHz and because the SoC max clock was capped at 950 MHz, this occasionally yield to a card failure and a system crash, therefore for SoC at 950 MHz, the max safe memory clock is 910 MHz.
Let me clarify. I set the SoC to 1093 max, with the intent to not run the memory above 910-912MHz. This should give enough headroom for the clock, stepping to increase to 1083 dynamically. I have one card that at 912MHz the SoC runs at 1083. My options are to reduce the clock to 910 or lower, and the GPU will step down the SoC to 950 or let the SoC to tun at 1083.

I care to run my Micron memory at or below 88c.
 
Let me clarify. I set the SoC to 1093 max, with the intent to not run the memory above 910-912MHz. This should give enough headroom for the clock, stepping to increase to 1083 dynamically. I have one card that at 912MHz the SoC runs at 1083. My options are to reduce the clock to 910 or lower, and the GPU will step down the SoC to 950 or let the SoC to tun at 1083.

I care to run my Micron memory at or below 88c.
Unfortunately, I do not know the consequences of capping the soc max clock however I do know the consequence of increasing the soc mini clock, For me I would not cap the soc max until I know what exactly is happening by doing that as I already learned a leason from altering the soc mini clock.
 
Let me clarify. I set the SoC to 1093 max, with the intent to not run the memory above 910-912MHz. This should give enough headroom for the clock, stepping to increase to 1083 dynamically. I have one card that at 912MHz the SoC runs at 1083. My options are to reduce the clock to 910 or lower, and the GPU will step down the SoC to 950 or let the SoC to tun at 1083.

I care to run my Micron memory at or below 88c.
This hasn't happened to me over last 3-4 months of running them at these clocks. I got multiple crashes due to lower SoC voltages, higher mem clocks, lower mvddci but on the other hand it's difficult to figure out if the crash can happen due to above. In my opinion it wouldn't cause even if you set it to 950 as Max limit (MPT) and you clock your memory higher it would still bump the SoC clocks to 1093 Mhz.
 
This hasn't happened to me over last 3-4 months of running them at these clocks. I got multiple crashes due to lower SoC voltages, higher mem clocks, lower mvddci but on the other hand it's difficult to figure out if the crash can happen due to above. In my opinion it wouldn't cause even if you set it to 950 as Max limit (MPT) and you clock your memory higher it would still bump the SoC clocks to 1093 Mhz.
The SoC is a very important component in the gpu, and I found out even on the mobile versions of RX 5600M, 5500M and 5300M and the professional versions of Raedon Pro 5500M and 5300M that even if these cards consume low power yet the SoC max clock value is still 1267 MHz, that is why I stated to those who limit the SoC max clock at 950 MHz that it is at their own discretion as the consequences are unknown for this limitation.
 
Zuletzt bearbeitet :
Let me clarify. I set the SoC to 1093 max, with the intent to not run the memory above 910-912MHz. This should give enough headroom for the clock, stepping to increase to 1083 dynamically. I have one card that at 912MHz the SoC runs at 1083. My options are to reduce the clock to 910 or lower, and the GPU will step down the SoC to 950 or let the SoC to tun at 1083.

I care to run my Micron memory at or below 88c.
Have you checked what is the soc max clock value for memory clock at 960 MHz? Is it still 1085 MHz or something else?
 
I don’t clock the memory at 960 MHz. The most i do is 900 to 910.
I leave the power curve to select the SoC. At 900 MHz memory speed the SoC is mostly 950, if I set the maximum to 1093.
I believe at the meantime it is better to leave the soc max clock at default until I get the required information regarding it as I am awaiting for it, I will update as soon as I get the information whenever it is possible.
 
@Mini_Me, please check if my logic and procedure is good and correct and perhaps make recommendations to refine:

Tuning the Cards
  1. Goal #1: To have all cards run "stable" at MEM>=910, regardless of VDDCI value. (If Samsung Memory, MEM=910. Elseif Micron Memory, MEM=930).
  2. Goal #2: To fine tune the value of VDDCI to accomplish Goal #1.
Assumptions
  • All cards are flashed with the same Bios modifications.
  • All rigs use TeamRedMiner
Information: HiveOS name translations (Names in OC Template vs. Names on Status Screen)
  • Core Clock, Mhz = CORE
  • Core State, index = DPM
  • Core Voltage, mV = VDD
  • Memory Controller Voltage, mV = VDDCI
  • Memory Clock, Mhz = MEM
  • Memory State, index = MDPM
  • Memory Voltage, mV = MVDD
Starting values: HiveOS OC Template Constants:
  • Core=1420
  • DPM=1
  • VDD=790
  • VDDCI=850 (This is what we will try to determine for each individual card.)
  • MVDD=1350
  • IF SAMSUNG THEN MEM=910; ELSEIF MICRON THEN MEM=930.
  • MDPM=1
Errors we are looking for:
  1. Invalid or rejected shares on individual cards.
  2. Individual cards that "drop out" completely after a short time in operation.
  3. Cards that are missing as an empty red box in Hive instead of a solid green box.
  4. Individual cards that go from operation @ > ~80W to dropping out of operation at < ~30W. (You will have to browse individual cards within each rig and notice the changed power values for individual cards, especially in cases of random reboots when no invalid shows. Before the rig reboots and refreshes, quickly observe the power values to determine which card is at fault.)
  5. Cards that cause a "reboot" of the rig without producing invalid or rejected shares.
During testing time, I recommend setting automatic refresh of status page for 30 seconds. For reboot or error, you will then need to browse individual cards in the rig in question and examine the cards in the rig for error, e.g., low power consumption, invalid shares.

I define stability as the individual card's performance of its hashing duties without producing errors OVER TIME. Notice that TIME is the most important variable here. Cards that are stable for 1 minute might show instability after 5 minutes of operation. Cards that are stable for 10 minutes might cause errors after 15 minutes. Cards that are good for 15 minutes might cause errors after 1 hour and so forth.

Therefore we begin with getting all cards and rigs stable for 5 minutes. If a card shows any kind of error within 5 minutes, especially after an immediate bootup and start, then we adjust the values of VDDCI. After we determine stability values at 5 minutes, we then see if the card remains stable at 10 minutes, 15 minutes, 30 minutes, 60 minutes and so forth.

Procedure:
  • We begin with (Samsung=910; Micron=930) and all VDDCI=850. If a card produces an error within 5 minutes, then we +25 to VDDCI until we reach VDDCI=900. If errors continue at VDDCI=900, then we -10 from MEM.
Expected Results:
  • What you will see is that some cards run rock stable at VDCCI=850 and MEM=(910 Samsung; 930 Micron). Sometimes, VDCCI=900 and MEM=(900 or 890 for Samsung; 930 or 920 for Micron).
Further tuning
After your rig has run without errors for a predetermined time to your comfort (1hr, 6hr, 12hr, 24hr, etc.) Then for those cards where MEM had to be reduced, you can lower VDDCI -25 for just these cards until you find a point where lower VDDCI causes new errors. If a lower VDDCI causes a new error, the increase VDDCI +25 until no more errors.
 
Oben Unten