Question memory voltage increase

skyborg

Neuling
Mitglied seit
Mai 18, 2020
Beiträge
3
Bewertungspunkte
0
Punkte
1
Hi

I am doing some overclocking of GPU RX560 in Dell Inspiron laptop, model 5576. Facing an issue of unstable VRAM above 1650MHz, and complete system crashes at 1900MHz.

VBIOS allow only core voltage tweaking in wattman, no MVDD or VDDCI changes take effect. I found that I can change VDDCI by manually changing init value in ASIC table and tried VDDCI of 1.10V (default was 0.875V) but this did not help with stability or memory errors.

So I took DMM and measured voltages on board and put all in two pictures.

I'd like to hear an opinion on how the GDDR5 memory is powered on this system and a possible way to increase it. These are SKHynix modules rated as 3GHz/1.35V 3.5GHz/1.5V running at 1.35V defaultUntitled.png2TG9Mb1.JPG
update>GPU phases voltage is on first 3 capacitors from the left, on the 4th one is VDDCI voltage
>intGFX voltage sensor is available in linux, it reads 1.09V
 
Zuletzt bearbeitet :
5525 IC is actually G5335 10A buck converter. hardmodding its FB pin resistor allowed to raise voltage, and almost completely eliminated mem errors at freqs up to 1800MHz.

max voltage tested was with 1.8V idle - 1.75V under load.

at 1900MHz all seems fine with furmark, then suddenly 250000 errors are counted, sometimes system freezes and crashes - this looks like it'll need replacement for 15A converter
 
5525 IC is actually G5335 10A buck converter. hardmodding its FB pin resistor allowed to raise voltage, and almost completely eliminated mem errors at freqs up to 1800MHz.

max voltage tested was with 1.8V idle - 1.75V under load.

at 1900MHz all seems fine with furmark, then suddenly 250000 errors are counted, sometimes system freezes and crashes - this looks like it'll need replacement for 15A converter
Greetings and good day,

Have you achieved what want? If yes then I would be glad if you could share your findings and results,

You may have a good day,

Regards.
 
not really... before replacement tried to change on-time, but there was a failure soon after that, probably caps or memory

tried to go with lower switching frequency, seemed to help with crashes as i was able to pass few benchmarks at 1920 and 1940MHz while still at default 1.35V output, but later the GPU came up with code 43...

for reference this is output on linux:

Code :
[    1.876147] [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67EF 0x1028:0x07E2 0xC5).
[    1.883328] [drm] register mmio base: 0xFE800000
[    1.883340] [drm] register mmio size: 262144
[    1.883343] [drm] PCIE atomic ops is not supported
[    1.883350] [drm] add ip block number 0 <vi_common>
[    1.883351] [drm] add ip block number 1 <gmc_v8_0>
[    1.883351] [drm] add ip block number 2 <tonga_ih>
[    1.883352] [drm] add ip block number 3 <gfx_v8_0>
[    1.883354] [drm] add ip block number 4 <sdma_v3_0>
[    1.883354] [drm] add ip block number 5 <powerplay>
[    1.883355] [drm] add ip block number 6 <dm>
[    1.883356] [drm] add ip block number 7 <uvd_v6_0>
[    1.883357] [drm] add ip block number 8 <vce_v3_0>
[    2.001492] ATOM BIOS: SWBRT01017.001
[    2.001523] [drm] UVD is enabled in VM mode
[    2.001524] [drm] UVD ENC is enabled in VM mode
[    2.001526] [drm] VCE enabled in VM mode
[    2.001532] vga_switcheroo: enabled
[    2.001550] [drm] GPU posting now...
[    2.023519] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[    2.023610] amdgpu 0000:03:00.0: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
[    2.023611] amdgpu 0000:03:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[    2.023687] [drm] Detected VRAM RAM=4096M, BAR=256M
[    2.023688] [drm] RAM width 128bits GDDR5
[    2.023723] [drm] amdgpu: 4096M of VRAM memory ready
[    2.023726] [drm] amdgpu: 4096M of GTT memory ready.
[    2.023736] [drm] GART: num cpu pages 65536, num gpu pages 65536
[    2.024447] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
[    2.024563] [drm] Chained IB support enabled!
[    2.025934] amdgpu: [powerplay] hwmgr_sw_init smu backed is polaris10_smu
[    2.026098] [drm] Found UVD firmware Version: 1.79 Family ID: 16
[    2.026107] [drm] UVD ENC is disabled
[    2.026570] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[    4.483778] amdgpu: [powerplay]
                failed to send message 254 ret is 0
[    6.941275] amdgpu: [powerplay] SMU load firmware failed
[    6.941303] amdgpu: [powerplay] fw load failed
[    6.941321] smu firmware loading failed
[    6.941340] amdgpu 0000:03:00.0: amdgpu_device_ip_init failed
[    6.941364] amdgpu 0000:03:00.0: Fatal error during GPU init
[    6.941399] [drm] amdgpu: finishing device.
[    7.089550] [drm] amdgpu: ttm finalized
[    7.089562] vga_switcheroo: disabled
[    7.089953] amdgpu: probe of 0000:03:00.0 failed with error -22

i've just replaced VRM, but the problem remains, so as i already wrote probably caps or memory are bad.

at least laptop boots with new VRM so i can confirm it is compatible. the part i used is AOZ2264QI
 
not really... before replacement tried to change on-time, but there was a failure soon after that, probably caps or memory

tried to go with lower switching frequency, seemed to help with crashes as i was able to pass few benchmarks at 1920 and 1940MHz while still at default 1.35V output, but later the GPU came up with code 43...

for reference this is output on linux:

Code :
[    1.876147] [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67EF 0x1028:0x07E2 0xC5).
[    1.883328] [drm] register mmio base: 0xFE800000
[    1.883340] [drm] register mmio size: 262144
[    1.883343] [drm] PCIE atomic ops is not supported
[    1.883350] [drm] add ip block number 0 <vi_common>
[    1.883351] [drm] add ip block number 1 <gmc_v8_0>
[    1.883351] [drm] add ip block number 2 <tonga_ih>
[    1.883352] [drm] add ip block number 3 <gfx_v8_0>
[    1.883354] [drm] add ip block number 4 <sdma_v3_0>
[    1.883354] [drm] add ip block number 5 <powerplay>
[    1.883355] [drm] add ip block number 6 <dm>
[    1.883356] [drm] add ip block number 7 <uvd_v6_0>
[    1.883357] [drm] add ip block number 8 <vce_v3_0>
[    2.001492] ATOM BIOS: SWBRT01017.001
[    2.001523] [drm] UVD is enabled in VM mode
[    2.001524] [drm] UVD ENC is enabled in VM mode
[    2.001526] [drm] VCE enabled in VM mode
[    2.001532] vga_switcheroo: enabled
[    2.001550] [drm] GPU posting now...
[    2.023519] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[    2.023610] amdgpu 0000:03:00.0: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
[    2.023611] amdgpu 0000:03:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[    2.023687] [drm] Detected VRAM RAM=4096M, BAR=256M
[    2.023688] [drm] RAM width 128bits GDDR5
[    2.023723] [drm] amdgpu: 4096M of VRAM memory ready
[    2.023726] [drm] amdgpu: 4096M of GTT memory ready.
[    2.023736] [drm] GART: num cpu pages 65536, num gpu pages 65536
[    2.024447] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
[    2.024563] [drm] Chained IB support enabled!
[    2.025934] amdgpu: [powerplay] hwmgr_sw_init smu backed is polaris10_smu
[    2.026098] [drm] Found UVD firmware Version: 1.79 Family ID: 16
[    2.026107] [drm] UVD ENC is disabled
[    2.026570] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[    4.483778] amdgpu: [powerplay]
                failed to send message 254 ret is 0
[    6.941275] amdgpu: [powerplay] SMU load firmware failed
[    6.941303] amdgpu: [powerplay] fw load failed
[    6.941321] smu firmware loading failed
[    6.941340] amdgpu 0000:03:00.0: amdgpu_device_ip_init failed
[    6.941364] amdgpu 0000:03:00.0: Fatal error during GPU init
[    6.941399] [drm] amdgpu: finishing device.
[    7.089550] [drm] amdgpu: ttm finalized
[    7.089562] vga_switcheroo: disabled
[    7.089953] amdgpu: probe of 0000:03:00.0 failed with error -22

i've just replaced VRM, but the problem remains, so as i already wrote probably caps or memory are bad.

at least laptop boots with new VRM so i can confirm it is compatible. the part i used is AOZ2264QI
Greetings and welcome back,

It might be the memory had been degraded somehow and they can not operate at the default rated specification anymore, how about lowering the clock at 1800 MHz or 1750 MHz to verify this.
 
Oben Unten