FS#68396 - [mesa] 20.2.1-1 [amdgpu] RX580 hard crash and glitched screens
Attached to Project:
Arch Linux
Opened by Jarmo (JATothrim) - Friday, 23 October 2020, 17:57 GMT
Last edited by Andreas Radke (AndyRTR) - Friday, 04 December 2020, 06:48 GMT
Opened by Jarmo (JATothrim) - Friday, 23 October 2020, 17:57 GMT
Last edited by Andreas Radke (AndyRTR) - Friday, 04 December 2020, 06:48 GMT
|
Details
Description:
Following messages appear on dmesg with both my screens totally screwed up: kernel: [drm:gfx_v8_0_priv_reg_irq [amdgpu]] *ERROR* Illegal register access in command stream kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=8, emitted seq=9 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1067 thread Xorg:cs0 pid 1068 kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset begin! kernel: amdgpu: cp is busy, skip halt cp kernel: amdgpu: rlc is busy, skip halt rlc .... kernel: snd_hda_intel 0000:09:00.1: spurious response 0x0:0x0, last cmd=0x870600 kernel: snd_hda_intel 0000:09:00.1: No response from codec, disabling MSI: last cmd=0x00820000 kernel: amdgpu 0000:09:00.0: amdgpu: failed to suspend display audio kernel: amdgpu: cp is busy, skip halt cp kernel: amdgpu: rlc is busy, skip halt rlc kernel: amdgpu 0000:09:00.0: amdgpu: GPU BACO reset kernel: snd_hda_intel 0000:09:00.1: No response from codec, resetting bus: last cmd=0x00820000 .... Also the system crashes instantly once Xorg or LightDM launches at boot. The system may survive if I switch to empty VT soon as screen glitching occurs. I have bisected the changes so that the crash is not related to an hardware nor kernel issue This was caused by mesa-20.2.1-1-x86_64 upgrade. If fact, the system may crash immediately after the pacman has finished upgrading the package!? Additional info: GPU: glxinfo: Vendor: X.Org (0x1002) Device: Radeon RX 580 Series (POLARIS10, DRM 3.38.0, 5.8.13-7-rzen+, LLVM 10.0.1) (0x67df) Version: 20.1.8 Accelerated: yes Video memory: 8192MB Unified memory: no Preferred profile: core (0x1) Max core profile version: 4.6 Max compat profile version: 4.6 Max GLES1 profile version: 1.1 Max GLES[23] profile version: 3.2 lspci: 09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7) 09:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] CPU: AMD Ryzen 7 2700 Eight-Core Processor Steps to reproduce: 1) I install mesa-20.2.1-1-x86_64 with above AMDGPU hardware. 2) Launch any GPU accelerated program 3) Enjoy pretty glitched screens I marked this as critical because I can't do 'pacman -Syu' because it crashes the system. |
This task depends upon
Since I reported this problem, I did full upgrade again and I still had to revert to mesa-20.1.8-1 to get into desktop.
I will try more with arch kernel (It did crash, I'm on self built v5.8.16 mainline now..) and verify more that my trouble It isn't my fault.
Never the less, any ideas how to fix this?
todo list for my self:
-boot into -arch kernel..
-what amdgpu module params are set to?
-double check the hardware is sane..
Could you bisect the mesa package to locate the causal commit?
[1] https://wiki.archlinux.org/index.php/Arch_Linux_Archive
LightDM, Xorg and cinnamon.
the system crashes also with with just LightDM starting or running "startxfce4" on VT.
I haven't tested any Wayland desktops yet.
I have some non-default amdgpu module params:
"options amdgpu dc=1 gpu_recovery=1 send_sigterm=1 mcbp=1 mes=1 moverate=1024"
I'll try test without these to see if they are the problem.
After that I will try bisect the packages on 5.9.1-arch1-1...
mesa 20.2.1-1 works fine if I boot without "mcbp=1". Ohff!
Now the question is what changed between mesa-20.1.8 and mesa-20.2.x related to the module param?
This problem is solved now, so it can be closed.