FS#63872 - [linux-zen] amdgpu fails to initialize on RX 5700 XT

Attached to Project: Arch Linux
Opened by Ivan V (pcm720) - Saturday, 21 September 2019, 05:30 GMT
Last edited by Andreas Radke (AndyRTR) - Wednesday, 11 December 2019, 07:19 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To No-one
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:

On most recent Zen kernel (5.3.0-zen1-1-zen), AMDGPU module hits general protection fault while initializing Navi 10 GPU (RX 5700 XT), locking up the system (see dmesg log attached).
System is completely unresponsive and can't be accessed through SSH.
This doesn't happen with the latest mainline kernel or default Arch Linux kernel.

Steps to reproduce:
1. Install the Zen kernel on the system with Navi 10 GPU;
2. Try to boot.
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Wednesday, 11 December 2019, 07:19 GMT
Reason for closing:  Fixed
Additional comments about closing:  Issue was fixed with Linux 5.4.1
Comment by TesX (tesfabpel) - Thursday, 03 October 2019, 11:44 GMT
It also happens to me with linux 5.3.1 but it seems a bit different...
After some tries I found out that a line in journalctl that radeonsi needs LLVM 9 or higher and it's also visibile here https://gitlab.freedesktop.org/mareko/mesa/commit/594010e366f911581ca0a4471a9d9fa68116514f

LLVM 9 is not in Arch repos though
Comment by Ivan V (pcm720) - Thursday, 03 October 2019, 11:59 GMT
I have Git versions of LLVM 10 and Mesa 19.3 installed on my system, so this has nothing to do with Mesa: amdgpu initializes without issues on Linux 5.3.1-arch1-1.
tesfabpel, most likely your problem is caused by the lack of Navi support in LLVM 8.
Comment by Jiří Kuchyňka (Anty0) - Sunday, 27 October 2019, 19:00 GMT
I can confirm the issue still exists in linux-zen version 5.3.7.zen1-1.
Can't boot with linux-zen and RX 5700 XT, while mainline linux works.

I can also see in journalctl the same general protection fault as @pcm720.

One important thing I see is that we both use the same model of motherboard (Gigabyte X570 AORUS PRO), which can (in combination with navi10 gpu) be the real cause of the problem.
Comment by igo95862 (igo95862) - Friday, 22 November 2019, 09:34 GMT
I can confirm this on old Z170 motherboard.


Normal kernel and the ones that compiled from git work just fine. As soon as zen kernel is attempted the GPU fails to initialize.
Comment by igo95862 (igo95862) - Monday, 02 December 2019, 05:32 GMT
I tried booting linux-zen 5.4.1 from [testing] and it worked!

Can anyone else verify that its fixed on 5.4.1?
Comment by Jiří Kuchyňka (Anty0) - Tuesday, 03 December 2019, 19:36 GMT
It's fixed for me as well! Tested with 5.4.1 from [stable].
Comment by Richard Alison (KerakTelor) - Tuesday, 03 December 2019, 19:38 GMT
Can verify, I can now boot with zen after updating to 5.4.1!
Comment by Ivan V (pcm720) - Monday, 09 December 2019, 01:23 GMT
Can confirm. System boots properly with 5.4.2-zen1-1-zen

Loading...