FS#66991 - [linux] Sudden freeze on kernel 5.7.2-arch1-1 with amdgpu driver

Attached to Project: Arch Linux
Opened by Edgar (Ryozuki) - Saturday, 13 June 2020, 14:32 GMT
Last edited by Jan Alexander Steffens (heftig) - Tuesday, 04 August 2020, 14:55 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
Levente Polyak (anthraxx)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 13
Private No

Details

Description:

The screen completely freezes, the image is frozen but sound continues to play, I think it's an issue with the amdgpu driver.

Additional info:
Linux kernel 5.7.2-arch1-1
extra/xf86-video-amdgpu 19.1.0-2
   log.txt (4.4 KiB)
This task depends upon

Closed by  Jan Alexander Steffens (heftig)
Tuesday, 04 August 2020, 14:55 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 5.7.11.arch1-1
Comment by loqs (loqs) - Saturday, 13 June 2020, 17:53 GMT
If you blacklist the amdgpu module can you reproduce the issue?
Comment by Edgar (Ryozuki) - Saturday, 13 June 2020, 18:26 GMT
I used the linux-lts (Linux arch 5.4.46-1-lts) for some time and it doesn't freeze.
Comment by loqs (loqs) - Saturday, 13 June 2020, 18:40 GMT Comment by nothanks (l330) - Sunday, 14 June 2020, 17:20 GMT
I've been having the same issue.
5.7.2-arch1-1
xf86-video-amdgpu-19.1.0-2
   log.txt (3.9 KiB)
Comment by Santiago Pastorino (spastorino) - Sunday, 14 June 2020, 20:17 GMT
I'm having an issue that looks similar or the same as this one. In my case I'm not using amdgpu. Also going back to lts make things work. I've downgraded to 5.6.15. As noted above the issue seems to be that the video is not able to initialize properly. Looking at the logs I see the following ...

Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (==) Matched intel as autoconfigured driver 0
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (==) Matched modesetting as autoconfigured driver 1
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (==) Matched fbdev as autoconfigured driver 2
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (==) Matched vesa as autoconfigured driver 3
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (==) Assigned the driver to the xf86ConfigLayout
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) LoadModule: "intel"
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (WW) Warning, couldn't open module intel
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (EE) Failed to load module "intel" (module does not exist, 0)
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) LoadModule: "modesetting"
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) Module modesetting: vendor="X.Org Foundation"
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: compiled for 1.20.8, module version = 1.20.8
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: Module class: X.Org Video Driver
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: ABI class: X.Org Video Driver, version 24.1
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) LoadModule: "fbdev"
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (WW) Warning, couldn't open module fbdev
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (EE) Failed to load module "fbdev" (module does not exist, 0)
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) LoadModule: "vesa"
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (WW) Warning, couldn't open module vesa
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (EE) Failed to load module "vesa" (module does not exist, 0)
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) modesetting: Driver for Modesetting Kernel Drivers: kms
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (II) modeset(0): using drv /dev/dri/card0
Jun 14 14:44:33 galago /usr/lib/gdm-x-session[929]: (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
Comment by Maksim Kraev (maximka) - Sunday, 14 June 2020, 21:54 GMT
also freezes on Intel J4105 without external graphics or monitor attached. Do not think amdgpu is to blame here.
Comment by Automne von Einzbern (automne) - Monday, 15 June 2020, 22:43 GMT
Same issue here even without display connected on a Ryzen 3200G (with Vega chipset embedded)
Had this log before a crash happened
Comment by Zorbik (zorbik) - Tuesday, 16 June 2020, 01:28 GMT
Same issue here. Getting a kernel fault randomly, sometimes after minutes, other times after hours. Soft locks the system and requires a hard shutdown to do anything. Attached is the kernel fault I'm getting.

Kernel: 5.7.2-arch1-1
Driver: xf86-video-amdgpu 19.1.0-2

I'd be happy to provide more information if that would be helpful.
Comment by Automne von Einzbern (automne) - Tuesday, 16 June 2020, 07:52 GMT
If you want to deposit your testimony on kernel's bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208205
Comment by Vinicius (PhantomX) - Tuesday, 16 June 2020, 12:44 GMT Comment by Automne von Einzbern (automne) - Wednesday, 17 June 2020, 07:51 GMT
I compiled yesterday Linux 5.8-rc1 and the issue seems to be solved.
If some of you can test and confirm.
Comment by Automne von Einzbern (automne) - Wednesday, 17 June 2020, 09:15 GMT
Nevermind, still hanging.
Comment by Magnus Boman (katt) - Thursday, 18 June 2020, 13:43 GMT
I seem to be affected by this as well, things were running fine for days but suddenly got a general protection fault while watching youtube. I'm technically running linux-zen but I doubt it's worth opening a separate issue.
Comment by Santiago Pastorino (spastorino) - Sunday, 21 June 2020, 19:35 GMT
After reading a bit here, my issue seems to be different, opened a new one here https://bugs.archlinux.org/task/67068
Comment by Loïc BLOT (Nerzhul) - Thursday, 25 June 2020, 10:07 GMT
I have exactly the same problem, i can only use LTS and 5.6.x branches, the current 5.7 is hanging randomly :(
Comment by Mario O. M. (marioortizmanero) - Thursday, 02 July 2020, 13:37 GMT
I think I have the same problem with the kernel 5.7.6-arch1-1, and xf86-video-amdgpu 19.1.0-2 (RX 480). It freezes randomly and the only way to get out of it is by rebooting, I can't open a new TTY or do anything at all, only listen to the system sounds. It has happened to me twice in the last two days, while an X server was running. I'm attaching my log in case it helps, too.
Comment by Vinicius (PhantomX) - Thursday, 02 July 2020, 21:07 GMT Comment by loqs (loqs) - Thursday, 02 July 2020, 21:51 GMT
@PhantomX that has been applied to linux 5.7.7.arch1-1 and linux-lts 5.4.50-1 as the fix for  FS#67131 
Comment by Magnus Boman (katt) - Monday, 13 July 2020, 17:33 GMT
Got hit by this today again, very irregular to say the least (hadn't happened since my last comment)
5.7.8-zen2-1-zen / Vega 56
Comment by ketsui (ketsui) - Monday, 20 July 2020, 04:43 GMT
Seems like the folks over here have found the culprit, can you guys try reverting 3202fa62f, cbfc35a48 and 89b83f282?
https://bugzilla.kernel.org/show_bug.cgi?id=207383#c72
Comment by Loïc BLOT (Nerzhul) - Monday, 20 July 2020, 14:40 GMT
i hope it can be fixed, being tied to 5.4 whereas 5.8 LTS will pop and should have the same problem is a very big issue
Comment by Mehmet Türk (mmturk) - Tuesday, 21 July 2020, 10:39 GMT
We should have a 5.7.9 build reverting these 3 commits: 3202fa62f, cbfc35a48 and 89b83f28 until a fixed kernel is released.
Comment by Han Vinke (Gatenkaas) - Thursday, 23 July 2020, 08:22 GMT
With my switch from nVidia to AMD Navi 10 5600XT I noticed severe stability issues. With Windows 10 experiencing bsods and freezes with Linux.
I first thought of driver issues as the cause, but in fact the problem was caused by a BIOS setting.
I use a multiboot rEFInd configuration with Windows 10, Xubuntu and Arch with currently 5.7.9-arch1-1.
Since all the 3 boot options had the same instability issues I had to look for another reason than the assumed driver issue.
After changing the PCIe Speed from [Auto] to [Gen3] I have no problems anymore.(---> Advanced\PCH Configuration\PCI Express).
Maybe somehow this auto-switching is not working properly for gen4 (PCI Express 4.0) devices on older motherboards?
Comment by Babalu (Babalu) - Friday, 24 July 2020, 17:50 GMT
I was running the latest kernel until last week and I had this problem too, then I moved to LTS, now at version "Linux archlinux 5.4.53-1-lts" and it works well, for the last week.

My graphics card is a "ATI Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]" with a dual-monitor setup.

Not everytime, but I would say 3 out of 5 times I was gaming with CS:GO, it was freezing, with a complete stall. Had to press the reset button to start again.

I will wait for the 5.8 to try again with the latest kernel.
Comment by vladodriver (vladodriver) - Monday, 27 July 2020, 17:52 GMT
After apply https://lkml.org/lkml/2020/7/27/64 - this patch problem solved. Testing few hours - this patch working. Performance seems not be affected. https://bugzilla.kernel.org/attachment.cgi?id=290591.
Comment by vladodriver (vladodriver) - Monday, 27 July 2020, 18:50 GMT
Please apply this fix for new linux package
Comment by Mehmet Türk (mmturk) - Saturday, 01 August 2020, 07:34 GMT
This patch has been applied to the git tree. I think it's going to be included in 5.8-rc8 onwards.
Comment by loqs (loqs) - Saturday, 01 August 2020, 08:13 GMT
The patch has been included in 5.7.12.arch1-1 [1] currently in testing. Can you confirm this resolves the issue?

[1] https://git.archlinux.org/linux.git/commit/?h=v5.7.12-arch1&id=38116337d16e34b0282c047b3fb644adbb908bca
Comment by Jan Alexander Steffens (heftig) - Saturday, 01 August 2020, 10:34 GMT
5.7.11.arch1 also has it.
Comment by Edgar (Ryozuki) - Tuesday, 04 August 2020, 14:53 GMT
It looks fixed to me, I don't get the bug anymore.

Loading...