FS#61950 - Freeze after suspend/resume with kernel 5.0
Attached to Project:
Arch Linux
Opened by Simone (tigerjack) - Friday, 08 March 2019, 10:55 GMT
Last edited by Antonio Rojas (arojas) - Sunday, 08 September 2019, 09:19 GMT
Opened by Simone (tigerjack) - Friday, 08 March 2019, 10:55 GMT
Last edited by Antonio Rojas (arojas) - Sunday, 08 September 2019, 09:19 GMT
|
Details
Description:
I am not totally sure that the kernel update is the culprit here, but it seems the major candidate. Since the update of a few days ago to 5.0.0-arch1-1-ARCH, every time I suspend the laptop, the screen is completely frozen on resume and I have to do an hard shutdown. Steps to reproduce: Two different ways. A) * start xserver with startx * invoke systemctl suspend * resume -> screen freezes B) * From tty invoke systemctl suspend * resume * invoke startx -> screen freezes What I have tried: 1) Disabling systemd services, namely NetworkManager, wpa_supplicant, acpid, laptop-mode-tools, cpu-power, thermald 2) Completely removed laptop-mode-tools package 3) Deleted user X-related files such as .Xresources, .xinitrc, .xprofile and started with a fresh X environment 4) Use root user to observe if the behavior is user-config dependant and tried method B above (log attached) 5) Updated drivers for AMD gpu to latest xf86-video-amdgpu 19.0.0-1 6) Completely removed battery and drained capacitors The problem still remains |
This task depends upon
root_cli.log
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2, emitted seq=3
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 647 thread Xorg:cs0 pid 648
[drm] IP block:gfx_v8_0 is hung!
[drm] GPU recovery disabled.
Has anyone affected tried bisecting between 4.20 and 5.0 to find the causal commit?
By the way, I've attached the latest log related to the suspend/resume cycle: 09.22.xx are the messages appearing after `systemctl suspend`, 09.23.xx those after the resume.
[1] https://wiki.archlinux.org/index.php/Bisecting_bugs_with_Git
The second bug first appears in commit [262485a50fd4532a8d71165190adc7a0a19bcc9e] drm/amd/display: Expand dc to use 16.16 bit backlight. Log - blackscreen.log; bisect log - bisect-blackscreen.log
The first bug with amdgpu_job_timedout first appears in the commit [106c7d6148e5aadd394e6701f7e498df49b869d1] drm/amdgpu: abstract the function of enter/exit safe mode for RLC. Log - amdgpu_error.log
During the bisect searching for the first error, I went through the following stages sequentially: good (resuming from suspend was successful), error 2, good, error 2, error 1.
Fix will be in kernel 5.3
If that fixes the issue arch could pick up the patch until 5.3 as it is not marked for stable.