Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#78341 - Failed resume from hibernation - Kernel 6.2.13 : KDE Plasma 5.27.4 : AMDGPU

Attached to Project: Arch Linux
Opened by phonemic (phonemic) - Sunday, 30 April 2023, 18:19 GMT
Last edited by Toolybird (Toolybird) - Wednesday, 10 May 2023, 22:48 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
Resume from hibernation on 6.2.13-arch1-1 running KDE Plasma 5.27.4 results in a frozen machine. I can see the SDDM lock screen but the keyboard and mouse will not respond. The screen will intermittently go black then display the frozen SDDM screen. I cannot access another tty with Ctrl+Alt+f3. The only way to safely reboot is with SysRq + REISUB key combo.

The issue seems to be related to amdgpu. See journalctl output below with results from a good resume and a failed resume.


Additional info:
DE: KDE Plasma 5.27.4 (Wayland Session)
Kernel: 6.2.13-arch1-1

## `journalctl` on failed resume from hibernation
```
Apr 30 09:29:21 devone kernel: [drm] kiq ring mec 2 pipe 1 q 0
Apr 30 09:29:21 devone kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Apr 30 09:29:21 devone kernel: [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
Apr 30 09:29:21 devone kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
Apr 30 09:29:21 devone kernel: amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
Apr 30 09:29:21 devone kernel: amdgpu 0000:03:00.0: PM: dpm_run_callback(): pci_pm_restore+0x0/0xe0 returns -110
Apr 30 09:29:21 devone kernel: amdgpu 0000:03:00.0: PM: failed to restore async: error -110
Apr 30 09:29:21 devone kernel: PM: hibernation: Basic memory bitmaps freed
```

## `journalctl` on successful resume from hibernation
```
Apr 30 12:49:15 devone kernel: [drm] kiq ring mec 2 pipe 1 q 0
Apr 30 12:49:15 devone kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Apr 30 12:49:15 devone kernel: [drm] JPEG decode initialized successfully.
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_low uses VM inv eng 1 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_high uses VM inv eng 4 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 5 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 6 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 7 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 8 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 9 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 10 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 11 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 12 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 13 on hub 0
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
Apr 30 12:49:15 devone kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
Apr 30 12:49:15 devone kernel: PM: hibernation: Basic memory bitmaps freed
```

Steps to reproduce:
Hibernate machine with KDE Plasma 5.27.4. Wait for shutdown then resume from hibernation.
This task depends upon

Closed by  Toolybird (Toolybird)
Wednesday, 10 May 2023, 22:48 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 6.3.1.arch1-1
Comment by Toolybird (Toolybird) - Sunday, 30 April 2023, 21:37 GMT
Failing to resume is a very common kernel problem. Search our bug tracker and you'll find plenty of other reports. If this used to work with a previous kernel, it's a kernel regression. General advice for debugging kernel regressions [1]. If you need to report it upstream, amdgpu issues can be reported here [2]. Please let us know what you find out.

[1] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
[2] https://gitlab.freedesktop.org/drm/amd
Comment by phonemic (phonemic) - Sunday, 30 April 2023, 23:11 GMT
This might have already been fixed. If this is the same issue, does this mean the regression will be addressed in 6.3?

https://github.com/torvalds/linux/commit/2fec9dc8e0acc3dfb56d1389151bcf405f087b10
Comment by Toolybird (Toolybird) - Monday, 01 May 2023, 05:47 GMT
That patch is already included in the 6.2.x series. Your issue must be different.
Comment by phonemic (phonemic) - Monday, 01 May 2023, 19:05 GMT Comment by phonemic (phonemic) - Monday, 01 May 2023, 19:44 GMT Comment by loqs (loqs) - Monday, 01 May 2023, 21:04 GMT
$ git tag --contains ef3064e461e64cdd6647e7604cf06e3c0131b099
v6.2.10-arch1
v6.2.11-arch1
v6.2.12-arch1
v6.2.13-arch1

The Arch linux package contains everything from the upstream stable tag it is based on.
Comment by loqs (loqs) - Wednesday, 10 May 2023, 15:31 GMT
So the issue is fixed since 6.3.1.arch1-1 [1]?

[1] https://gitlab.freedesktop.org/drm/amd/-/issues/2537#note_1898098
Comment by phonemic (phonemic) - Wednesday, 10 May 2023, 18:58 GMT
You can close the bug report. I cannot reproduce the issue in 6.3.1.

Loading...