FS#74285 - [linux] 5.17.1.arch1-1 crashes when waking from suspend

Attached to Project: Arch Linux
Opened by Ben Grant (190n) - Thursday, 31 March 2022, 03:51 GMT
Last edited by Jan Alexander Steffens (heftig) - Sunday, 10 April 2022, 11:45 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
David Runge (dvzrv)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 7
Private No

Details

Description:

After updating to Linux 5.17.1.arch1-1, my laptop seems to suspend normally, but it crashes when waking back up: the fans spin up and the power light comes on, but the screen remains off and nothing else indicates it is working. Only forcing shutdown can get it out of this state. Downgrading to Linux 5.16.16.arch1-1 and installing 5.15.32-1-lts have both fixed this problem. Another user with the same issue (https://redd.it/ts2m56) claims that this is a kernel panic, but I'm not sure how I could check that myself -- logs from the failing boot (journalctl -b -1) stop right after it successfully suspends, and even when I switch from GUI to a tty, the screen doesn't turn back on for me to see any console output.

Additional info:
* package version(s): linux 5.17.1.arch1-1, tp_smapi 0.43-382, acpi_call-dkms 1.2.2-1, v4l2loopback-dkms 0.12.5-2, virtualbox-host-dkms 6.1.32-2
* config and/or log files etc.: attached fail-suspend.log is log output from 5.17.1.arch1-1 from when I tell it to suspend until the end

Hardware: Lenovo ThinkPad T495s, AMD Ryzen 5 PRO 3500U

Steps to reproduce:
* Boot the system
* Attempt to suspend (close the lid, short press on the power button, or systemctl suspend)
* Attempt to wake (open the lid or press the power button again)
This task depends upon

Closed by  Jan Alexander Steffens (heftig)
Sunday, 10 April 2022, 11:45 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 5.17.2.arch3-1
Comment by Samuel Reddy (GamerTechUniverse) - Thursday, 31 March 2022, 09:42 GMT
On my laptop, the caps lock key light flickers when there is a kernel panic. Note I am the op of the reddit post mentioned.
Comment by Ben Grant (190n) - Thursday, 31 March 2022, 18:58 GMT
Interesting. I don't see that on my ThinkPad. What laptop do you have?
Comment by Samuel Reddy (GamerTechUniverse) - Thursday, 31 March 2022, 20:42 GMT
I have a Dell Inspiron 3505.
Comment by Tim Sweet (tsweet64) - Thursday, 31 March 2022, 21:32 GMT
I can reproduce this on a Lenovo IdeaPad S340, also with the blinking caps lock and also with `linux-zen`. Always fails to resume when closing the lid to suspend, or when unplugging/plugging a monitor while suspended. It only sometimes happens when clicking suspend without touching lid or monitor. I slightly suspect `amdgpu` or something else graphics/KMS/DRM-related.
Comment by Tim Sweet (tsweet64) - Friday, 01 April 2022, 05:12 GMT
Possibly related: https://bugzilla.kernel.org/show_bug.cgi?id=215742

My laptop does have an nvme drive, what about y'all's laptops?
Comment by Ben Grant (190n) - Friday, 01 April 2022, 05:15 GMT
Yes, I actually have the exact same model as OP there (WD SN550). But it does work for me on 5.15.
Comment by dcard (DCard) - Friday, 01 April 2022, 11:59 GMT
I have the same laptop and the same happened to me. I confirm that downgrading to Linux 5.16.16.arch1-1 fixes it.
Comment by dcard (DCard) - Friday, 01 April 2022, 13:11 GMT
After a few hours, can confidently say downgrading fixed it.
Comment by Samuel Reddy (GamerTechUniverse) - Friday, 01 April 2022, 21:04 GMT
tsweet64, my laptop does have a nvme ssd, so that bug report may be related to the issue.
Comment by Josefine Hofmarcher (josefine) - Saturday, 02 April 2022, 08:11 GMT
Hello, I have this problem as well. Starting the LTS kernel temporarily solves the problem.
Hopefully there will be a kernel 5.17 update soon...

Hallo, Bei mir besteht dieses Problem ebenso. Ein Start des LTS Kernels löst einstweilen das Problem.
Hoffentlich gibt es bald ein uppdate des Kernel 5.17

CPU: 8-core AMD FX-8350 (-MT MCP-) speed/min/max: 2492/1400/4000 MHz
Kernel: 5.17 Up: 41m Mem: 6696.0/31998.6 MiB (20.9%)
Storage: 3.69 TiB (9.7% used) Procs: 383 Shell: Bash inxi: 3.3.14


Graphics:
Device-1: AMD Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X]
driver: amdgpu v: kernel
Device-2: Realtek Full HD Webcam type: USB driver: snd-usb-audio,uvcvideo
Display: x11 server: X.Org v: 1.21.1.3 driver: X: loaded: amdgpu
unloaded: fbdev,modesetting,radeon,vesa gpu: amdgpu resolution:
1: 1920x1080~60Hz 2: 1920x1080~60Hz 3: 1680x1050~60Hz 4: 1920x1080~60Hz
OpenGL: renderer: AMD Radeon RX 550 / 550 Series (polaris12 LLVM 13.0.1
DRM 3.42 5.15.32-1-lts)
v: 4.6 Mesa 22.0.0
Comment by Reik Keutterling (Spielkind) - Saturday, 02 April 2022, 22:10 GMT
Not sure if related, but I have a similar issue on my X280 since 5.17 - where the thinkpad wakes up immediately after entering suspend.

https://bbs.archlinux.org/viewtopic.php?pid=2029159#p2029159

Downgrading to 5.16.16 "fixed" the issue for me, as reported "Always-on USB" may also helps.
Comment by Tim Sweet (tsweet64) - Sunday, 03 April 2022, 23:55 GMT
I git-bisected the kernel and identified a potentially responsible commit on my system (this one was [87ebbb8c612b1214f227ebb8f25442c6d163e802] ACPI: processor: idle: Only flush cache on entering C3). I have made a new upstream bug report for it: https://bugzilla.kernel.org/show_bug.cgi?id=215797
Comment by Hedy Ache (ache) - Tuesday, 05 April 2022, 12:23 GMT
A little stranger.
I have a closely related problem. ThinkPad x395.

Suspend works but crash on the second waking up.
So `systemctl suspend` *2 can reproduce the bug.

journalctl isn't really verbose. I have a trace on 5.16.9 (but it works !) and nothing in 5.17.1 (on the first suspend).

5.17, first suspend, it works
~~~plain
avril 03 01:17:12 law kernel: ACPI: PM: Restoring platform NVS memory
avril 03 01:17:12 law kernel: Enabling non-boot CPUs ...
~~~

5.16, after a downgrade.

~~~plain
avril 05 14:05:47 law kernel: ACPI: PM: Restoring platform NVS memory
avril 05 14:05:47 law kernel: ------------[ cut here ]------------
avril 05 14:05:47 law kernel: WARNING: CPU: 0 PID: 83056 at drivers/iommu/amd/init.c:841 amd_iommu_enable_interrupts+0x34d/0x420
avril 05 14:05:47 law kernel: Modules linked in: ppp_deflate bsd_comp sr_mod ppp_async ppp_generic slhc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc huawei_cdc_ncm cdc_wdm cdc_ncm cdc_ether uas option usbnet usb_storage mii usb_ww>
avril 05 14:05:47 law kernel: platform_profile i2c_piix4 typec_ucsi ipmi_devintf tpm libphy rfkill typec snd ipmi_msghandler rng_core roles soundcore wmi video mac_hid pinctrl_amd i2c_scmi acpi_cpufreq pkcs8_key_parser dm_multipath dm_mod i2c_dev cp210x sg crypto_user fus>
avril 05 14:05:47 law kernel: CPU: 0 PID: 83056 Comm: systemd-sleep Tainted: G W 5.16.9-arch1-1 #1 f76945f6a1bdef4447cac8f8524be2a356a0339d
avril 05 14:05:47 law kernel: Hardware name: LENOVO 20NLCTO1WW/20NLCTO1WW, BIOS R13ET49W(1.23 ) 11/24/2020
avril 05 14:05:47 law kernel: RIP: 0010:amd_iommu_enable_interrupts+0x34d/0x420
avril 05 14:05:47 law kernel: Code: ff ff 49 8b 7f 18 89 04 24 e8 df 04 ee ff 8b 04 24 e9 4b fd ff ff 0f 0b 4d 8b 3f 49 81 ff 30 0c 76 88 0f 85 05 fd ff ff eb 96 <0f> 0b 4d 8b 3f 49 81 ff 30 0c 76 88 0f 85 f1 fc ff ff eb 82 31 f6
avril 05 14:05:47 law kernel: RSP: 0018:ffffbeaa82e73c88 EFLAGS: 00010046
avril 05 14:05:47 law kernel: RAX: 000000010033dca9 RBX: 0000000000000000 RCX: 0000000000000000
avril 05 14:05:47 law kernel: RDX: 000000000000575d RSI: 00000000000051e2 RDI: 000000010033854c
avril 05 14:05:47 law kernel: RBP: 0000000080000000 R08: 0000000000000000 R09: 000000000000000f
avril 05 14:05:47 law kernel: R10: 0000000079726f6d R11: 000000006d656d20 R12: 000ffffffffffff8
avril 05 14:05:47 law kernel: R13: 0800000000000000 R14: ffffbeaa82e73c90 R15: ffff9b0dc004a800
avril 05 14:05:47 law kernel: FS: 00007f26dd8d6e80(0000) GS:ffff9b0eb7a00000(0000) knlGS:0000000000000000
avril 05 14:05:47 law kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
avril 05 14:05:47 law kernel: CR2: 00007f10b27c0024 CR3: 00000001009e8000 CR4: 00000000003506f0
avril 05 14:05:47 law kernel: Call Trace:
avril 05 14:05:47 law kernel: <TASK>
avril 05 14:05:47 law kernel: ? early_enable_iommus+0x1c5/0x300
avril 05 14:05:47 law kernel: ? enable_iommus_v2+0x8e/0x130
avril 05 14:05:47 law kernel: syscore_resume+0x48/0x160
avril 05 14:05:47 law kernel: suspend_devices_and_enter+0x6da/0x7e0
avril 05 14:05:47 law kernel: pm_suspend.cold+0x2fb/0x342
avril 05 14:05:47 law kernel: state_store+0x71/0xd0
avril 05 14:05:47 law kernel: kernfs_fop_write_iter+0x119/0x1b0
avril 05 14:05:47 law kernel: new_sync_write+0x159/0x1f0
avril 05 14:05:47 law kernel: vfs_write+0x1eb/0x280
avril 05 14:05:47 law kernel: ksys_write+0x67/0xe0
avril 05 14:05:47 law kernel: do_syscall_64+0x59/0x80
avril 05 14:05:47 law kernel: ? syscall_exit_to_user_mode+0x23/0x40
avril 05 14:05:47 law kernel: ? do_syscall_64+0x69/0x80
avril 05 14:05:47 law kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
avril 05 14:05:47 law kernel: RIP: 0033:0x7f26de2ca257
avril 05 14:05:47 law kernel: Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
avril 05 14:05:47 law kernel: RSP: 002b:00007ffdf75c0638 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
avril 05 14:05:47 law kernel: RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f26de2ca257
avril 05 14:05:47 law kernel: RDX: 0000000000000004 RSI: 00007ffdf75c0720 RDI: 0000000000000004
avril 05 14:05:47 law kernel: RBP: 00007ffdf75c0720 R08: 000055b0e57294f0 R09: 0000000000000000
avril 05 14:05:47 law kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
avril 05 14:05:47 law kernel: R13: 000055b0e57253c0 R14: 0000000000000004 R15: 00007f26de3c37a0
avril 05 14:05:47 law kernel: </TASK>
avril 05 14:05:47 law kernel: ---[ end trace 71b094c9afde75cf ]---
avril 05 14:05:47 law kernel: Enabling non-boot CPUs ...
~~~

On a failed (second suspend on 5.17.1). My log is very similar to the one of the OP:

~~~plain
avril 03 01:17:40 law systemd-logind[611]: Suspending...
avril 03 01:17:40 law systemd[1]: Reached target Sleep.
avril 03 01:17:40 law systemd[1]: Starting System Suspend...
avril 03 01:17:40 law systemd-sleep[4660]: Entering sleep state 'suspend'...
~~~
Comment by Tim Sweet (tsweet64) - Sunday, 10 April 2022, 07:28 GMT
The problematic commit was reverted upstream in dfbba2518aac4204203b0697a894d3b2f80134d3. Could this commit be cherrypicked in the `linux` package, due the bug's severity in breaking suspend functionality on many devices?

Loading...