FS#70236 - [linux][linux-zen] 5.11 - Very slow boot process. Soft lockup.
Attached to Project:
Arch Linux
Opened by env (ENV25) - Tuesday, 30 March 2021, 19:33 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Thursday, 14 October 2021, 21:49 GMT
Opened by env (ENV25) - Tuesday, 30 March 2021, 19:33 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Thursday, 14 October 2021, 21:49 GMT
|
Details
Description:
Soft lockups during early boot. Takes very long to boot. Many stack traces in dmesg once finished booting (see attachment). This only happens in 5.11 kernels (both arch and zen), 5.10 lts kernel works fine. The messages "watchdog: BUG: soft lockup - CPU#x stuck for 23s! [xxxxx/xx]" are very vague, I can't what exactly the problem is. I don't know how to interpret the stack traces either, help would be welcome. Additional info: * package version(s) kernels 5.11.10.arch1-1 5.11.10.zen1-1 * config and/or log files etc. Laptop: HP Pavilion Laptop 13-an0xxx CPU: i3-8145U Boot: UEFI sd-boot Graphics: xorg w/ i915 (no nvidia or radeon) DE: plasma & sddm dmesg dump is attached. |
This task depends upon
Closed by Sven-Hendrik Haase (Svenstaro)
Thursday, 14 October 2021, 21:49 GMT
Reason for closing: Fixed
Additional comments about closing: 2021-10-12: A task closure has been requested. Reason for request: This doesn't happen anymore in the same way.
Thursday, 14 October 2021, 21:49 GMT
Reason for closing: Fixed
Additional comments about closing: 2021-10-12: A task closure has been requested. Reason for request: This doesn't happen anymore in the same way.
In mkinitcpio.conf I replaced udev with systemd and I have i915 module added. I use kernel-install to automatically concatenate microcode.
Same issues with default settings and separate microcode.
I have "root=PARTLABEL=ROOT rw resume=PARTLABEL-swap quiet splash" in my kernel command-line for quiet boot (no plymouth).
Same issue with "root=PARTLABEL=ROOT rw".
Also note the "soft lockup" messages happen before systemd's initramfs messages, so it probably happens before initramfs.
I also have some ACPI errors, but I've always had those.
[ 95.255064] Setting dangerous option enable_guc - tainting kernel
[ 95.255067] Setting dangerous option enable_fbc - tainting kernel
I've attached a new dmesg dump.
Yes, this still happens in 15.11.11 . Compared to previous version, it doesn't seem to happen consistently. Sometimes it boots correctly. 50% chance.
I did not know it was possible to update firmware. I'll look it up.
I tried to do sysrq 't' but it didn't work, I think. I'll try something else later.
https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
I've seen a few issues, in this bugtracker and elsewhere, that are similar to mine but not exactly the same. Is this kind of thing common?
The ssd firmware might fix acpi errors. It seems there was an issue with Drive Self Test.
https://support.hp.com/us-en/drivers/selfservice/HP-Pavilion-13-an0000-Laptop-PC/23238359/
the firmware, I am planning to try using Windows PE:
https://wiki.archlinux.org/index.php/Windows_PE .
BIOS update instructions:
https://gist.github.com/eNV25/c8001491dc0440656ff7b0ae18993ba1
What PKGBUILD should I use?
This issue happens in both linux 5.13.4.arch1-1 and linux-zen 5.13.4.zen1-1. It happens when rebooting but not after poweroff.
I wasn't able to find the commit. My laptop is too slow to compile or bisect linux.