FS#77811 - linux package v6.2.3.arch1-1 causes a kernel panic at shutdown

Attached to Project: Arch Linux
Opened by Robin Candau (Antiz) - Friday, 10 March 2023, 18:02 GMT
Last edited by Jan Alexander Steffens (heftig) - Saturday, 11 March 2023, 16:23 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description: The linux kernel package v6.2.3.arch1-1 currently in [testing] causes a kernel panic at shutdown.
This behavior has been reproduced and confirmed by another tester.

Additional info:
* Package version: linux 6.2.3.arch1-1
* Log files: https://pasteboard.co/RxRSXUj8t8qm.jpg I'm sorry, I'm unsure how to get those logs back after booting in order to send them in a more readable form than a picture... I couldn't retrieve those in `journalctl` or `dmesg`. If there's any other information needed, I'd be happy to provide them :)

Steps to reproduce: Upgrade the system to get the linux package v6.2.3.arch1-1 from [testing], reboot in order to boot into new kernel, then shutdown the system.
This task depends upon

Closed by  Jan Alexander Steffens (heftig)
Saturday, 11 March 2023, 16:23 GMT
Reason for closing:  Fixed
Additional comments about closing:  6.2.3.arch2-1
Comment by Christian Heusel (gromit) - Friday, 10 March 2023, 18:32 GMT
I can confirm this bug as I was chatting with Antiz about it! Thanks for writing the report!
Comment by Robin Candau (Antiz) - Friday, 10 March 2023, 18:40 GMT
For what it's worth, some people seem to be able to shutdown/reboot just fine with linux 6.2.3.arch1-1 so it looks like this behavior isn't systematic.
Comment by Luna Jernberg (bittin1) - Friday, 10 March 2023, 18:51 GMT
Downgraded to 6.2.2-arch2-1 just in case had that laying around on my m2 SSD thanks for the headsup @Antiz
Comment by loqs (loqs) - Friday, 10 March 2023, 19:33 GMT Comment by Mike Cloaked (mcloaked) - Friday, 10 March 2023, 20:00 GMT Comment by Mike Cloaked (mcloaked) - Friday, 10 March 2023, 20:45 GMT
I just tested 6.2.3 built reverting commit

bfe46d2efe46c5c952f982e2ca94fe2ec5e58e2a

and the kernel Oops no longer occurs for me - so that resolves the issue I reported.

I have also tested a second build with two commits reverted as below, and this also
gives no Oops following usb umount.

So reverting both

bfe46d2efe46c5c952f982e2ca94fe2ec5e58e2a
57a425badc05c2e87e9f25713e5c3c0298e4202c

Has resolved the issue for me.
Comment by loqs (loqs) - Friday, 10 March 2023, 21:00 GMT
Your findings match upstream's suspicion [1] What if you backported the 2nd patch from the series [2] that stable skipped?

[1] https://lore.kernel.org/all/ad021e89-c05c-f85a-2210-555837473734%40kernel.dk/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dfd6200a095440b663099d8d42f1efb0175a1ce3
Comment by Gene (GeneC) - Friday, 10 March 2023, 21:56 GMT
I had this as well (the 2 reverts fixed it).

For me :
- "desktop like systems" by which I mean those with 1 or 2 disks and no additional connected storage reboot fine.
- "server like" systems with storage (e.g. ext4 on lvm over mdadm raid-6) crash and hang on shutdown with same trace.
The btrfs raid systems never had any tests with 6.2.3 so cannot comment.

Comment by Henrique Custódio (henriqueffc) - Friday, 10 March 2023, 21:57 GMT
I didn't have any problems with the update. Lenovo S145 - i7 8565U / Nvidia MX110. It must affect some specific settings.
Comment by Robin Candau (Antiz) - Friday, 10 March 2023, 21:59 GMT
FWIW, the laptop I'm having this issue with has 2 disks (one nvme SSD and one HDD) and no additional connected storage.
Comment by Gene (GeneC) - Friday, 10 March 2023, 22:04 GMT
The crashes I have seen all have (from block/foo.c)

throtl_pd_offline+0x40/0x70
blkcg_deactivate_policy+0xab/0x140

Perhaps 2 disks is enough if there's a race - the more disks the more likely perhaps.
Comment by Gene (GeneC) - Friday, 10 March 2023, 22:07 GMT
The Good news is that there's a simple fix :)
Comment by Geert Hendrickx (ghen) - Friday, 10 March 2023, 23:47 GMT
Here two systems with just a single NVME consistently have the same issue.
Comment by Jan Alexander Steffens (heftig) - Saturday, 11 March 2023, 00:13 GMT
should be fixed in 6.2.3.arch2-1
Comment by Robin Candau (Antiz) - Saturday, 11 March 2023, 08:40 GMT
I confirm, 6.2.3.arch2-1 fixed it for me.
Comment by Andreas Radke (AndyRTR) - Saturday, 11 March 2023, 09:22 GMT
Can somebody affected by this issue please check if 6.1.16-1-lts kernel is also affected by this bug? Those commits are included by I haven't seen any crashes here so far.
Comment by Christian Heusel (gromit) - Saturday, 11 March 2023, 09:47 GMT
The issue also exists for me in the latest lts kernel, I just checked.

For the linux package I can also confirm that `6.2.3.arch2-1` works!
Comment by Geert Hendrickx (ghen) - Saturday, 11 March 2023, 10:23 GMT
6.2.4 is out, containing just this fix; https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.2.4
Comment by Christian Heusel (gromit) - Saturday, 11 March 2023, 10:27 GMT Comment by Mike Cloaked (mcloaked) - Saturday, 11 March 2023, 11:16 GMT
6.2.4 has been released upstream, that has the two reverted bad commits - and this version fixes the problem for me.
Comment by Andreas Radke (AndyRTR) - Saturday, 11 March 2023, 15:24 GMT
6.1.18-1-lts is in testing and should be fixed.
Comment by Luna Jernberg (bittin1) - Saturday, 11 March 2023, 15:39 GMT
and hopefully a new 6.2.x soon too as its released upstream

Loading...