FS#62008 - [linux-hardened] Kernel Panic with 4.20.16.a

Attached to Project: Arch Linux
Opened by freswa (frederik) - Thursday, 14 March 2019, 10:45 GMT
Last edited by Levente Polyak (anthraxx) - Wednesday, 20 March 2019, 09:23 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

Description:
Kernel does not boot and does not show anything on the monitor, but Caps Lock blinks (IIRC that's a Kernel Panic).


Additional info:
* 4.20.15.a is good, 4.20.16.a is bad

Steps to reproduce:
Try to boot the hardened kernel.
This task depends upon

Closed by  Levente Polyak (anthraxx)
Wednesday, 20 March 2019, 09:23 GMT
Reason for closing:  Fixed
Additional comments about closing:  4.20.17.a-1
Comment by Levente Polyak (anthraxx) - Thursday, 14 March 2019, 11:18 GMT
4.20.16.a boots fine here, so it seems related to your hardware.

1. Can you please try to build non hardened 4.20.16 with the help of the Arch vanilla package using its 4.20 configs based on this state:
https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/linux&id=b44aa57a4ff15e6c41d24429aff240d2e3980645

if you are hit by it on vanilla 4.20.16, please try to bisect the bad commit between v4.20.15 and v4.20.16 via a git checkout
Comment by anonymous (Austaras) - Thursday, 14 March 2019, 12:46 GMT
same here, but in a VPS on a KVM

I am also using nftables
Comment by Levente Polyak (anthraxx) - Thursday, 14 March 2019, 18:11 GMT
same response, please do above to get any further with this
Comment by freswa (frederik) - Friday, 15 March 2019, 11:28 GMT
No panic with either 4.20.16, 5.0.1 and 5.0.2 (all vanilla).
Comment by Levente Polyak (anthraxx) - Friday, 15 March 2019, 15:09 GMT
can you try vanilla 4.20.16 with
CONFIG_BUG_ON_DATA_CORRUPTION=y
CONFIG_REFCOUNT_FULL=y
CONFIG_PANIC_ON_OOPS=y

if that still works please try all the Kconfig values from hardened in the vanilla package.
I know its lot of work and compiling, I'm sorry but i can't reproduce myself and this would really bring this issue forward if you debug and test the problem.
Comment by Thibaut Sautereau (thithib) - Friday, 15 March 2019, 15:59 GMT
Same issue with two different physical machines on linux-hardened-4.20.16.a-1.

I cannot debug right now but I was able to see a call to __nf_tables_abort() in my stack trace. Disabling the nftables systemd service allows my system to boot correctly. After looking for commits in v4.20.15..v4.20.16 touching __nf_tables_abort(), I'd bet on this one: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.20.y&id=6f9518c5bc88e5206ed68df1f911e47095414476
Comment by freswa (frederik) - Friday, 15 March 2019, 16:00 GMT
I'll try to reproduce, but at least I am also using nftables.
Comment by loqs (loqs) - Friday, 15 March 2019, 16:57 GMT
If you remove CONFIG_PANIC_ON_OOPS=y and change CONFIG_BUG_ON_DATA_CORRUPTION=y to CONFIG_DEBUG_LIST=y
does that produce an OOPS in the journal rather than a panic?
Comment by freswa (frederik) - Friday, 15 March 2019, 17:30 GMT
I've just tested linux-hardened 4.20.16 with reverted commit 6f9518c5bc88e5206ed68df1f911e47095414476 by applying the attached patch.
No panic anymore.
Comment by loqs (loqs) - Friday, 15 March 2019, 17:54 GMT Comment by Eduard Toloza (edu4rdshl) - Saturday, 16 March 2019, 00:10 GMT
Not reproducible here (I'm not using nftables).
Comment by Levente Polyak (anthraxx) - Tuesday, 19 March 2019, 23:25 GMT
I have backported the fix, please try 4.20.17.a-1

Loading...