FS#65869 - [linux-hardened] Kernel panic caused by non zeroed-free pages
Attached to Project:
Arch Linux
Opened by Filip Brygidyn (fbrygidyn) - Tuesday, 17 March 2020, 20:04 GMT
Last edited by freswa (frederik) - Sunday, 13 September 2020, 15:28 GMT
Opened by Filip Brygidyn (fbrygidyn) - Tuesday, 17 March 2020, 20:04 GMT
Last edited by freswa (frederik) - Sunday, 13 September 2020, 15:28 GMT
|
Details
The problem:
For some time now I am experiencing kernel panics on linux-hardnened kernel. The earliest captured traces I have are from 5.4.7 version and I am reproducing it on the latest 5.5 git branch. The details: I build a 5.5 version with symbols and you can see a symbolized trace in symbolized_panic.txt (build on this tree: https://github.com/anthraxx/linux-hardened/tree/b5d24fe9e7cf98359c2910e7444da3022983c3ed using the linux-hardened-git aur package) It essentially boils down to a check at https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/mm/page_alloc.c#L2193 ... which fails - the pages at this point are not always zeroed. I added a crude patch to see more verbose information (0001-verbose.patch) And with it applied you can see an output in after_verbose_patch.txt There are several pages that appear to have one or two 64 Byte blocks that are not zeroed. I also tried applying only this check on a vanilla kernel - same results. The patch consisted of - lines 2190-2194 from https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/mm/page_alloc.c - lines 218-223 from https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/include/linux/highmem.h And I tested on both existing arch installation as well as on fresh, clean install to eliminate any possible rouge services corrupting memory - if there is some then it is in some base package. Possible explanation: I see 3 things that can be happening here: 1. The assertion in the linux-hardware patches at https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/mm/page_alloc.c#L2193 is wrong. The logic seems to be correct. If the init-after-free flag is present then when we reclaim a page from a free list it should be zero-ed. But maybe the 'free list' does not only contain pages that were previously 'freed' (and initialized) but also some other ones. 2. This is an upstream bug - As explained above the vanilla kernel with just this check added resulted in the same panic. But this is assuming that that the check is valid. 3. This is some hardware/firmware-related bug - the 64 Byte blocks may be corrupted cache lines? @anthraxx or someone: Please check if the assert at https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/mm/page_alloc.c#L2193 is valid. I do not have enough knowledge about the mm subsystem to tell. Test hardware: I tested in on several configurations: the parts I had to swap around were as follows: CPUs: ryzen 2600, ryzen 2200g Mobo: x470D4U, B450M steel legend Ram: 4 sticks of ECC ram: 2x8GB + 2x16GB SSD: intel 760p (nvme), samsung 860 evo (sata) No gpu - all tested headless PSU: supermicro 1200W rack unit, seasonic 360W gold unit Cooling: wraith prism, wraith stealth Traces gathered over a serial port I mixed and matched those components around and ruled out any single component, I also went to town in the BIOS and tried disabling all features I could find + checked lower/higher memory frequencies aside from stock, flashed old and new BIOSes, tried different DIMM slots. The only 2 common hardware parts that I cannot rule out are that all configurations used a 2nd gen ryzen on an asrock mobo. Steps to reproduce: Install linux-hardware package - I tried several, also compiled my own - all had the same problem. I found ways to reproduce: 1. Easiest: install memtester package and while booted into linux-hardened kernel run a shell like this (done while 16GB of ram were installed): while true do memtester 12G & sleep 8 killall memtester done What it essentially does is to allow memtester enough time to allocate 12G and write something. The panic occurs during the allocation part. Sometimes the panic happens right away, and sometimes it can take a few minutes. Doing something in the background seem to help - for example launch kernel compilation in parallel. 2. Just reboot repeatedly - eventually the boot will fail - you can see some of the failed boot traces in the random_panics.txt. Sometimes it happens right away and sometimes I could reboot for 2 hours without any issues. |
This task depends upon
Closed by freswa (frederik)
Sunday, 13 September 2020, 15:28 GMT
Reason for closing: Upstream
Additional comments about closing: https://bugzilla.kernel.org/show_bug.cgi ?id=206963
Sunday, 13 September 2020, 15:28 GMT
Reason for closing: Upstream
Additional comments about closing: https://bugzilla.kernel.org/show_bug.cgi ?id=206963
If you have any suggestions about kernel config options or run parameters then please let me know - reproducing a crash on a clean upstream vanilla kernel would rule out hardening patches and I could go to an upstream bug tracker with this.
Also: what do I have to do to be able to edit my own task description/attachments? I would like to fix grammar mistakes and replace the broken patch.
The easiest way would be to find the faulty vanilla patch that behaves incorrectly. To do this easily, you seem to be aware of a range of versions where this was good and where it started to behave badly. Could you try a git bisect between both versions to figure out the vanilla linux commit that introduced this regression?
And that all non-hardened linux version worked without a panic.
So I do not have any hardened-linux version that worked.
What I can try for now is to apply a minimal patch:
- lines 2190-2194 from https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/mm/page_alloc.c
- lines 218-223 from https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/include/linux/highmem.h
on top of an random old vanilla kernel and check if I can reproduce a panic. If I am not then bisect from there.
Edit: I can start doing it in about ~6-7 hours after I finish work. Also I am wondering if I will be able to boot on an old kernel on a ryzen system. If I for example take a 4.19 or 3.14.
I did use linux-hardened package for about half a year with little to no crashes. But since the beginning I always had random failed boots and freezes. Back then I didn't know what was the cause. Now after enabling a serial console I see what those problems were.
Only after I started stressing the system I was able to reproduce panics more frequently.
Just now I finished checking the 4.19.17: https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/linux-hardened&id=3cd727170e52501509ac75aaab5d01493ea53a3e
and the panic log is attached.
Will try 4.15.18 now ( https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/linux-hardened&id=8784547f9b40bc1a9dc1a56250a88c6588f7b983 )
4.18 crashes the same way.
I also see on github that there is a 4.14 version but I cannot find a PKGBUILD/config for it. It was most likely back when linux-hardened was in AUR.
@anthraxx Do you know where can I find those old versions? If I'm to fix the broken compilation then I would like to go as far back as I can.
https://git.archlinux.org/svntogit/community.git/tree/linux-hardened/trunk?id=a623982bcdcf0cae6e841a90120bb705f7ec1deb
you could clone the tree and see how to reach the objects, its not a valid ref/branch anymore as it has been deleted in community.
You may indeed need to downgrade gcc as well in some virtual machine or such :/
I tried building older gcc but god it's infuriating - AUR has __a_lot__ of old gcc versions and pretty much nothing older than gcc8 works.
Anyway: 4.14 paniced the same way.
So at this point I do not think I can go any farther back in kernel versions.
@anthraxx Can you tell me if the check at https://github.com/anthraxx/linux-hardened/blob/b5d24fe9e7cf98359c2910e7444da3022983c3ed/mm/page_alloc.c#L2193 is valid?
Or point to someone who I could ask to verify this? I know it would be more than weird to find out that this check was invalid for years without causing issues but it just seems suspicious to me.
Another theory I have could be a problem with ECC ram on ryzen (no official validation from AMD) - disabling ECC in BIOS didn't help but maybe the support of ECC dimms in both ECC and non-ECC modes is somehow broken/doing something unexpected.
Unfortunately I do not have any non-ECC ddr4 sticks ATM. Will try to get some.
Just to make sure I will check on a second system again ('B450m steel legend + 2200g + 8GB stick' instead of 'x470d4u + 2600 + 16GB stick')
4.14-hardened-custom-abominat... (51.5 KiB)
I compiled a KASAN version and did more testing. Now I see 3 separate issues.
To make this all organized:
All the testing I did was on a linux-hardened 5.5.10.a-1:
https://git.archlinux.org/svntogit/packages.git/tree/trunk?id=67b915f73578de3d6874df3cf674404423619db2
with config file modified to enable KASAN and debug info.
You can find a script I used in attached kernel_build_script.tar.gz
This should be a corresponding kernel tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=v5.5.10&id=7ee76f1601f39ab3941c8b1c9a19dfc58f7cea47
And also: I noticed my previous logs were not wrapping lines - so longer lines were being cut. I changed the minicom config and logs attached this time seem to be full.
First I booted the mentioned kernel on a second machine (B450M steel legend + 2200G + 8G ecc stick + 860 EVO)
Was able to reproduce the crash as expected but before I even logged in to launch memtester - KASAN logged some errors. It doesn't seem like a source of the original issue but well... It did report use-after-free related to amdgpu module (This board/cpu combo did not allow me to disable IGPU)
You can find the logs (raw and symbolized) in 2nd_machine_panic.tar.gz This may be related to https://bugs.archlinux.org/task/59463
Not all KASAN call-traces were symbolized by 'decode_stacktrace.sh' script. I do not know why. If you know how I can fix this the let me know.
Anyway, After that I went back to the 1st machine (X470D4U + 2600 + 16G ecc stick + 760p), booted the same kernel and got a panic as well. Logs (raw and symbolized) in 1st_machine_panic.tar.gz
No KASAN logs of any kind.
But then I think I found a way to stop the panics:
I tried booting with mem_encrypt=off and after _a_lot_ of reboots and memtester launches I was not able to reproduce a panic. Removing mem_encrypt=off option resulted in easy reproduction after at most few reboots/memtester launches.
***Edit: The thing about BIOS option is incorrect - it disables TSME, not SME***
Now: Maybe you noticed that in the bug description I mentioned about disabling all the BIOS features I could find. This included SME. But it turns out that that option does not do anything on my X470D4U board (I did not check B450M yet). Even with SME disabled in the bios I could see "AMD Secure Memory Encryption (SME) active" in the logs.
***end edit***
And there is one more thing that kinda points to SME:
When I look at the non-zeroed regions I dumped into after_verbose_patch.txt they do seem random. Maybe a coincidence but maybe they were encrypted/decrypted by a broken SME.
If it is indeed SME than I should be able to reproduce some panic on a linux-lts package with mem_encrypt=on. the config of both linux and linux-lts package has SME disabled by default.
from linux-lts config:
CONFIG_AMD_MEM_ENCRYPT=y
# CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is not set
Without the additional checks of linux-hardened I do not yet know how hard it could be to reproduce.
TL;DR:
I see 3 problems now:
1. KASAN points a seemingly unrelated use-after-free in amdgpu
2. ***Edit: this is incorrect, see the remark above*** Asrock BIOS is broken - SME disable switch doesn't disable SME
3. The original one - Those panics look like a broken SME - disabling it with mem_encrypt=off seems to have helped for now
2nd_machine_panic.tar.gz (50.9 KiB)
1st_machine_panic.tar.gz (43.4 KiB)
On the other hand the lts kernel _with a check for zeroed pages_ added behaves the same way as hardened - with mem_encrypt=on I can reproduce the issue easily and with mem_encrypt=off I can't.
At this point I think this is no longer a linux-hardened issue so I am planning to open a upstream bug on bugzilla.kernel.org (probably this weekend or when I have time)
For reference I am attaching a few things:
- lts+checks_build_script.tar.gz - contains a script that I used for building linux-lts package. It also includes a minimal zero-check patch from linux-hardened and config flags modifications needed to reproduce.
- lts+checks_log.tar.gz - lts kernel logs with non zeroed pages (raw and symbolized)
If you have anything else you would like me to try/check then please let me know
lts+checks_log.tar.gz (44.5 KiB)
Either way I think we should give up on amd's mem_encrypt, its poorly engineered with an incomplete and borked ecosystem around it.
Tom Lendacky <thomas.lendacky@amd.com>
You could also try the kernel mailing list:
linux-kernel@vger.kernel.org (open list:X86 MM)
Individuals you might want to cc on the list:
Dave Hansen <dave.hansen@linux.intel.com> (maintainer:X86 MM)
Andy Lutomirski <luto@kernel.org> (maintainer:X86 MM)
Peter Zijlstra <peterz@infradead.org> (maintainer:X86 MM)
Thomas Gleixner <tglx@linutronix.de> (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT))
Ingo Molnar <mingo@redhat.com> (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT))
Borislav Petkov <bp@alien8.de> (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT))
"H. Peter Anvin" <hpa@zytor.com> (reviewer:X86 ARCHITECTURE (32-BIT AND 64-BIT))