FS#79444 - [linux-hardened] 6.4.9+ breaks AVX enumeration
Attached to Project:
Arch Linux
Opened by CodingCellist (CodingCellist) - Tuesday, 22 August 2023, 17:03 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:19 GMT
Opened by CodingCellist (CodingCellist) - Tuesday, 22 August 2023, 17:03 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:19 GMT
|
Details
Description:
When using linux-hardened 6.4.9-1 or 6.4.10-1 (6.4.10-1 is the latest at time of writing) along with lightdm and lightdm-webkit2-greeter, the system fails to start the greeter due to a coredump caused by a missing/removed part of libatomic: host-config.h (see the logs for details). This makes logging in impossible, as the continuous crash-and-restart of the greeter (re-)focuses its tty. Downgrading to linux-hardened 6.4.7-2 fixes the problem; downgrading any of the lightdm or webkit2 packages does not seem to affect things. Additional info: * link to upstream bug report: https://github.com/anthraxx/linux-hardened/issues/85 * package version(s): - lightdm: 1:1.32.0-4 - lightdm-webkit2-greeter: 2.2.5-7 - webkit2gtk: 2.40.5-1 - systemd: 254.1-1 * config and/or log files etc. attached (redacted for brevity, please let me know if I accidentally removed too much) Steps to reproduce: 1. Set up an Arch machine running linux-hardened 6.4.7 (rel 1 or 2), along with lightdm and lightdm-webkit2-greeter. Booting and logging in should work at this stage. 2. Upgrade the kernel to linux-hardened 6.4.9-1 or 6.4.10-1 and reboot. The greeter should never appear, leaving only a flickering cursor. 3. Attempt to switch tty using Ctrl+Alt+F3 (for example). This should briefly work when repeatedly pressing the key, although the restarting greeter will force you back to its tty within a second. At this point, the only recovery method I thought of was to go via the install ISO on a USB, mounting the system manually, chroot-ing, and then downgrading the kernel from there. If there is an easier one, I'd be grateful to know, although my system is working now. I'm happy to provide more information if need be. |
This task depends upon
Closed by Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:19 GMT
Reason for closing: Moved
Additional comments about closing: https://gitlab.archlinux.org/archlinux/p ackaging/packages/linux-hardened/issues/ 1
Saturday, 25 November 2023, 20:19 GMT
Reason for closing: Moved
Additional comments about closing: https://gitlab.archlinux.org/archlinux/p ackaging/packages/linux-hardened/issues/ 1
I suspect host-config.h not being found by the debugger is not relevant as it is not used at run time and not included by upstream as it is not referenced in any public headers.
Unfortunately, spec_rstack_overflow=off does not seem to change anything.
The issue does not repro on linux (non-hardened) 6.4.{9,10,11}. There was a slight oddity in that switching to 6.4.9 and 6.4.10 seemed to require 2 reboots for the greeter to work, but after the second reboot, it persistently worked. 6.4.11 worked out-of-the-box.
linux-6.4.10-first-broken.log (6.1 KiB)
linux-6.4.10-second-works.log (5.1 KiB)
linux-6.4.10-working-persists... (14.1 KiB)
linux-6.4.11-works-ootb.log (5.6 KiB)
I've swapped the configs for the arch kernels and I'm currently rebuilding the linux-hardened with the non-hardened config. It's been going for ~40 minutes; hopefully it'll be done soon.
Rebuilding the non-hardened package with the hardened config is proving difficult: the pgp-verification keeps failing on public key '3B94A80E50A477C7'. This is, according to the keyservers, heftig's key, although neither searching+importing the public key via GPG, nor importing it from the public key file found via "archlinux/people/developers" -> "heftig" -> "PGP Key" resolves the problem. I was missing anthraxx's key as well, but --search-keys resolved that without any issue.
I'm going to try to figure out the gpg public key issue to hopefully get the linux package to build with the linux-hardened config. Any pointers and/or ideas as to what might be wrong would be much appreciated : )
```
gpg --import <(pacman-key --export 3B94A80E50A477C7)
```
The resulting linux kernel+headers, compiled from the linux package source using the linux-hardened config, reproduces the issue (log attached). As mentioned earlier, the reverse, compiling the linux-hardened sources with the linux (non-hardened) config does not reproduce it.
[2] Section 14.3 DETECTION OF INTEL® AVX INSTRUCTIONS Provides flow diagrams and pseudo code of how to enumerate AVX support and also the following note
Note: It is unwise for an application to rely exclusively on CPUID.1:ECX.AVX[bit 28] or at all on CPUID.1:ECX.XSAVE[bit 26]: These indicate hardware support but not operating system support. If YMM state management is not enabled by an operating systems, Intel AVX instructions will #UD regardless of CPUID.1:ECX.AVX[bit 28]. “CPUID.1:ECX.XSAVE[bit 26] = 1” does not guarantee the OS actually uses the XSAVE process for state management.
[3] Provides C source for detecting AVX.
Edit:
Does building webgit2gtk with the attached patch have any effect?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=553a5c03e90a6087e88f8ff878335ef0621536fb
[2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
[3] https://www.intel.com/content/dam/develop/external/us/en/documents/intro-to-intel-avx-183287.pdf
[thomas@skidbladnir testc]$ ./gcc.out
__builtin_cpu_supports ("avx"):512
__builtin_cpu_supports ("avx2"):1024
[thomas@skidbladnir testc]$ ./clang.out
__builtin_cpu_supports ("avx"):1
__builtin_cpu_supports ("avx2"):1
In any event the issue needs to be reported to upstream webkit to fix. I suspect wc -l will also be broken [1] as __builtin_cpu_supports ("avx2") is not returning 0.
[1] https://git.savannah.gnu.org/cgit/coreutils.git/commit/src/wc.c?id=91a74d361461494dd546467e83bc36c24185d6e7
----- normal -----
asm-clang.out
CPU supports AVX:0
asm-gcc.out
CPU supports AVX:0
code-clang.out
__builtin_cpu_supports ("avx"):0
__builtin_cpu_supports ("avx2"):0
code-gcc.out
__builtin_cpu_supports ("avx"):0
__builtin_cpu_supports ("avx2"):0
----- gds=off -----
asm-clang.out
CPU supports AVX:1
asm-gcc.out
CPU supports AVX:1
code-clang.out
__builtin_cpu_supports ("avx"):1
__builtin_cpu_supports ("avx2"):1
code-gcc.out
__builtin_cpu_supports ("avx"):512
__builtin_cpu_supports ("avx2"):1024
gds-mitigation-disabled.log (2.9 KiB)
[1] https://bbs.archlinux.org/viewtopic.php?id=288816
[2] https://bbs.archlinux.org/viewtopic.php?id=289037
FS#79828From the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 Chapter 14.3 DETECTION OF INTEL® AVX INSTRUCTIONS:
1) Detect CPUID.1:ECX.OSXSAVE[bit 27] = 1 (XGETBV enabled for application use[1]).
2) Issue XGETBV and verify that XCR0[2:1] = ‘11b’ (XMM state and YMM state are enabled by OS).
3) detect CPUID.1:ECX.AVX[bit 28] = 1 (AVX instructions supported).
(Step 3 can be done in any order relative to 1 and 2.)
[1]: If CPUID.01H:ECX.OSXSAVE reports 1, it also indirectly implies the processor supports XSAVE, XRSTOR, XGETBV, processor extended state bit vector XCR0. Thus an application may streamline the checking of CPUID feature flags for XSAVE and OSXSAVE. XSETBV is a privileged instruction.
Do you know of an alternative to using XGETBV? The report does not indicate if the potential issue was ever reported to Intel. All the affected packages include code that uses AVX or AVX2 instructions without calling XGETBV to check for OS support.
https://github.com/electron/electron/issues/40441