FS#68445 - >=packages/linux-5.9.arch1-1 commit 5c7d8268bbddfe5 breaks booting as Xen domU/pvgrub

Attached to Project: Arch Linux
Opened by Jason (jac299792458) - Tuesday, 27 October 2020, 21:27 GMT
Last edited by Toolybird (Toolybird) - Sunday, 04 June 2023, 03:22 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Christian Hesse (eworm)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

With commit 5c7d8268bbddfe5 [0], the linux kernel configuration changed from compressing itself with XZ to compressing itself with ZSTD. All fine and good. Except when booting an Arch Linux DomU in Xen via pvgrub. Apparently, GRUB v2.04, even though it doesn't decompress the kernel, fails to detect the file as a Linux Kernel when compressed with ZSTD. I looked into the GRUB codebase enough that, as usual, GNU is making another operating system when you just need a simple tool. :-/

It's clear from the GRUB code that ZSTD support is only used when interfacing with ZSTD compressed BTRFS file systems. There is no obvious logic regarding ZSTD compressed Kernels and Initramfs'. Regardless, GRUB shouldn't be decompressing them anyway.

So, my specific situation is that I have an up-to-date Gentoo dom0, for which I've installed:
* sys-boot/grub-2.04-r1 (GRUB_PLATFORMS="efi-64 pc xen")
* app-emulation/grub-xen-host-1.0

Which, up until yesterday, booted four different Arch Linux DomUs via 'kernel = "/usr/libexec/xen/bin/grub-x86_64-xen.bin"'

Now, after I updated the Arch Linux kernels inside the VM, I get the following on the console immediately after the GRUB menu:

```
Loading Linux linux ...
error: not xen image.
Loading initial ramdisk ...
error: you need to load the kernel first.

Press any key to continue...
```

Well shit. That's about as helpful as a turd sandwich. But after a few hours of searching and code-diving, it appears that when you use grub-x86_64-xen.bin, it wants to verify the file passed as the kernel. It's this verification process that fails, presumably due to the change in compression scheme. I've looks all through the GRUB2 manual for a switch to turn this off and say "trust me, it's a f*cking kernel, just do it!" without success. There *was* an option in GRUB1 [@option{--type="type"}], but I guess that's not necessary because GRUB2 is perfect </sarcasm>. :-)

Regardless, for now I'm just going to `extract-linux` and load an uncompressed Linux kernel. But damn, this is annoying.


[0] https://github.com/archlinux/svntogit-packages/commit/5c7d8268bbddfe55aba9e587658592069fdb24af#diff-3e341d2d9c67be01819b25b25d5e53ea3cdf3a38d28846cda85a195eb9b7203a
This task depends upon

Closed by  Toolybird (Toolybird)
Sunday, 04 June 2023, 03:22 GMT
Reason for closing:  No response
Comment by Jason (jac299792458) - Tuesday, 27 October 2020, 21:37 GMT
Just a quick follow up.

I mounted the root partition from dom0, and did the following:

# /path/to/linux.git/scripts/extract-vmlinux /path/to/arch_root/boot/vmlinuz-linux >/path/to/arch_root/boot/vmlinuz-linux_uncompressed
# cd /path/to/arch_root/boot
# mv vmlinuz-linux vmlinuz-linux.orig
# mv mvlinuz-linux_uncompressed vmlinuz-linux
# cd ../../
# umount /path/to/arch_root
# xl create -c /etc/xen/config/arch_domu

and it came up no worries. So, just a little bit more circumstantial evidence that it was the change in compression scheme that fouled up GRUB. For no good reason, I might add.
Comment by Jason (jac299792458) - Wednesday, 21 April 2021, 12:15 GMT
Just saw that this was assigned.

I am still having this problem, and I've created a post install hook In my Arch and Artix VMs to decompress their kernels if ZSTD is detected. It's a complete hack and just papers over the problem, but it's been working reliably with automatic updates and reboots for months.

Hopefully, someone with a better understanding of the guts of grub2 can create a proper fix

fwiw, I created a bug over on Gentoo's bug tracker https://bugs.gentoo.org/show_bug.cgi?id=753728 which has also seen very little activity
Comment by Jan Alexander Steffens (heftig) - Wednesday, 21 April 2021, 12:36 GMT
Is there a upstream GRUB bug for this?

Loading...