FS#77569 - [dracut] boot failure with systemd >= 253
Attached to Project:
Arch Linux
Opened by Toolybird (Toolybird) - Saturday, 18 February 2023, 23:21 GMT
Last edited by Toolybird (Toolybird) - Thursday, 13 July 2023, 00:35 GMT
Opened by Toolybird (Toolybird) - Saturday, 18 February 2023, 23:21 GMT
Last edited by Toolybird (Toolybird) - Thursday, 13 July 2023, 00:35 GMT
|
Details
This is mainly a heads-up because I haven't figured it out
yet. It's the first time in yonks I've had to recover an
Arch system with an installation USB flash drive :(
- standard LVM setup (root volume -> no raid, thin, luks or anything like that, just basic) - dracut - system fails to boot when systemd-253 is present in the initrd Removing "quiet" and adding "rd.debug" to kernel params shows it's udev related...seems to be hanging somewhere around "dm-pre-udev.sh" Downgrading systemd and regenerating initrd works. Upgrading systemd but leaving initrd containing old systemd also works. Bit of a head-scratcher so far. I'm curious if anyone else is affected. |
This task depends upon
Closed by Toolybird (Toolybird)
Thursday, 13 July 2023, 00:35 GMT
Reason for closing: Fixed
Additional comments about closing: dracut 059-2
Thursday, 13 July 2023, 00:35 GMT
Reason for closing: Fixed
Additional comments about closing: dracut 059-2
[1] https://github.com/systemd/systemd/commit/ca6ce62d2a437432082b5c6e5d4275d56055510f#diff-0d21c797b372d6ceff84bd203e533ad48b9ed4fee07c24e2b4bc64f4ee6bac0fR3766
FWIW, I also run LVM but I do use LUKS on the individual logical volumes, but since this breaks even before cryptsetup comes into play, I'm probably hitting the same bug.
I've uploaded the /run/initramfs/rdsosreport.txt and journalctl output for systemd-253 with rd.debug enabled. While the logs mention linux-hardened, the same behavior can be seen with linux plain as well.
Note that `dmsetup` claims no devices exist, and there are no `/dev/dm*` devices and `/dev/disk/by-uuid` doesn't contain any mapped disks. `lvm` does list the physical and logical volumes as well as the volume groups, but the volume group `vgmain` (in my case) doesn't exist on `/dev/vgmain/*`.
From the log it does seem like lvm_scan is never run, maybe because it doesn't exist at the right place/its hook is not registered or because udev doesn't settle properly? I have no idea about the actual underlying workings unfortunately. I ~~can provide~~ have uploaded a tarball of the contents of a faulty and a correctly generated initramfs ~~but not here~~ on the linked GitHub issue (too big for flyspray)
rdsosreport.txt (378.1 KiB)
- STRV_MAKE("/sys", "/run", "/proc", "/dev/shm", "/tmp"));
+ STRV_MAKE("/sys", "/run", "/proc", "/dev", "/tmp", "/etc/", "/var", "/usr"));
but unfortunately no change.
Edit:
Skip the whole remount and see if the issue is still present.
Could you somehow place strace at the top of the two generators and find what call is getting blocked?
I would suggest using kernel audit rules but that would involve getting auditd installed and configured in the initrd.
Try asking upstream dracut?
[1] https://github.com/systemd/systemd/pull/26494
[1] https://src.fedoraproject.org/rpms/systemd/c/4bdd16eba5c409a5aa0afcc16f6e284f20793e06?branch=rawhide
https://github.com/aafeijoo-suse/dracut/commit/4bde75fabe31a5c048fd75e533b94e91c3faa83b
@eworm, please push, thanks.
@grazzolini, it appears you dropped the patch that fixed this. The patch is *not* included in 059 upstream.
[1] https://gitlab.archlinux.org/toolybird/dracut/-/commit/0f48723a277d8f1ee6e24cb21fd5c0ee01bd2846