FS#54958 - [systemd] systemd 234 Fails to start Switch Root

Attached to Project: Arch Linux
Opened by Eric Siegel (nticompass) - Friday, 28 July 2017, 15:04 GMT
Last edited by Christian Hesse (eworm) - Monday, 07 August 2017, 05:55 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dave Reisner (falconindy)
Christian Hesse (eworm)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

I updated to systemd 234.11-1 (also systemd-sysvcompat) last night.

When I powered my system on this morning, I saw the following:

Failed to start Switch Root
See systemctl status initrd.switch-root.service for details

(Or something like that. I'm typing that from memory, I forget the *exact* error.)

I could not run 'systemctl' (or any command) as there is no console, the system just hangs there.

I had to use a Fedora USB (just what I happened to have lying around) to chroot into my Arch Linux system and downgrade to systemd 233.75-3 (as well as downgrading systemd-sysvcompat... I forgot about libsystemd, so I didn't downgrade that). I tried to run 'systemctl status initrd.switch-root.service' from the chroot, but it wouldn't let me and I couldn't find anything in 'journalctl'.

Downgrading to systemd 233.75-3 allowed the system to boot up correctly.
This task depends upon

Closed by  Christian Hesse (eworm)
Monday, 07 August 2017, 05:55 GMT
Reason for closing:  Fixed
Additional comments about closing:  systemd 234.11-6
Comment by losynix (losynix) - Friday, 28 July 2017, 18:24 GMT
I'm having the exact same issue. I'll investigate and report here if I find the problem.
Comment by Doug Newgard (Scimmia) - Saturday, 29 July 2017, 02:28 GMT
Please do. There no real information here; this ticket is pretty useless as-is.

At the very least post configs.
Comment by Rasmus Edgar (ashren) - Saturday, 29 July 2017, 17:55 GMT
I have the same problem. After downgrading to systemd-233.75-3-x86_64.pkg.tar.xz my server could boot again.

I run Archlinux on a cloud provided kvm instance with following very non-standard setup which looks like it does because it was migrated from Xen.

/boot resides with most of the rest of root on /dev/sda which unlike the rest of system is not LVM.

/etc/fstab:
proc /proc proc defaults 0 0
/dev/sda / ext3 defaults,noatime 0 1
/dev/sdb none swap sw 0 0
/dev/mapper/vg-lvvar /var ext3 defaults,noatime 0 2
/dev/mapper/vg-lvtmp /tmp ext3 defaults,noatime 0 2
/dev/mapper/vg-lvhome /home ext3 defaults,noatime 0 2
/dev/mapper/vg-lvsrv /srv ext3 defaults,noatime 0 2
/dev/mapper/vg-lvbck /data/backup ext3 defaults,noatime 0 2
/dev/mapper/vg-lvusr /usr ext3 defaults,noatime 0 0
/dev/mapper/vg-lvopt /opt ext3 defaults,noatime 0

/etc/mkinicpio.conf:
grep -v ^# /etc/mkinitcpio.conf
MODULES=""

BINARIES=""

FILES=""

HOOKS="systemd autodetect modconf fsck block sd-lvm2 filesystems usr"

/etc/defaults/grub:grep -v ^# /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR="Arch"
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX="console=ttyS0,19200n8"
GRUB_SERIAL_COMMAND="serial --speed=19200 --unit=0 --word=8 --parity=no --stop=1"

GRUB_PRELOAD_MODULES="part_gpt part_msdos"


GRUB_TERMINAL_INPUT=console

GRUB_TERMINAL_OUTPUT=console

GRUB_GFXMODE=text

GRUB_GFXPAYLOAD_LINUX=keep

GRUB_DISABLE_LINUX_UUID=true

GRUB_DISABLE_RECOVERY=true

Comment by Ken Sinclair (kenbot) - Monday, 31 July 2017, 06:48 GMT
Have 2 separate btrfs on luks drives, both specified in kernel args. Set luks.options=timeout=5min to work around  bug 54825 . Debugging messages indicate that btrfs volumes were mounted.
Comment by Eric Siegel (nticompass) - Monday, 31 July 2017, 14:37 GMT
Apologies, I should've known to most more info in this ticket. What (more) info would you need from me? My system is a ThinkPad T430 and I am using rEFInd to boot Arch Linux.

Here's some info that could possibly be useful:

Drive Setup:
/dev/sda (500GB HDD)
- /dev/sda1 4.38G swap
- /dev/sda2 461.39G LVM

/dev/sdb Windows 10 installation

/dev/sdc (128GB SSD)
- /dev/sdc1 fat32 ESP (/boot/efi)
- /dev/sdc2 ext2 /boot
- /dev/sdc3 LVM (lvmcache)
- /dev/sdc4 bcache0 cache
- /dev/sdc5 Windows cache
- /dev/sdc6 ext4 /

Inside the LVM are 4 lvs: /usr, /var, /opt, /home. All are ext4 and all (except /home) are cached using lvmcache. /home is actually using bcache, long story, but for some reason I had issues booting with /home using lvmcache. Either way, this works and boots fine with systemd 233.75-3.

Here is my /etc/mkinitcpio.conf:
MODULES="i915"
HOOKS="base systemd sd-plymouth autodetect block sd-lvm2 bcache filesystems keyboard fsck"
COMPRESSION="xz"

And if it's needed, my /etc/fstab: https://pastebin.com/raw/dSZNQRtA
Comment by Ken Sinclair (kenbot) - Monday, 31 July 2017, 23:24 GMT
Downgraded to systemd{,-sysvcompat}-233.75-x86_64 and ran mkinitcpio. Now boots fine with no other changes.

My mkinitcpio.conf has
```
MODULES="nvidia nvidia_modeset nvidia_uvm nvidia_drm"
HOOKS="base systemd autodetect modconf block sd-encrycrypt sd-vconsole sd-shutdown filesystems btrfs keyboard fsck"
COMPRESSION="lzma"
```

As an aside, despite having the base hook, I never get a rescue shell when boot fails. But that's off topic; maybe someone could point me in the right direction for that, though.
Comment by Asbjørn Apeland (aude) - Tuesday, 01 August 2017, 07:12 GMT Comment by Ken Sinclair (kenbot) - Thursday, 03 August 2017, 11:48 GMT
I just upgraded my systemd to the latest version for my OS. When I used an initcpio that had the new version, the issue persisted, but when I used one that had the older version, it booted without a hitch
Comment by loqs (loqs) - Saturday, 05 August 2017, 21:55 GMT
Can you try with systemd 234.11-5 which fixed the lack of shell under rd.emergency and rd.rescue
It looks like `/usr/bin/systemctl --no-block switch-root /sysroot` is the failing command
Please check what mount options were applied to /sysroot if you obtain a shell.
Comment by Ken Sinclair (kenbot) - Sunday, 06 August 2017, 17:10 GMT
I could not obtain a shell, so I couldn't do that, but I did find some other interesting stuff.

After attempting to boot using systemd 234.11-5 (which fails), then downgrading to 233 and booting, I find the directory structure /sysroot/usr has been created, and is recreated if I rename or delete /sysroot and then reattempt booting using systemd 234.

I redirected journald to the console, where I could see that systemctl "Sent message type=method_call ... member=SwitchRoot ... " and then an error stating that "'sysroot' does not appear to be an OS tree. os-release file is missing." and no method_return message.
Comment by loqs (loqs) - Sunday, 06 August 2017, 18:30 GMT
Both your system and nticompass's use a split usr along with the systemd hook which may be a rare combination.
"os-release file is missing" does it mean /etc/os-release but I thought systemd should boot with an empty /etc and populate it from /usr/share/factory/
Comment by loqs (loqs) - Sunday, 06 August 2017, 18:40 GMT Comment by Ken Sinclair (kenbot) - Sunday, 06 August 2017, 23:37 GMT
It seems that usr is being mounted to /sysroot/sysroot/usr, not /sysroot/usr. This is related to https://github.com/systemd/systemd/commit/98eda38aed6a10c4f6d6ad0cac6e5361e87de52b and will hopefully be resolved the next time the arch repo is updated...
Comment by Ken Sinclair (kenbot) - Sunday, 06 August 2017, 23:54 GMT
Hm. 234.11-5 is currently in test, and it pulls from upstream commit d52e2bb9c20216972754c054e8534bca28baab66, which doesn't have the fix in it yet. <strike>I guess I'll have to wait 2 rounds of updates to see the change in core.</strike> just installed 234.11-6 from test. solved my issue!

Loading...