Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in Unsupported. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#63697 - systemd 243.0-1 poweroff/reboot umount errors

Attached to Project: Arch Linux
Opened by Scott K. (Pillgar) - Monday, 09 September 2019, 00:36 GMT
Last edited by freswa (frederik) - Saturday, 25 April 2020, 21:51 GMT
Task Type Bug Report
Category Packages: Core
Status Assigned   Reopened
Assigned To Dave Reisner (falconindy)
Christian Hesse (eworm)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 10
Private No

Details

Description: after upgrading from systemd 242.84-2 to 243.0-1 (lib32 included), I've noticed two errors during shutdown or reboot.

sd-umount[1636]: Failed to umount /oldroot: Device or resource busy
sd-umount[1637]: Failed to umount /oldroot/sys: Device or resource busy

Might be a regression from https://bbs.archlinux.org/viewtopic.php?id=233820&p=2. I vaguely remember seeing this in a previous version awhile ago.

Additional info:
* package version(s) systemd 243.0-1

Steps to reproduce:
upgrade to systemd 243.0-1 and reboot
This task depends upon

Comment by jari_45 (jari_45) - Wednesday, 11 September 2019, 08:59 GMT
I have noticed this bug too.
Comment by christian (damachine) - Thursday, 12 September 2019, 15:08 GMT
too
Comment by gamezelda (gamezelda) - Friday, 13 September 2019, 21:08 GMT
After upgrading to systemd 243.0-1 I started getting a new message when shutting down / rebooting:

printk: shutdown: 4 output lines suppressed due to ratelimiting

If I add the "debug" parameter to the kernel command line, I see the "Failed to umount /oldroot" message as well, as reported in this bug.

HOWEVER, If I downgrade to systemd 242.84-2, I no longer see the "lines suppressed due to ratelimiting" message, but I still see the "Failed to umount /oldroot" message. So, it appears that at least in my system, the "Failed to umount /oldroot" always appeared with the "debug" kernel parameter. But the "lines suppressed due to ratelimiting" message started appearing with this upgrade.

Now, if on 243.0-1 I run "sudo sysctl -w kernel.printk_devkmsg=on -w kernel.printk_ratelimit=0", the "lines suppressed due to ratelimiting" message disappears, with no apparent ill effect (I tried various debugging parameters like "debug", "printk_devkmsg=on", etc. and didn't find any further error).

My current hypothesis is:
* The "Failed to umount /oldroot" message is not a regression from 242.84-2 -> 243.0-1, it was there but I simply didn't see it because it only appears with the "debug" kernel parameter.
* The "lines suppressed due to ratelimiting" message is a regression in 243.0-1. But it's possible that it's simply that systemd is too chatty in the shutdown process and this triggers a ratelimiting message even though there is really no error condition.

Can someone else reproduce this behaviour as well?
Comment by Ales Katona (almindor) - Friday, 15 November 2019, 19:11 GMT
I'm getting same. One thing to note here is that I identified the "culprit" for failing the unmount in my case is RLS (rust language server, started by VS code) using lsof | grep ...
If I stop RLS myself and shutdown it works without errors.

It seems systemd is not following proper shutdown logic and isn't waiting for all processes to kill before trying the unmount.
Comment by Dennis Kadioglu (dkadioglu) - Wednesday, 11 December 2019, 14:45 GMT
I have the same problem - at least as far as I can see. Setting sudo sysctl -w kernel.printk_devkmsg=on -w kernel.printk_ratelimit=0 also made the ratelimiting message disappear. But still, I see the following messages:
sd-umount[1334]: Failed to unmount /oldroot: Device or resource busy
shutdown[1]: Failed to finalize file systems, ignoring.

I already asked upstream: https://github.com/systemd/systemd/issues/14298 and Lennart Poettering thinks, this must have something to do with Archs initrd.
I would like to investigate this further but, I don't know how. Debugging systemd during shutdown following https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1 lead to the following log:https://pastebin.com/raw/Ws6RRmNM
At least I cannot identify anything problematic in the log.

As a sidenote, I don't have rust language server installed.
Comment by Leo P. (jpegxguy) - Friday, 13 December 2019, 11:07 GMT
No that you mention Arch's initrd, I think the message didn't happen for the short test time that I used dracut to generate the initramfs. Have you tried it?
Comment by Dennis Kadioglu (dkadioglu) - Friday, 13 December 2019, 16:13 GMT
I tried it with dracut now and the messages still appear. Thanks for that idea anyway.
Comment by Vizitor (Vizitor) - Wednesday, 26 February 2020, 17:12 GMT
systemd 244.3-1
[some_number] sd-umount[some_number]:Failed to unmount /oldroot: Device or resource busy
[some_number] sd-umount[some_number]:Failed to unmount /oldroot/sys: Device or resource busy
[some_number] shutdown[[1]: Could not detach DM /dev/dm-0: Device or resource busy
[some_number] shutdown[[1]: Failed to finalize file systems, DM devices, ignoring

I am using Simple dm-crypted root (so, NO LVM) +EFISTUB booting, and no swap or any other partitions mounted
Comment by Christian Hesse (eworm) - Wednesday, 26 February 2020, 19:56 GMT
What HOOKS are enabled in /etc/mkinitcpio.conf?
I guess you have 'shutdown' in there? Any side effects removing this?
Comment by Ales Katona (almindor) - Wednesday, 26 February 2020, 20:07 GMT
I still get this issue and I don't have shutdown in hooks,

my hooks are: `HOOKS=(base udev autodetect modconf block keyboard keymap encrypt resume filesystems fsck)`
Comment by Vizitor (Vizitor) - Wednesday, 26 February 2020, 22:17 GMT
HOOKS=(base udev autodetect modconf block encrypt filesystems keyboard fsck)
No impact on shutdown time, but I'm worried that filesystem may become corrupted if system is not shutdown properly.
Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 02:16 GMT
I tried to do some printf's in systemd's shutdown process to print a lsof-like log and /proc/mounts but unfortunately there doesn't seem to be anything obviously wrong to block the unmount (screenshot attached). So it looks like it's something more obscure blocking it. If anyone has any idea of where to look, I can give it a try.

EDIT: Also, cwd=/ so that isn't the culprit it seems.

EDIT 2: Also, something that looks interesting as well is that if one boots into an emergency shell and reboots, no message happens. If one boots into a rescue shell and reboots, then the 'Failed to unmount /oldroot' message is printed. One can also boot into a rescue shell and then run 'systemctl emergency' to go into an emergency shell, in which case the message is also printed upon reboot. So, it looks like something in between emergency and rescue is getting stuck.
Comment by Ales Katona (almindor) - Friday, 28 February 2020, 19:46 GMT
For me it's definitely caused by RLS (Rust Language Server) when running via VSCode. If I uninstall the RLS plugin or shut down VSCode prior to restart I don't get the mount errors.

How can a normal user-level process cause this is another question I guess..
Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 20:24 GMT
I managed to find the culprit (or culprits) in my case as well. It seems caused by or related to systemd-binfmt.service, but for me it only happens when I also have two concrete AUR packages installed, qemu-user-static-bin and binfmt-qemu-static (which deploy files to /lib/binfmt.d/).

* If I remove those two qemu AUR packages, the error goes away (even though I also have wine installed which also deploys files in /lib/binfmt.d/, so systemd-binfmt.service is still started).

* If I mask systemd-binfmt.service, the error also goes away. On an emergency shell the error doesn't happen normally, but it does if I start systemd-binfmt.service.

I also managed to replicate this in a virtual machine as well.
Comment by Dennis Kadioglu (dkadioglu) - Friday, 28 February 2020, 20:47 GMT
In my case, it seems to be the same culprit as it is for gamezelda. If I remove both packages, the error messages do not appear anymore.
Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 21:14 GMT
I did some more experimentation and I found why qemu-user-static-bin+binfmt-qemu-static cause the problem but not wine. It seems related to the "F" flag in the last entry of the binfmt configuration files in /lib/binfmt.d/.

As a simple test case I create a dummy file in /lib/binfmt.d/ containing just:

:TestFormat:M::XX::/usr/bin/true:F

Then the problem starts to reproduce. Removing the last "F" in the line fixes it. The meaning of the "F" flag is explained here: https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html and seems to imply that it keeps a reference to the interpreter binary, so that's probably the file that's still open and causes the unmount to fail.

If I clean the entry from /proc/sys/fs/binfmt_misc/ manually by echoing -1 to it, it allows the unmount to work correctly. However systemd-binfmt does not do this cleanup itself: https://github.com/systemd/systemd/issues/13129 . And unmounting the binfmt_misc filesystem also doesn't make the kernel deregister the formats. So this should explain why the unmount is blocked, the interpreter binary is still registered and with a reference open.
Comment by loqs (loqs) - Friday, 28 February 2020, 22:54 GMT
I thought systemd skipped unmounting virtual filesystems such as binfmt_misc [1]
You could ask for additional special handling of binfmt_misc.

[1] https://github.com/systemd/systemd/blob/v244/src/shutdown/umount.c#L105
Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 23:14 GMT
@loqs: Unmounting binfmt_misc or not does not fix the problem, the registered formats stay even after unmounting, it can only be done by explicitly deregistering them (by echoing "-1" to status before unmounting) which is what is not done and the systemd issue I linked seems to ask for. In case of 'F' flag unless they are explicitly deregistered it looks like it keeps a reference to the interpreter which later blocks the rootfs unmount.
Comment by Leo P. (jpegxguy) - Friday, 28 February 2020, 23:29 GMT
This has been bugging me for months. Really glad you guys found the issue.
In my machine it's the same qemu story.

So what'd the best way to fix this? It it systemd that needs to clean up the binfmt entries?
Comment by loqs (loqs) - Friday, 28 February 2020, 23:54 GMT
@gamezelda I was suggesting if upstream would not implement the feature into the shutdown of systemd-binfmt.service perhaps upstream would integrate deregistering types into systemd-shutdown.
Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 23:56 GMT
@loqs: Oh, I misunderstood your comment then. I agree it would make sense to deregister binfmts at shutdown if systemd-binfmt does not do this on stop.

@jpegxguy: If you are experiencing this due to the binfmt problem I mentioned, this is what I'm testing right now and seems to work. Run 'systemctl edit systemd-binfmt' and paste this (then run systemctl daemon-reload or reboot for the changes to apply):

[Service]
# Workaround for FS#63697 / systemd GitHub issue #13129
# See: https://bugs.archlinux.org/task/63697, https://github.com/systemd/systemd/issues/13129
ExecStop=/usr/bin/bash -c "echo -1 > /proc/sys/fs/binfmt_misc/status"
Comment by Ales Katona (almindor) - Friday, 28 February 2020, 23:58 GMT
I don't know what RLS does but I doubt it's related to binfmt in my case...
Comment by Scott K. (Pillgar) - Saturday, 29 February 2020, 01:23 GMT
@gamezelda

I tried the ExecStop=/usr/bin/bash -c "echo -1 > /proc/sys/fs/binfmt_misc/status". Didn't remedy for me. This just creates an override.conf in /etc/systemd/system/systemd-binfmt.service.d??? Though ls -l /proc/sys/fs/binfmt_misc didn't show any qemu entries either. And I'm just not smart enough to find the culprit. At least it's getting narrowed down now. TY
Comment by gamezelda (gamezelda) - Saturday, 29 February 2020, 02:18 GMT
@Pillgar That override.conf should get automatically applied by systemd when systemd-binfmt.service is stopped during shutdown.

What files do you have on /proc/sys/fs/binfmt_misc other than 'register' and 'status'? (If you don't have any more, then this definitely won't help)

The problem with this bug is that anything that leaves a file open until shutdown can cause it. So it's possible your case hasn't anything to do with binfmt. But it's good that we try to find typical causes for it..
Comment by Scott K. (Pillgar) - Saturday, 29 February 2020, 03:19 GMT
I got:
CLR:
enabled
interpreter /usr/bin/mono
flags:
offset 0
magic 4d5a

and
DOSWin:
enabled
interpreter /usr/bin/wine
flags:
offset 0
magic 4d5a
Comment by gamezelda (gamezelda) - Saturday, 29 February 2020, 10:50 GMT
@Pillgar Then since none of those binfmt's have the 'F' flag I mentioned, it's likely that your message is due to a different cause and the override I posted won't help.

If you or someone else wants to try to find out what it is, I suggest disabling all autostart application, systemd services you can until it doesn't reproduce and you find the culprit. If you run out of things to disable, then try booting into the systemd rescue and/or emergency shells and see if you can reproduce it from those, and once you can't try to do a "gradual" bootup.

For example in my case where the problem reproduced in rescue but not emergency, I looked into 'systemctl list-dependencies rescue.target' and started services/targets/mounts one by one until I found the culprit. It's a bit tricky due to some services refusing to start unless some dependencies have already started and some auto-starting a whole chain of dependencies themselves (you can see this on journalctl, after starting systemd-journald). So expect it to require a lot of trial and error and be time-consuming, but once you find the exact service/application that's causing the problem, that narrows down things a lot.
Comment by gamezelda (gamezelda) - Saturday, 29 February 2020, 18:18 GMT
I have opened an issue at systemd's GitHub for the binfmt thing: https://github.com/systemd/systemd/issues/14981
Comment by Leo P. (jpegxguy) - Wednesday, 04 March 2020, 01:03 GMT
@gamezelda Your workaround is successful, thanks!
Comment by Martin K. Scherer (marscher) - Friday, 03 April 2020, 14:20 GMT
However my reboot/umount error is not due to binfmtfs (e.g. service masked/no config), the source of umount errors renders lvm caching on root completely useless. When the fs is marked dirty due to the unsuccessful umount, all cache blocks are also marked dirty for some reason. This leads to all block being rewritten to the cache - a disastrous situation for a SSD cache.
I first encounter that my rootfs is not cleanly unmounted, when I first enabled caching.

It seems like there is a dependency problem with the services (not every service stopped before shutdown.target). However I cannot spot it.

I can confirm the observation that the rescue.target of systemd leads to the same error, while the (very minimalistic) emergency.target does not.

Mr Poettering says, that systemd cannot unmount the rootfs if it is living on a LV (https://github.com/systemd/systemd/issues/15004) and suggest using a special initrd created with dracut.
Comment by Martin K. Scherer (marscher) - Sunday, 05 April 2020, 13:57 GMT
A note for those people who have tried out dracut to mitigate the problem. If you have systemd-sysvcompat installed,
dracut will execute the symlinked systemd-shutdown/reboot binary. This again will try to unmount things and we have the same situation again.
Without the compat package installed, one gets this error during dracut image generation:

dracut: *** Including module: shutdown ***
dracut-install: ERROR: installing 'poweroff'
dracut-install: ERROR: installing 'reboot'
dracut-install: ERROR: installing 'halt'
dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.kR65CE/initramfs -a umount poweroff reboot halt losetup stat sleep

and one is being trapped in a plain emergency shell when doing a reboot,
BUT the rootfs was unmounted successfully.
However there is no way to properly reboot/shutdown anymore (only reset/power button will the job).

Does anyone know how to get the (presumably c-written) shutdown command of sysv back into play?
Comment by Zaidan (zaidan) - Thursday, 16 April 2020, 17:04 GMT
I get the following errors before the system hangs:
```
systemd[1]: Shutting down.
systemd-shutdown[1]: Syncing filesystems and block devices.
systemd-shutdown[1]: Sending SIGTERM to remaining processes...
systemd-shutdown[1]: Waiting for process: systemd-udevd, systemd-udevd
systemd-udevd[1250]: hdaudioC1D0: Worker [1317] processing SEQNUM=4814 killed
systemd-udevd[1250]: Worker [1317] terminated by signal 9 (KILL)
systemd-udevd[1250]: hdaudioC1D0: Worker [1317] failed
systemd-shutdown[1]: Sending SIGKILL to remaining processes...
systemd-shutdown[1]: Unmounting file systems.
[1592]: Remounting '/' read-only in with options '(null)'.
EXT4-fs (dm-2): re-mounted. Opts: (null)
systemd-shutdown[1]: All filesystems unmounted.
systemd-shutdown[1]: Deactivating swaps.
systemd-shutdown[1]: All swaps deactivated.
systemd-shutdown[1]: Detaching loop devices.
systemd-shutdown[1]: All loop devices detached.
systemd-shutdown[1]: Detaching DM devices.
systemd-shutdown[1]: Detaching DM /dev/dm-1 (254:1).
systemd-shutdown[1]: Detaching DM /dev/dm-0 (254:0).
systemd-shutdown[1]: Could not detach DM /dev/dm-0: Device or resource busy
systemd-shutdown[1]: Not all DM devices detached, 2 left.
systemd-shutdown[1]: Detaching DM devices.
systemd-shutdown[1]: Detaching DM /dev/dm-0 (254:0).
systemd-shutdown[1]: Could not detach DM /dev/dm-0: Device or resource busy
systemd-shutdown[1]: Not all DM devices detached, 2 left.
systemd-shutdown[1]: Can not finalize remaining DM devices, continuing
systemd-shutdown[1]: Successfully changed into root pivot.
systemd-shutdown[1]: Returning to initrd...
Unmounting all devices.
Detaching loop devices.
Disassembling stacked devices.
kvm: exiting hardware virtualization
```

As @gamezelda suggested I started dependencies from rescue.target in emergency mode.
When I boot into rescue mode the system produces the log above and does not shutdown properly.
In emergency mode there are no errors. When I start services until systemd-udevd / systemd-udev-trigger produces some udev device
detection output (especially NVIDIA stuff) I can reproduce the system hang.


Since the log contains hdaudioC1D0 I tried to disable the Audio device GM204
HDA (device ID: 0x0fbb, vendor ID: 0x10de). First I tried to blacklist the modules
snd_hda_codec_realtek and snd_hda_codec_generic with the following kernel parameters:
modprobe.blacklist=snd_hda_codec_realtek modprobe.blacklist=snd_hda_codec_generic

This did not work, since the device was still detected as snd_hda_codec_generic.

To disable the device I used `lspci | grep Audio` to determine the PCI device address.
With the command `echo 1 > /sys/bus/pci/devices/0000:01:00.1/remove` I removed the PCI device.
The first time this worked and I could power off without issues.
The second time, the ZSH shell got frozen and was not interuptable. The remove file was gone.
I tried to shutdown, but the system got a similar error log, but with ZSH.

The disable the device permanently I created a udev rule /etc/udev/rules.d/01-disable-nivida-hda.rules:
```
ACTION=="add", KERNEL=="0000:01:00.1", SUBSYSTEM=="pci", RUN+="/bin/sh -c 'echo 1 > /sys/bus/pci/devices/0000:01:00.1/remove'"
```

I tried 3 shutdowns and all shutdowns finished properly.
The only drawback of this workaround is, that the Nvidia HDA Audio device is removed.
Since my built in speaker is wired to the Intel audio device, I have no issues with playing audio.
Comment by Martin K. Scherer (marscher) - Thursday, 23 April 2020, 22:57 GMT
Mr Poettering pushed a fix for the binfmtfs issue to Github, eventually somebody is willing to test it? https://github.com/systemd/systemd/pull/15566
Comment by gamezelda (gamezelda) - Thursday, 23 April 2020, 22:58 GMT
@marscher: I just finished rebuilding systemd with this patch applied and it works for me. It applies cleanly over Arch's v245.5.

PS: Note that this fix won't do anything for your problem if it's not due to the binfmts issue, like it appears to be your case.
Comment by Ales Katona (almindor) - Saturday, 25 April 2020, 21:51 GMT
  • Field changed: Percent Complete (100% → 0%)
This is not fixed. I still get the same error with systemd@245.5-2

NOTE: the drive for me is NOT root and the issue is not caused by binfmts but by either RLS (rust language server) or rust analyzer, or possibly even just VSCode having a file open in the disk before reboot is requested.

In either case the issue persists.
Comment by loqs (loqs) - Saturday, 25 April 2020, 23:20 GMT
@almindor have you opened an issue upstream with systemd or do you believe this is a packaging issue?
Comment by Ales Katona (almindor) - Sunday, 26 April 2020, 02:19 GMT
I didn't open an issue upstream yet no, I don't think this is a packaging issue but haven't got the context to know where the error originates. All I know is that having some (graphical or background used by a front-end graphical app) open and pointing to files in the partition will cause this unmount error.

I know 100% that RLS (rust language server) causes it as well as rust-analyzer (another extension with similar functionality). I'm not sure if VSCode itself (the editor) would cause it too, or if ANY process that keeps a file open in that partition would cause it.

I can try reproducing with a custom script or something to that effect to see if I can get it to happen out of these specific cases.

Loading...