FS#63697 : [systemd] 243.0-1 poweroff/reboot umount errors

FS#63697 - [systemd] 243.0-1 poweroff/reboot umount errors

Attached to Project: Arch Linux
Opened by Scott K. (Pillgar) - Monday, 09 September 2019, 00:36 GMT
Last edited by Toolybird (Toolybird) - Thursday, 15 June 2023, 07:38 GMT

Task Type	Bug Report
Category	Packages: Core
Status	Closed
Assigned To	Christian Hesse (eworm)
Architecture	x86_64
Severity	Medium
Priority	Normal
Reported Version
Due in Version	Undecided
Due Date	Undecided
Percent Complete
Votes	34 Boss (boss) (2023-06-15) sh1sh1n11 (sh1sh1n11) (2022-07-20) Pascal (pad) (2022-07-16) Mark Wagie (yochananmarqos) (2022-07-02) smowtenshi (smowtenshi) (2022-05-29) Oliver Kahrmann (founderio) (2022-04-26) Simon Tunnat (imon) (2022-03-26) Ken Minardo (kminardo) (2022-03-13) Rodrigo Contreras (rodancoci) (2022-03-04) Matthias Treitler (mattre) (2022-02-21) hamelg (hamelg) (2022-01-30) Jérémie Roquet (Arkanosis) (2022-01-27) Matthew Moore (ElderBladeX) (2022-01-11) Aleš Ferlan (Alko89) (2022-01-02) Nikita Nemkin (nnemkin) (2021-12-07) Danny Navarro (vimes656) (2021-09-08) Sebastián Peyrott (Pse) (2021-01-28) Frederick Zhang (FrederickZh) (2021-01-14) Alexander (orlovv) (2020-12-15) theminetrooper7 (TheMineTrooper) (2020-10-03) Serge K (phnx47) (2020-09-24) Guilherme (GUiHKX) (2020-09-15) Michael (skaldek) (2020-09-07) Kyle (2bluesc) (2020-07-03) rainer (raneon) (2020-04-18) Zaidan (zaidan) (2020-04-16) Martin K. Scherer (marscher) (2020-04-05) Vizitor (Vizitor) (2020-02-26) Matthew Dimitroff (mdimitro) (2020-02-16) Leon Möller (jkhsjdhjs) (2020-02-13) Vinay S Shastry (shastry) (2019-12-26) Ales Katona (almindor) (2019-11-15) Mihai Coman (z0id) (2019-10-21) jihwan kim (paseu) (2019-09-20)
Private	No

Details

Description: after upgrading from systemd 242.84-2 to 243.0-1 (lib32 included), I've noticed two errors during shutdown or reboot.

sd-umount[1636]: Failed to umount /oldroot: Device or resource busy
sd-umount[1637]: Failed to umount /oldroot/sys: Device or resource busy

Might be a regression from https://bbs.archlinux.org/viewtopic.php?id=233820&p=2. I vaguely remember seeing this in a previous version awhile ago.

Additional info:
* package version(s) systemd 243.0-1

Steps to reproduce:
upgrade to systemd 243.0-1 and reboot

This task depends upon

Closed by Toolybird (Toolybird)
Thursday, 15 June 2023, 07:38 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed upstream in systemd 252. See comments

Comment by Mthw (jari_45) - Wednesday, 11 September 2019, 08:59 GMT

I have noticed this bug too.

Comment by christian (damachine) - Thursday, 12 September 2019, 15:08 GMT

too

Comment by gamezelda (gamezelda) - Friday, 13 September 2019, 21:08 GMT

After upgrading to systemd 243.0-1 I started getting a new message when shutting down / rebooting:

printk: shutdown: 4 output lines suppressed due to ratelimiting

If I add the "debug" parameter to the kernel command line, I see the "Failed to umount /oldroot" message as well, as reported in this bug.

HOWEVER, If I downgrade to systemd 242.84-2, I no longer see the "lines suppressed due to ratelimiting" message, but I still see the "Failed to umount /oldroot" message. So, it appears that at least in my system, the "Failed to umount /oldroot" always appeared with the "debug" kernel parameter. But the "lines suppressed due to ratelimiting" message started appearing with this upgrade.

Now, if on 243.0-1 I run "sudo sysctl -w kernel.printk_devkmsg=on -w kernel.printk_ratelimit=0", the "lines suppressed due to ratelimiting" message disappears, with no apparent ill effect (I tried various debugging parameters like "debug", "printk_devkmsg=on", etc. and didn't find any further error).

My current hypothesis is:
* The "Failed to umount /oldroot" message is not a regression from 242.84-2 -> 243.0-1, it was there but I simply didn't see it because it only appears with the "debug" kernel parameter.
* The "lines suppressed due to ratelimiting" message is a regression in 243.0-1. But it's possible that it's simply that systemd is too chatty in the shutdown process and this triggers a ratelimiting message even though there is really no error condition.

Can someone else reproduce this behaviour as well?

Comment by Ales Katona (almindor) - Friday, 15 November 2019, 19:11 GMT

I'm getting same. One thing to note here is that I identified the "culprit" for failing the unmount in my case is RLS (rust language server, started by VS code) using lsof | grep ...
If I stop RLS myself and shutdown it works without errors.

It seems systemd is not following proper shutdown logic and isn't waiting for all processes to kill before trying the unmount.

Comment by Dennis Kadioglu (dkadioglu) - Wednesday, 11 December 2019, 14:45 GMT

I have the same problem - at least as far as I can see. Setting sudo sysctl -w kernel.printk_devkmsg=on -w kernel.printk_ratelimit=0 also made the ratelimiting message disappear. But still, I see the following messages:
sd-umount[1334]: Failed to unmount /oldroot: Device or resource busy
shutdown[1]: Failed to finalize file systems, ignoring.

I already asked upstream: https://github.com/systemd/systemd/issues/14298 and Lennart Poettering thinks, this must have something to do with Archs initrd.
I would like to investigate this further but, I don't know how. Debugging systemd during shutdown following https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1 lead to the following log:https://pastebin.com/raw/Ws6RRmNM
At least I cannot identify anything problematic in the log.

As a sidenote, I don't have rust language server installed.

Comment by Leo P. (jpegxguy) - Friday, 13 December 2019, 11:07 GMT

No that you mention Arch's initrd, I think the message didn't happen for the short test time that I used dracut to generate the initramfs. Have you tried it?

Comment by Dennis Kadioglu (dkadioglu) - Friday, 13 December 2019, 16:13 GMT

I tried it with dracut now and the messages still appear. Thanks for that idea anyway.

Comment by Vizitor (Vizitor) - Wednesday, 26 February 2020, 17:12 GMT

systemd 244.3-1
[some_number] sd-umount[some_number]:Failed to unmount /oldroot: Device or resource busy
[some_number] sd-umount[some_number]:Failed to unmount /oldroot/sys: Device or resource busy
[some_number] shutdown[[1]: Could not detach DM /dev/dm-0: Device or resource busy
[some_number] shutdown[[1]: Failed to finalize file systems, DM devices, ignoring

I am using Simple dm-crypted root (so, NO LVM) +EFISTUB booting, and no swap or any other partitions mounted

Comment by Christian Hesse (eworm) - Wednesday, 26 February 2020, 19:56 GMT

What HOOKS are enabled in /etc/mkinitcpio.conf?
I guess you have 'shutdown' in there? Any side effects removing this?

Comment by Ales Katona (almindor) - Wednesday, 26 February 2020, 20:07 GMT

I still get this issue and I don't have shutdown in hooks,

my hooks are: `HOOKS=(base udev autodetect modconf block keyboard keymap encrypt resume filesystems fsck)`

Comment by Vizitor (Vizitor) - Wednesday, 26 February 2020, 22:17 GMT

HOOKS=(base udev autodetect modconf block encrypt filesystems keyboard fsck)
No impact on shutdown time, but I'm worried that filesystem may become corrupted if system is not shutdown properly.

Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 02:16 GMT

I tried to do some printf's in systemd's shutdown process to print a lsof-like log and /proc/mounts but unfortunately there doesn't seem to be anything obviously wrong to block the unmount (screenshot attached). So it looks like it's something more obscure blocking it. If anyone has any idea of where to look, I can give it a try.

EDIT: Also, cwd=/ so that isn't the culprit it seems.

EDIT 2: Also, something that looks interesting as well is that if one boots into an emergency shell and reboots, no message happens. If one boots into a rescue shell and reboots, then the 'Failed to unmount /oldroot' message is printed. One can also boot into a rescue shell and then run 'systemctl emergency' to go into an emergency shell, in which case the message is also printed upon reboot. So, it looks like something in between emergency and rescue is getting stuck.

systemd.jpg (260 KiB)

Comment by Ales Katona (almindor) - Friday, 28 February 2020, 19:46 GMT

For me it's definitely caused by RLS (Rust Language Server) when running via VSCode. If I uninstall the RLS plugin or shut down VSCode prior to restart I don't get the mount errors.

How can a normal user-level process cause this is another question I guess..

Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 20:24 GMT

I managed to find the culprit (or culprits) in my case as well. It seems caused by or related to systemd-binfmt.service, but for me it only happens when I also have two concrete AUR packages installed, qemu-user-static-bin and binfmt-qemu-static (which deploy files to /lib/binfmt.d/).

* If I remove those two qemu AUR packages, the error goes away (even though I also have wine installed which also deploys files in /lib/binfmt.d/, so systemd-binfmt.service is still started).

* If I mask systemd-binfmt.service, the error also goes away. On an emergency shell the error doesn't happen normally, but it does if I start systemd-binfmt.service.

I also managed to replicate this in a virtual machine as well.

Comment by Dennis Kadioglu (dkadioglu) - Friday, 28 February 2020, 20:47 GMT

In my case, it seems to be the same culprit as it is for gamezelda. If I remove both packages, the error messages do not appear anymore.

Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 21:14 GMT

I did some more experimentation and I found why qemu-user-static-bin+binfmt-qemu-static cause the problem but not wine. It seems related to the "F" flag in the last entry of the binfmt configuration files in /lib/binfmt.d/.

As a simple test case I create a dummy file in /lib/binfmt.d/ containing just:

:TestFormat:M::XX::/usr/bin/true:F

Then the problem starts to reproduce. Removing the last "F" in the line fixes it. The meaning of the "F" flag is explained here: https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html and seems to imply that it keeps a reference to the interpreter binary, so that's probably the file that's still open and causes the unmount to fail.

If I clean the entry from /proc/sys/fs/binfmt_misc/ manually by echoing -1 to it, it allows the unmount to work correctly. However systemd-binfmt does not do this cleanup itself: https://github.com/systemd/systemd/issues/13129 . And unmounting the binfmt_misc filesystem also doesn't make the kernel deregister the formats. So this should explain why the unmount is blocked, the interpreter binary is still registered and with a reference open.

Comment by loqs (loqs) - Friday, 28 February 2020, 22:54 GMT

I thought systemd skipped unmounting virtual filesystems such as binfmt_misc [1]
You could ask for additional special handling of binfmt_misc.

[1] https://github.com/systemd/systemd/blob/v244/src/shutdown/umount.c#L105

Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 23:14 GMT

@loqs: Unmounting binfmt_misc or not does not fix the problem, the registered formats stay even after unmounting, it can only be done by explicitly deregistering them (by echoing "-1" to status before unmounting) which is what is not done and the systemd issue I linked seems to ask for. In case of 'F' flag unless they are explicitly deregistered it looks like it keeps a reference to the interpreter which later blocks the rootfs unmount.

Comment by Leo P. (jpegxguy) - Friday, 28 February 2020, 23:29 GMT

This has been bugging me for months. Really glad you guys found the issue.
In my machine it's the same qemu story.

So what'd the best way to fix this? It it systemd that needs to clean up the binfmt entries?

Comment by loqs (loqs) - Friday, 28 February 2020, 23:54 GMT

@gamezelda I was suggesting if upstream would not implement the feature into the shutdown of systemd-binfmt.service perhaps upstream would integrate deregistering types into systemd-shutdown.

Comment by gamezelda (gamezelda) - Friday, 28 February 2020, 23:56 GMT

@loqs: Oh, I misunderstood your comment then. I agree it would make sense to deregister binfmts at shutdown if systemd-binfmt does not do this on stop.

@jpegxguy: If you are experiencing this due to the binfmt problem I mentioned, this is what I'm testing right now and seems to work. Run 'systemctl edit systemd-binfmt' and paste this (then run systemctl daemon-reload or reboot for the changes to apply):

[Service]
# Workaround for ~~FS#63697~~ / systemd GitHub issue #13129
# See: https://bugs.archlinux.org/task/63697, https://github.com/systemd/systemd/issues/13129
ExecStop=/usr/bin/bash -c "echo -1 > /proc/sys/fs/binfmt_misc/status"

Comment by Ales Katona (almindor) - Friday, 28 February 2020, 23:58 GMT

I don't know what RLS does but I doubt it's related to binfmt in my case...

Comment by Scott K. (Pillgar) - Saturday, 29 February 2020, 01:23 GMT

@gamezelda

I tried the ExecStop=/usr/bin/bash -c "echo -1 > /proc/sys/fs/binfmt_misc/status". Didn't remedy for me. This just creates an override.conf in /etc/systemd/system/systemd-binfmt.service.d??? Though ls -l /proc/sys/fs/binfmt_misc didn't show any qemu entries either. And I'm just not smart enough to find the culprit. At least it's getting narrowed down now. TY

Comment by gamezelda (gamezelda) - Saturday, 29 February 2020, 02:18 GMT

@Pillgar That override.conf should get automatically applied by systemd when systemd-binfmt.service is stopped during shutdown.

What files do you have on /proc/sys/fs/binfmt_misc other than 'register' and 'status'? (If you don't have any more, then this definitely won't help)

The problem with this bug is that anything that leaves a file open until shutdown can cause it. So it's possible your case hasn't anything to do with binfmt. But it's good that we try to find typical causes for it..

Comment by Scott K. (Pillgar) - Saturday, 29 February 2020, 03:19 GMT

I got:
CLR:
enabled
interpreter /usr/bin/mono
flags:
offset 0
magic 4d5a

and
DOSWin:
enabled
interpreter /usr/bin/wine
flags:
offset 0
magic 4d5a

Comment by gamezelda (gamezelda) - Saturday, 29 February 2020, 10:50 GMT

@Pillgar Then since none of those binfmt's have the 'F' flag I mentioned, it's likely that your message is due to a different cause and the override I posted won't help.

If you or someone else wants to try to find out what it is, I suggest disabling all autostart application, systemd services you can until it doesn't reproduce and you find the culprit. If you run out of things to disable, then try booting into the systemd rescue and/or emergency shells and see if you can reproduce it from those, and once you can't try to do a "gradual" bootup.

For example in my case where the problem reproduced in rescue but not emergency, I looked into 'systemctl list-dependencies rescue.target' and started services/targets/mounts one by one until I found the culprit. It's a bit tricky due to some services refusing to start unless some dependencies have already started and some auto-starting a whole chain of dependencies themselves (you can see this on journalctl, after starting systemd-journald). So expect it to require a lot of trial and error and be time-consuming, but once you find the exact service/application that's causing the problem, that narrows down things a lot.

Comment by gamezelda (gamezelda) - Saturday, 29 February 2020, 18:18 GMT

I have opened an issue at systemd's GitHub for the binfmt thing: https://github.com/systemd/systemd/issues/14981

Comment by Leo P. (jpegxguy) - Wednesday, 04 March 2020, 01:03 GMT

@gamezelda Your workaround is successful, thanks!

Comment by Martin K. Scherer (marscher) - Friday, 03 April 2020, 14:20 GMT

However my reboot/umount error is not due to binfmtfs (e.g. service masked/no config), the source of umount errors renders lvm caching on root completely useless. When the fs is marked dirty due to the unsuccessful umount, all cache blocks are also marked dirty for some reason. This leads to all block being rewritten to the cache - a disastrous situation for a SSD cache.
I first encounter that my rootfs is not cleanly unmounted, when I first enabled caching.

It seems like there is a dependency problem with the services (not every service stopped before shutdown.target). However I cannot spot it.

I can confirm the observation that the rescue.target of systemd leads to the same error, while the (very minimalistic) emergency.target does not.

Mr Poettering says, that systemd cannot unmount the rootfs if it is living on a LV (https://github.com/systemd/systemd/issues/15004) and suggest using a special initrd created with dracut.

procs.txt (7.1 KiB)

Comment by Martin K. Scherer (marscher) - Sunday, 05 April 2020, 13:57 GMT

A note for those people who have tried out dracut to mitigate the problem. If you have systemd-sysvcompat installed,
dracut will execute the symlinked systemd-shutdown/reboot binary. This again will try to unmount things and we have the same situation again.
Without the compat package installed, one gets this error during dracut image generation:

dracut: *** Including module: shutdown ***
dracut-install: ERROR: installing 'poweroff'
dracut-install: ERROR: installing 'reboot'
dracut-install: ERROR: installing 'halt'
dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.kR65CE/initramfs -a umount poweroff reboot halt losetup stat sleep

and one is being trapped in a plain emergency shell when doing a reboot,
BUT the rootfs was unmounted successfully.
However there is no way to properly reboot/shutdown anymore (only reset/power button will the job).

Does anyone know how to get the (presumably c-written) shutdown command of sysv back into play?

Comment by Zaidan (zaidan) - Thursday, 16 April 2020, 17:04 GMT

I get the following errors before the system hangs:
```
systemd[1]: Shutting down.
systemd-shutdown[1]: Syncing filesystems and block devices.
systemd-shutdown[1]: Sending SIGTERM to remaining processes...
systemd-shutdown[1]: Waiting for process: systemd-udevd, systemd-udevd
systemd-udevd[1250]: hdaudioC1D0: Worker [1317] processing SEQNUM=4814 killed
systemd-udevd[1250]: Worker [1317] terminated by signal 9 (KILL)
systemd-udevd[1250]: hdaudioC1D0: Worker [1317] failed
systemd-shutdown[1]: Sending SIGKILL to remaining processes...
systemd-shutdown[1]: Unmounting file systems.
[1592]: Remounting '/' read-only in with options '(null)'.
EXT4-fs (dm-2): re-mounted. Opts: (null)
systemd-shutdown[1]: All filesystems unmounted.
systemd-shutdown[1]: Deactivating swaps.
systemd-shutdown[1]: All swaps deactivated.
systemd-shutdown[1]: Detaching loop devices.
systemd-shutdown[1]: All loop devices detached.
systemd-shutdown[1]: Detaching DM devices.
systemd-shutdown[1]: Detaching DM /dev/dm-1 (254:1).
systemd-shutdown[1]: Detaching DM /dev/dm-0 (254:0).
systemd-shutdown[1]: Could not detach DM /dev/dm-0: Device or resource busy
systemd-shutdown[1]: Not all DM devices detached, 2 left.
systemd-shutdown[1]: Detaching DM devices.
systemd-shutdown[1]: Detaching DM /dev/dm-0 (254:0).
systemd-shutdown[1]: Could not detach DM /dev/dm-0: Device or resource busy
systemd-shutdown[1]: Not all DM devices detached, 2 left.
systemd-shutdown[1]: Can not finalize remaining DM devices, continuing
systemd-shutdown[1]: Successfully changed into root pivot.
systemd-shutdown[1]: Returning to initrd...
Unmounting all devices.
Detaching loop devices.
Disassembling stacked devices.
kvm: exiting hardware virtualization
```

As @gamezelda suggested I started dependencies from rescue.target in emergency mode.
When I boot into rescue mode the system produces the log above and does not shutdown properly.
In emergency mode there are no errors. When I start services until systemd-udevd / systemd-udev-trigger produces some udev device
detection output (especially NVIDIA stuff) I can reproduce the system hang.

Since the log contains hdaudioC1D0 I tried to disable the Audio device GM204
HDA (device ID: 0x0fbb, vendor ID: 0x10de). First I tried to blacklist the modules
snd_hda_codec_realtek and snd_hda_codec_generic with the following kernel parameters:
modprobe.blacklist=snd_hda_codec_realtek modprobe.blacklist=snd_hda_codec_generic

This did not work, since the device was still detected as snd_hda_codec_generic.

To disable the device I used `lspci | grep Audio` to determine the PCI device address.
With the command `echo 1 > /sys/bus/pci/devices/0000:01:00.1/remove` I removed the PCI device.
The first time this worked and I could power off without issues.
The second time, the ZSH shell got frozen and was not interuptable. The remove file was gone.
I tried to shutdown, but the system got a similar error log, but with ZSH.

The disable the device permanently I created a udev rule /etc/udev/rules.d/01-disable-nivida-hda.rules:
```
ACTION=="add", KERNEL=="0000:01:00.1", SUBSYSTEM=="pci", RUN+="/bin/sh -c 'echo 1 > /sys/bus/pci/devices/0000:01:00.1/remove'"
```

I tried 3 shutdowns and all shutdowns finished properly.
The only drawback of this workaround is, that the Nvidia HDA Audio device is removed.
Since my built in speaker is wired to the Intel audio device, I have no issues with playing audio.

Comment by Martin K. Scherer (marscher) - Thursday, 23 April 2020, 22:57 GMT

Mr Poettering pushed a fix for the binfmtfs issue to Github, eventually somebody is willing to test it? https://github.com/systemd/systemd/pull/15566

Comment by gamezelda (gamezelda) - Thursday, 23 April 2020, 22:58 GMT

@marscher: I just finished rebuilding systemd with this patch applied and it works for me. It applies cleanly over Arch's v245.5.

PS: Note that this fix won't do anything for your problem if it's not due to the binfmts issue, like it appears to be your case.

Comment by Ales Katona (almindor) - Saturday, 25 April 2020, 21:51 GMT

Field changed: Percent Complete (100% → 0%)

This is not fixed. I still get the same error with systemd@245.5-2

NOTE: the drive for me is NOT root and the issue is not caused by binfmts but by either RLS (rust language server) or rust analyzer, or possibly even just VSCode having a file open in the disk before reboot is requested.

In either case the issue persists.

Comment by loqs (loqs) - Saturday, 25 April 2020, 23:20 GMT

@almindor have you opened an issue upstream with systemd or do you believe this is a packaging issue?

Comment by Ales Katona (almindor) - Sunday, 26 April 2020, 02:19 GMT

I didn't open an issue upstream yet no, I don't think this is a packaging issue but haven't got the context to know where the error originates. All I know is that having some (graphical or background used by a front-end graphical app) open and pointing to files in the partition will cause this unmount error.

I know 100% that RLS (rust language server) causes it as well as rust-analyzer (another extension with similar functionality). I'm not sure if VSCode itself (the editor) would cause it too, or if ANY process that keeps a file open in that partition would cause it.

I can try reproducing with a custom script or something to that effect to see if I can get it to happen out of these specific cases.

Comment by Scott K. (Pillgar) - Monday, 29 June 2020, 10:28 GMT

Removing "nvidia nvidia_modeset nvidia_uvm nvidia_drm" from mkinitcpio.conf modules remedied the errors. I haven't tested individual modules.

Comment by Vinay S Shastry (shastry) - Monday, 29 June 2020, 15:06 GMT

@pillgar: Good find! Removing nvidia modules from initcpio MODULES array fixed it for me too.

Comment by Ales Katona (almindor) - Monday, 29 June 2020, 15:36 GMT

@pillgar I doubt that's it. It might be a wider issue. I'm running with AMD setup and have no nvidia modules and still get this issue.

Comment by Rafael (rafasc) - Monday, 29 June 2020, 16:15 GMT

@almindor There's no doubt there's more than one issue causing the same symptom. For me, removing the nvidia modules was enough to make the error vanish.

Comment by Martin K. Scherer (marscher) - Saturday, 12 September 2020, 13:24 GMT

Removing nvidia modules from initramfs (dracut) doesn't solve the problem on my side. May I ask if you guys also do have your rootfs on LVM? I also have swap on LVM (disabling swap doesn't solve it either).
Also tested the udev rule to disable the nvidia audio device (adopted the pci path of course), without success.

@gamezelda I would be delighted, if you could elaborate a bit your approach on how you achieve to incrementally start services from the emergency target (e.g. your strategy) - you would save me a lot of headache. Thanks a lot in advance!

Comment by Martin K. Scherer (marscher) - Tuesday, 15 September 2020, 06:32 GMT

Getting closer... as my rootfs lives on an LVM I definitely need a transition to an initrd on shutdown in order to properly unmount the rootfs. Dracut offers this ability, however it is somehow buggy. The provided systemd unit
responsible to unpack the initrd to /run/initramfs starts /usr/lib/dracut/dracut-initramfs-restore.
The resulting unpacked environment in /run/initramfs cannot be chrooted to, e.g. chroot /run/initramfs fails with:

chroot: failed to run command ‘/bin/bash’: No such file or directory

If I extract the image via lsinitrd --unpack (which is also a script provided by dracut), I'm able to chroot manually and during the shutdown process my rootfs gets properly unmounted.

I posted this upstream: https://github.com/dracutdevs/dracut/issues/925

Is it true, that mkinitcpio does not provide a shutdown procedure?

Comment by Martin K. Scherer (marscher) - Tuesday, 15 September 2020, 15:50 GMT

Could somebody experiencing shutdown issues please confirm, that using dracut with the shutdown module can circumvent the problem, if the /run/initramfs directory is populated via:

cd /run/initramfs && /usr/bin/lsinitrd --unpack

For some reason dracut-shutdown.service does not get executed at the right time, so systemd-shutdown cannot chroot into /run/initramfs. That is why I came up with lsinitrd workaround.

Comment by gamezelda (gamezelda) - Friday, 18 September 2020, 16:56 GMT

@marscher: I wish I could give you some kind of simple procedure for how to diagnose this, but my way of testing this was incredibly ad-hoc since I don't really understand how systemd launches the services on boot, and furthermore I don't really remember it that well and I don't have the terminal logs :(

Anyway, if my memory serves me well, what I did was booting into the lowest mode where the message wasn't printed (in my case it was emergency). Once there I looked into which services are loaded in the mode where the problem happened (rescue) by doing 'systemctl list-dependencies rescue.target'. Then I just picked one or some of those in the list, started them with 'systemctl start whatever' and tried to reboot. If the message wasn't printed that means the service isn't causing the problem, if it was printed, that means that the service or one of its dependencies is causing the problem... so then I did 'systemctl list-dependencies whatever' or looked at the journal 'journalctl' (note that you need to load the journal service beforehand for this to work!), to figure out which are the dependencies, and then tried to reboot again and start one or some of those dependencies and see if the message appears, and if it does then look at the dependencies of the dependencies and so on, until I managed to narrow the problem to one single service (so that if I started all its dependencies it was fine but loading the service itself made the message appear). All of this took many reboots and quite some time unfortunately...

To add to the problem I also remember having problems with some services which were not auto-starting its dependencies and thus failing to load, I don't remember too well what I did, probably just tried to figure out what needed to be started beforehand by trial and error.

So if you're not confident you'll be able to figure out some way to mess around systemd and don't mind having to do a lot of trial and error, I don't recommend doing this... Hopefully someone with some more systemd knowledge can explain the proper way to do this and not this Pepe Silvia way...

Comment by theminetrooper7 (TheMineTrooper) - Saturday, 03 October 2020, 21:30 GMT

quick note, I'm using Manjaro and not arch, but this seems (from what I've seen in here at least) to be the same bug I have, and this is the only page I found on this bug after hours of searching, so sorry in advance if this doesn't hold for Manjaro.

any updates on this? I've tried practically every solution offered here.. with NVIDIA i didn't have the modules in the file to begin with.. with binfmt i have DOSWin in the /proc/sys/fs/binfmt_misc, but not the F flag (also when trying to start systemd-binfmt.service in emergency mode i get error code=exited, with no log I could find).. I don't seem to have the rust server in vscode, even tried uninstalling it but didn't help..

and also at the start @gamezelda mentioned that he managed to hide the msg prints without any ill effects, but its seems like in my laptop if I try to shutdown in any way beside just long-press on the power button (unhealthy..) it causes the pc to seriously heat up over a few minutes, sometimes hours, and eventually completely discharge the battery, like its not completely off (currently to avoid that every time I shutdown I have to reboot to windows and shutdown from there). any idea what could cause this?

also for me the exact line points to "umount: /oldroot/dev/pts: target is busy" if that changes anything

Comment by Scott K. (Pillgar) - Sunday, 04 October 2020, 07:46 GMT

@(TheMineTrooper) Nvidia modules might be hiding in /etc/modules-load.d/*.conf?

What does sudo systemctl poweroff do?

Comment by theminetrooper7 (TheMineTrooper) - Sunday, 04 October 2020, 16:04 GMT

in etc/modules-load.d/ i have linux5*-virtualbox-modules.conf (* - 4,7,8, all have the same modules, nothing to do with NVIDIA), mhwd-gpu.conf, which has some things on nvidia:
"""
##
## Generated by mhwd - Manjaro Hardware Detection
##

nvidia
nvidia-drm
"""
and modules.conf which is empty.

when i tried to remove the mhwd-gpu.conf, nothing happend, and i still got the umount print (also it notes that its generated by mhwd, but after reboot it didnt recreate it)

sudo systemctl poweroff did a regular shutdown (unless i missed some print in the shutdown process, but i can hardly follow it normaly), and the msg was displayed.

Comment by Scott K. (Pillgar) - Sunday, 04 October 2020, 22:59 GMT

You probably just forgot to rebuild the image. Rename mhwd-gpu.conf to mhwd-gpu.conf.bak, or comment out nvidia nvidia-drm, and rebuild initcpio image with sudo mkinitcpio -p linux58 (or what ever kernel name Manjaro uses).

Comment by theminetrooper7 (TheMineTrooper) - Tuesday, 06 October 2020, 07:29 GMT

indeed forgot it, but it seems like that wasn't the problem.. moved it to mhwd-gpu.conf.back, rebuilt initcpio (same kernel name it seems) with no errors, but with shutdown now, reboot and sudo systemctl poweroff i still get the msg.

Comment by Frederick Zhang (FrederickZh) - Thursday, 14 January 2021, 11:55 GMT

In my case this was caused by NVIDIA and ZFS (on LUKS). Solution and debugging tips below.

Solution:

1. Copy `shutdown` hook to `/etc/initcpio` and rename them to e.g. `my-shutdown` for clarity

```
cp /usr/lib/initcpio/shutdown /etc/initcpio/my-shutdown
cp /usr/lib/initcpio/install/shutdown /etc/initcpio/install/my-shutdown
cp /usr/lib/initcpio/hooks/shutdown /etc/initcpio/hooks/my-shutdown
```

2. Edit `/etc/initcpio/install/my-shutdown`, change the `add_file` line to `add_file "/etc/initcpio/my-shutdown" "/shutdown" 755`

3. Edit `/etc/initcpio/my-shutdown`, after line `cd /sys/class/block` (no need for the ZFS stuff if you're not using it of cos):

```
cd /sys/class/block

date
printf '%s\n' "Unloading NVIDIA modules."

rmmod nvidia_drm nvidia_uvm nvidia_modeset nvidia

date
printf '%s\n' "Unmounting all devices."

umount --recursive /oldroot

date
printf '%s\n' "Exporting ZFS pools."

zpool export -a
```

4. Add `my-shutdown` to `HOOKS` then mkinitcpio.

So now NVIDIA no longer holds /oldroot/sys and ZFS no longer holds /dev/dm-0.

Debug:

Cases may vary. To help debug, you can first set up a non-encrypted partition (I just used `/boot`), then add something like the following in your `/etc/initcpio/my-shutdown` to store some logs:

```
# after umount, etc.
debug_log() {
mkdir /debug
mount /dev/sdz1 /debug
date | tee -a /debug/log.txt
mount 2>&1 | tee -a /debug/log.txt
fuser -vm "$1" 2>&1 | tee -a /debug/log.txt
umount /debug
}
if findmnt /oldroot/sys; then
debug_log /oldroot/sys
elif findmnt /oldroot; then
debug_log /oldroot
fi
```

I'll add these to https://wiki.archlinux.org/index.php/Mkinitcpio#Troubleshooting if you guys find it helpful.

Comment by gamezelda (gamezelda) - Thursday, 14 January 2021, 13:52 GMT

Just to add another data point, I can confirm that in my case, similar to how other users have commented, having the NVIDIA modules to MODULES in /etc/mkinitcpio.conf causes the error to appear. If I don't add the NVIDIA modules to mkinitcpio but instead I let the modules auto-load without any configuration, the error doesn't appear.

(As usual, this is just _another_ thing that can make this problem happen. So, it's possible that you don't have the NVIDIA modules in mkinitcpio and it still happens to you).

Comment by Ivan Lyapunov (dront78) - Monday, 22 November 2021, 08:25 GMT

> I am using Simple dm-crypted root (so, NO LVM)

that's my case.
confirmed that after shutdown lightdm and
modprobe -r nvidia_drm nvidia_modeset nvidia_uvm && modprobe -r nvidia
the issue is disappeared.

it sound like nvidia module creates something on disk

Comment by Ivan Lyapunov (dront78) - Monday, 22 November 2021, 09:19 GMT

instead of the hacking shutdown initcpio task, try to create

/usr/lib/systemd/system-shutdown/nvidia.shutdown
#!/bin/sh
# remove nvidia modules
/usr/bin/modprobe -r nvidia_drm nvidia_modeset nvidia_uvm && /usr/bin/modprobe -r nvidia

and make it executable. helped me to resolve the issue.

Comment by Ananth Bhaskararaman (badrobot) - Thursday, 02 December 2021, 02:50 GMT

Should this be reported upstream? To systemd perhaps?

Comment by Ivan Lyapunov (dront78) - Thursday, 02 December 2021, 19:05 GMT

Imo, Nvidia should be a better candidate to fix shutdown handler in the Linux kernel module

Comment by loqs (loqs) - Thursday, 02 December 2021, 22:55 GMT

Those with the issue caused by nvidia modules. Are the nvidia modules loaded in the initrd or after root is mounted?

Comment by Ivan Lyapunov (dront78) - Saturday, 04 December 2021, 07:56 GMT

That's my modules section in the mkinitcpio.conf
MODULES=(intel_agp i915 vfat nls_ascii nls_cp437 nls_iso8859-1 f2fs)

Comment by Jonas Malaco Filho (jonasmalaco) - Saturday, 04 December 2021, 15:44 GMT

@loqs, in my case they are loaded after root is mounted.

(and the issue does appear to be caused by the nvidia modules: unloading them with a `/usr/lib/systemd/system-shutdown/*` script does prevent it).

Comment by loqs (loqs) - Saturday, 04 December 2021, 18:14 GMT

The nvidia module has an open file under /sys possibly /sys/bus/pci/devices/*/config.
If the module was loaded before the root fs. Then provided another instance of /sys was opened on top of the root fs you that might avoid the issue.

Cleaner solution would be for nvidia to provide something that runs at shutdown either removing the modules or notifying the module to close the open file.
As an alternative systemd could modify its shutdown logic to move all pseudo filesystems from on top of real filesystems to on top of the pivoted ram disk. So oldroot is not blocked from being unmounted.

Comment by invra (InvisibleRasta) - Monday, 14 February 2022, 22:39 GMT

I have been having the same problem on my desktop since 2-3 weeks . The system would just freeze with this error at shutdown or reboot. Not allways. It seems like it happens with longer sessions. I was forced to hardware reset a bunch of times alredy. I am running systemd version 250.3-4 and nvidia drivers version 510.54-1

The workaround that posted Ivan Lyapunov (dront78) seems to work. I hope it gets fixed sometime. I did some google searches and it seems that this bug has been around for quite some years.

Comment by Jan (medhefgo) - Monday, 06 June 2022, 10:23 GMT

https://github.com/systemd/systemd/pull/23348

Comment by André Fettouhi (A.Fettouhi) - Saturday, 29 October 2022, 14:04 GMT

Same issue here as well on my system with a Nvidia 1080 GTX card using the proprietary driver. Using @dront78 workaround (see his post from the 22nd of November 2021) has resolved it for me. This has been an issue since 2018 for me at least. Thanks for the workaround!

Comment by Toolybird (Toolybird) - Thursday, 15 June 2023, 07:37 GMT

Fixed by [1]

[1] https://github.com/systemd/systemd/commit/4acbe1f6cc7fbab22f7a016db58a035c80e6dd7a

	Tasks related to this task (0)

Duplicate tasks of this task (1)
~~FS#72499 - [systemd] 249.5-1 poweroff/reboot umount errors~~

Arch Linux

FS#63697 - [systemd] 243.0-1 poweroff/reboot umount errors

Details

Loading...