FS#61808 - {arch-install-scripts} bind mount of /run causes systemd-tmpfiles to mess up uids in host /run

Attached to Project: Arch Linux
Opened by Axel Hinrichs (ahinrichs) - Tuesday, 19 February 2019, 13:54 GMT
Last edited by Morten Linderud (Foxboron) - Sunday, 03 July 2022, 12:50 GMT
Task Type Bug Report
Category Arch Projects
Status Closed
Assigned To David Runge (dvzrv)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description: pacstrap bind mounts host /run into chroot dir. If you install systemd in chroot the post install hook systemd-tmpfiles.hook is run with the chroot /etc/passd but host /run. On my host the uid of systemd-network differes from the new chroot and so /run/systemd/netif gets messed up.

Additional info:
* commit https://git.archlinux.org/arch-install-scripts.git/commit/?id=aea51ba9012ace489d10543f34ebea9c5aebd812
changed to bind mount the host dir /run in $newroot
* intention was to fix https://bugs.archlinux.org/task/61040

Steps to reproduce:
# ls -la /run/systemd/
[...]
drwxr-xr-x 5 systemd-network systemd-network 120 Feb 19 14:09 netif
[...]
# mkdir /tmp/chroot
# pacstrap -c /tmp/chroot systemd
# ls -la /run/systemd/
[...]
drwxr-xr-x 5 981 981 120 Feb 19 14:23 netif
[...]
This task depends upon

Closed by  Morten Linderud (Foxboron)
Sunday, 03 July 2022, 12:50 GMT
Reason for closing:  Fixed
Additional comments about closing:  25-1
Comment by Dave Reisner (falconindy) - Wednesday, 20 February 2019, 18:07 GMT
Ok, but what's the effect of this?
Comment by Axel Hinrichs (ahinrichs) - Wednesday, 20 February 2019, 22:00 GMT
(actually I searched for a method to change the task description - package name - and severity by myself and didn't succeed...)

I don't think the pacstrap run should have side effects on the hosts /run tree. In my case the effect was, that the host /run/systemd/netif({links/lease} folders weren't writable by the systemd-networkd anymore flooding log with "access denied". But I could think of any kind of problems.

$newroot/run used to be an empty tmpfs. To fix the  bug 61040  the chroot lvm install scripts should be able to talk to the running lvm sockets /run/lvm/lvmetad.socket. For that they bind mount the whole /run folder into $newroot. But any package script of the pacstrap run now operates on the host /run folder which I don't think is intended.

I could think of making $newroot/run again an empty tmpfs and just bind mount /run/lvm on $newroot/run/lvm. Or to fix the other side /usr/share/libalpm/scripts/systemd-hook should check systemd_live before "tmpfiles --create" too.
Comment by loqs (loqs) - Wednesday, 20 February 2019, 22:15 GMT
With lvm 2.03 lvmetad support is removed so there would be no /run/lvm/lvmetad.socket perhaps the changes for  FS#61040  could then be reverted.
Comment by Axel Hinrichs (ahinrichs) - Wednesday, 20 February 2019, 22:46 GMT
Probably. And I actually don't think that the behaviour of systemd is perfect here. But I *really* don't like the idea that any (misbehaving) package script of a pacstrap run could have side effects on the running host. It caused me some headaches to find the cause of the changed uid:gid in /run/systemd/netif.

Comment by tinywrkb (tinywrkb) - Thursday, 04 June 2020, 19:15 GMT
Please switch $chroot/run from /run bind mount to tmpfs.

When bind mounting /run, a docker daemon from host is somehow using $chroot/run/docker/netns/* as mount targets and pacstrap seems to fail (silently) on unmounting $chroot/run because of it.
I might be hitting a similar issue to this: https://github.com/weaveworks/weave/issues/1455
Patching pacstrap and arch-chroot to mount $chroot/run as tmpfs fix this for me.
Comment by loqs (loqs) - Friday, 05 June 2020, 18:53 GMT
@tinywrkb as you are reverting the fix for  FS#61040  does that reintroduce that issue?
If  FS#61040  is still an issue, is there an alternative solution that can be applied to grub?
Or what if $chroot/run is made a tmpfs again with /run/lvm being bind mounted to $chroot/run/lvm within it?
Comment by tinywrkb (tinywrkb) - Monday, 08 June 2020, 15:38 GMT
@loqs you means this? https://git.archlinux.org/arch-install-scripts.git/commit/?id=aea51ba9012ace489d10543f34ebea9c5aebd812
then yes, this is basically what I've done.

I don't use lvm so I don't know if the change ends with a regression for lvm users.
But the current behavior is also not unacceptable and it unexpectedly took down my X session as my Xauthority is set to $XDG_RUNTIME_DIR"/Xauthority.
Initially, I also made the mistake of deleting $chroot/run, thinking pacstrap unmounted everything correctly, and that ended with needing to use SysRq in order to reboot.
Comment by loqs (loqs) - Monday, 08 June 2020, 16:43 GMT
See attached patch for what I was attempting to describe. Not implemented is the creation of $chroot/run/lvm being contingent on the existence of /run/lvm

If you build arch-install-scripts with the applied patch does arch-chroot work as expected on your system?
Comment by Axel Hinrichs (ahinrichs) - Tuesday, 09 June 2020, 10:04 GMT
@loqs yes, this was also my suggestion in the first comment. To me this looks good and is a much better solution to fix  FS#61040  than the first fix. (TBH: I did not fully build the package for testing the patch but applied it by hand to arch-chroot)
Comment by tinywrkb (tinywrkb) - Wednesday, 10 June 2020, 21:42 GMT
@loqs yes, with the suggested patch pacstrap is working fine, it doesn't negatively affect the host, and the host's docker service doesn't use $chroot/run/docker/netns/* as mount targets.
Comment by tinywrkb (tinywrkb) - Friday, 28 May 2021, 11:28 GMT
@eschwartz or @falconindy is there a reason not to apply the suggested changes in the patch attached by @loqs and close this issue? It would be nice to be able to switch back to the official arch-install-scripts and not having to maintain my own package.
Comment by Tom Yan (tom.ty89) - Monday, 06 September 2021, 12:43 GMT
Why not just bind mount /run read-only? https://github.com/archlinux/arch-install-scripts/pull/13
Comment by tinywrkb (tinywrkb) - Monday, 06 September 2021, 21:40 GMT
@tom.ty89 some packages might (wrongly) be packaging some resources that are in `/run`. While the packaging should be fixed, it's better not to break the installation process.
For example, samba did this, packaging `/run/samba`, thankfully it was fixed, and my bug report wasn't ignored like the autofs one which I believe is still packaging `/run`, now in the AUR.
Comment by Tom Yan (tom.ty89) - Tuesday, 07 September 2021, 01:16 GMT
IIRC pacman does not even stop when some of the files cannot be updated / rewritten.
Comment by tinywrkb (tinywrkb) - Tuesday, 07 September 2021, 09:47 GMT Comment by Tom Yan (tom.ty89) - Tuesday, 07 September 2021, 13:37 GMT
Honestly, I don't really care (and my "IIRC" was based on some `chattr +i` workaround I practise before I notice NoExtract, in case you are "interested"). Often "showing" what is wrong / broken is the only way to get people motivated in fixing it, and to me it would never be a reason to avoid doing a right thing.

P.S. What I think does not really matter though. At the end of the day it's really just a matter of who's maintaining / who has the right to maintain it, and some Arch dev(s) (well, or just people in general) are known to have their own worldview on what's right or wrong / good or bad, without needing a rationale or proof to support it. (Luckily, it's just a script anyway.)
Comment by tinywrkb (tinywrkb) - Tuesday, 07 September 2021, 14:21 GMT
> Honestly, I don't really care

So you suggest a fix and then don't care about testing if it diverges from current behavior?

>Often "showing" what is wrong / broken is the only way to get people motivated in fixing it

I have bug reports about packaging issues that have been ignored for over a year without any response from the package maintainer, even when suggesting a fix, sometimes as an attached patch.
I think that you shouldn't create a problem that didn't exist and throw this at package maintainers.

> ... are known to have their own worldview on what's right or wrong / good or bad, without needing a rationale or proof to support it.

That's call religion, and I hope that's not a factor in software development related decisions.
Comment by Axel Hinrichs (ahinrichs) - Thursday, 27 January 2022, 10:45 GMT
@eschwartz or @falconindy: Could you please recheck this issue. I agree that arch-chroot is fine when run in the context of a live installer. The bug is only relevant when arch-chroot is used on a production server. Eg. when you "pacstrap --sysroot ..." to bootstrap a container image. It too uses arch-chroot and here the pacman scripts of the container root change the /run directory of a live and running server. And this has surely unwanted and even sometimes destructive side effects on the host.

The bind mount was introduced to mitigate an issue with lvm. I's been a tmpfs long time before. The proposed fix perfectly handles the /run/lvm and protects the host /run otherwise like it was before. I can't find any reason not to include this patch.
Comment by Tom Yan (tom.ty89) - Thursday, 27 January 2022, 14:08 GMT
> So you suggest a fix and then don't care about testing if it diverges from current behavior?
I mean I don't care if there will be hiccup when a package that consists of files in /run makes no sense given that we assume /run to have a tmpfs mounted on it. In other words, IMHO we should fix wrong-doing that prevents us from doing a right thing instead of avoiding doing the right thing because there have been some wrong-doing.
Comment by Axel Hinrichs (ahinrichs) - Thursday, 27 January 2022, 21:22 GMT
It is not about hiccups. It is about killing production servers with a simple "pacstrap --sysroot" ...

I wonder why reverting this breaking change and redo it more specific to the reasoning is such a big deal. /chroot/run used to be a tmpfs ever before. It was not changed because that tmpfs was a problem. It's been changed to a bind mound because the lvm package somehow tried to access /run/lvm. And the test.patch does exactly the right thing. Keep (or make again) /chroot/run a tmpfs and just bind mount /run/lvm to /chroot/run/lvm.
Comment by Tom Yan (tom.ty89) - Friday, 28 January 2022, 01:41 GMT
Wasn't your production server "killed" already? Isn't that why you are so eager to get the "the right thing" in?
No one (at least not me) wants to keep you server "dead". It's just neither is my focus getting your server or some other scenario "work". Getting things work (for some) ASAP is not and will never be my concern, because you can easily get it work for you with /usr/local/bin. Doing it right, whether there's a real (like technical or logical) reason to merely "revert and make exception" is my only concern (I didn't rule out that the need of not bind mounting btw. See the last paragraph of my PR's message. Bare in mind "need" does not refer to what can be and should be fixed though.)

But again, what I think doesn't matter. I just couldn't help when being addressed, sort of.
Comment by loqs (loqs) - Sunday, 03 July 2022, 12:24 GMT

Loading...