FS#22523 - [initscripts] Use mount -f to write mtab entry for /

Attached to Project: Arch Linux
Opened by Emil Renner Berthing (Esmil) - Thursday, 20 January 2011, 11:13 GMT
Last edited by Tom Gundersen (tomegun) - Saturday, 14 January 2012, 01:28 GMT
Task Type Feature Request
Category Initscripts
Status Closed
Assigned To Aaron Griffin (phrakture)
Thomas Bächler (brain0)
Tom Gundersen (tomegun)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
In rc.sysinit we first remount the root filesystem read only to safely do fsck's and such. Eventually we remount the root writeable, but of course this has to be done with with mount -n, because mount can't write to /etc/mtab on the read-only filesystem. Immediately after this we fill in /etc/mtab using either findmnt our catting /proc/mounts into it.

My proposal is to use the mount command to write the entry in /etc/mtab for the root filesystem instead. mount has a -f option exactly for this purpose.
It tells mount to not actually do any mounting, but do the bookkeeping it would otherwise do. That is write an entry to /etc/mtab.

This has the small advantage that the entry for / in the mtab gets the device specified in your /etc/fstab instead of /dev/root.

The real advantage, however, is that this is one of the two changes needed in order to use NILFS2 as your root filesystem. This filesystem needs a userspace daemon running in order to clean up garbage (free unused space). The man page for this nilfs_cleanerd specifies that this should be automatically started by the mount command. However, when using mount -n the daemon is not started. The idea is starting the daemon is delayed until you do mount -f to write the entry in the /dev/mtab. The reason is that it wants to write the PID on the daemon in the mtab entry as so:
/dev/sda1 on / type nilfs2 (rw,noatime,gcpid=144)

As the rc.sysinit is now, however, we never call mount -f to write the mtab entry, and the garbage cleaner will never be started. Sure, you manually start the daemon later, but as stated in the man page of nilfs_cleanerd this isn't the recommended way to do it.

I know this NILFS2-scheme might seem, in falconindy's words, absurd, but I think it very much in spirit of Arch to run programs as upstream intended it, and we can do that with a very small change:

diff --git a/rc.sysinit b/rc.sysinit
index 44238fa..daeff2c 100755
--- a/rc.sysinit
+++ b/rc.sysinit
@@ -260,10 +260,11 @@ fi
stat_busy "Mounting Local Filesystems"
/bin/mount -n -o remount,rw /
if [ -x /bin/findmnt -a -e /proc/self/mountinfo ]; then
- /bin/findmnt -rnu -o SOURCE,TARGET,FSTYPE,OPTIONS >| /etc/mtab
+ /bin/findmnt -rnu -o SOURCE,TARGET,FSTYPE,OPTIONS -i -T / >| /etc/mtab
else
cat /proc/mounts >| /etc/mtab
fi
+/bin/mount -f -o remount /
run_hook sysinit_premount
# now mount all the local filesystems
/bin/mount -a -t $NETFS -O no_netdev

Additional info:
This patch is against the head of the git branch.

Btw. the other small change needed is to make udev create a /dev/root symlink.
A quick hack is to
ln -s sda1 /lib/udev/devices/root

Steps to reproduce:
Mm.. try running Archlinux with a NILFS2 root without this change.
This task depends upon

Closed by  Tom Gundersen (tomegun)
Saturday, 14 January 2012, 01:28 GMT
Reason for closing:  Fixed
Additional comments about closing:  in git
Comment by Emil Renner Berthing (Esmil) - Thursday, 20 January 2011, 12:24 GMT
Here is the patch as an attachment. You can also pull it from git://github.com/esmil/initscripts.git
Comment by Dan McGee (toofishes) - Thursday, 20 January 2011, 14:17 GMT
The nilfs2 stuff is not absurd at all, unfortunately we've had bugs dead in the water on this, even with a patch attached, to fix it:  FS#20260 
Comment by Tom Gundersen (tomegun) - Sunday, 27 March 2011, 17:06 GMT
@Emil: I agree that we should try to follow upstream, so I like this patch. However, I have two questions:

1) Why is the symlink not created upstream in udev? It wouldn't be much point in fixing this if we just push the problem onto the udev maintainer. I guess a bug should be filed upstream with udev. Do you agree?
2) I guess your patch will be unnecessary once /etc/mtab is a symlink to /proc/self/mounts? If we anyway need to wait for udev to be fixed, I think we might as well delay this issue to the next release of util-linux where everything will "just work". Do you agree?
Comment by Emil Renner Berthing (Esmil) - Monday, 28 March 2011, 18:48 GMT
Hi Tom

1) I've been looking a bit at the log of 'udevadm monitor --environment' when doing 'udevadm trigger', and I don't think there is enough info present there to reliably decide which device is the root device. Hence I'm no longer convinced this is a job for udev.
The only way I can think of right now to automatically create the /dev/root symlink is to do something like

ln -s $(sed 's/.*root=\([^ ]*\).*/\1/' /proc/cmdline) /dev/root

That is unless there is a tool to find the device of the root filesystem or some file under /sys or /proc I've overlooked.

2) No, the whole point of this is to run the mount -f -o remount / command once the root is mounted r/w so that the nilfs_cleanerd will be started.
I don't really like Seb's solution ( FS#20260 ) of running the cleaner as any other daemon since the man page for it specifically says:

"This program can be invoked either automatically by mount.nilfs2(8) or manually by an administrator. However, users are recommended to invoke nilfs_cleanerd through mount.nilfs2(8) or mount(8) and shutdown it through umount.nilfs2(8) or umount(8) in order to avoid state inconsistencies among administration tools."
Comment by Tom Gundersen (tomegun) - Monday, 28 March 2011, 20:36 GMT
1) there used to be a script in the udev package that did exactly this (other distros did something similar). However, it was removed. I don't know why this was deemed no longer necessary. I think, however that udev should be in charge of creating this link if it is needed. Btw, I'm not really understanding why this link is needed in the first place, could you explain (or point to documentation)?

2) If /etc/mtab is just a symlink to /proc/self/mounts, then I think we don't need to call mount with the "-n" in the first place. Would this solve the problem? Would the above mentioned symlink still be needed?

Good point about the daemon. I think you are right. My intention with my other comment was that I prefer your fix, but failing that a daemon would be better than adding lots of code to rc.sysinit.
Comment by Emil Renner Berthing (Esmil) - Monday, 28 March 2011, 23:23 GMT
1) No, unfortunately I can't point to any documentation about this. All I can do is see that the nilfs_cleanerd refuses to start unless this symlink is present.
My nilfs2-using system is a bit out of date though. I haven't tried with a .38 kernel and nilfs-utils 2.0.21 yet, so there is a small chance this might have changed with the latest release.

2) I think you're right. As long as mount is run (without -n) on a writeable root filesystem it should start the cleaner daemon. The /dev/root symlink will still be needed though, as this seems to be needed by the daemon itself.
Comment by Tom Gundersen (tomegun) - Monday, 28 March 2011, 23:33 GMT
@Emil: Thanks for the clarification.

About the symlink: this is either a bug in nilfs (who should perhaps not rely on this link) or in udev (who should perhaps create this link). I suggest contacting those projects to hear what they have to say.

Once the symlink issue is taken care of, we just have to prepare rc.sysinit for dealing with mtab being a symlink. Patches welcome, otherwise it will be on my TODO. I don't think we should add any temporary workarounds though.
Comment by Alexander Lam (lambchops468) - Friday, 13 January 2012, 12:43 GMT
Not fixed after symlink from /etc/mtab to /proc/self/mounts was made; tomegun requested reopening if this was the case.
Comment by Tom Gundersen (tomegun) - Friday, 13 January 2012, 12:57 GMT
@Alexandre: thanks for following up on this.

The situation has changed significantly since this bug was first opened.

Could I ask you to try editing your rc.conf and removing '-n' option from all the calls to mount. This should no longer be needed as we never write to /etc/mtab anyway. If this fixes your problem I'll remove it from the next initscritps release.
Comment by Alexander Lam (lambchops468) - Saturday, 14 January 2012, 00:23 GMT
Yeah, that fixes it.
Attached is where I removed "-n" from mount
Comment by Tom Gundersen (tomegun) - Saturday, 14 January 2012, 01:27 GMT
Thanks for testing. I pushed a patch based on your test: <http://projects.archlinux.org/initscripts.git/commit/?id=5dd3fbaa93c157cfa37351324de06096f4377808>.

Loading...