FS#30134 - systemd fails to mount dmraid targets by label

Attached to Project: Arch Linux
Opened by Simeon (bladud) - Monday, 04 June 2012, 04:09 GMT
Last edited by Dave Reisner (falconindy) - Sunday, 05 August 2012, 00:07 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dave Reisner (falconindy)
Tom Gundersen (tomegun)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:

systemd version 184-2

I have in my /etc/fstab a line like:

LABEL="Disc" /mnt/Disc ext4 defaults 0 0

where /dev/disc/by-label/Disc is a symlink to, say /dev/dm-1

systemd generates a mount target for this, mnt-Disc.mount, which is required by local-fs.target
However, systemd cannot execute this mount target; the error is:

systemd[1]: Job dev-disk-by\x2dlabel-Disc.device/start timed out.
systemd[1]: Job mnt-Disc.mount/start failed with result 'dependency'.
systemd[1]: Job dev-disk-by\x2dlabel-Disc.device/start failed with result 'timeout'.

There is an easy workaround; just use the device name in fstab instead of the label to mount the
partition. eg,

/dev/dm-1 /mnt/Disc ext4 defaults 0 0

works fine.

I had dmraid in my mkinitcpio.conf, and dmraid.service enabled.

Let me know if you need any more information
This task depends upon

Closed by  Dave Reisner (falconindy)
Sunday, 05 August 2012, 00:07 GMT
Reason for closing:  Upstream
Additional comments about closing:  Not much for Arch to do here. See long stream of comments.
Comment by Dave Reisner (falconindy) - Monday, 04 June 2012, 09:56 GMT
Since it works to mount by device name, have you looked to see if the udev symlinks are being created? Not sure why you're adding dmraid to the initramfs since you don't need it there
Comment by Simeon (bladud) - Monday, 04 June 2012, 15:24 GMT
Turns out the "by device label" thing was a red herring - the real issue is "will not mount if the dmraid hook is in mkinitcpio.conf"

The funny thing is, doing "mount /mnt/Disc" in systemd's emergency shell will mount the disc just fine.
It's not timing based, because trying to mount a different dmraid device with the systemctl target after manually mounting the first will still fail.
Comment by Dave Reisner (falconindy) - Monday, 04 June 2012, 15:33 GMT
Ah, I think I might know what the problem is...

"/usr/lib/initcpio/udev/11-dm-initramfs.rules" is part of the device-mapper package, which you should have installed. If you add that to the image via the FILES= var in your config and regenerate, does systemd stop complaining?
Comment by Simeon (bladud) - Tuesday, 05 June 2012, 02:48 GMT
Hm. No, that didn't seem to make a difference.

Is there some way I can check what udev knows about the device when I'm in the emergency shell?
Comment by Dave Reisner (falconindy) - Tuesday, 05 June 2012, 02:51 GMT
Sure, 'systemctl -t device' is relevant. Can you confirm that when it fails by label that /dev/disk/by-label/Disc exists?
Comment by Dave Reisner (falconindy) - Thursday, 07 June 2012, 12:56 GMT
Can you fix your /usr/lib/systemd/system/systemd-udev-settle.service according to the patch below?

http://cgit.freedesktop.org/systemd/systemd/commit/?id=a2368a3f37ede469d4359421c1e4ad304c682a07
Comment by Simeon (bladud) - Saturday, 09 June 2012, 05:58 GMT
Nope, no change with the fix to systemd-udev-settle.service.

I also confirmed that /dev/disk/by-label/Disc exists, and indeed "mount /dev/disk/by-label/Disc" works as expected.
Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 14:29 GMT
Please post your /etc/mkinitcpio.conf
Comment by Simeon (bladud) - Sunday, 10 June 2012, 17:09 GMT
Now running systemd 185-1.

The situation has changed a little - now there is a separate bug that mounting by label appears to succeed,
but actually mounts one of the member volumes of the dmraid array (even if dmraid is not in HOOKS in mkinitcpio.conf).
This is because /dev/disc/by-label/Disc points to /dev/sdb1 (or whatever the member volumes of the array are called), and is because
mnt-Disc.mount gets run just after local-fs-pre.target, while dmraid.service, which assembles the arrays, only gets run just before basic.target, which may be later. Changing fstab to mount /dev/dm-2 on /mnt/Disc fixes this problem, but the behaviour this bug is about is unchanged.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 17:10 GMT
Problematic mkinitcpio.conf

To make it work, just delete "dmraid" from the HOOKS array.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 17:12 GMT
The output of "systemctl show dev-disk-by\x2dlabel-Disc.device" at the emergency shell after mounting has failed.

Note I checked explicitly after doing this that /dev/disk/by-label/Disc did point to /dev/dm-2, and that doing a manual mount worked fine.
Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 17:19 GMT
"This is because /dev/disc/by-label/Disc points to /dev/sdb1"

This sounds really messed up. I'm really not familiar with dmraid, but if its anything like other stacked filesystems, you should not be labeling the member devices (and if you are, you definitely shouldn't be giving same identical labels). Can you attach blkid output as well as your complete /etc/fstab after everything is assembled?

I'm still curious: why are you including dmraid in mkinitcpio config when assembly in early userspace isn't required?
Comment by Simeon (bladud) - Sunday, 10 June 2012, 17:22 GMT
Relevant bits of "systemctl -t device --all --full " at the same emergency shell as above.
The dmarray does have another partition on it, but I've commented out its mountpoint in fstab for debugging.

Also my real root is on an md array, whose various devices are not shown in the log.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 17:41 GMT
"I'm still curious: why are you including dmraid in mkinitcpio config when assembly in early userspace isn't required?"

Simple ignorance, I fear. I thought, from my reading of the mkinitcpio help and the man page, that
the dmraid hook had to be included to mount dmraid arrays;
I didn't realise that it was enough to run the dmraid service later.

I guess now that the dmraid hook is just for a root on a dm device?

I'd be totally happy if you want to close the bug with "don't do that then" and maybe add a note to some documentation somewhere, except that now I'm kind of curious as to what goes on. Although if you have better things to do than assuage my curiousity, feel free to tell me to go away :)

Also I had no idea you weren't supposed to label member devices - I attach the output of blkid,
where you can see that each element in my dm arrays does indeed have a label, identical to the label of the /dev/mapper device, but the md devices don't. Although - my understanding was that unlike md each dm device is identical to each other, and there is no metadata block. So it makes sense that they would be labelled, no? There is nowhere else to store the label of the assembled device.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 17:43 GMT Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 17:55 GMT
> There is nowhere else to store the label of the assembled device.
labels and UUIDs generally belong to the filesystem, not to the underlying physical devices.

mkinitcpio's only job is supposed to be setting up the root partition, mounting it, and exec'ing the real root. Setting up more than that should be supported, but there's not really any real need to do it. In the case of using device-mapper, there's always some extra gotchas because it doesn't play well with udev hotplug (i.e. it requires the udev-settle service).

The problem here is simply a case of the labels clashing, since you have multiple devices with the same label. My hypothesis is: a by-label symlink is created for every device (meaning that they'll be overwritten). In the case of the array being assembled in early userspace, they array already exists and the later userspace udev-trigger event recreates the symlinks in an unreliable order and you end up with the symlink pointing to the wrong device that never shows up with a mountable filesystem. In the case of assembly only by systemd in later userspace, the 2 member devices appear, followed by the raid array being assembled (which conveniently overwrites the by-label symlink with a pointer to the real FS).

The short version is, there isn't really anything I can do to fix this on my side.

As I see it, you've got a few options, in no particular order:
- Differentiate the assembled device from its members, either by removing the label from the members, or changing the label on the assembled device. Again, I'm not familiar with the intricacies of dmraid, so I'm not sure if this will work.
- Remove dmraid from your initramfs config.
- Refer to the dmraid device by something other than its label. UUID will certainly be unique.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 18:33 GMT
I don't completely buy your hypothesis; it doesn't explain why changing my fstab to mount by device name, /dev/dm-1, doesn't work when the dmraid hook is enabled. That at least should be unique. Also the initcpio hook calls dmraid -ay -Z, and the -Z removes member devices from the tree, which I think is why member devices can have the same labels.
(I just tried running the initcpio hook without that option and there was no change)

Incidentally, dmraid.service should have the "-Z" option added to dmraid when it is called. This removes member volumes from blkid and ensures that once we end in multi-user with everything done, there is only one volume for the dm array, which prevents a lot of confusion.
It probably also fixes the bug where /dev/disk/by-label/Disc points to the wrong place.

I'm willing to accept the explanation that udev is somehow getting confused and there's nothing you can do about it.
Perhaps a note could be put in mkinitcpio.conf to say that users should only add the hooks they need to mount their root device, and that the dmraid hook may be problematic with systemd?
For me, I shall just remove dmraid from my mkinitcpio.conf.

Thanks for the help.
Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 18:35 GMT
Can you confirm that adding -Z to the dmraid.service fixes this when dmraid is in the initramfs hooks? I'd really like to understand what's breaking here. I don't have the hardware to test this myself.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 18:35 GMT
Altered dmraid.service that fixes some unrelated bugs with the -Z option to dmraid.
Comment by Simeon (bladud) - Sunday, 10 June 2012, 18:55 GMT
Sorry, to summarize and be a bit clearer, there are now three overlapping bugs here:

1. Adding dmraid to the mkinitcpio HOOK array causes mounting any dmraid array with systemd as part of the normal, post-initcpio boot process to fail, although they can still be mounted manually with the mount command. This can apparently only be fixed by not including dmraid in the mkinitcpio HOOK array. I can do this because my root is not a dmraid array; I have no idea whether dmraid on root works or not.

2. Mounting a dm device by label in fstab has a tendency to mount the member volume of the dm array, because the member volumes and the assembled array have the same labels, and the systemd unit that corresponds to the mount may be run before dmraid.service assembles the array. This can be fixed by not mounting by label. If you mount by device name or uuid, systemd will wait until that device name or uuid exists before trying the mount. The label, however, will probably exist before the device is actually ready.

3. Once you have successfully mounted your dmraid array, file managers such as kde's dolphin, and also blkid, will show multiple volumes with the same label (that of the dmraid array). This is because when dmraid is being called it is assembling the array, but not removing the member devices. This can be fixed by adding the "-Z" option to dmraid in dmraid.service, which removes the member devices when the main array is assembled. Unfortunately it doesn't help to fix the other two bugs.
Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 19:22 GMT
1. root on dmraid should work just fine -- systemd won't be involved. I'd rather not just accept that assembly of accessory devices "doesn't work" if we can figure out a fix.

2. It's worth pointing out that the label "always" exists. The symlinks you see in /dev/disk/by-* are created by udev events when the devices appear. See /usr/lib/udev/rules.d/60-persistent-storage.rules or 13-dm-disk.rules in the same dir for examples.

3. Cool, that much is committed and will be shipped with the next dmraid package.

-----

We should probably look at the device nodes in question. systemd will always tag block devices so it can keep track of them. Can you post the output of 'udevadm info --query=all -n /dev/mapper/nvidia_bedfgeadp3' booting with both dmraid in the initramfs and without?
Comment by Simeon (bladud) - Sunday, 10 June 2012, 20:01 GMT
1. Ok! Let's debug! Good that dmraid root will probably work.

2. I guess the problem here is that systemd can't know that the symlink that appears first, to the label of the member volume, isn't the one to use, but that it should wait until dmraid is run to mount the real volume. The "systemd" way to fix this would, I guess, be to create an explicit .mount unit which has an explicit dependency on dmraid.service. But I can't think of any way to let fstab contain that information. I guess one thing that could be done is to make dmraid.service happen before local-fs-pre.target, rather than basic.target (so it runs before any mountpoints). Hopefully dmraid removes the old volumes before assembling the new ones, and udev's symlinks keep up, so by the time we get to mnt-Disc.mount, the symlink to the new volume is the only one there is. I can try that, I guess.

3. Great! One down, two to go.

I'll post the udev info in a second
Comment by Simeon (bladud) - Sunday, 10 June 2012, 20:03 GMT
udevadm info --query=all -n /dev/mapper/nvidia_bedfgeadp3

without dmraid hook in mkinitcpio
Comment by Simeon (bladud) - Sunday, 10 June 2012, 20:46 GMT
udevadm info --query=all -n /dev/mapper/nvidia_bedfgeadp3

with dmraid hook in mkinitcpio
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 20:48 GMT
Maybe worth comparing with this: https://bugzilla.redhat.com/show_bug.cgi?id=679321 ?
Comment by Simeon (bladud) - Sunday, 10 June 2012, 20:50 GMT
"I guess one thing that could be done is to make dmraid.service happen before local-fs-pre.target, rather than basic.target (so it runs before any mountpoints). Hopefully dmraid removes the old volumes before assembling the new ones, and udev's symlinks keep up, so by the time we get to mnt-Disc.mount, the symlink to the new volume is the only one there is."

This was, it turns out, a bad idea. udev could sometimes keep up, but not always; at one point the whole system hung on boot, I *think* because the symlink changed partway through the mount. That time I didn't even get an emergency shell. So let's just not be mounting these things by label.
Comment by Dave Reisner (falconindy) - Sunday, 05 August 2012, 00:07 GMT
There isn't much I can do about this. dmraid has no active upstream maintainer and systemd upstream doesn't care about dmraid.

Bottom line, don't try to activate these things from the initramfs if you aren't going to be mounting them there as well.

Loading...