FS#25132 - [mdadm] mkinitcpio hook does not contain any "mdadm" call

Attached to Project: Arch Linux
Opened by Phillip Keldenich (Zac) - Friday, 15 July 2011, 08:05 GMT
Last edited by Tobias Powalowski (tpowa) - Sunday, 22 January 2012, 10:52 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture All
Severity Very Low
Priority Low
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

Description:
After updating from 0.7.1, my system fails to boot automatically and i am dropped to recovery shell because the mdadm hook fails to create my correctly specified and configured md0 device; thereafter, cryptsetup fails because /dev/md0, where it expects an encrypted root raid0, does not exist.
I can, however, boot the machine by executing:

mdadm --assemble md0
cryptsetup luksOpen /dev/md0 root
exit

So the mdadm.conf file works and so does cryptsetup.

Additional Information:

The mdadm hook file looks like this:

run_hook ()
{
input="$(cat /proc/cmdline)"
mdconfig="/etc/mdadm.conf"
# for partitionable raid, we need to load md_mod first!
modprobe md_mod 2>/dev/null
# If md is specified on commandline, create config file from those parameters.
if [ "$(echo $input | grep "md=")" ]; then
#Create initial mdadm.conf
# scan all devices in /proc/partitions
echo DEVICE partitions > $mdconfig
for i in $input; do
case $i in
# raid
md=[0-9]*,/*)
device="$(echo "$i" | sed -e 's|,/.*||g' -e 's|=||g')"
array="$(echo $i | cut -d, -f2-)"
echo "ARRAY /dev/$device devices=$array" >> $mdconfig
;;
# partitionable raid
md=d[0-9]*,/*)
device="$(echo "$i" | sed -e 's|,/.*||g' -e 's|=|_|g')"
array="$(echo $i | cut -d, -f2-)"
echo "ARRAY /dev/$device devices=$array" >> $mdconfig
;;
# raid UUID
md=[0-9]*,[0-9,a-z]*)
device="$(echo "$i" | sed -e 's|,.*||g' -e 's|=||g')"
array="$(echo $i | cut -d, -f2-)"
echo "ARRAY /dev/$device UUID=$array" >> $mdconfig
;;
# partitionable raid UUID
md=d[0-9]*,[0-9,a-z]*)
device="$(echo "$i" | sed -e 's|,.*||g' -e 's|=|_|g')"
array="$(echo $i | cut -d, -f2-)"
echo "ARRAY /dev/$device UUID=$array" >> $mdconfig
;;
esac
done
# If i add mdadm --assemble md0 here, everything works fine for me...
fi
}
---------------------------------------------------------------------------
Is it normal that this file does not even try to execute mdadm and is a no-op
if /etc/mdadm.conf is already existing/there are no md=... arguments?

---------------------------------------------------------------------------
Output of mkinitcpio -v:
`--> sudo mkinitcpio -v
==> Starting dry run: 2.6.39-ARCH
adding file: /lib/firmware/tigon/tg3_tso5.bin
adding file: /lib/firmware/tigon/tg3_tso.bin
adding file: /lib/firmware/tigon/tg3.bin
adding file: /lib/firmware/iwlwifi-5150-2.ucode
adding file: /lib/firmware/iwlwifi-5000-5.ucode
adding file: /lib/firmware/iwlwifi-6000g2b-5.ucode
adding file: /lib/firmware/iwlwifi-6000g2a-5.ucode
adding file: /lib/firmware/iwlwifi-6050-5.ucode
adding file: /lib/firmware/iwlwifi-6000-4.ucode
adding file: /lib/firmware/iwlwifi-100-5.ucode
adding file: /lib/firmware/iwlwifi-1000-5.ucode
-> Parsing hook: [base]
adding dir: /proc
adding dir: /sys
adding dir: /dev
adding dir: /run
adding dir: //usr/bin
adding dir: //usr/sbin
adding file: /bin/busybox
adding symlink: /lib/libc-2.14.so -> /lib/libc.so.6
adding file: /lib/libc-2.14.so
adding symlink: /lib/ld-2.14.so -> /lib/ld-linux-x86-64.so.2
adding file: /lib/ld-2.14.so
adding file: /sbin/modprobe
adding file: /sbin/blkid
adding symlink: /lib/libblkid.so.1.1.0 -> /lib/libblkid.so.1
adding file: /lib/libblkid.so.1.1.0
adding symlink: /lib/libuuid.so.1.3.0 -> /lib/libuuid.so.1
adding file: /lib/libuuid.so.1.3.0
adding file: /init_functions
adding file: /init
adding file: /etc/modprobe.d/usb-load-ehci-first.conf
-> Parsing hook: [udev]
adding file: /sbin/udevd
adding symlink: /lib/librt-2.14.so -> /lib/librt.so.1
adding file: /lib/librt-2.14.so
adding symlink: /lib/libpthread-2.14.so -> /lib/libpthread.so.0
adding file: /lib/libpthread-2.14.so
adding file: /sbin/udevadm
adding file: /lib/udev/rules.d/50-firmware.rules
adding file: /lib/udev/rules.d/50-udev-default.rules
adding file: /lib/udev/rules.d/60-persistent-storage.rules
adding file: /lib/udev/rules.d/80-drivers.rules
adding file: /lib/udev/firmware
adding file: /lib/udev/ata_id
adding file: /lib/udev/path_id
adding file: /lib/udev/scsi_id
adding file: /lib/udev/usb_id
adding file: /etc/udev/udev.conf
adding file: /hooks/udev
-> Parsing hook: [autodetect]
-> Parsing hook: [scsi]
-> Parsing hook: [usbinput]
-> Parsing hook: [keymap]
adding file: /hooks/keymap
-> Parsing hook: [mdadm]
Custom /etc/mdadm.conf file will be used in initramfs for assembling arrays.
adding file: /etc/mdadm.conf
adding file: /sbin/mdadm
adding file: /lib/udev/rules.d/64-md-raid.rules
adding file: /hooks/mdadm
-> Parsing hook: [encrypt]
adding file: /sbin/cryptsetup
adding symlink: /lib/libcryptsetup.so.1.2.0 -> /lib/libcryptsetup.so.1
adding file: /lib/libcryptsetup.so.1.2.0
adding symlink: /lib/libpopt.so.0.0.0 -> /lib/libpopt.so.0
adding file: /lib/libpopt.so.0.0.0
adding file: /lib/libdevmapper.so.1.02
adding symlink: /lib/libgcrypt.so.11.6.0 -> /lib/libgcrypt.so.11
adding file: /lib/libgcrypt.so.11.6.0
adding symlink: /lib/libgpg-error.so.0.7.0 -> /lib/libgpg-error.so.0
adding file: /lib/libgpg-error.so.0.7.0
adding symlink: /lib/libudev.so.0.11.5 -> /lib/libudev.so.0
adding file: /lib/libudev.so.0.11.5
adding file: /sbin/dmsetup
adding file: /lib/udev/rules.d/10-dm.rules
adding file: /lib/udev/rules.d/13-dm-disk.rules
adding file: /lib/udev/rules.d/95-dm-notify.rules
adding file: /lib/udev/rules.d/11-dm-initramfs.rules
adding file: /hooks/encrypt
-> Parsing hook: [filesystems]
==> Generating module dependencies
==> Dry run complete, use -g IMAGE to generate a real image
------------------------------------------------------------------------------------
The /etc/mdadm.conf file contains a valid configuration for /dev/md0.
------------------------------------------------------------------------------------
Finally, a shortened version of my /etc/mkinitcpio.conf:

MODULES=""
BINARIES=""
FILES=""
HOOKS="base udev autodetect scsi usbinput keymap mdadm encrypt filesystems"
COMPRESSION="xz"

Steps to reproduce:
I do not know; does everything work fine if you use an encrypted root on a raid0 device?
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Sunday, 22 January 2012, 10:52 GMT
Reason for closing:  Fixed
Additional comments about closing:  if ARRAY is defined it's fixed!
Comment by Florian Pritz (bluewind) - Friday, 15 July 2011, 08:48 GMT
I think the rules in /lib/udev/rules.d/64-md-raid.rules are supposed to take care of that by calling mdadm -I for every new device.
Comment by Tobias Powalowski (tpowa) - Saturday, 16 July 2011, 07:05 GMT
udev should activate everything.
Comment by Phillip Keldenich (Zac) - Saturday, 16 July 2011, 07:22 GMT
I agree with you that it should; but at least in my case, it does not. Is there anything needed in /etc/mdadm.conf except the two "DEVICE" entries for the devices (by-id) and the "ARRAY" entry?
Comment by Thomas Bächler (brain0) - Saturday, 16 July 2011, 18:11 GMT
If you report a problem, please report the actual problem and not what you think _might_ be the cause. The mdadm hook is not supposed to contain any runtime files, and works without any. It also works without any mdadm.conf file, only the 'md?' names will change then.

It is strange that the mdadm hook tries to do anything at all, all of that is unnecessary and will probably be without effect.
Comment by Thomas Bächler (brain0) - Saturday, 16 July 2011, 18:17 GMT
I just confirmed this on my test setup: Without the mdadm runtime hook, and withou mdadm.conf, the system still assembles the raid fine and boots. The only thing that changes is that md0 is renamed to md127.
Comment by Phillip Keldenich (Zac) - Monday, 18 July 2011, 14:18 GMT
In my case, if i remove the /etc/mdadm.conf file and then run mkinitcpio, I end up without any chance of booting my system with the generated initramfs. I tried this by renaming the file; I was not able to boot because the drivers for my chipset, ahci or something else went missing in initramfs and none of my hard drives were found.

Renaming the file was the only thing I changed.
Comment by Thomas Bächler (brain0) - Monday, 18 July 2011, 14:29 GMT
Ehm, sorry? If you don't know how to properly regenerate and test your initramfs, then go to the forums or wherever and get help. The bugtracker is not here to teach you how to do these simple tasks.

The fact that you end up without ahci after changing a file completely unrelated to mkinitcpio tells me you have no idea what you are doing, and your problem is most likely a case of PEBKAC. If it is not, please provide the correct debugging information.
Comment by Chris Bannister (Zariel) - Monday, 18 July 2011, 14:57 GMT
The same thing is happening to me, the array's are being brought up as /dev/md12{3,4,5,6,7} (there are only 4, 123 is /dev/sd{a,b}). It drops me to recovery shell where I can stop the arrays and do a mdadm -A --scan which brings up the arrays correctly as my boot lines says, /dev/md{1,2,3,4} and then I can boot fine.

This is just a standard mdadm raid1 setup, no lvm or luks.
Comment by Phillip Keldenich (Zac) - Monday, 18 July 2011, 15:19 GMT
I did not look for or require any help to regenerate my initramfs. I was just replying to your comment that, without /etc/mdadm.conf and the mdadm hook your system assembles the md, saying my system does not.

My system is booting correctly with the changed hook - I do not care whether udev creates my root device automatically or my changed version of the mdadm hook file does.

This might also be something wrong in my udev configuration, though I did not touch any of this, but as a user I was not able to figure out which component was failing; the only thing I noticed was that there was a mkinitcpio update and the regenerated initramfs did not boot correctly with an unchanged configuration and the md was not build correctly by the time the mdadm hook was running which was the case in the former version.
From this I concluded this might well be a bug in mkinitcpio because none of the other components or the configuration changed.

I then changed the hook file to fix the annoying 20 second wait at startup and the need to bring up my md manually.
Comment by Thomas Bächler (brain0) - Monday, 18 July 2011, 15:29 GMT
@Zariel, the md= command line parameters cannot work in the current configuration and should not be used (in fact, the parsing code for the md= options should be removed). Either generate mdadm.conf and regenerate initramfs, or use labels/uuids if you don't care about the md device names.

@Zac, your last comment stated that you lost ahci in your initramfs after moving mdadm.conf, which are entirely unrelated, so I am pretty sure you are doing something wrong.
Comment by Chris Bannister (Zariel) - Monday, 18 July 2011, 15:32 GMT
Ive tired using both and neither name the devices correctly.
Comment by Jens Adam (byte) - Tuesday, 19 July 2011, 15:34 GMT
I had to change my mdadm.conf after the initscripts/udev update in April, when mdadm support switched to udev alone.
Before I had a rather specific "DEVICE /dev/disk/by-id/ata-Hitachi*" line, which stopped working.
The default "DEVICE partitions" was OK.
Comment by Felix (thetrivialstuff) - Saturday, 13 August 2011, 04:24 GMT
I upgraded today and I too am experiencing this. I think the problem is that the udev rules aren't handling certain cases -- it looks to me as if /lib/udev/rules.d/64-md-raid.rules will examine the following:

- devices linked from /dev/disk/by-id/
- devices linked from /dev/md/
- "ordinary" devices, e.g. /dev/sdXN
- devices linked from /dev/disk/by-uuid/
- devices linked from /dev/disk/by-label/

(I could be wrong about this, though; I find udev rules cryptic and don't understand them very well...)

If udev tries "mdadm -I" on a device that is a RAID member, but isn't mentioned in a DEVICE line (say, because udev is calling "mdadm -I /dev/sda1" but mdadm.conf names that partition as /dev/disk/by-id/something-part1 instead), the call will fail and the array will not assemble.

Small rant:

Doing RAID assembly with udev when there is a custom mdadm.conf strikes me as overly complex -- the arrays could be assembled by just calling "mdadm --assemble" for each ARRAY named in mdadm.conf (as the original reporter is doing manually). That's a lot simpler and more transparent in case of problems than a big mess of udev rules that assemble the array if and only if udev happens to, more or less by brute force, try all of the devices that might be mentioned in DEVICE and ARRAY lines.

Sorry, but I'm getting tired of RAID arrays breaking on upgrades -- "persistent" device names aren't looking so persistent.

~Felix.

PS: Jens, the specific names you had were better, and you should try to get them working again. "partitions" makes it too easy to accidentally pull/replace the wrong device when a disk fails, because it's difficult to tell which one failed.
Comment by Tim O'Brien (timob) - Saturday, 13 August 2011, 04:33 GMT
On upgrade the mdadm hook was taken out of my /etc/mkinitcpio.conf. Meaning i had to reboot with a live cd and recreate my initramfs.

My mdadm.conf contains: DEVICE partitions
Comment by Dave Reisner (falconindy) - Saturday, 13 August 2011, 05:06 GMT
The udev rule is responsible for _creating_ those /dev/disk/by-* symlinks after assembling and starting each raid array.

udev read attributes on block devices which are created earlier in the ruleset. Anything marked with an ID_FS_TYPE of 'linux_raid_member' or 'isw_raid_member' invokes mdadm --incremental to try and start the array. If an array is started, the kernel creates an md* device, which triggers creation the aforementioned symlinks.

I'm not sure I follow the griping about (lack of?) persistent naming. Label your filesystem. Boot with root=LABEL=mysweetfs or root=/dev/disk/by-label/mysweetfs. Furthermore, inclusion of an mdadm.conf gives the array an appropriate name on incremental assembly so you can identify by something like /dev/md/mysweetraid. You're far from out of options here. It's not really clear to me what's broken to the point that people are unable to boot...
Comment by Felix (thetrivialstuff) - Saturday, 13 August 2011, 05:35 GMT
Here's my long rant on why /dev/disk/by-path is the safest way to name members of RAID arrays and boot devices and why everybody should be using it instead of by-label and by-uuid:

https://bbs.archlinux.org/viewtopic.php?id=81674

Short version:

Relying on drives (specifically, the *data* on the drives) to identify themselves is dangerous, for a variety of reasons.

Examples of when using UUID's or labels for raid members breaks down:

- During a drive failure, the system will go looking for the remaining RAID members. It will blindly add *any* devices that it finds with matching UUID's -- even if they're in a ridiculous location like an external USB port. Were you trying to recover a failed drive plugged into USB temporarily? Oops, you just booted to it and probably corrupted all the data badly.

- Suppose you and a friend both label your root device "root". His power supply blows, so you plug his drive into your eSATA port to retrieve some files for him. Oops, your system just booted to his drive because it initializes faster than yours.

- Suppose one of your RAID members fails. Which drive was it that failed, and what is it connected to?

Telling the system to boot from a path you know to be "the first SATA port on the motherboard inside the machine" is safer and avoids these problems.

~Felix.
Comment by Tim O'Brien (timob) - Saturday, 13 August 2011, 05:48 GMT
Comment by Felix (thetrivialstuff) - Saturday, 13 August 2011, 06:28 GMT
Comment by Dave Reisner (falconindy): "udev read attributes on block devices which are created earlier in the ruleset. Anything marked with an ID_FS_TYPE of 'linux_raid_member' or 'isw_raid_member' invokes mdadm --incremental to try and start the array."

Wait... does this mean that there is now *no way* to persistently name RAID members? That we're down to the equivalent of the old RAID auto-detect/assemble and that's our only choice?
Comment by Felix (thetrivialstuff) - Saturday, 13 August 2011, 07:30 GMT
Proposed fix to /lib/initcpio/hooks/mdadm. Calling mdadm --assemble on an array that's already assembled has no effect, so this is harmless to any arrays that udev assembles first.

Conversely, calling mdadm -I on a device that's already in an assembled array is also harmless (though it does output a warning message about the device being busy). I think udev would be finished with its attempts to create arrays by the time this runs, though?

I've tested this on a virtual machine; will test it on my real server at the office later.

~Felix.
Comment by Felix (thetrivialstuff) - Friday, 19 August 2011, 04:03 GMT
OK, tested aforementioned patch on a production machine; it assembles the arrays cleanly and that machine now boots without help again.

~Felix.
Comment by Dave Reisner (falconindy) - Friday, 19 August 2011, 04:10 GMT
Your patch breaks if someone wants to assemble by label (omg, heresy!) and the label name has a space in it. I'm not sure this is correct, but...

grep ^ARRAY /etc/mdadm.conf | while read _ array; do
mdadm --assemble "$array"
done

I'm not even sure we want something like this, but I think we can all agree that we don't want a broken patch.
Comment by Felix (thetrivialstuff) - Friday, 19 August 2011, 05:01 GMT
I thought the first argument to an ARRAY line had to be a /dev device name? Although I guess it is possible for those to have spaces in them too, yeah...

Well, I think there's an mdadm command line that'll cause mdadm to go through mdadm.conf, parse it all properly, and do the assembling. "mdadm --assemble --scan" or something?

There is (was) also a program called "mdassemble" that existed before raid assembly was moved over to udev rules, that did all of this quite nicely. But talking about mdassemble (which we're reinventing here, poorly) is basically asking for the whole "let's do this in udev" change to be completely thrown out, so I'm not sure that'll get us anywhere.

~Felix.
Comment by Thomas Bächler (brain0) - Friday, 19 August 2011, 07:15 GMT
While I'm not an expert on the mdadm architecture, the patch seems somewhat a step back. If I understand this right, mdadm is supposed to have the capability to auto-assemble arrays once the underlying drives are detected. So instead of reverting to old behaviour, wouldn't it be better to find out why that doesn't work in some cases and fix it?

It would help a lot if anyone here would actually provide details about their raid setup - I cannot reproduce this problem, auto-assembly works fine for all my systems.
Comment by Tobias Powalowski (tpowa) - Friday, 19 August 2011, 07:29 GMT
As far as i can see here, some people use raid partitions.
Comment by Felix (thetrivialstuff) - Friday, 19 August 2011, 17:07 GMT
Here's my mdadm.conf . The problem is that the DEVICE names I'm using are symlinks created by udev. Since the new auto-assembly is based on udev rules, in order for udev to assemble my arrays it would need to trigger on its own symlinks. I'm not sure if this is possible.

As I understand the current rules auto-assembly behaviour, it only looks at the actual device nodes -- which may not be persistently named -- and not any symlinks to them. This doesn't matter if assembly is done by reading identifiers off each partition (UUID's) and then matching them up into arrays, e.g. if the two pieces of a RAID-1 are /dev/sda1 and /dev/sdb1 one boot but /dev/sdb1 and /dev/sdc1 the next boot, it'll still assemble correctly because udev will just pass those to mdadm -I one at a time, and mdadm will figure out which array each new device should go into.

But if assembly is to be done by saying, "I want this partition on this physical disk and this partition on this physical disk in my array," (which I vastly prefer), this cannot be done by triggering on device names that are subject to change across boots. The conundrum is this:

- udev will only trigger on /dev/sdX names
- so, /dev/sdX names must be used in mdadm.conf; otherwise, mdadm will not know what to do with the argument given to -I
- but, /dev/sdX names are subject to change, and thus *cannot* be used in mdadm.conf

Where this becomes really apparent is when a hard drive fails -- even though /dev/sdX names are normally quite stable, if a disk is not detected or simply not present, everything after that disk will shift its name by one place. /dev/sdc will become /dev/sdb if the existing /dev/sdb dies. In the old days, the /dev/ names were stable -- /dev/hdb always meant "disk in the primary slave position" and it would stay /dev/hdb even if the primary master disk died. Using /dev/hdX naming, one *could* write a good mdadm.conf for these udev rules. But those days are gone, and if one wants to refer to disks based on where they're connected, in a way that remains stable even if earlier disks are removed, one needs to use /dev/disk/by-path .

(Ultimately, I'm still of the opinion that removing the association between where a disk is connected and its default name in /dev/ was a mistake -- I would much prefer it if /dev/sda meant "SATA connector 1" instead of "the first disk that initializes, regardless of where it is" -- but that's neither here nor there :) )

~Felix.
Comment by Thomas Bächler (brain0) - Friday, 19 August 2011, 17:20 GMT
I was under the impression that mdadm *always* assembles by matching the UUIDs in the metadata. This is completely safe, so why do anything else?

In any case, I now finally know why this hook fails as it is (at least in your case) - using your configuration, you actually prevent mdadm from auto-assembling. You are right that udev calls mdadm before it creates the symlinks, thus your mdadm.conf is not working.

There is a fundamental problem with your approach though: The device creation is asynchronous, meaning it can take time until the devices show up - mkinitcpio might call the mdadm assembly before the devices are created. This problem is actually solved by udev auto-assembly.

I could think of this solution: Split the mdadm into two hooks:
1) mdadm - only add /sbin/mdadm, the udev rule and /etc/mdadm.conf, no runtime hook.
2) mdadm_manual - add the above, omit the udev rule and add the old runtime hook.
Comment by Felix (thetrivialstuff) - Friday, 19 August 2011, 18:20 GMT
"mkinitcpio might call the mdadm assembly before the devices are created. This problem is actually solved by udev auto-assembly."

I've never run into that -- doesn't putting sata and pata before mdadm in HOOKS solve this? Or does initcpio not wait for hooks to finish running?

Anyway, based on earlier experiments I've done, I'm not convinced that matching UUID's is completely safe -- for instance, I've been able to cause bad RAID assembly, and in some cases data corruption by:

- dd'ing a partition onto an external USB drive -- when auto-assembling, mdadm doesn't care where something is and can pull it into the array. This is bad if the reason the disk was on USB was because you were in the middle of trying to recover it.

- plugging in the components of an array from a different system. all the components are there, so mdadm starts it -- maybe the reason I plugged those drives in was for data recovery, and the disks shouldn't be used except for dd'ing onto backups.

- maliciously creating a disk with the same UUID and superblock. mdadm will assume that such a device is clean and add it to the array, even though the data might differ completely from what's on the good disk. (Granted, attacks of this kind are quite unlikely, but possible if one has physical access to USB ports but not inside the machine's case. Where might this be true? Something like a "Userful" setup, commonly used in schools and libraries.) What bothers me about this is that an unprivileged user can (on their own machine) create a disk with the correct UUID and superblock, then plug it into a machine on which they do not have root, and have mdadm consider the disk on a level playing field with the internal SATA ports.

- marking a disk 'failed' -- apparently, doing this does not prevent the 'failed' disk from being a member of an array in future. e.g. try this:
1. make a RAID-1 array /dev/md0 of /dev/sda1 and /dev/sdb1
2. mdadm /dev/md0 --fail /dev/sdb1 ; mdadm /dev/md0 --remove /dev/sdb1
3. now stop /dev/md0 and then mdadm --assemble --scan
mdadm recognizes that the failed disk (sdb1) should not be paired with the good one (sda1) -- but it still makes it part of an array (in the example I just tried, it calls it /dev/md/0_0). So now there are two arrays:
/dev/md0 -- consisting of the good disk, /dev/sda1
/dev/md127 -- consisting of the failed disk, /dev/sdb1

All of these scenarios are, in my opinion, serious problems that can be avoided by being explicit about what is and is not part of each array. If we rely on mdadm to scan and dynamically assemble things based on their UUID, it is difficult (if not impossible, as in that last case) to stop a particular disk from being used -- either because it is known to be a bad disk, or because it doesn't belong in the running system.

~Felix.
Comment by Dave Reisner (falconindy) - Friday, 19 August 2011, 18:45 GMT
>> maliciously creating a disk with the same UUID and superblock.

physical access is root access. if we're going to consider this scenario and account for it, why not dispatch ravenous badgers to protect your computer and cover all sorts of other scenarios?

>> I've never run into that -- doesn't putting sata and pata before mdadm in HOOKS solve this? Or does initcpio not wait for hooks to finish running?

Not all items in HOOKS have a runtime component. Use "lsinitcpio -a /path/to/image". The last item in the output is a list of hooks which do have, and will run, a script for bootstrap.

>> - dd'ing a partition onto an external USB drive -- when auto-assembling, mdadm doesn't care where something is and can pull it into the array. This is bad if the reason the disk was on USB was because you were in the middle of trying to recover it.

Change the UUID of the newly created partition on the external.

>> marking a disk 'failed' -- apparently, doing this does not prevent the 'failed' disk from being a member of an array in future.

This is why you zero the superblock with mdadm after removing it from the array.

>> plugging in the components of an array from a different system. all the components are there, so mdadm starts it.

I don't even understand this point. If the disks are to be used as backup, why is there evidence of an array on it? Zero it out.
Comment by Felix (thetrivialstuff) - Friday, 19 August 2011, 19:12 GMT
">> maliciously creating a disk with the same UUID and superblock.

physical access is root access."

There are different levels of physical access -- keyboard, mouse, display, and USB port should not be equivalent to "complete access to the innards." A brief description of the "Userful" system I mentioned:

There is one large computer, which is in a locked cabinet or perhaps another room. Users never see this box. It is connected to multiple sets of keyboards, monitors, mice (i.e. it's running multi-headed X), each with their own USB port on the end of an extension cable or a USB hub. Users can plug in their own USB drives and work with files on them.


">> marking a disk 'failed' -- apparently, doing this does not prevent the 'failed' disk from being a member of an array in future.

This is why you zero the superblock with mdadm after removing it from the array."

That's possible when you're simulating a failure explicitly with mdadm, but disks don't zero out own their superblocks when they disappear from the array for another reason (temporary failure due to bad cabling, power supply issues, etc.). Their reappearance is quite likely to be at the next power cycle, so zeroing will have to be done by booting to alternate media. If the reboot happens unattended, you could suddenly have a system with twice as many arrays as it should, which could be a mess to sort out. And if filesystems are mounting by UUID, who knows what would get mounted (since UUID's would still be the same across array pieces). Actually, this sounds fun to simulate; I'm going to try it next time I'm at work and report back :)

Also, is it safe to zero the superblock if you might want to recover data off that piece of the array later?


">> plugging in the components of an array from a different system. all the components are there, so mdadm starts it.

I don't even understand this point. If the disks are to be used as backup, why is there evidence of an array on it? Zero it out."

I didn't mean that the array was being used for any regular purpose; this is a data recovery case. The system that owned the array originally died (say, fried motherboard or something) and the disks have been moved to a different system to check their health and possibly recover them. So, zeroing out the superblocks is not a good idea, since you might want to try starting the array again (but on your own terms -- not automatically, and not before you've had a chance to check the disks).

~Felix.
Comment by Mark (voidzero) - Monday, 22 August 2011, 18:03 GMT
For me there is a problem on one machine.
The setup is as follows:

/dev/md0 /dev/sd[abcd]1 (raid1)
/dev/md1 /dev/sd[abcd]2 (raid0)
/dev/md2 /dev/sd[ab]3 (raid0)
/dev/md3 /dev/sd[cd]3 (raid0)
/dev/md4 /dev/md[23] (raid1)

As you can see, I am using raid0+1 where two striping arrays are mirrored. I have just upgraded my system after a pretty long time, and I notice that md4 is not being built. The rest is ok. I will now test the patch suggestion.

*Edit* Yes, the patch worked for me. Thanks.
Comment by tiny (tiny) - Thursday, 01 September 2011, 09:07 GMT
I'm also stuck after upgrade. Here's my mdadm.conf
# cat /etc/mdadm.conf
ARRAY /dev/md1 level=raid0 num-devices=3 metadata=0.90 UUID=fc6fcb91:68a75a29:b88acab5:adec4669
ARRAY /dev/md2 level=raid0 num-devices=3 metadata=0.90 UUID=de4e3121:f431727a:a6faa410:2054c963
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=0.90 UUID=708ad3f4:71151a9e:2886212e:c7b400f1
ARRAY /dev/md3 level=raid0 num-devices=3 metadata=0.90 UUID=8e83828a:f27d4277:52349015:4ece80f2
ARRAY /dev/md4 level=raid0 num-devices=3 metadata=0.90 UUID=24c0389f:5157f6ca:58a3bbf9:facd50e1
ARRAY /dev/md5 level=raid1 num-devices=2 metadata=0.90 UUID=b5ac45b4:5d186d3c:530cf6cd:548af09e
ARRAY /dev/md6 level=raid1 num-devices=2 metadata=0.90 UUID=78aa9709:15503fd7:35e7b8f4:67d09af7

I have a raid10 setup for / and /home .

With or without mdadm hook, before or after udev hook ... nothing works for me.
I get stuck at boot time while attemtping to mount "md5". I can manually add md5 and md6 but
system doesn't boot. I also tried patching mdadm hook. No joy.

This box serves email, CVS and file sharing. Important box. Help appreciated.
Comment by tiny (tiny) - Thursday, 01 September 2011, 09:27 GMT
I can add some more information to above report.
I managed to boot system after couple of resets. Well sort of. Manual intervention was needed.
(mdadm -A --scan) which didn't work on previous reboots.

The only thing I can think of is some kind of "race condition" between udev and mdadm hooks setting
up md arrays. When mdadm prevailed with "correct" mdx names for my setup I could start it up with
(mdadm -A --scan)

Still doesn't boot automatically.

Q: How can udev hook setup raid array that has raid array members? How do we tell it to use names we wan't?
Comment by Felix (thetrivialstuff) - Thursday, 01 September 2011, 18:01 GMT
tiny: You're right I think; my patch probably is getting into a race condition with udev on your system. The patch assumes that udev isn't finding any of your arrays (Edit: or, if it is finding some arrays, that the arrays it does find end up with correct names).

To take udev out of the equation completely, delete or move the file /lib/udev/rules.d/64-md-raid.rules and re-run mkinitcpio. Then mdadm will be the only thing running the assembly, and it should happen in the correct order and bring all your arrays up.

~Felix.
Comment by Mark (voidzero) - Thursday, 01 September 2011, 23:34 GMT
Alternatively, write your own hooks and save yourself the thought cycles until we're past the udev woes.. :)

http://sprunge.us/KLZX

(or if someone comes here years from now, send me a message via the forums if it has expired and you want it.)

As an aside, I don't want to suggest that there should be no more attempts to fix this bug. There should, I suppose. But at the same time, udev sucks :) It used to be two steps forward from devfs, but now it is one or two steps backward from that. I recently saw it for the matured but untaimed beast that it has become: it is too conflicting with other suddenly deprecated applications, too "dark" or obscure to understand, and has too many odd configuration files to entertain any casual or professional sysadmin; in general they would rather play with more interesting parts of Linux until udev itself becomes deprecated. I'll hang out the flag when that happens.. and meanwhile I'll just write speedy workarounds :)
Comment by tiny (tiny) - Friday, 02 September 2011, 10:57 GMT
In the end, I moved /lib/udev/rules.d/64-md-raid.rules , and added my own simple mdadm hook:

cat /lib/initcpio/hooks/mdadm
# vim: set ft=sh:
run_hook ()
{
input="$(cat /proc/cmdline)"
mdconfig="/etc/mdadm.conf"
[ -f $mdconfig ] && mdadm -A --scan
}

Default mdadm hook does not work.

EDIT:(replaced filename mdadm.simple with mdadm)
Comment by Vic Fryzel (vicfryzel) - Thursday, 08 September 2011, 20:09 GMT
How are the severity and priority of this issue so low? With one upgrade, Arch has broken a large number of RAID users, myself included. Still, there is not an upgrade-proof solution on this bug. Another upgrade will override these changes, and systems will be reverted to the broken state.
Comment by Tobias Powalowski (tpowa) - Wednesday, 14 September 2011, 09:55 GMT
mdadm 3.2.2-4 should fix this, please confirm.
Comment by Tim O'Brien (timob) - Sunday, 22 January 2012, 10:51 GMT
This is not fixed. The mdadm hooks do not configure the raid array leaving the system unbootable.

Loading...