FS#28067 - [mdadm] Unable to boot on raid devices

Attached to Project: Arch Linux
Opened by Sébastien Luttringer (seblu) - Monday, 23 January 2012, 12:38 GMT
Last edited by Dave Reisner (falconindy) - Saturday, 24 March 2012, 18:33 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dave Reisner (falconindy)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

Current version of mkinitpcio in core broke system boot when disks are raid (mdadm) and use /dev/disk/by-uuid as root devices path.

Boot process doesn't find /dev/disk/by-uuid/$myuuid. Replacing it by something like /dev/md1 is a quick workaround.

When udev starts in initramfs it doesn't have rules files needed to be able to create symlinks in /dev/disk/by-uuid/* (and even /dev/by-id/md-uuid-*). As a consequency, root fs is not found.

These files are created after udev got all his rules files in real root fs, but it's too late to boot on.

Maybe we should ship /lib/udev/rules.d/64-md-raid.rules in initramfs to be able to boot.

$ cat /etc/mkinitcpio.conf|grep -v ^#|grep -v ^$
MODULES="radeon"
BINARIES=""
FILES=""
HOOKS="base udev autodetect usbinput sata mdadm filesystems"
COMPRESSION="gzip"


sluttrin /tmp $ lsinitcpio /boot/initramfs-linux.img|grep udev
./lib/udev
./lib/udev/udevd
./lib/udev/ata_id
./lib/udev/rules.d
./lib/udev/rules.d/50-udev-default.rules
./lib/udev/rules.d/60-persistent-storage.rules
./lib/udev/rules.d/80-drivers.rules
./lib/udev/scsi_id
./usr/bin/udevadm
./etc/udev
./etc/udev/udev.conf
./hooks/udev
This task depends upon

Closed by  Dave Reisner (falconindy)
Saturday, 24 March 2012, 18:33 GMT
Reason for closing:  Won't fix
Additional comments about closing:  see comments
Comment by Sébastien Luttringer (seblu) - Monday, 23 January 2012, 12:44 GMT
ok just forget. mdmadm_udev is a the solution :/
Comment by Sébastien Luttringer (seblu) - Monday, 23 January 2012, 12:51 GMT
this is unrelated with mkinitcpio but with mdadm mkinitcpio files.

A warning message telling to users during update to change their mdadm into mdadm_udev would avoid failure reboot...

This commit introduce the changes.
http://projects.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/mdadm&id=b16a6aa5ad4dd0b56298f3f8434c8ad93f640d15
Comment by Sébastien Luttringer (seblu) - Monday, 23 January 2012, 13:26 GMT
i'm still wondering why this change have not trigger a boot failure before today.
Comment by Dave Reisner (falconindy) - Monday, 23 January 2012, 19:23 GMT
The original mdadm hook did not use udev for assembly. Users who expected assembly via mdassemble needed to make zero changes to their initramfs setup ('mdadm' still uses mdassemble). If you wanted the new shiny udev assembly, you changed your hook. Therefore, I see no reason why we need to add a warning anywhere. The behavior is consistent unless you actively want to change what you're doing.

You'll need to figure out yourself why this magically worked for you
Comment by Sébastien Luttringer (seblu) - Saturday, 28 January 2012, 11:36 GMT
  • Field changed: Percent Complete (100% → 0%)
The original mdadm hook *use* udev for assembly. The change was introduced by commit given in comments..
Comment by Dave Reisner (falconindy) - Sunday, 29 January 2012, 16:10 GMT
As we discussed in IRC:

- the original mdadm hook did NOT use udev for assembly. It only gained udev assembly upstream modified the udev rule and we continued to ship it as is (not paying attention to the new mdadm -I calls).
- this caused extremely broken behavior -- udev would assemble (most) devices, and then mdassemble would be called, subsequently breaking down already assembled devices before reassembling.
- we moved _all_ udev based functionality to mdadm_udev.

There aren't a whole lot of options here:
- write our own rule based on the upstream 64-md-raid.rules which removes incremental assembly (ugly, adds maintenance)
- tell people to use mdadm_udev if you want udev naming. This is already documented in mkinitcpio's wiki page.

I really dislike adding warnings everywhere and turning a package into a proverbial minefield. We're a bit late to be doing this anyways, and adding a warning now is just going to confuse simple-minded users.
Comment by Felix (thetrivialstuff) - Monday, 13 February 2012, 21:18 GMT
For what it's worth, I just upgraded a server of mine that has an unusual RAID configuration (which was previously broken by the udev approach; see https://bugs.archlinux.org/task/25132) and the current 'mdadm' hook worked perfectly. The distinction between mdadm and mdadm_udev appears to be working well.

I'm sympathetic to those who set up RAID systems during the time when the 'mdadm' hook did use udev, as there's history here that they weren't party to -- initially, mdadm was for manual assembly and used mdassemble, then for a while it used udev (which broke a lot of more exotic RAID boot configs), and now it's back to mdassemble and we have a separate mdadm_udev hook.

The "mdadm hook uses udev" phase was confusing and we might've exacerbated that by splitting it into two hooks, but ultimately I think this is best.
Comment by Dave Reisner (falconindy) - Saturday, 24 March 2012, 18:33 GMT
I'm inclined not to fix this, mostly because mkinitcpio is doing the right thing (and I'm not the mdadm maintainer). This bug is aging and there's really not been much in the way of activity on it (meaning users aren't affected). If this breaks your boot, then, my apologies. If you desperately want this fixed by adding a portion of the mdadm udev rules back to the mdadm hook, then take it up with the mdadm maintainer in a separate bug. That said, I do not recommend it. mdadm is meant to work without udev.

If you want persistent naming with the mdadm hook, use LABEL= or UUID= tags -- this has been supported in mkinitcpio for quite some time. early init will resolve these via blkid or udev links, and will work in all scenarios (there's currently a regression in mkinitcpio 0.8.4, but expect this fixed today or tomorrow).

Loading...