FS#19493 - [mkinitcpio] Add full persistent device naming support to the udev hook

Attached to Project: Arch Linux
Opened by Felix (thetrivialstuff) - Monday, 17 May 2010, 07:17 GMT
Last edited by Thomas Bächler (brain0) - Tuesday, 13 July 2010, 11:28 GMT
Task Type Feature Request
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
The current version of mkinitcpio and/or udev does not include several important binaries in the initial ramdisk image. Specifically, these:

/lib/udev/ata_id
/lib/udev/cdrom_id
/lib/udev/edd_id
/lib/udev/input_id
/lib/udev/path_id
/lib/udev/scsi_id
/lib/udev/usb_id
/lib/udev/v4l_id

At least some of those are needed in order for /dev/disk/by-path and other persistent device naming symlinks to exist at boot. This is very important for booting to RAID partitions, since everyone who uses RAID uses persistent device names to assemble the arrays (right? ;) ).

The result is that the system doesn't boot because the RAID arrays do not assemble.

Additional info:
* package version(s)
mkinitcpio 0.6.4-1
udev 151-3


Steps to reproduce:
Set up an mdadm.conf file describing the RAID array for the root filesystem using /dev/disk/by-path naming, thus:

[code]
#root
DEVICE /dev/disk/by-path/pci-0000:00:1f.1-scsi-0:0:0:0-part3
DEVICE /dev/disk/by-path/pci-0000:00:1f.1-scsi-0:0:1:0-part3
ARRAY /dev/md2 devices=/dev/disk/by-path/pci-0000:00:1f.1-scsi-0:0:0:0-part3,/dev/ disk/by-path/pci-0000:00:1f.1-scsi-0:0:1:0-part3
[/code]

Then, add "mdadm" to the HOOKS line in /etc/mkinitcpio.conf and run mkinitcpio -p kernel26. Upon reboot, the mdadm hook does nothing, because mdassemble cannot find any of the DEVICEs specified (because udev didn't make the symlinks, because those binaries mentioned above are needed for that). This results in a crash to the recovery shell.

Workaround (if you're stuck at the recovery shell):
- edit /etc/mdadm.conf (yes, the copy in the ramdisk) to replace the /dev/disk/by-path names with what you think are good guesses -- often this'll be /dev/sda, /dev/sdb, etc.
- run mdassemble -- it should now be able to assemble your root partition.
- exit the recovery shell, and boot continues normally.

Workaround/fix (if you've booted successfully or haven't rebooted yet):
- add all of the above binaries (separated by spaces) to the BINARIES line in mkinitcpio.conf
- re-run mkinitcpio -p kernel26

Ultimate fix:
- include all of the above binaries in the initrd image by referencing them with add_binary calls in /lib/initcpio/install/udev (which is presently owned by the mkinitcpio package).

For more details on this bug and the fix, see this thread: http://bbs.archlinux.org/viewtopic.php?pid=759939
This task depends upon

Closed by  Thomas Bächler (brain0)
Tuesday, 13 July 2010, 11:28 GMT
Reason for closing:  Implemented
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 10:57 GMT
I never even considered that by-path would be used anywhere. When writing the udev hook for mkinitcpio, I ensured by-uuid and by-label would work.

Question is, do we really want by-path to work? Is by-path even persistent?
Comment by Felix (thetrivialstuff) - Monday, 17 May 2010, 17:37 GMT
Yes. In fact, it's more persistent than by-label and by-uuid -- those can be fooled, by-path cannot (at least, not in the same way). For a longer discussion of what I mean, see: http://bbs.archlinux.org/viewtopic.php?id=81674

Summary:

- by-uuid and by-label make NO provision whatsoever for what bus a device is on. You do NOT want your system to consider devices on USB, firewire, PATA, SCSI, or eSATA buses if you know for a FACT that your boot device is on internal SATA connector #1. by-uuid and by-label provide no assurance of this.

- by-uuid and by-label can be spoofed (maliciously or accidentally), because they rely on data on the disk when determinig which disk is which. Suppose as part of a filesystem recovery (say, from bad sectors or corruption), you've dd'd your bad disk to another one to work on it. Both have the same UUID and label -- which one will the system choose? It's random (I've tested this in VM's).

by-uuid and by-label are somewhat better than trying to rely on /dev/sd* names, but they're not as reliable as the old /dev/hd* names -- /dev/hda meant "master device on first PATA bus" and by-path is its exact equivalent.
Comment by Leonid Isaev (lisaev) - Thursday, 20 May 2010, 00:53 GMT
@Felix
Wait, something is missing. Which arch are you using? I am on i686, and I have mkinitcpio 0.6.4-1 and udev 151-3. All the files, which you claim missing, are present on my system and owned by udev...
Comment by Felix (thetrivialstuff) - Thursday, 20 May 2010, 07:24 GMT
@Leonid Isaev:

Are you looking in /lib/udev/ on your fully booted system, or the initial ramdisk?

This bug is about the initial ramdisk environment; to check that you could use cpio on your /boot/kernel26.img, or deliberately cause a boot crash to drop you into the recovery shell while you're booting -- easy way to do that is to edit your grub kernel line and change root= to something that doesn't exist. Once you're in the recovery shell, then try ls /lib/udev.

~Felix.
Comment by Leonid Isaev (lisaev) - Thursday, 20 May 2010, 16:58 GMT
Oops, you are right -- sorry for that. I can confirm the issue, although I don't use RAID...
Comment by Thomas Bächler (brain0) - Thursday, 03 June 2010, 12:18 GMT
I only added ata_id, path_id, scsi_id and usb_id. The rest are not used by the rule files in initramfs.

http://projects.archlinux.org/mkinitcpio.git/commit/?id=f302dbc67c8bd5c96ef3c570925b8beba88f6487

Loading...