FS#74888 - [linux] kernel 5.18.0: LVM volume on RAID10 based PV does not mount
Attached to Project:
Arch Linux
Opened by Sachin Garg (randompie) - Sunday, 29 May 2022, 02:16 GMT
Last edited by Jan Alexander Steffens (heftig) - Monday, 30 May 2022, 19:23 GMT
Opened by Sachin Garg (randompie) - Sunday, 29 May 2022, 02:16 GMT
Last edited by Jan Alexander Steffens (heftig) - Monday, 30 May 2022, 19:23 GMT
|
Details
Description:
I have 5 RAID device. * 3 RAID 10 devices that server as PVs for an LVM VG. * 2 RAID 0 devices. Upon upgrading to kernel 5.18.0, the RAID 10 devices *no* longer get assembled. The RAID 0 devices do get assembled. RAID 10 assembly works fine with both 5.15.43-1-lts and 5.17.9-arch1-1 *without* any configuration changes. Additional info: * package version(s) Linux Kernel Package: 5.18.0-arch1-1 LVM version: 2.03.16(2) (2022-05-18) Library version: 1.02.185 (2022-05-18) Driver version: 4.45.0 mdadm - v4.2 - 2021-12-30 Steps to reproduce: * Create partitions for use as Linux RAID devices * Combine these partitions in a software RAID 10. * Use this Software RAID 10 as a PV for LVM - create a VG and LV * Set the LV to mount on boot (using UUID scheme in /etc/fstab) * Add "mdadm_udev" and "lvm2" hooks in mkinitcpio.conf * Install Linux 5.18.0-arch1-1 package and configure * Reboot |
This task depends upon
Closed by Jan Alexander Steffens (heftig)
Monday, 30 May 2022, 19:23 GMT
Reason for closing: Fixed
Additional comments about closing: linux 5.18.1.arch1-1
Monday, 30 May 2022, 19:23 GMT
Reason for closing: Fixed
Additional comments about closing: linux 5.18.1.arch1-1
Seems to affect both RAID 10 and RAID 1 but not RAID 0 or RAID 6. LVM2 does not appear to be a common factor.
Sticking with 5.17.9 for now.
My old ARRAY definition from /etc/mdadm.conf that worked prior to the 5.18 kernel update:
ARRAY /dev/userraid10 metadata=1.2 name=phenom:UserRAID10 UUID=bee9ca99:c9a86e5e:0d3e9c1a:c5473a21
Changing the definition to the follow now works with 5.18:
ARRAY /dev/md127 metadata=1.2 name=phenom:UserRAID10 UUID=bee9ca99:c9a86e5e:0d3e9c1a:c5473a21
Hopefully this gives developers additional info to help pinpoint the root cause.
ARRAY /dev/md/hostname:home devices=/dev/sda1,/dev/sdb1 metadata=1.2 name=hostname:home UUID=...
into
ARRAY /dev/md127 devices=/dev/sda1,/dev/sdb1 metadata=1.2 name=hostname:home UUID=...
Looks like that the detected array device @ 5.18 must match the identifier from the /etc/mdadm.conf.
These are the lines in my /etc/mdadm.conf - all working before upgrade to 5.18 and also working now (downgraded to 5.17.9):
## Commented out because mdadm_udev is present
#DEVICE partitions
##ARRAY /dev/md/sysresccd:home metadata=1.2 name=sysresccd:home UUID=60b0eaff:867d1d1d:3b630859:855507ac
##ARRAY /dev/md/triveni:124 metadata=1.2 name=triveni:124 UUID=114bf81d:3ee322d4:24f1733a:c7bab01b
##ARRAY /dev/md/sysresccd:1 metadata=1.2 name=sysresccd:1 UUID=cf9ab491:f2d0050c:0e585f82:31573738
##ARRAY /dev/md125 metadata=1.2 name=triveni.d.navankur.net:125 UUID=0bd12c43:640ea1be:7e7de184:91c6a7fb
A scan of the devices shows up these:
$ sudo mdadm -Esv
ARRAY /dev/md/125 level=raid0 metadata=1.2 num-devices=2 UUID=0bd12c43:640ea1be:7e7de184:91c6a7fb name=triveni.d.navankur.net:125
devices=/dev/sdd3,/dev/sdc3
ARRAY /dev/md/home level=raid10 metadata=1.2 num-devices=4 UUID=60b0eaff:867d1d1d:3b630859:855507ac name=sysresccd:home
devices=/dev/sde1,/dev/sdd1,/dev/sdc1,/dev/sdb1
ARRAY /dev/md/124 level=raid10 metadata=1.2 num-devices=4 UUID=114bf81d:3ee322d4:24f1733a:c7bab01b name=triveni:124
devices=/dev/sde3,/dev/sdd2,/dev/sdc2,/dev/sdb3
ARRAY /dev/md/123 level=raid0 metadata=1.2 num-devices=2 UUID=a8f77e85:46336115:dcbd2489:be540e35 name=triveni.d.navankur.net:123
devices=/dev/sde5,/dev/sdb5
ARRAY /dev/md/122 level=raid10 metadata=1.2 num-devices=4 UUID=3e0c80ef:fca94bda:76259412:b2049262 name=triveni:122
devices=/dev/sde7,/dev/sdd6,/dev/sdc6,/dev/sdb7
With 5.18.0, when I tried to assemble the array /dev/md/home I got the error "mdadm: unexpected failure opening /dev/md127". Replacing "/dev/md/home" with "/dev/md127" - I was able to assemble and strat the array - that is what pointed me in the direction of this beinga potential kernel bug.
if it helps:
# mdadm configuration file
#
# mdadm will function properly without the use of a configuration file,
# but this file is useful for keeping track of arrays and member disks.
# In general, a mdadm.conf file is created, and updated, after arrays
# are created. This is the opposite behavior of /etc/raidtab which is
# created prior to array construction.
#
#
# the config file takes two types of lines:
#
# DEVICE lines specify a list of devices of where to look for
# potential member disks
#
# ARRAY lines specify information about how to identify arrays so
# so that they can be activated
#
# You can have more than one device line and use wild cards. The first
# example includes SCSI the first partition of SCSI disks /dev/sdb,
# /dev/sdc, /dev/sdd, /dev/sdj, /dev/sdk, and /dev/sdl. The second
# line looks for array slices on IDE disks.
#
#DEVICE /dev/sd[bcdjkl]1
#DEVICE /dev/hda1 /dev/hdb1
#
# The designation "partitions" will scan all partitions found in
# /proc/partitions
DEVICE partitions
# ARRAY lines specify an array to assemble and a method of identification.
# Arrays can currently be identified by using a UUID, superblock minor number,
# or a listing of devices.
#
# super-minor is usually the minor number of the metadevice
# UUID is the Universally Unique Identifier for the array
# Each can be obtained using
#
# mdadm -D <md>
#
# To capture the UUIDs for all your RAID arrays to this file, run these:
# to get a list of running arrays:
# # mdadm -D --scan >>/etc/mdadm.conf
# to get a list from superblocks:
# # mdadm -E --scan >>/etc/mdadm.conf
#
#ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371
#ARRAY /dev/md1 super-minor=1
#ARRAY /dev/md2 devices=/dev/hda1,/dev/hdb1
#
# ARRAY lines can also specify a "spare-group" for each array. mdadm --monitor
# will then move a spare between arrays in a spare-group if one array has a
# failed drive but no spare
#ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1
#ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1
#
# When used in --follow (aka --monitor) mode, mdadm needs a
# mail address and/or a program. To start mdadm's monitor mode, enable
# mdadm.service in systemd.
#
# If the lines are not found, mdadm will exit quietly
#MAILADDR root@mydomain.tld
#PROGRAM /usr/sbin/handle-mdadm-events
ARRAY /dev/md/raid10 metadata=1.2 name=desktop:raid10 UUID=2cf5d240:d1576c3b:b59a2e7b:1eb89875
Cannot boot with new kernel, everything OK with previous ones.
Personalities : [raid0]
md127 : active raid0 nvme3n1p1[2] nvme1n1p1[0] nvme4n1p1[3] nvme2n1p1[1]
3906514944 blocks super 1.2 512k chunks
`ARRAY /dev/md127 metadata=1.2 name=renegade:media UUID=7f658d31:e75ec0d2:0ba31f5c:be9fccd1`
to the config file, as well as modifying `/etc/fstab`, solved the issue. Though doesn't allow to be use the prettier `/dev/md/media` reference.
As suggested in the linked BBS post, adding `CONFIG_BLOCK_LEGACY_AUTOLOAD=Y` to the kernel config, as well as removing the `ARRAY` line from `/etc/mdadm.conf` got me back to the previous behaviour. Though it's not a viable long term solution as this kernel config option will disappear in 5.19.
So the question remains how to keep the pretty device name (i.e. `/dev/md/media` instead of `/dev/md127`) without using a soon-to-be-dropped config option.
Edit: I tested so many combinations of kernels and configs and I forgot about the simple ones, I confirm it indeed works with the Archlinux vanilla 5.18 with properly configured mdadm.conf. No pretty names, but it boots and I'm good with it.
I use the UUID in my fstab and that works correctly after fixing `/etc/mdadm.conf`.
So long story short, as long as you have the correct `ARRAY` line in your `/etc/mdadm.conf`, it boots just fine. From what I understand, the fallback kernel option `CONFIG_BLOCK_LEGACY_AUTOLOAD=y` will be removed in 5.19, so I'm not entirely sure this is a bug, but sadly a breaking feature.
For now, easiest fix is to :
- boot from an alternative kernel (or boot disk)
- run `mdadm --detail --scan`
- change the name of the device to `/dev/md127` or whatever the "ugly" name of your device is
- add that to `/etc/mdadm.conf`
- I ran `mkinitcpio -P` just for good measure and rebooted, and I could boot via UUID again (no need to use the "ugly" device name in any other config file).
mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md/myarray /dev/vdb1 /dev/vdc1
This failed. I tried again with:
mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md128 /dev/vdb1 /dev/vdc1
This succeeded. I then rebooted to see if the DEVICE=partitions would pick it up, without adding ARRAY to mdadm.conf and it did. mdadm --detail --scan revealed:
ARRAY /dev/md/128 metadata=1.2 name=myhostname:128 UUID=....
So it looks like autodetection DOES work, but you have to create the array without the initial name and just to an initial md device.