FS#34706 - [systemd] unable to boot from software raid (md,mdadm,raid1)

Attached to Project: Arch Linux
Opened by Daan van Rossum (drrossum) - Tuesday, 09 April 2013, 17:08 GMT
Last edited by Dave Reisner (falconindy) - Tuesday, 09 April 2013, 19:11 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
with Systemd-200 I can't boot from software raid, md-raid1. This setup has been working flawlessly since 2009. All packages up-to-date.

Package versions:
systemd-200-1
linux-3.8.6-1
mdadm-3.2.6-3

/etc/mdadm.conf:
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=0.90 UUID=094804b7:ac53c08d:84363918:89da2e0d
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=1.2 UUID=e1c5207d:05c39527:254202b7:dbd37c8e
ARRAY /dev/md2 level=raid1 num-devices=2 metadata=1.2 UUID=4e0bd6c0:703ad2f7:c284b130:b0af6f4b


Boot messages:
----
:: running early hook [udev]
:: running hook [udev]
:: Triggering uevents...
Waiting 10 seconds for device /dev/md1 ...
:: performing fsck on '/dev/md1'
fsck.ext2: Invalid argument while trying to open /dev/md1
/dev/md1:
The superblock could not be read or does not describe a correct ext2
...
...
You are now being dropped into an emergency shell.
----

From that shell, `cat /proc/mdstat` shows all arrays in place correctly. Note, that md1 is an ext4 file system, not ext2. I guess, the arrays assembly was had not yet finished at the time fsck starts???

Turning off fsck with the fastboot kernel parameter results in an error saying that /dev/md1 is not available (sorry, I'll capture the exact messages on the screen next time).

Reverting back to systemd-198-2 fixed the issue.
This task depends upon

Closed by  Dave Reisner (falconindy)
Tuesday, 09 April 2013, 19:11 GMT
Reason for closing:  Not a bug
Additional comments about closing:  device nodes names aren't reliable.
Comment by Dave Reisner (falconindy) - Tuesday, 09 April 2013, 17:15 GMT
This has nothing to do with systemd. Your failure occurs before systemd even starts.

Please post your /etc/mkinitcpio.conf.
Comment by Daan van Rossum (drrossum) - Tuesday, 09 April 2013, 17:56 GMT
Thanks for your quick reply!

/etc/mkinitcpio.conf:
MODULES="ext4 raid1"
HOOKS="base udev autodetect modconf block mdadm_udev filesystems keyboard fsck"

It's a hell to debug this thing, because it fails 9/10 times and then works suddenly. But while trying to capture the output from fastboot I realized systemd-198-2 also doesn't fix the issue...

Here is the output with fastboot:
----
:: skipping fsck on `/dev/md1`
:: mounting `/dev/md1`
[ 2.xxx] EXT4-fs (md1): unable to read supberblock
[ 2.xxx] EXT4-fs (md1): unable to read supberblock
[ 2.xxx] EXT4-fs (md1): unable to read supberblock
mount: wrong fs type, bad option, bad superblock on /dev/md1,
missing codepage or ...
You are being dropped into an emergency shell
----

Then, from there I can simply do
$ mount /dev/md1 /new_root

Can this not be a udev issue, and is not udev integrated in systemd these days?
Comment by Dave Reisner (falconindy) - Tuesday, 09 April 2013, 18:08 GMT
Sounds more like a kernel problem. Do you get any joy if you use a label rather than a device node?

> Can this not be a udev issue, and is not udev integrated in systemd these days?
No. It isn't. Just because it builds from the same source tree doesn't mean it's "integrated". It's still very much (intentionally) a separate binary.
Comment by Daan van Rossum (drrossum) - Tuesday, 09 April 2013, 18:29 GMT
YES, that works! You're a genius!

Thanks for your help!

Could you please change the title from [systemd] to [udev] or [kernel], whatever you think is correct.
Comment by Dave Reisner (falconindy) - Tuesday, 09 April 2013, 19:10 GMT
> YES, that works! You're a genius!
Not particularly... Kernel device nodes have never been a reliable thing. This isn't a bug, and I'm a little surprised that you've not had any problems up until now.

Loading...