FS#31236 : [systemd] and dmraid

FS#31236 - [systemd] and dmraid

Attached to Project: Arch Linux
Opened by Fred Verschueren (fvsc) - Wednesday, 22 August 2012, 08:17 GMT
Last edited by Dave Reisner (falconindy) - Sunday, 30 December 2012, 14:28 GMT

Task Type	Support Request
Category	Packages: Core
Status	Closed
Assigned To	Dave Reisner (falconindy) Tom Gundersen (tomegun)
Architecture	i686
Severity	High
Priority	Normal
Reported Version
Due in Version	Undecided
Due Date	Undecided
Percent Complete
Votes	7 Malstrond (Malstrond) (2013-02-06) Ivan (Zipfer) (2012-12-02) Ercan Karagoz (ercan2502) (2012-11-27) chris (drasich) (2012-11-05) Stéphane Travostino (eazy) (2012-09-26) Pierpaolo Valerio (gondsman) (2012-08-23) Kevin (anonymous_user) (2012-08-22)
Private	No

Details

Description:

I' m using Arch now for several years on a desktop and a laptob without severe problems.
Few days ago I changed to systemd both on the laptop and desktop.
For the laptop all went fine and I' m able to boot using systemd and the system is up and running as before.
The only difference between laptop and desktop is that the /home partition on the desktop is on a FakeRaid (bios raid)

My problem:
When booting with init=/bin/systemd I have the following phenomenon:

Job dev-mapper-nvidia_cjjcaiiep1.device/start timed out and booting stops and I'm in a rescue shell.
my /home partition is not mounted.

/dev/mapper/nvidia_cjjcaiiep1 exists.

Systemctl start dev-mapper-nvidia_cjjcaiiep1.device gives the same result timeout.

and my /home is not mounted

output of dmraid -t
/dev/sdc: nvidia, "nvidia_cjjcaiie", mirror, ok, 398297086 sectors, data@ 0
/dev/sdb: nvidia, "nvidia_cjjcaiie", mirror, ok, 398297086 sectors, data@ 0

Output dmsetup status
nvidia_cjjcaiie: 0 398297086 mirror 2 8:16 8:32 3039/3039 1 AA 1 core
VolGroup00-lvolswap: 0 8388608 linear
VolGroup00-lvoltmp: 0 8388608 linear
VolGroup00-lvolroot: 0 31457280 linear
VolGroup00-lvolroot: 31457280 10485760 linear
VolGroup00-lvolweb: 0 41943040 linear
VolGroup00-lvolvar: 0 31457280 linear
nvidia_cjjcaiiep1: 0 398292992 linear
VolGroup00-lvolvirt: 0 83886080 linear

output dmsetup deps
nvidia_cjjcaiie: 2 dependencies : (sdc) (sdb)
VolGroup00-lvolswap: 1 dependencies : (sda8)
VolGroup00-lvoltmp: 1 dependencies : (sda8)
VolGroup00-lvolroot: 1 dependencies : (sda8)
VolGroup00-lvolweb: 1 dependencies : (sda8)
VolGroup00-lvolvar: 1 dependencies : (sda8)
nvidia_cjjcaiiep1: 1 dependencies : (nvidia_cjjcaiie)
VolGroup00-lvolvirt: 1 dependencies : (sda8)

output ls -al /dev/mapper
drwxr-xr-x 2 root root 220 Aug 22 10:01 .
drwxr-xr-x 19 root root 3.8K Aug 22 10:03 ..
lrwxrwxrwx 1 root root 7 Aug 22 10:01 VolGroup00-lvolroot -> ../dm-0
lrwxrwxrwx 1 root root 7 Aug 22 10:01 VolGroup00-lvolswap -> ../dm-4
lrwxrwxrwx 1 root root 7 Aug 22 10:01 VolGroup00-lvoltmp -> ../dm-2
lrwxrwxrwx 1 root root 7 Aug 22 10:01 VolGroup00-lvolvar -> ../dm-1
lrwxrwxrwx 1 root root 7 Aug 22 10:01 VolGroup00-lvolvirt -> ../dm-3
lrwxrwxrwx 1 root root 7 Aug 22 10:01 VolGroup00-lvolweb -> ../dm-5
crw------- 1 root root 10, 236 Aug 22 10:01 control
brw------- 1 root root 254, 6 Aug 22 09:58 nvidia_cjjcaiie
brw------- 1 root root 254, 7 Aug 22 09:58 nvidia_cjjcaiiep1

When I do a manual mount of the /home partition
mount /dev/mapper/nvidia_cjjcaiiep1 /home
/home is mounted

Systemctl default

my system resumes booting and finally comes online.

I 'm new to systemd so if someone can be so kind to give me some hints how / where to start searching for the problem or maybe the problem is already known or it is a bug ...

Thanks in advance.

Fred

Additional info:
* package version(s)

libsystemd 188-2
systemd 188-2
systemd-arch-units 20120704-2
systemd-tools 188-2
dmraid 1.0.0.rc16.3-7

* config and/or log files etc.
Please let me know what you need

Steps to reproduce:
?home partition on a bias raid (mirror)

This task depends upon

Closed by Dave Reisner (falconindy)
Sunday, 30 December 2012, 14:28 GMT
Reason for closing: Fixed
Additional comments about closing: dmraid 1.0.0.rc16.3-8

Comment by Dave Reisner (falconindy) - Thursday, 23 August 2012, 03:27 GMT

Feel free to read over ~~FS#30134~~ . I don't have anything more to add to that right now, and I have no ability to test this in any way.

Comment by Pierpaolo Valerio (gondsman) - Thursday, 23 August 2012, 13:09 GMT

Bug #31257 was marked as a duplicate of this one, so I'm going to comment on this one. Dave, you refer to the discussion on bug #30134 and you say there that there is no plan of supporting dmraid because it's not maintained upstream. I'm fine with that (I had already read that comment), that's why I moved to mdadm and I'm not using dmraid anymore. The fact that this issue is still there leads me to believe the problem lies somewhere else, as I don't even have dmraid installed on my system anymore.

Comment by Fred Verschueren (fvsc) - Friday, 24 August 2012, 08:50 GMT

Iǘe installed Fedora fc7 (which is using systemd)on my system (spare disk) and with the same /home partition (another user id).

1. This is booting without any problem

2. my /home partition is mounted and fully accessable

3. Even dmraid is used
dmraid -r
/dev/sdc: nvidia, "nvidia_cjjcaiie", mirror, ok, 398297086 sectors, data@ 0
/dev/sdb: nvidia, "nvidia_cjjcaiie", mirror, ok, 398297086 sectors, data@ 0

dmsetup status
luks-9391e85a-6420-4074-b78e-2ba420efd864: 0 567976624 crypt
nvidia_cjjcaiie: 0 398297086 mirror 2 8:16 8:32 3039/3039 1 AA 1 core
VolGroup00-lvolswap: 0 8388608 linear
VolGroup00-lvoltmp: 0 8388608 linear
VolGroup00-lvolroot: 0 31457280 linear
VolGroup00-lvolroot: 31457280 10485760 linear
VolGroup00-lvolweb: 0 41943040 linear
VolGroup00-lvolvar: 0 31457280 linear
vgroup-lvol_bu_pc: 0 567967744 linear
nvidia_cjjcaiiep1: 0 398292992 linear
VolGroup00-lvolvirt: 0 83886080 linear

dmsetup deps
luks-9391e85a-6420-4074-b78e-2ba420efd864: 1 dependencies : (8, 50)
nvidia_cjjcaiie: 2 dependencies : (8, 32) (8, 16)
VolGroup00-lvolswap: 1 dependencies : (8, 8)
VolGroup00-lvoltmp: 1 dependencies : (8, 8)
VolGroup00-lvolroot: 1 dependencies : (8, 8)
VolGroup00-lvolweb: 1 dependencies : (8, 8)
VolGroup00-lvolvar: 1 dependencies : (8, 8)
vgroup-lvol_bu_pc: 1 dependencies : (253, 8)
nvidia_cjjcaiiep1: 1 dependencies : (253, 0)
VolGroup00-lvolvirt: 1 dependencies : (8, 8)

ls -al /dev/mapper
total 0
drwxr-xr-x. 2 root root 260 Aug 24 10:16 .
drwxr-xr-x. 21 root root 3880 Aug 24 08:16 ..
crw-------. 1 root root 10, 236 Aug 24 08:16 control
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 luks-9391e85a-6420-4074-b78e-2ba420efd864 -> ../dm-8
brw-rw----. 1 root disk 253, 0 Aug 24 10:16 nvidia_cjjcaiie
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 nvidia_cjjcaiiep1 -> ../dm-1
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 vgroup-lvol_bu_pc -> ../dm-9
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 VolGroup00-lvolroot -> ../dm-2
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 VolGroup00-lvolswap -> ../dm-6
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 VolGroup00-lvoltmp -> ../dm-4
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 VolGroup00-lvolvar -> ../dm-3
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 VolGroup00-lvolvirt -> ../dm-5
lrwxrwxrwx. 1 root root 7 Aug 24 08:16 VolGroup00-lvolweb -> ../dm-7

4. mdadm arrays are empty
cat /proc/mdstat

Personalities :
unused devices: <none>

I want to point out that this fedora configuration is rather complex:
The / is on an encrypted external firewire disk
The /boot is on the first internal disk
The /home is on a FakeRaid (bios raid) with two (sdb , sdc) disks

cat /etc/fstab
#
# /etc/fstab
/dev/mapper/vgroup-lvol_bu_pc / ext4 defaults 1 1
UUID=7bae55c2-a689-4829-8077-181afd23e0cd /boot ext3 defaults 1 2
UUID=76c4b013-fd10-4d39-ba65-e18c97a356fa swap swap defaults 0 0
/dev/mapper/nvidia_cjjcaiiep1

In my humble opinion this seems to indicate that the problem is Arch specific and as nothing to with upstream dmraid nor systemd.

Please let me know if you need some more info

Comment by Pierpaolo Valerio (gondsman) - Friday, 24 August 2012, 09:47 GMT

Fred, do you have dmraid module loaded in initramfs? Can you post your /etc/mkinitcpio.conf file(both on fedora and Arch)? From what I understood, if you don't assemble the devices in early userspace systemd can mount them just fine. Unfortunately if you have your root partition on the array (like I do) you can't do anything else to boot your kernel.

Comment by Tom Gundersen (tomegun) - Friday, 24 August 2012, 18:28 GMT

@fvsc: could you paste the output of "udevadm info /dev/mapper/nvidia_cjjcaiiep1" on both the working and broken system? My guess is that your fedora system contains some udev rule that makes sure you have "TAGS=:systemd:", and that Arch does not. If the device is not tagged in this way sysntemd will not know about it.

Could anyone else verify whether or not you have "TAGS=:systemd:" on your dmraid device under Arch?

Comment by Dave Reisner (falconindy) - Friday, 24 August 2012, 18:38 GMT

Tom, this sort of output is available ~~FS#30134~~

Comment by Fred Verschueren (fvsc) - Saturday, 25 August 2012, 08:17 GMT

Included mkinitcpio from arch
Dracut.conf from Fedora

Udevadm from Fedora
udevadm from arch when in emergency shell (udevadm-arch-before)
udevadm from arch after manual mount /dev/mapper/nvidiajjcaiiep1 /home (udevadm-arch-after)

My /root is NOT on a dmraid nor mdraid only /home is on a dmraid

mkinitcpio-arch.conf.txt (2.9 KiB)

dracut.conf.txt (1.8 KiB)

udevadm-fedora.txt (1.6 KiB)

udevadm-arch-before.txt (1.3 KiB)

udevadm-arch-after.txt (0.9 KiB)

Comment by Pierpaolo Valerio (gondsman) - Saturday, 25 August 2012, 11:18 GMT

I can confirm that TAGS=:systemd: is present on my system as well.
So, the issue is there by using either dmraid or mdadm and it doesn't depend on whether the array was assembled in early userspace or not. Still, the systems work in fedora fc7 (using systemd and dmraid) and fedora 17 (using systemd and mdadm). Any idea?

Comment by Dave Reisner (falconindy) - Monday, 27 August 2012, 00:55 GMT

The crux of the problem is that /dev/disk/by-* symlinks are not created for these devices. Find out why.

This works for sysvinit because we don't care about those symlinks existing. We simply run 'mount -a' after some synchronization point (when we believe all storage devices are available) and let mount resolve the tags through libblkid.

Comment by Pierpaolo Valerio (gondsman) - Monday, 27 August 2012, 07:16 GMT

What should create those symlinks? I am fairly ignorant on the subject, the init sequence is one of those things a user doesn't delve into unless there's a problem. Is it systemd's task or kernel's ? When I boot (at least with sysVinit, I'll have to test in the emergency prompt with systemd) those links are there as I would expect.

Comment by Dave Reisner (falconindy) - Monday, 27 August 2012, 09:53 GMT

udev creates these symlinks based on rules. In the case of dm-* nodes, it should be 13-dm-disk.rules, particularly this block:

IMPORT{builtin}="blkid"
ENV{DM_UDEV_LOW_PRIORITY_FLAG}=="1", OPTIONS="link_priority=-100"
ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"
ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}"

Comment by Pierpaolo Valerio (gondsman) - Monday, 27 August 2012, 10:03 GMT

Ok, thanks. later today I'll test whether those links are created or not with every combination of dmraid/mdadm and sysVinit/systemd. Maybe it will help troubleshoot the problem.

Comment by Fred Verschueren (fvsc) - Monday, 27 August 2012, 15:21 GMT

output of ls -al /dev/disk/by-*/

with dmraid (I'm not using mdadm)

1. after succesful boot with sysvinit
2. After failed boot with systemd (in emergency shell)
3. After succesful boot with systemd (after mount of /home and systemctl default in emergency shell)

Hope this helps

dev-disk-by-id-sysvinit-after... (3.1 KiB)

dev-disk-by-label-sysvinit-af... (0.2 KiB)

dev-disk-by-path-sysvinit-aft... (0.4 KiB)

dev-disk-by-uuid-sysvinit-aft... (1.1 KiB)

dev-disk-by-id-systemd-failed... (2.9 KiB)

dev-disk-by-label-systemd-fai... (0.2 KiB)

dev-disk-by-path-systemd-fail... (0.4 KiB)

dev-disk-by-uuid-systemd-fail... (1.1 KiB)

dev-disk-by-id-systemd-after-... (2.9 KiB)

dev-disk-by-label-systemd-aft... (0.2 KiB)

dev-disk-by-path-systemd-afte... (0.4 KiB)

dev-disk-by-uuid-systemd-afte... (1.1 KiB)

Comment by Dave Reisner (falconindy) - Monday, 27 August 2012, 15:36 GMT

So what's the output of 'systemctl show' for the failed foo.mount on the dmraid units after a failed systemd boot?

Comment by Pierpaolo Valerio (gondsman) - Monday, 27 August 2012, 17:42 GMT

Meanwhile, I can add my outputs for ls -al /dev/disk/by-*/
(the one when I booted with systemd is the same before or after I mounted my home partition by hand .
/ -> /dev/md126p5
swap -> /dev/md126p5
/home -> /dev/md126p7

systemd_premount.txt (4.8 KiB)

sysV.txt (6.2 KiB)

Comment by Fred Verschueren (fvsc) - Tuesday, 28 August 2012, 08:20 GMT

systemctl show home.mount (This is the dmraid partition)
sytemctl show dev-mapper-nvidia_jjcaiie.mount (dmraid disk)
sytemctl show dev-mapper-nvidia_jjcaiiep1.mount (dmraid disk partition)

From each show command I sent two fies: one when the mount of /home was failed and one after a manual mount of /home (mount /dev/mapper/nvidia_jjcaiiep1 /home)

Diff show no difference for the dev-mapper-nvidia ... files (before and after)
The difference for the home file is showing the mount operation.

home-before.txt (2.3 KiB)

home-after.txt (2.4 KiB)

dev-mapper-nvidia_jjcaiie-bef... (1.7 KiB)

dev-mapper-nvidia_jjcaiie-aft... (1.7 KiB)

dev-mapper-nvidia_jjcaiiep1-b... (1.7 KiB)

dev-mapper-nvidia_jjcaiiep1-a... (1.7 KiB)

Comment by Dave Reisner (falconindy) - Tuesday, 28 August 2012, 09:56 GMT

I need to see dev-mapper-nvidia_cjjcaiiep1.device -- there is no .mount unit for a _device_.

Comment by Fred Verschueren (fvsc) - Tuesday, 28 August 2012, 15:22 GMT

systemctl show dev-mapper-nvidia_cjjcaiiep1.device

By the way, is there somewhere a document describing the the flow of the boot process with systemd.
This would help to pin point a fault.

dev-mapper-nvidia_jjcaiiep1.d... (0.6 KiB)

Comment by Pierpaolo Valerio (gondsman) - Tuesday, 28 August 2012, 18:04 GMT

Here is my output.

show_dev_home.txt (0.7 KiB)

show_dev_root.txt (0.6 KiB)

Comment by Fred Verschueren (fvsc) - Friday, 31 August 2012, 08:59 GMT

With systemd 189-3 the problem is still present.

By the way, is there somewhere a document describing the the flow of the boot process with systemd.
This would help to pin point a fault.

Comment by Tom Gundersen (tomegun) - Friday, 31 August 2012, 09:16 GMT

@fvsc: there are some high-level explanation in the man pages see bootup(7), udev(7), etc. Don't know if it will help in this particular case, but worth a try.

Comment by Alexander F. Rødseth (xyproto) - Tuesday, 11 September 2012, 05:33 GMT

Did the suggestion from tomegun work? Is this still an issue? Thanks

Comment by Pierpaolo Valerio (gondsman) - Tuesday, 11 September 2012, 05:53 GMT

I'm still unable to boot with systemd, using either dmraid or mdadm.

Comment by Fred Verschueren (fvsc) - Tuesday, 11 September 2012, 21:29 GMT

Same situation.
I can only boot after manual mount. (see above)

Comment by Ralph (RWiggum) - Thursday, 13 September 2012, 07:44 GMT

As I am experiencing the same problem, I would like to contribute additional information that MAY assist in finding a cause and solution.

Intel "RAID" controller (fake type), 2 SSD RAID-0. (Asus P8Z77V Premium)

If booting off of it, or another install on a different single disk, it will see all partitions listed in the fstab on the Intel RAID as "DEAD".
If running LUKS on another parition on the same disk, it will see the LUKS partition fine after it is unlocked. (preboot or postboot does not matter)
If a systemctl daemon-reload is performed, it will then see it as healthy and mount properly.
When changing targets, it will see it as dead again and require another daemon-reload. (I think this eliminates mkinitcpio from equation)
If it is set as "noauto" in fstab, it will work properly when accessed after a daemon-reload after boot in final target.
If "noauto" is not used in fstab, a daemon-reload is used, and then accessing data on the partition for more than a few mins, it will do an fsck and kind of lock the system, basically systemd crashes. I can issue commands, but no systemd, no init, so no logs, and no reboot.

I am using UUID, but path or label work exactly the same.
Doing a ls of /dev shoes the partitions, as well as the UUID and Label in the disk-by, before and after daemon-reload.
Blkid lists all of them before and after daemon-reload.
Udevadm lists disk and partition before and after daemon-reload.

I would give up hope for a solution, but doing a daemon-reload after finished loading a target, and mounting afterwards, it works fine. This indicates it is capable of working, just got a problem with load order, or something tirvial someone with a more in-depth knowledge of systemd and udev can fix.

Systemd is the way things are going. I think it is an improvement, and am hopeful we can work past these little bumps in the road.
My system is still in a state I can wipe everything and test. Single disk 2 partitions, 2 disk fake RAID-0 2 partition. SSDs or HDD.

I am using UEFI to boot with a EF00 boot partition and a LUKS root partition.
Problem is only with the boot partition of the drive on the Intel fake RAID.
Everything is md_raid, NOT dm_raid for these results.

/dev/md127 container
/dev/md126 inside the /dev/md127 container
/dev/md126p1 EF00 /boot
/dev/md126p2 LUKS /

This nested RAID is done by the Intel RAID OROM. It may be the heart of the cause. I would abandon, but the only way the EFI will boot off of the RAID partition. Booting off of the EFI RAID partition works fine until system tries to mount it according to fstab. The "noauto" option in fstab allows a complete successful boot in this configuration. After booting and login, a systemctl daemon-reload, ls /boot, and everything is perfect until a target change.

Please let us know if there is anything else we can contribute to assist in diagnosing the cause.

Thank you for your assistance.

Comment by Ralph (RWiggum) - Monday, 17 September 2012, 08:44 GMT

I did some more testing, and I am pretty sure it is something to do with RAID containers.
As I did not have enough available ports on the intel controller I built a DDF container. Pretty much the same results occurred.

Steps for anyone to duplicate:
mdadm -CR --verbose /dev/md0 -e ddf -l container -n 2 /dev/sde /dev/sdf
mdadm -CR --verbose /dev/md1 -l 0 -n 2 /dev/md0 -z 200G

gdisk /dev/md1
o y
n 1 entr +512M efoo
n 2 entr entr entr
w
mkfs.vfat -F32 -n testboot /dev/md1p1
mkfs.ext4 -L testroot /dev/md1p2
mkdir /ctr1
mkdir /ctr2
mount /dev/md1p1 /ctr1
mount /dev/md1p2 /ctr2
(copy some random data over to each)

blkid | grep md1p >> /etc/fstab (then edit to use UUID, mount to ctr1 or 2, respective filesytem)

mdadm --examine --scan >> /etc/mdadm.conf (and edited out the duplicates)
mkinitcpio -p linux

systemctl reboot

Starts up, times out
Maintenance mode
systemctl daemon-reload
mount /crt1
mount /ctr2
ctrl-D
Login
ls /ctr1
(Never displays)
Ctrl-C
systemctl daemon-reload
ls /ctr1
*poof* lists contents
ls /ctr2
*poof* lists contents
We are good to go until we switch targets.

It looks like we just have a problem with RAIDs in a container.
cat /proc/mdstat looks the same before and after
blkid looks the same before and after
ls /dev looks the same before and after
lsmod looks the same before and after
udevadm looks the same before and after

Control group of a standalone disk sdc1 GPT ext4 using UUID in fstab identical to md1p2 always works, no timeouts, noproblems.

Hope this helps get closer to the resolution.
Not sure if it matters, but I am running in pure UEFI mode. (no BIOS or compatibility mode)
Please let me know if I can provide any additional information.

Thank you

Comment by Stéphane Travostino (eazy) - Wednesday, 26 September 2012, 16:29 GMT

I have the same problem as OP -- /home on dmraid

The difference is that the boot process never times out, it's stuck after "Assembling FakeRAID devices" and cannot reboot (reboot process starts and hangs as well), so I don't know how to debug this. Is any error written to some log file?

I removed dmraid from mkinitcpio.conf with no effect. Only commenting the dmraid devices from /etc/fstab goes to the emergency shell.

Comment by Tom Gundersen (tomegun) - Wednesday, 26 September 2012, 16:33 GMT

Can you still reproduce this with systemd-192 (currently in testing)?

Comment by Dave Reisner (falconindy) - Wednesday, 26 September 2012, 16:55 GMT

This should only be fixed in systemd-192 if you're using mdadm to assemble dmraid, which I'd strongly suggest moving to.

Comment by Pierpaolo Valerio (gondsman) - Wednesday, 26 September 2012, 18:11 GMT

It boots! Yay! Unfortunately this is still a mess. Now the machine boots fine but it doesn't shut down or reboot. The problem is exactly like this: https://bugzilla.redhat.com/show_bug.cgi?id=752593
Basically, from what I understood, in order to have the kernel boot I have to build my initramfs with BINARIES="/sbin/mdmon". When systemd tries to shut down, it kills this process and then tries to remount / read-only, failing miserably as mdmon is not running anymore. The bug is solved on Fedora, but they use dracut so I couldn't understand how they solved it. Can anyone help?

Comment by Tom Gundersen (tomegun) - Wednesday, 26 September 2012, 18:59 GMT

@gondsman: this seems to be a separate issue (not related to dmraid).

Adding mdmon to BINARIES by itself won't do anything (unless I'm missing something). It needs to be started in the initrd, and be passed the --offroot option so that it will not be killed by systemd. Did starting mdmon ever work with initscripts for you?

Comment by Pierpaolo Valerio (gondsman) - Wednesday, 26 September 2012, 20:43 GMT

Foreword: everything I'll say is using mdadm, NOT dmraid.
If I don't put mdmon in the BINARIES array the kernel isn't able to mount the root partition correctly (it hangs before the init, complaining about unknown filesystem or lack of an helper program). If I put it in mkinitcpio.conf and run mkinitcpio -p the kernel boots just fine. Both with initscripts and systemd the shutdown process hangs. With the initscripts it hangs at "unmounting non-API filesystems", with systemd while trying to remount / as read-only (the issue is exactly as in the fedora bug I linked). Is there anything I can do in order to avoid mdmon being killed?

Comment by Dave Reisner (falconindy) - Wednesday, 26 September 2012, 20:55 GMT

Is this using mdadm or mdadm_udev hook?

I see in the code where mdadm/mdassemble can try to start mdmon, but mdadm needs to be called with --offroot in order for mdmon to be started with --offroot. Passing this option will prevent mdmon from being killed. Would be nice if mdadm was smart enough to look for /etc/initrd-release to autodetect being run from early userspace and add this flag automatically.

Comment by Pierpaolo Valerio (gondsman) - Wednesday, 26 September 2012, 22:02 GMT

I use mdadm_udev, I remember having tried mdadm without success. Is there any way to pass the --offroot option manually when building the initramfs?

Comment by Stéphane Travostino (eazy) - Thursday, 27 September 2012, 07:41 GMT

I may be misinformed, but isn't mdadm a Linux-only software RAID?

My problem is that I want to use the fake hardware RAID to be able to access it from Windows too.
Does mdadm provide this functionality?

Comment by Pierpaolo Valerio (gondsman) - Thursday, 27 September 2012, 07:53 GMT

@eazy: yes, it does. The dmraid package is in "maintenance mode" only, it won't be updated anymore. Most of its functionality is now part of mdadm, although the number of supported chipsets is somewhat smaller as far as I know. Anyway, I'm using mdadm to assemble my array (fakeRAID on an intel chipset) and it does work, apart from what I wrote before. Other distributions like Fedora also use mdadm. Moreover, Arch liveCD uses mdadm, try to boot from it and see if it works.

Comment by Fred Verschueren (fvsc) - Thursday, 27 September 2012, 08:04 GMT

With systemd 192 my problem is still present.

Boot - emergency shell (/home on dmraid not mounted - time out) - manual mount of /home - systemctl default - boot continues and online

So, I cannot boot without manual intervention!!

Comment by Pierpaolo Valerio (gondsman) - Thursday, 27 September 2012, 08:58 GMT

@Fred: are you using mdadm or dmraid?

Comment by Fred Verschueren (fvsc) - Thursday, 27 September 2012, 15:07 GMT

I'm using dmraid.

Comment by Pierpaolo Valerio (gondsman) - Thursday, 27 September 2012, 15:33 GMT

As dave said before, the fix in systemd 192 only works if you assemble the array with mdadm, not dmraid.

Comment by Fred Verschueren (fvsc) - Friday, 28 September 2012, 08:52 GMT

@falconindy
Is there a plan to fix this also for dmraid?

@gondsman
How is the switch from dmraid to mdadm done?
The raid is also defined in the bios, must this be removed?

Comment by Dave Reisner (falconindy) - Friday, 28 September 2012, 09:50 GMT

There's no plans/interest to fix dmraid. I've mentioned several times before that upstream is dead.

Comment by Tom Gundersen (tomegun) - Friday, 28 September 2012, 10:01 GMT

While no one is currently working on dmraid, I'm sure it would be very much appreciated if someone with the right hardware would manage to figure out what's wrong so we can get it fixed...

Comment by Pierpaolo Valerio (gondsman) - Friday, 28 September 2012, 11:42 GMT

@fred: the switch to mdadm isn't difficult at all.
- Modify mkinitcpio.conf so that mdadm_udev is in the HOOKS line and /sbin/mdmon is in the BINARIES line (also, remove dmraid if any)
- Rebuild the initramfs with 'mkinitcpio -p linux'
- Update fstab and the bootloader config with the new device names (mdadm assembles partitions as /dev/md126pX)
Once the shutdown issue is solved I'm going to update the fakeRAID wiki page, as it's currently badly out of date.

Comment by Fred Verschueren (fvsc) - Sunday, 30 September 2012, 10:21 GMT

@tomegun
I have the HW, I'm willing to search for a solution but ... I need help.

@gondsman
The raid is my /home partition, so no need to change or add mkinitcpio
Also the switch from dmraid to mdadm is not that strait forward because I expect the metadata to be different and the disk data will not be accessible anymore.

Comment by Pierpaolo Valerio (gondsman) - Sunday, 30 September 2012, 12:15 GMT

@Fred: No, the metadata is the same. I switched back and forth between dmraid and mdadm many times without doing anything to the partitions, it's completely transparent.

Comment by Pierpaolo Valerio (gondsman) - Monday, 01 October 2012, 22:30 GMT

Let's try to give some more food for thought.
Suspecting the issue with the shutdown was in the initramfs (thanks to Tom's and Dave's suggestions) I compiled and installed dracut and created a initramfs with it. This way the system boots correctly and it shuts down without errors. Unfortunately, after shutdown the array is flagged as "dirty" and booting into windows (since Intel utility to check the array is for win only) forces a rebuild, which takes several hours. I experienced the same behavior when I gave fedora a try with the same setup.
I don't know if this can help troubleshooting the problem, let me know if I can do more tests.

Comment by Dave Reisner (falconindy) - Tuesday, 02 October 2012, 00:08 GMT

If you can post the dracut initramfs, that may be helpful. I'm not sure how soon I'd be able to take the time to look through it, though.

Comment by Pierpaolo Valerio (gondsman) - Tuesday, 02 October 2012, 09:07 GMT

Do you mean the .img file? Is it "human readable"? I'll post it when I get home tonight.

Comment by Pierpaolo Valerio (gondsman) - Tuesday, 02 October 2012, 22:08 GMT

Here it is: https://dl.dropbox.com/u/23402519/initramfs-linux-dracut.img

Comment by Pierpaolo Valerio (gondsman) - Tuesday, 09 October 2012, 12:27 GMT

I don't know why, but my system was able to shut down correctly ONCE last night, without modifying anything. I wasn't able to reproduce it, now it doesn't shut down cleanly anymore. Anyway, when it did manage to unmount everything correctly, the array was flagged as "dirty" and it had to be checked and corrected at the next boot, exactly like when I boot with the dracut-built initramfs.
EDIT: also, this may be relevant (although I'm really not smart enough to figure out exactly how the shutdown process works...): http://board.issociate.de/thread/509251/mdadm-forces-resync-every-boot.html

Comment by Pierpaolo Valerio (gondsman) - Tuesday, 23 October 2012, 07:51 GMT

Since there were no new comments on this bug (I don't mean to sound rude, I know the devs are busy with all the changes going on in Arch these days), should I report the issue upstream? I'm guessing my bug is not Arch-specific, as the same issue is there with Fedora. If so, where should I report it? Systemd, mdadm or another project?
Thanks again.
EDIT: This may be relevant as well: https://bugzilla.redhat.com/show_bug.cgi?id=753335 . I'm going to try this solution, I'll report back.

Comment by Pierpaolo Valerio (gondsman) - Wednesday, 24 October 2012, 00:03 GMT

No luck with the solution linked in the previous post. :(

Comment by Pierpaolo Valerio (gondsman) - Wednesday, 31 October 2012, 13:58 GMT

Any update on this bug? I really don't want to rant here, but now that due to the latest updates I'm forced to boot with systemd (hence, I can't use dmraid but only mdadm) things are getting extremely annoying, because from time to time my pc does shut down "properly", forcing the array to be rebuilt, taking several hours every time! I just can't use a system which takes 5+ hours to reboot, it's ridiculous. Not to mention people with RAID arrays not supported by mdadm (actually, ANY fakeraid other than intel ISMS), who are basically unable to boot without waiting for 5 minutes at boot and manually mounting their /home partition.
Please, note that this problem is not Arch-specific, it's basically due to systemd not supporting initramfs mounting of root partition with dmraid. I have the same bug with Fedora 17. However, given the advertised "simplicity" of Arch (one of the main reasons I chose it), I would expect a "simple" way of checking what my system is doing at boot and during shutdown and an equally "simple" way of modifying it, not something that automagically works most of the times and I have no control over it if it doesn't.
I'm sorry for this post as it doesn't add anything to the discussion, but I really don't want to leave Arch for another distro because of issues like this.

Comment by Dave Reisner (falconindy) - Thursday, 01 November 2012, 00:53 GMT

If there's no updates on this bug, it's because no one with the hardware, time, and motivation has taken the liberty of working with upstream on finding a solution.

I have:

- not the hardware to test this
- not the interest to fix this

And most of all, I don't have the time to read inane rants about how there's nothing happening with this bug.

Comment by Tom Gundersen (tomegun) - Thursday, 01 November 2012, 00:58 GMT

I keep on intending to look into this, but it is just TMI to dig through. If any of you with the correct hardware and a vague grasp of how things _should_ work, would poke me in #archlinux-projects on day I could possibly help out with explaining any missing pieces for you to figure out exactly what the problem is (I really think it is necessary to have the hardware to be able to figure this out).

Comment by Pierpaolo Valerio (gondsman) - Thursday, 01 November 2012, 07:10 GMT

@dave: fair enough, I apologize for the tone of my previous post, but you can understand I was frustrated by having to wait 5 hours before using my pc. Anyway I do appreciate your work on the distro, I didn't mean to attack you. I'll keep the posting more "in topic" from now on.
@Tom: thanks, I don't know if I'm knowledgeable enough, but I should at least have an idea of what should be run in order to make it work, I just can't figure out how to do it. I'll drop by #archlinux-projects and see if I can help.

Comment by Fred Verschueren (fvsc) - Thursday, 01 November 2012, 10:06 GMT

I also have still the problem of not booting without first being dropped in the shell, manually mounting my /home partition (on a dmraid) and executing systemctl default.

@Tom: I'm also willing to help but have not a clue where to start.

Question: where and how can I join #archlinux-projects

Comment by Pierpaolo Valerio (gondsman) - Thursday, 01 November 2012, 12:31 GMT

@fred: That is because systemd is not supporting dmraid correctly. As dmraid is not actively developed anymore, systemd authors have no plans of implementing such support.
However, mdadm does support SOME fakeraid implementation. You should check if your raid can be assembled with mdadm and (if this is the case) use it instead of dmraid. If you want to do it without modifying your installation you can try it with a recent Arch livecd, as it contains mdadm. In my case mdadm automatically assembled my device after booting from the livecd, but doing it manually isn't that much more difficult, just check mdadm manual. At this point check /proc/mdstat to see if mdadm is working properly (you can also mount partitions in the array, they should be assembled as /dev/md126p*).

Comment by Randy Terbush (RandyT) - Saturday, 03 November 2012, 22:00 GMT

Greetings, wanted to chime in here and share what I have learned about this issue.

I've recently installed ArchLinux this week, coming from Gentoo. The hardware configuration I have installed on has not changed in any way from the Gentoo config. I've installed ArchLinux on my root drive and attempted to reconfigure in the storage systems I had running prior to the Arch install.

I am running a 6 drive mdadm RAID10 array mounted on /raid and my /home has been running on a dmraid RAID1 mirror using the "Fakeraid" hardware on my nvidia MB.

I can confirm all of the problems reported in this bug report. My fresh system has no trouble mounting the mdadm raid, but hangs attempting to complete the boot process and eventually drops me into the emergency shell. mount -a or mount /home mounts the dmraid without any issues and a systemctl default continues the boot process.

Seems that mdadm does not support all "fakeraid" chipsets, one of those unsupported being the nvidia chip. Seems a shame as I preferred using the two different code paths to support these two storage devices.

I've decided to stop beating my head on the wall hear (it is getting soft...) and convert the /home to mdadm software raid.

Adding my .02 here so that the higher powers recognize that this is not an unusual scenario.

Thanks. Loving Arch so far...

Comment by Pierpaolo Valerio (gondsman) - Sunday, 04 November 2012, 22:43 GMT

Small update. Since the last kernel update the system always shuts down "correctly". It still forces an array check every time, so the problem is not solved, but at least there is some progress.
My guess (I have no idea how to check) is that now mdmon gets launched with the --offroot, so systemd doesn't kill it during shutdown. As a result, the system unmounts the partitions correctly. When root is mounted again read-only, however, the array isn't marked as clean with a
mdadm --wait-clean --scan
and so at every reboot a check is enforced.
@tomegun: I tried dropping by #archlinux-projects but I couldn't find you. I'll try again shortly, thanks for your help.

Comment by Fred Verschueren (fvsc) - Monday, 05 November 2012, 15:43 GMT

@tomegun

I have found the reason why /home on dmraid is not booting without manual intervention.

No symlink is made for /dev/mapper/nvidia_cjjcaiiep1 because the udev rules 10-dm.rules are not executed because of ENV{DM_UDEV_DISABLE_DM_RULES_FLAG}!="1", ENV{DM_NAME}=="?*", SYMLINK+="mapper/$env{DM_NAME}" which is true.

When I make a own 10-dm.rules in /etc/udev/rules with
ENV{DM_UDEV_DISABLE_DM_RULES_FLAG}!="0", ENV{DM_NAME}=="?*", SYMLINK+="mapper/$env{DM_NAME}" which is false a symlink is made and the /home is mounted and the pc is booting without any manual intervention.

So, the question is: where is ENV{DM_UDEV_DISABLE_DM_RULES_FLAG} set to '1' and is this correct

I need some help to dig further.

Comment by Tom Gundersen (tomegun) - Monday, 05 November 2012, 16:58 GMT

@fsvc: That's useful, thanks!

Looking at the source code of libdm and dmraid, my guess is that a missing call to dm_task_set_cookie() in dmraid is the problem.

Please change your dmraid.service file to

"ExecStart=/sbin/dmraid --ignorelocking --activate y -Z -dddd"

Hopefully that should spit out the right debug info.

Comment by Fred Verschueren (fvsc) - Monday, 05 November 2012, 19:46 GMT

@tomegun
-dddd added
pc rebooted
Partial output of journalctl -b see file

dmraid (3.8 KiB)

Comment by Pierpaolo Valerio (gondsman) - Friday, 30 November 2012, 12:24 GMT

I forgot to update this bug with recent news...
Thanks to Dave and the other developers on IRC the problem is now solved for people who can use mdadm to mount the array. This means (as far as I know) that fakeRAID with an intel storage matrix chipset is now supported. There are still a few error messages thrown out during shutdown, but everything works. I'm going to update the fakeRAID wiki page with info on how to set up mdadm to work with my chipset.
For users of other chipsets (nvidia for example), the bug is still there, I can't really help.

Comment by Fred Verschueren (fvsc) - Sunday, 02 December 2012, 15:52 GMT

Thanks for the info.

Concerning my problem, I suppose this is still existing but I cannot test it because my processor went to the cpu walhalla.

I have upgraded with a new mobo and new processor and new ram.

As a concequence the dmraid was not working anymore with the new HW and I have switched to mdadm raid, and this went well without data loss.

Now I can boot without manual intervention.

Thanks for the help.

Comment by Dave Reisner (falconindy) - Saturday, 22 December 2012, 20:12 GMT

Users still stuck on dmraid should be able to use the dmraid package I pushed into testing. It's safe to cherry pick. Make sure to regenerate your initramfs after installing the new package and before rebooting with 'mkinitcpio -p linux'.

Comment by Dave Reisner (falconindy) - Friday, 28 December 2012, 19:27 GMT

Anyone... anyone... dmraid-1.0.0.rc16.3-8 is in [core]... closing this bug if no one is interested in this anymore since I have someone else claiming that everything is fine.

	Tasks related to this task (0)

Duplicate tasks of this task (1)
~~FS#31257 - [systemd] boot fails beccause of missing fakeraid support~~

Arch Linux

FS#31236 - [systemd] and dmraid

Details

Loading...