FS#23691 - [mdadm] 3.2.1 crashes

Attached to Project: Arch Linux
Opened by Ionut Biru (wonder) - Monday, 11 April 2011, 08:00 GMT
Last edited by Tobias Powalowski (tpowa) - Saturday, 23 April 2011, 19:32 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 6
Private No

Details

Description:

mdadm[1996]: segfault at 0 ip 0000000000421726 sp 00007fff58769d80 error 4 in mdadm[400000+5e000]
r8169 0000:06:00.0: eth0: link up
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
mdadm[1997]: segfault at 0 ip 0000000000421726 sp 00007fffa5a48380 error 4 in mdadm[400000+5e000]
eth0: no IPv6 routers present


This is from /etc/rc.d/mdadm crashing on line 11

/sbin/mdadm --monitor --scan -i /var/run/mdadm.pid -f

it seems that it crashes only when passing --scan
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Saturday, 23 April 2011, 19:32 GMT
Reason for closing:  Fixed
Additional comments about closing:  3.2.1-3
Comment by Tobias Powalowski (tpowa) - Monday, 11 April 2011, 12:32 GMT
confirmed, latest git also has this bug :(
Comment by Jonathan Liu (net147) - Tuesday, 12 April 2011, 08:22 GMT
Patch to fix segmentation fault attached.
Comment by Jonathan Liu (net147) - Tuesday, 12 April 2011, 08:43 GMT
Posted patch for git version to linux-raid@vger.kernel.org.
Comment by Jonathan Liu (net147) - Tuesday, 12 April 2011, 08:47 GMT
After applying patch though, I get a message when starting mdadm:
mdadm: Only one autorebuild process allowed in scan mode, aborting

It is caused by: mdadm --monitor --oneshot --scan
Comment by Tobias Powalowski (tpowa) - Tuesday, 12 April 2011, 10:19 GMT
Any idea when this is fixed?
Do I need to force a 3.1.5 downgrade?
Comment by Jonathan Liu (net147) - Wednesday, 13 April 2011, 06:00 GMT Comment by Jonathan Liu (net147) - Wednesday, 13 April 2011, 06:06 GMT
Just need to fix the mdadm initscript.
Comment by Jonathan Liu (net147) - Wednesday, 13 April 2011, 06:07 GMT
Why is mdadm called again after starting with --oneshot?
Comment by Florian Pritz (bluewind) - Wednesday, 13 April 2011, 06:22 GMT
> Why is mdadm called again after starting with --oneshot?

https://bugs.archlinux.org/task/20937
Comment by Jonathan Liu (net147) - Wednesday, 13 April 2011, 06:25 GMT
Well it can be run before starting mdadm probably. Probably should verify whether upstream has already fixed the issue for  FS#20937 .
Comment by Jonathan Liu (net147) - Wednesday, 13 April 2011, 11:04 GMT
I can't reproduce  FS#20937  with mdadm 3.1.5 and 3.2.1.

Steps to test:
/etc/rc.d/mdadm stop
dd if=/dev/zero of=data1.bin bs=4096 count=$((32*1024*1024/4096))
dd if=/dev/zero of=data2.bin bs=4096 count=$((32*1024*1024/4096))
losetup /dev/loop0 data1.bin
losetup /dev/loop1 data2.bin
mdadm --create /dev/md/raidtest --metadata=default --level=raid1 --raid-devices=2 /dev/loop0 /dev/loop1
mdadm /dev/md/raidtest --fail /dev/loop1
mdadm --monitor --scan -i /var/run/mdadm.pid -f
mdadm --stop /dev/md/raidtest
losetup -d /dev/loop0
losetup -d /dev/loop1

It sends me an email as soon as I start mdadm monitor with "mdadm --monitor --scan -i /var/run/mdadm.pid -f" command.
Is it possible it was already fixed upstream since the original bug report was created?

If that's the case we can just remove "mdadm --monitor --oneshot --scan" from the initscript.
If I use "/etc/rc.d/mdadm start" instead of "mdadm --monitor --scan -i /var/run/mdadm.pid -f" with mdadm 3.1.5, I get send two emails saying the same thing which implies that "mdadm --monitor --oneshot --scan" is redundant.
Comment by Jonathan Liu (net147) - Saturday, 16 April 2011, 09:51 GMT
Tobias, can you remove "mdadm --monitor --oneshot --scan" from the initscript?
As far as I can tell it is redundant, shows an error message when starting mdadm and doesn't do anything because mdadm is already running.

Loading...