FS#20535 - [kernel26] Kernel update to 2.6.35.2 dmraid fails

Attached to Project: Arch Linux
Opened by Van Nguyen (kaizoku) - Friday, 20 August 2010, 10:25 GMT
Last edited by Tobias Powalowski (tpowa) - Friday, 26 August 2011, 14:52 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
After 2.6.35.2, dmraid refuses to build array.

Additional info:
*kernel26 2.6.35.2
ERROR: nvidia: wrong # of devices in RAID set "nvidia_dfdiifcb" [1/2] on /dev/sdb
ERROR: removing inconsistent RAID set "nvidia_dfdiifcb"
ERROR: no RAID set found

Steps to reproduce:
pacman -Syu
reboot

I had to downgrade kernel back to 2.6.34.3 to fix.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Friday, 26 August 2011, 14:52 GMT
Reason for closing:  Fixed
Comment by David C. Rankin (drankinatty) - Friday, 20 August 2010, 21:12 GMT
You had just the opposite experience I had with nvraid/dmraid and 2.6.34.3 and 2.6.35.2 (see: http://bugs.archlinux.org/task/20499) 2.6.34.3 would not recognize the arrays at all. (to the point it would declare the array failed requiring a power-off restart to re-sync before booting into LTS) 2.6.35.2 recognizes the arrays just fine, but the boot process dies waiting for udev devides and usbhid (the NUT driver) so there is something amiss in both kernels relating to dmraid and nvidia softraid.
Comment by Van Nguyen (kaizoku) - Monday, 23 August 2010, 03:32 GMT
Same thing for 2.6.35.3
Comment by Leonid Isaev (lisaev) - Wednesday, 25 August 2010, 17:37 GMT
I don't use dmraid myself, but have you seen this:
http://forums.fedoraforum.org/showthread.php?t=236404
Comment by Van Nguyen (kaizoku) - Thursday, 26 August 2010, 03:12 GMT
I have seem that actually, it is an old thread, and they are not using the latest kernel. The user suggests to delete the meta data, but I think that will wipe all data.
Comment by N K (synackfin) - Sunday, 17 October 2010, 17:48 GMT
I've had the same problem. It had to do with an updated mdadm that came alongside the same archlinux release as the kernel 2.6.35.

The new mdadm for some reason doesn't support metadata version 1.0 on partitions properly. I had metadata 1.0 on /dev/sda1, /dev/sdb1, /dev/sdc1, etc. forming a 6 drive raid5 array with internal bitmap. The new mdadm for some reason incorrectly loaded the array as /dev/sda, /dev/sdb, /dev/sdc, etc. and displayed junk metadata stats (version 1.1, no internal bitmap). The new mdadm, because it saw junk metadata, started "syncing" the 6th disk, but in actuality was ruining the 6th disk.

I was able to downgrade my mdadm and keep the 2.6.35 kernel. I had to zero out the 6th disk and re-add it to the array so the old mdadm would properly rebuild it.
Comment by Van Nguyen (kaizoku) - Monday, 18 October 2010, 03:46 GMT
How do I obtain the binary for mdadm before 1.0? I can't find it anywhere. Wait, don't you mean dmraid?
Comment by Van Nguyen (kaizoku) - Sunday, 05 December 2010, 04:17 GMT
Okay I think I know what the problem is after some extensive research.

When installing windows, it puts a 100M partition on one of the drives. If all your drives are on the array, it puts it on regardless, and this causes a mismatch in the number of blocks in your array.

The thing I don't understand is that this doesn't effect the older kernels, but the newer ones cannot correct this problem. I tried compiling my own kernel to fix the problem, but still no luck.

I think i'm going have to backup over 2TB of data and start over again.
Comment by Jelle van der Waa (jelly) - Thursday, 14 April 2011, 21:08 GMT
is this still an issue?
Comment by Van Nguyen (kaizoku) - Saturday, 23 April 2011, 03:08 GMT
Yes it is, with latest kernel.
Comment by Tom Gundersen (tomegun) - Monday, 20 June 2011, 16:57 GMT
Is this still a problem with mdadm 3.2.2 and kernel 2.6.39.1? If so, are there upstream bug reports about this? This really sounds like upstream bugs.
Comment by Van Nguyen (kaizoku) - Tuesday, 21 June 2011, 07:39 GMT
Again, its dmraid not mdadm...
Comment by Tom Gundersen (tomegun) - Tuesday, 21 June 2011, 09:43 GMT
Sorry, typo. So, does it still persist with dmraid 1.0.0.rc16.3 and kernel 2.6.39.1? And if so, have you reported this upstream?
Comment by Van Nguyen (kaizoku) - Tuesday, 21 June 2011, 10:47 GMT
Not sure, I backed up everything which took an eternity and started all over again.
Comment by Tom Gundersen (tomegun) - Tuesday, 21 June 2011, 11:20 GMT
Ok. In that case I think this bug should be closed, unless someone else can reproduce it?

Loading...