FS#17674 - Kernel panic with kernel26 2.6.32.2-2: No filesystem could mount root

Attached to Project: Arch Linux
Opened by Jerome Baker (ebirtaid) - Friday, 01 January 2010, 21:44 GMT
Last edited by Andrea Scarpino (BaSh) - Wednesday, 13 January 2010, 07:53 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture i686
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
Updated to 2.6.32-2 from core, system no longer boots reading:

List of all partitions:
No filesystem could mount root,tried:
Kernel panic - not syncing: VFS: unable to mount root fs on unknown-block(0,0)
PID: 1, comm swapper Not tainted 2.6.32-ARCH #1

and then a call trace which I have a picture of if it is relevant. This occurs with stock/fallback kernel and kernel26-bfs; using UUID or /dev/sdx# in menu.lst. Reverting to 2.6.31.6-1 works. Other info follows.

fdisk -l:
/dev/sda2 1913 7865 47817472+ 83 Linux
/dev/sda3 7866 20023 97659135 5 Extended
/dev/sda5 7866 7896 248976 82 Linux swap / Solaris
/dev/sda6 7897 20023 97410096 83 Linux

fstab:
UUID=0bebcc5f-5749-408e-a668-1ce65b83995b swap swap defaults 0 0
UUID=7ab7cbcf-294e-40c1-bea3-e3b88adc85e5 / ext3 defaults 0 1
UUID=9e276556-b672-4ccb-8b31-d0b780e67b2d /home ext3 defaults 0 1

menu.lst:
# (1) Arch Linux
title Arch Linux
root (hd0,1)
kernel /boot/vmlinuz26 root=/dev/disk/by-uuid/7ab7cbcf-294e-40c1-bea3-e3b88adc85e5 ro edd=off vga=775
initrd /boot/kernel26.img

lspci:
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio Controller (rev a2)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev a2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
05:00.0 VGA compatible controller: nVidia Corporation G96 [GeForce 9400 GT] (rev a1)

relevant lines of mkinitcpio.conf:
MODULES="pata_acpi pata_amd ata_generic scsi_mod sata_nv"
HOOKS="base udev autodetect pata scsi sata usb usbinput keymap filesystems"

The kernel itself builds fine and without error when installing from pacman. If any more information is required please let me know and I will do my best to provide it.
This task depends upon

Closed by  Andrea Scarpino (BaSh)
Wednesday, 13 January 2010, 07:53 GMT
Reason for closing:  Fixed
Additional comments about closing:  kernel26 2.6.32.3-1
Comment by Vedant Kumar (vsk) - Sunday, 03 January 2010, 06:41 GMT
Given that the previous kernel works and that you're using ext3, the issue may have been introduced by this patch; http://www.kernel.org/diff/diffview.cgi?file=/pub/linux/kernel/v2.6/patch-2.6.32.2.bz2;z=119. It's the only modification to ext3 for the 2.6.32-2 release as far as I can tell.

In that patch, all calls to 'ext3_truncate' are replaced with 'ext3_truncate_failed_write', which truncates 'inode pages' and then just calls the original 'ext3_truncate' function. If it's possible, could you please comment out this line in /fs/ext3/inode.c; "truncate_inode_pages(inode->i_mapping, inode->i_size);", and test your kernel? That line is within the 'ext3_truncate_failed_write' function, around line 1150.

I may be completely wrong here -- I'm not 100% sure it's a kernel bug.. But if that change fixes things, then it'd be interesting :). Good luck!
Comment by Pall Gone (pallgone) - Sunday, 03 January 2010, 08:59 GMT
Hey... I have a smiliar issue on my HP mini. After upgrading the Kernel via pacman -Syu the system does not boot anymore.
It halts after displaying: Waiting for for /dev/sda3 (or UUID=xxx, tried both), it cannot detect the root device.
Simply giving grub the old kernel image does detect the root device again but of course then there are problems with the modules etc. I've decided to downgrade to kernel .31 and all works well again. Btw, I'm using ext4.

Here's my config:
fstab:
#
# /etc/fstab: static file system information
#
# <file system> <dir> <type> <options> <dump> <pass>
none /dev/pts devpts defaults 0 0
none /dev/shm tmpfs defaults 0 0

#/dev/cdrom /media/cd auto ro,user,noauto,unhide 0 0
#/dev/dvd /media/dvd auto ro,user,noauto,unhide 0 0
#/dev/fd0 /media/fl auto user,noauto 0 0

UUID=053ca061-b03e-46ef-89ed-ab37111fc935 / ext4 defaults 0 1
UUID=beb9babd-2f9b-4799-a7ba-765b71f49801 swap swap defaults 0 0
UUID=5bcd0543-ae50-476e-b75e-8954bafecaec /srv/nfs ext4 defaults 0 2
/dev/sda1 /boot ext4 defaults 0 0

menu.lst:
title Arch Linux - Xmonad
root (hd0,0)
kernel /vmlinuz26 root=/dev/disk/by-uuid/053ca061-b03e-46ef-89ed-ab37111fc935
initrd /kernel26.img

title Arch Linux - Framebuffer
root (hd0,0)
kernel /vmlinuz26 root=/dev/disk/by-uuid/053ca061-b03e-46ef-89ed-ab37111fc935 vga=786 3
initrd /kernel26.img

title Arch Linux - Textmode
root (hd0,0)
#kernel /vmlinuz26 root=/dev/disk/by-uuid/053ca061-b03e-46ef-89ed-ab37111fc935 3
kernel /vmlinuz26 root=/dev/sda3 3
initrd /kernel26.img

title Arch Linux Fallback
root (hd0,0)
kernel /vmlinuz26 root=/dev/disk/by-uuid/053ca061-b03e-46ef-89ed-ab37111fc935
initrd /kernel26-fallback.img

# (2) Windows
#title Windows
#rootnoverify (hd0,0)
#makeactive
#chainloader +1

lspci:
00:00.0 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Host Bridge
00:00.1 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Host Bridge
00:00.2 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Host Bridge
00:00.3 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Host Bridge
00:00.4 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Host Bridge
00:00.5 PIC: VIA Technologies, Inc. CN896/VN896/P4M900 I/O APIC Interrupt Controller
00:00.6 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Security Device
00:00.7 Host bridge: VIA Technologies, Inc. CN896/VN896/P4M900 Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
00:02.0 PCI bridge: VIA Technologies, Inc. CN896/VN896/P4M900 PCI to PCI Bridge Controller (rev 80)
00:03.0 PCI bridge: VIA Technologies, Inc. CN896/VN896/P4M900 PCI to PCI Bridge Controller (rev 80)
00:0f.0 IDE interface: VIA Technologies, Inc. Device 5372
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev b0)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev b0)
00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev b0)
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237S PCI to ISA Bridge
00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
00:13.0 Host bridge: VIA Technologies, Inc. VT8237A Host Bridge
00:13.1 PCI bridge: VIA Technologies, Inc. VT8237A PCI to PCI Bridge
01:00.0 VGA compatible controller: VIA Technologies, Inc. CN896/VN896/P4M900 [Chrome 9 HC] (rev 01)
02:00.0 Network controller: Broadcom Corporation BCM4312 802.11b/g (rev 01)
07:03.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5788 Gigabit Ethernet (rev 03)
80:01.0 Audio device: VIA Technologies, Inc. VT1708/A [Azalia HDAC] (VIA High Definition Audio Controller) (rev 10)
Comment by Vedant Kumar (vsk) - Sunday, 03 January 2010, 20:19 GMT
Hmm.. my ext3/inode bug theory is looking weaker now. There was significantly more work done on ext4 with this release, so any number of things could be causing this issue. That being said, I noticed the same pattern in the ext4/inode patch as I saw in the ext3 one; they've introduced "ext4_truncate_failed_write" which replaces "ext4_truncate" calls in order to truncate the "pagecache as well so that corresponding buffers get properly unmapped."

Maybe I'm looking in the wrong place (this is my first foray into kernel code =/).. I would appreciate it if you could comment out the line I pointed out earlier and test the 2.6.32.2 kernel, though. On /fs/ext4/inode.c, the line should be around line 1535.
Comment by Jerome Baker (ebirtaid) - Sunday, 03 January 2010, 21:29 GMT
By commenting the aforementioned line, kernel compilation does not complete. Error is:

fs/ext3/inode.c:1167: error: redefinition of ‘ext3_truncate_failed_write’
fs/ext3/inode.c:1158: note: previous definition of ‘ext3_truncate_failed_write’ was here
make[2]: *** [fs/ext3/inode.o] Error 1
make[1]: *** [fs/ext3] Error 2
make: *** [fs] Error 2
==> ERROR: Build Failed.
Comment by Vedant Kumar (vsk) - Monday, 04 January 2010, 07:25 GMT
I can't reproduce that. I built the vanilla 2.6.32.2 sources with my one-line change and it's working well enough for me to post this;
[quote]
...
CC fs/ext3/inode.o
...
Kernel: arch/x86/boot/bzImage is ready (#1)
[/quote]

Have you made any other modifications/applied any other patches? Can you check if inode.c:1158 looks like this;
static void ext3_truncate_failed_write(struct inode *inode) {
// truncate_inode_pages(inode->i_mapping, inode->i_size);
ext3_truncate(inode);
}

static int ext3_write_begin(struct file *file, struct address_space *mapping,
...
Comment by Jerome Baker (ebirtaid) - Monday, 04 January 2010, 19:41 GMT
Mine looks the same:

static void ext3_truncate_failed_write(struct inode *inode)
{
// truncate_inode_pages(inode->i_mapping, inode->i_size);
ext3_truncate(inode);
}

synced abs and copied PKGBUILD, applied no other patches. The only difference I see is that my CC line is different, compiling ext3 as a module:

...
CC [M] fs/ext3/inode.o
fs/ext3/inode.c:1168: error: redefinition of ‘ext3_truncate_failed_write’
fs/ext3/inode.c:1158: note: previous definition of ‘ext3_truncate_failed_write’ was here
make[2]: *** [fs/ext3/inode.o] Error 1
make[1]: *** [fs/ext3] Error 2
make: *** [fs] Error 2
==> ERROR: Build Failed.
Comment by Vedant Kumar (vsk) - Monday, 04 January 2010, 23:15 GMT
I'm puzzled.. Line 1168 (where this function is supposedly 'redefined') is simply this; "struct inode *inode = mapping->host;" -- and this kernel has been compiling correctly for you before.

The only thing I've done differently than you is setting up the compile. I had the 2.6.32.1 sources on disk, so I applied the 2.6.32.2 patch and everything was peachy. I did a 'make mrproper', 'menuconfig', 'bzImage', 'modules_install', and registered the thing with GRUB. Why would this work and the ABS method not?

Sorry I can't be of more help.
Comment by Heema (Heema) - Tuesday, 05 January 2010, 09:55 GMT
i also have the same problem

if i restart a couple of times it gets detected but it doesn't detect my secondary harddrive and it works without a problem with the fallback kernel

attached a screenshot
Comment by John (graysky) - Tuesday, 05 January 2010, 12:17 GMT Comment by Carsten Abele (Yonk) - Tuesday, 05 January 2010, 15:06 GMT
the 2.6.32 fallback kernel doesnt work for me ...
i have no kernelpanic ...
i just cant mount the partitions for some reason

[yonk@archlinux kernel]$ ls /dev/sdc*
brw-r--r-- 1 root root 8, 33 May 31 2009 /dev/sdc1
brw-r--r-- 1 root root 8, 34 May 31 2009 /dev/sdc2
brw-r--r-- 1 root root 8, 35 May 31 2009 /dev/sdc3
brw-r--r-- 1 root root 8, 37 May 31 2009 /dev/sdc5

[yonk@archlinux ~]$ sudo mount /dev/sdc5 test
mount: /dev/sdc5 is not a valid block device


Comment by Jerome Baker (ebirtaid) - Tuesday, 05 January 2010, 17:12 GMT
In response to vsk, yea I don't get it; the other thing is that the -bfs package from aur lacks that line completely (unless I am doing something wrong); so my guess is that that is not the issue.
Comment by Heema (Heema) - Tuesday, 05 January 2010, 20:31 GMT Comment by Pall Gone (pallgone) - Wednesday, 06 January 2010, 11:57 GMT
I didn't mention I am running Arch32 on this HP mini which I have "cloned" from a USB stick. I was curious to see what happens when I update the USB stick to .32 if I would be able to access the hdd in question. So I've done the update and the stick was running fine. I booted with it on the problem machine and had no problem to mount the root partition that is causing problems. So I assume the error has something to do with what happens when the kernel tries to detect the root device on boot.

As the problem machine is my server I am not able investigate much here...
Comment by Steen Meyer (slmeyer) - Thursday, 07 January 2010, 11:34 GMT
I have the same on Dell Latitude D630 - if I restart 10 - 15 times, it will suddenly find the partition. Look forward to the incorporation of the patch
Comment by Mauro Santos (R00KIE) - Thursday, 07 January 2010, 20:19 GMT
I may be completely wrong but wouldn't you need to have at least 'ext3' (and or 'ext4' for people using ext4) in the MODULES line in mkinitcpio.conf? At least I have that .... maybe it's just garbage/legacy stuff from an old install when it was still needed.
Comment by Jerome Baker (ebirtaid) - Thursday, 07 January 2010, 20:38 GMT
Apparently not, because once again it works with prior kernels, just not .32. This install has been going for ~1.5 years now without issue.
Comment by Mauro Santos (R00KIE) - Thursday, 07 January 2010, 20:48 GMT
Just asked because I have this:
MODULES="pata_atiixp ahci ehci-hcd ohci-hcd ext3 ext4"

I don't remember having changed it and things have been working very smoothly here. I guess that adding it there can't hurt anyway (just in case something doesn't go well with auto detection .... but the fallback should work anyway).
Comment by Heema (Heema) - Friday, 08 January 2010, 17:54 GMT
seam the latest update (2.6.32.3-1) solved this bug

i restarted twice and it booted fine
Comment by Carsten Abele (Yonk) - Friday, 08 January 2010, 18:22 GMT
Didnt work for me ... on arch64 btw
Comment by Steen Meyer (slmeyer) - Friday, 08 January 2010, 21:50 GMT
update seemed to fix it here on 64
Comment by Daniel (Doehni) - Sunday, 10 January 2010, 17:53 GMT
It seems I have the same problem. (Also fallback doesn't start)

It always hangs after the [resume] hook. (Waiting 10 sec for partition...)

The problem wasn't fixed for me on i686...
Comment by Gun Onen (gun26) - Tuesday, 12 January 2010, 05:07 GMT
Those still getting a timeout waiting for the root filesystem even after updating to kernel 2.6.32.3, your problem may actually not be with libata retries but with udev which is used early in the initrd. I fixed my own problem by downgrading udev to the udev-stable package in AUR (which provides udev 141).

Loading...