FS#19457 - mkinitcpio 0.6.4-1 / mkinitcpio-busybox 1.16.1-3 causes root not to be found

Attached to Project: Arch Linux
Opened by Nicky726 (Nicky726) - Friday, 14 May 2010, 20:07 GMT
Last edited by Thomas Bächler (brain0) - Monday, 17 May 2010, 19:03 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Aaron Griffin (phrakture)
Thomas Bächler (brain0)
Architecture i686
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
After upgrade to metioned versions of packages my EEE 1005 H netbook fails to boot, respectively it fails to mount root filesystem. It prits stuff like can't find actual root filesystem, bailing out, and I am in ramfs.
Downgrading packages back to 0.6.3-1, 1.16.1-1 respectively makes system bootable again.

Additional info:
encrypted disk with LUKS (except for /boot)
LVM on top of encrypted partition
resume set up for swap partition
intel KMS early start

* package version(s)
mkinitcpio 0.6.4-1
mkinitcpio-busybox 1.16.1-3
* config and/or log files etc.
mkinitcpio hooks are: base, udev, autodetect, pata, scsi, sata, keymap, encrypt, lvm2, resume, filesystems
mkinitcpio modules are: intel_agp i915
modprobe.conf options are: i915 modeset=1 and usbcore autosuspend=1

Steps to reproduce:
Upgrade mentioned packages on box with mentioned config and watch your root not to be found on next boot.
This task depends upon

Closed by  Thomas Bächler (brain0)
Monday, 17 May 2010, 19:03 GMT
Reason for closing:  Not a bug
Additional comments about closing:  mkinitcpio should require util-linux-ng>=2.17 though.
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 06:48 GMT
Can't reproduce, please provide more information on the machine and boot process.
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 08:19 GMT
I'd be hapy to, what exactly do you want to know?

Problem appears, when Arch is printing "mounting root filesystem read only", or something like that. And I can mount the partition from ramfs manually. When I type "exit" after that, it gives me kernel panick.
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 08:25 GMT
You see, I can't use "or something like that", I need to know _exactly_ what's on the screen.
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 08:47 GMT
Sorry for that. The mentioned just blinks, can't catch it clearer. Here is what remains written there:

...
Running Hook [lvm2]
....
5 logical volume(s) in volume group lvm_storage now active
Running Hook [resume]
Waiting 10 seconds for device /dev/mapper/lvm_storage-lvm_swap...
Waiting 10 seconds for device /dev/mapper/lvm_storage-lvm_root...
BusyBox v1.16.1 (...) mult-call binary.

Usage: mount ....

ERROR: Failed to mount the real root device.
Bailing out, you are on your own. Good luck.

/bin/sh: can't access tty; job control turned off
[ramfs/]

Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 08:55 GMT
Hmm, this may be interesting //dmesg//:
...
PM: Starting manual resume from disk
PM: Resume from partition 254:1
PM: Checking hibernation image.
PM: Error -22 checking image file
PM: Resume from disk failed

Though I didn't suspend, just normally rebooted.
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 09:45 GMT
You'd expect to not find a hibernation image when you didn't hibernate, so nothing to worry about here.


Can you provide the contents of the kernel command line? It seems that everything is fine up to the point where we actually try to mount - this means that the root device is in fact found, only the 'mount' command is incorrect.
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 10:02 GMT
Here it is:
kernel /vmlinuz26 root=/dev/mapper/lvm_storage-lvm_root cryptdevice=/dev/sda2:luks_storage resume=/dev/mapper/lvm_storage-lvm_swap ro
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 10:06 GMT
What is your filesystem type? Does the fallback image boot?
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 10:34 GMT
Filesystem is ext4 and fallback doesn's boot either. The error is the same.
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 10:56 GMT
Okay, that rules out the obvious suspects. Before the:

BusyBox v1.16.1 (...) multi-call binary.

Usage: mount ....

There should be one line detailing what exactly is going wrong. If you could provide what this says, it would be great.
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 12:36 GMT
If it should be exactly before the "BusyBox..." then there isn't. Last printed line is the ,,Waiting 10 seconds for device /dev/mapper/lvm_storage-lvm_root...". Nothing else on the screen seems to be showing anything goes wrong.
Any suggestion how to get the line?
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 13:45 GMT
Then I don't see why it prints a mount usage message. I don't see what might be wrong with the mount command, considering it works fine here.

You could try the following, just to see what the script tries to do: The last command in /lib/initcpio/init_functions is the 'mount' command. Before that command, insert a new line, copy the command, put 'echo' in front of it. Then regenerate initramfs. At least then I will know which exact command fails.
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 14:35 GMT
OK, the echo prints:
mount -t af2414b5-3a57-40b2-86a0-9813d0b09f17 1.0 ext4 filesystem -o ro /dev/mapper/lvm_storage-lvm_root /new_root
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 16:30 GMT
Now we are in business. I don't know yet what happens exactly, but please confirm that appending "rootfstype=ext4" to the kernel command line works around the problem.
Comment by Thomas Bächler (brain0) - Monday, 17 May 2010, 17:04 GMT
What's your util-linux-ng version?

mkinitcpio runs this to determine the file system type:

/sbin/blkid -u filesystem -o value -s TYPE -p /dev/mapper/lvm_storage-lvm_root

but your output looks like the one from:

/sbin/blkid -u filesystem -o value -p /dev/mapper/lvm_storage-lvm_root
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 17:04 GMT
Yes it does.
Comment by Nicky726 (Nicky726) - Monday, 17 May 2010, 17:35 GMT
I used my own experimental SELinux enabled build, version 2.16. I had also obsolete versions of SELinux enabled coreutils, pam and flex. Switching to [core] versions of this packages solves the issue -- how could I forget!. :-/ Ok, you can consider this solved. Sorry for troubles.

Loading...