FS#8832 - [mkinitcpio] Kernel panic upon init=/bin/bash
Attached to Project:
Arch Linux
Opened by Bijan Camp (Abstracity) - Tuesday, 04 December 2007, 10:05 GMT
Last edited by Jan de Groot (JGC) - Sunday, 14 June 2009, 10:36 GMT
Opened by Bijan Camp (Abstracity) - Tuesday, 04 December 2007, 10:05 GMT
Last edited by Jan de Groot (JGC) - Sunday, 14 June 2009, 10:36 GMT
|
Details
Continued from forum thread:
http://bbs.archlinux.org/viewtopic.php?id=40478
Description: Either a kernel panic or other system error results from appending "init=/bin/bash" to the kernel parameters on GRUB's boot line. After upgrading bash, filesystem, and hwdetect with a "pacman -Syu", I receive this error message stating something like, in phrakture's own words, "Can't find root device, failing to prompt. If your root device appears here, try adding rootdelay=X to your kernel boot parameters". This is with "bash 3.2.025-3", "filesystem 2007.11-3", and "hwdetect 0.9-1". Before this upgrade, I was instead receiving this error message: :: Running Hook [filesystems] :: Leading root filesystem module ... jfs JFS: nTxBlock=7842,nTxLock = 62739 Waiting for devices to settle... done :: Initramfs completed - control passing to kinit IP-Config: no devices to configure Waiting 0 s before mounting root device... kinit: Mounted root (jfs filesystem) readonly. bash: root=/dev/sda3: No such file or directory Kernel panic - not syncing: Attempted to kill in Please take review of the original thread on the forum: http://bbs.archlinux.org/viewtopic.php?id=40478. That is where the initial discussion began between me and phrakture, and soon later two other users confirmed this condition by posting in the thread. I wanted to file this bug report here for better organization for the developers. The first error message above is after upgrading the bash, filesystem, and hwdetect packages, and before that I would simply just get a kernel panic. Please read the noted thread in order to view the progress that has been made in narrowing the cause down. I would recommend sticking "init=/bin/bash" on the end of your kernel parameters to reproduce this. Restating a message I posted on the thread, I have tested this on my laptop, a third-generation MacBook 2.16 GHz Intel Core 2 Duo (T7200 / T7400) with Serial ATA, as well as my desktop reiserfs system, which also boots from a Serial ATA drive. Both systems load "libata," "ata_generic," "ata_piix," "sr_mod," and "sd_mod" upon every boot. Mkinitcpio version is the latest found in the core repository, 0.5.17-1. The parameter "init=/bin/bash" was also tested on mkinitcpio 0.5.15-2 which also produced a kernel panic during the initialization procedure. |
This task depends upon
Closed by Jan de Groot (JGC)
Sunday, 14 June 2009, 10:36 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed in testing.
Sunday, 14 June 2009, 10:36 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed in testing.
just kidding.
why you don't use init s?
My main concern was file system damage, but if the risk of corruption can be minimized to a negligible possibility by syncing the disk before a manual power off, you may want to consider changing the priority of this report to that of a lower one.
From what I read the suggestion is to boot into single user mode (init s?) then doing something, is that right?
Robin, I initially commented on the use of "sync", a command line tool to force a flush of the system buffer, available from the core/coreutils package, but I have since turned to a brighter solution that I felt would establish a higher chance of system-wide success: a read-only system. Mounting your root file system and all of its subfolders as read-only effectively halts any digital encoding on the nominated device, reducing the chance of file corruption to an infinitesimal possibility. It turns out that the steps the minds behind Arch Linux take towards simplifying the file structure of the system makes the process of setting up a read-only root astonishingly carefree.
To do this, you may want to follow the advice of an article from the openSUSE wiki as well as from the Gentoo wiki; the sources are given below. Once you can positively ensure your active existence within a reliable, read-only file system, you may take yourself down to the single-user mode via the "init s" command the two developers have suggested. If I were to do this over again from the beginning, I would probably ask for the assistance of one of the two knowledgeable developers that have already gave their thoughts here, as the specifics they may be able to give you to carry out the conversion of a read/write root to that of a read-only root may prove to be a critical component in the safety of this executional process.
With kind regards,
Bijan
1. http://en.opensuse.org/How-To_Make_the_root_filesystem_read-only
2. http://gentoo-wiki.com/HOWTO_Read-only_root_filesystem
I currently have a working system by booting from a live cd, mounting the partitions and then chrooting into the original system. If the boot readonly is just to get you to a situation where you can then fix something within the system I may be able to do it from where I am now.
My goal is also different from your's, in that the reason I am concerned with the reported irregularity of the booting process is because it outwardly prevents a simple way for me, as well as others, to test out various tasks from the command line that could lead to a system having to be turned off manually instead of through a proper shutdown procedure. That is the reason why Tobias had originally suggested the "init s" workaround, and that is the reason why the prospect of syncing the cached data with the written data on the disk seemed logical.
You will need to take a different approach towards this matter than the one I did. If you are looking into rebuilding your system, I would first concern myself with what exactly would be different after a new installation than from the build you are currently using. Find out what would be different, find out what would be changed, and try to initiate these changes without rebuilding your entire system from scratch--that's the key. Also, phrakture indicated from the original thread that the error message:
bash: root=/dev/sda3: No such file or directory
Kernel panic - not syncing: Attempted to kill init!
looked to be an "error from bash itself." Have you tried rolling back to a previous version of the bash shell?
Another thing I really would like to mention is that if you can boot into the Arch Install/Rescue CD then you _know_ that the CD is doing _something_ right. It has _some_ file, is employing _some_ altered technique, or is acting in just the the right way for you to be able to boot into it. All you have to do is mimic what the Install CD is doing and make your normal system do exactly the same thing. I am looking forward to hearing what you decide to do.
Bijan
I thought your situation sounded different but I haven't had time to try to sit down and work out what it was.
I may try rolling back a few packages before doing a rebuild but seeing as I have all the packages and config files backed up I think it is likey to be be just as quick to do a clear down and start again then restore the config files that it is trying to work out which package caused the problem in the first place. Especially as to test it I have to boot the cd, mount stuff, make changes then reboot to test it. If it fails I have another reboot, remount etc.
If I do find anything with the bit of testing I have time for I'll report back.
I'll post this to the forum as well but here is what I did...
Boot a live cd
mount my root and boot partitions
chroot into the mounted root
rolled back bash which forced me to rollback hwdetect and filesystem.
I reinstalled the kernel so that mkinitcpio was ran then rebooted and it all worked!
I don't know which package killed it, hwdetect, filesystem or bash, I'll have to see what happens next time one is available for an upgrade.
>Fixed it!
>I'll post this to the forum as well but here is what I did...
this does not work for me: what are the versions of the programs that you have installed?
regards
evellon
kernel26-2.6.23.8-1-i686.pkg.tar.gz
filesystem-2007.11-2-i686.pkg.tar.gz
bash-3.2.025-2-i686.pkg.tar.gz
hwdetect-0.8-11-i686.pkg.tar.gz
on one and the same on the other except that seems to work with the current kernel rather than this older one.
This bug report is titled "Kernel panic upon init=/bin/bash". You are not discussing this bug. You are discussing other tangentially related issues. Please stop it
http://projects.archlinux.org/git/?p=mkinitcpio.git;a=commit;h=41faecce468a243f1b0835cacdb373c8b4515204
WARNING: BE CAREFUL IF YOU DO THIS!
If anyone is willing to test, replacing /lib/initcpio/init with the init in that git checkout (chmod +x, of course) and rebuilding the mkinitcpio image should give you something testable. I would highly recommend building a NEW image (-g /boot/test-mkinitcpio.img)
I added an echo of the kinit command at the end of init. that gives:
with "kernel /boot/vmlinuz26 root=/dev/sda1" in grub:
/bin/kinit -- root=/dev/sda1 rootfstype=ext3 rootdelay=0
with "kernel /boot/vmlinuz26 root=/dev/sda1 init=/bin/bash" in grub:
/bin/kinit -- root=/dev/sda1 init=/bin/bash rootfstype=ext3 rootdelay=0
I did this at work without a compiler or any of that fun stuff. Would someone be willing to test this patch. I dunno if it even compiles, but it's a start, and enough to motivate me to fix this.
:: Loading Initramfs
/init: 36: replace: not found
export: 36: : bad variable name
Kernel panic - not syncing: Attempted to kill init!
Here is an outline of the steps I took:
1) I built a new klibc package by adding the command "patch -p1 -i ../0001-Properly-specify-arguments-passed-to-init-1.patch || return 1" to the build function of the PKGBUILD of klibc-1.5-5. I also changed the "_kver" field to "2.6.25-ARCH."
2) The patch seemed to allow a successful compilation through makepkg. The log for this step is attached.
3) I then removed the old klibc package and installed the new one:
/usr/bin/pacman -Rnsd klibc
/usr/bin/pacman -S custom/klibc
4) I then generated a new kernel image with the following command: /sbin/mkinitcpio -k 2.6.25-ARCH -c /etc/mkinitcpio.conf -g /boot/kernel26-new-klibc.img.
5) Finally, I attempted to boot the new kernel image with the "init=/bin/bash" parameter and received the aforementioned "kernel panic" message:
:: Loading Initramfs
/init: 36: replace: not found
export: 36: : bad variable name
Kernel panic - not syncing: Attempted to kill init!
Subsequently, I tried to boot the new kernel image without any extra kernel parameters and received the same foregoing "kernel panic" message.
Hope this helps.
[2008-05-27 18:49] removed klibc (1.5-5)
[2008-05-27 18:49] installed klibc (1.5-6)
Here is "pacman -Ql klibc-extras":
klibc-extras /usr/
klibc-extras /usr/lib/
klibc-extras /usr/lib/klibc/
klibc-extras /usr/lib/klibc/bin/
klibc-extras /usr/lib/klibc/bin/kill
klibc-extras /usr/lib/klibc/bin/lodel
klibc-extras /usr/lib/klibc/bin/losetup
klibc-extras /usr/lib/klibc/bin/mdassemble
klibc-extras /usr/lib/klibc/bin/mknod
klibc-extras /usr/lib/klibc/bin/moddeps
klibc-extras /usr/lib/klibc/bin/mv
klibc-extras /usr/lib/klibc/bin/parseblock
klibc-extras /usr/lib/klibc/bin/replace
The issue may be the result of applying the patch to usr/kinit/kinit.c. In other words, if I build klibc-1.5-5 from ABS without the additional patch, I am not able to boot with the "init=/bin/bash" parameter, but I am at least able to boot without the added parameter (just like how everyone else is); but if I build klibc-1.5-5 from ABS _with_ the additional patch, it will not boot in either case (with the added parameter or not) and relays the "kernel panic" given above.
:: Loading Initramfs
/init: 36: replace: not found
replace not found. That is "/usr/lib/klibc/bin/replace" on your system. For some reason it did not make it into the generated image. I'm am entirely unsure as to why, but it absolutely positively cannot be related to this patch. The patch modifies some C code. replace is added to the image by mkinitcpio and is in an entirely different package (klibc-extras)
Edit: I must have skipped reading about the patches - I didn't try any of those. It doesn't work on a stock system is all I was trying to get across.
@ Bijan Camp and others:
see
FS#11757for "/init: 36: replace: not found" issue.now back to original report - does anyone experience the same problem (NOT with "/init: 36: replace: not found")?
With a kernel command line formed like so:
kernel /boot/vmlinuz26 root=/dev/sda3 ro vga=733 init=/bin/sh -x
init should be started with all args *after* the init param. In this case, /bin/sh should be called with "-x".
instead, klibc doesn't do that at all. /bin/sh is called with the command line args "/boot/vmlinuz26 root=/dev/sda3 ro vga=733 init=/bin/sh -x", which we all know is completely invalid.
If we look at the original error:
bash: root=/dev/sda3: No such file or directory
We can see that bash is trying to "run" a file named "root=/dev/sda3" in the exact same way it would handle "bash myscript.sh"
Oddly enough, klibc's run-init code does this fine, but the changes were not migrated to the kinit code.
with no init= argument kinit segfaults with: "kinit[1] segfault at 0 ip 080485a4 sp bfce1980 error 4 in kinit[8048000+8000]"
with "init=/bin/bash" or "init=/bin/bash -" on the end of my kernel arguments klibc errors with "kinit: /bin/bash: Bad address"
I'll try and rework the patch this weekend.
Is your comment confirming that you've tested and this is still a problem?
Kernel panic - not syncing: Attempted to kill init!
Tested on 3 different freshly "-Syu'ed" i686 boxes and it failed gloriously on all 3.
If anyone has time, can you test this - I'm not sure when I'll get around to it
http://code.phraktured.net/cgit.cgi/klibc/commit/?h=initargs&id=61098bf60486ab09760c40f4c5d0f0969c54a0be
with this latest patch "init=/bin/bash" now fails with "/bin/bash: rootfstype=ext3: No such file or directory" instead of the "root=/dev/sda2: No such file or directory"
ps.
I didn't recompile klibc-* packages, only klibc, but I included both the new and the old /lib/klibc-hash.so libraries. But since the patch is only to kinit I don't think it matters.
That is, a command line like so:
init=/bin/bash rootfstype=ext3 vga=791
will do the same as:
$ bash rootfstype=ext3 vga=791
kernel /boot/vmlinuz-2.6.29-dg \
root=/dev/sda3 ro \
resume=/dev/sda2 \
i915.modeset=1 init=/bin/bash
yes, it's at the end.
note that "rootfstype=ext3" is something that either the initcpio script or kinit adds.
New patch: http://code.phraktured.net/cgit.cgi/klibc/commit/?h=initargs
Note the changed argc/argv to cmdc/cmdv if you're interested
Could you try that and report what the args are it dumps?
Thanks for debugging this for me - it's hard to get the time to do this, as I'm over SSH right now
http://softver.org.mk/damjan-files/kinit-fail.png
(Ive not found how to copy/paste text from qemu/kvm directly)
I added a debug line that should output "init found, path=..." where ... is /bin/bash in this case.
If you don't see that, then maybe the patch didn't apply right.
New patch: http://code.phraktured.net/cgit.cgi/klibc/commit/?h=initargs&id=14aa4a62e5e9a158bbad93cb961369693cfbbef4
maybe it's better to get klibc from git then?
So.. got the patch (14aa4a62e5e9a158bbad93cb961369693cfbbef4), checked the source to see if it's really patched after makepkg.. it is. Installed package in VM, run mkinitcpio.. reboot the VM ..
Let's see now.. WTF .. see the new screenshot: http://softver.org.mk/damjan-files/kinit-fail.png
ps. it seems that that DEBUG macro is not expanded, dunno why .. I'll change it to printf and try again.
"init found, path=/bin/bash"
that argc also looks wrong, 12 when it should probably be 6 (or 5).
The disadvantage is that boot from network is not supported yet with those scripts, as i have no need for it. It should be easy to implement this with ipconfig and nfsmount.
Maybe this code can be usefull in solving this bug.
filesystems (1.4 KiB)
(Random side note: mkinitcpio fails hard under bash 4.0)
@thomas: kinit does argument parsing wrong for pasing args to init. No one has noticed because the major klibc user (debian) uses run-init and skips kinit entirely.
http://git.kernel.org/?p=libs/klibc/klibc.git;a=blob;f=usr/kinit/kinit.c#l199
and this is run_init:
http://git.kernel.org/?p=libs/klibc/klibc.git;a=blob;f=usr/kinit/run-init/run-init.c#l58
As far as I see, run_init is called as 'run_init [-c console_dev] /root /sbin/init [arguments to init]', while we call 'kinit -- "root=${root}" ${kinit_params} "${runlevel}' where everything after the -- is treated as an element of /proc/cmdline and kinit passes all these arguments to the init process.
Should it even pass any arguments to init? Does the kernel pass any arguments to init? I guess what we want is to pass everything _before_ the -- to init and /sbin/init ignores this, while /bin/bash doesn't. If I am correct about everything here, the fix is to change line 261 from "i++" to "init_argv[i++] = NULL". So a one-line patch instead of reimplementing much of klibc's functionality in dash scripts, what do you think?
What I don't understand is why we first try to modularize things with hooks, and then pass control to kinit which does most of the things already done in hooks, but does some of them not quite right or not in the way/order we want it?
sokoan65's patch seems to work fine, and maybe it's not good to wait until Arch's initramfs is reworked.
Long-term we should:
- move away from the crappy and mostly unmaintained klibc
- either use uclibc or glibc, combine it with busybox
- use a proper shell (busybox's ash seems okay)
- use busybox's switch_root
- create a proper net hook (what Jörg wrote above seems fine on a short glance)
Right now, I don't have the motivation to do that, and I have tons of other stuff pending for mkinitcpio, initscripts and related things, so I'll leave the quickfix in for now - at least kinit is behaving as expected now.