FS#15416 - [kernel26] 2.6.30.1-1 PC reboots immediately after start from Grub

Attached to Project: Arch Linux
Opened by Gerhard Brauer (GerBra) - Monday, 06 July 2009, 20:10 GMT
Last edited by Roman Kyrylych (Romashka) - Tuesday, 14 July 2009, 10:01 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Pierre Schmitz (Pierre)
Thomas Bächler (brain0)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
New kernel26 package reboots my pc instantly after start from grub. I deactivated all quiet or framebuffer parameters. I will see nothing as output before the reboot. First line normally should be "Loading initramfs...." (or similar).

Tried also different kernelparameters (acpi,lapic,...) - no go.
Rebuilding a new initrd doesn't solve it.

Also other users have this problem:
http://bbs.archlinux.org/viewtopic.php?pid=580687

All reported machines (CPU/Vendor) and CARCH(i686,x86_64) are different and there is no system to see.
We have also enough posts where 2.6.30.1 works without problems.

In above thread there was one hint on downgrade of xz-utils and build initrd then, but this solve problem only by one guy...


Additional info:
* package version(s)
kernel26-2.6.30.1-1
xz-utils 4.999.8beta-4

This task depends upon

Closed by  Roman Kyrylych (Romashka)
Tuesday, 14 July 2009, 10:01 GMT
Reason for closing:  Fixed
Additional comments about closing:  the kernel part of the problem is fixed in 2.6.30.1, for grub part there are instructions posted on forums plus a link to this report, everyone with the issue seem to have solved this by now
Comment by Andreas Radke (AndyRTR) - Monday, 06 July 2009, 20:36 GMT
are you running kms with an intel card? I had something similar. see dev list kernel signoff thread.
Comment by Gerhard Brauer (GerBra) - Monday, 06 July 2009, 20:38 GMT
No it's a nvidia....
Comment by Pierre Schmitz (Pierre) - Monday, 06 July 2009, 20:43 GMT
adding Thomas (who built the kernel) and mysqlf (xz-utils)
Comment by Thomas Bächler (brain0) - Monday, 06 July 2009, 20:58 GMT
Could you please bring all of the affected here so that they can write down their system configuration and details.
Comment by Gerhard Brauer (GerBra) - Monday, 06 July 2009, 21:09 GMT
Yes. Here my System:
$ uname -a
Linux ws01 2.6.30-ARCH #1 SMP PREEMPT Fri Jun 19 20:44:03 UTC 2009 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ AuthenticAMD GNU/Linux
RAM: 4GB, DualChannel
GPU:nVidia Corporation D9M-20 [GeForce 9400 GT] (rev a1
MB: Asus M3A76

Additional Infos: After grub starts the kernel i see something: i see the grub messages about which disk, kernel line and initrd it used, and i see at last the line: Decompressing Linux kernel (or similar, don't know it exactly now).
After that screen turns blank, i mean i see short a cursor or similar in upper left corner, then the PC reboots.
Also i mean i hear a "clack" on my CRT-Monitor, so if it would switch to a other graphic mode (although: i use no frambuffer for the test).

What i have done meanwhile:
a) Tried without initrd line in grub - nothing other than above
b) Build 2.6.30.1 from abs myself (with z-utils 4.999.8beta-3 package) - no netter.

Comment by Gerhard Brauer (GerBra) - Monday, 06 July 2009, 21:32 GMT
Additional info:
i have jfs as filesystem (no seperate /boot)
A user posted in the thread above that he has also problems with a 2.6.31 (and 2.6.30.1) with jfs, he switched to ext4.
If this could be some info of interesst i could clean my /tmp partition and use it (with ext2|3) as /boot)
Comment by Gerhard Brauer (GerBra) - Monday, 06 July 2009, 22:38 GMT
Ok, i could solve my problem. It was jfs on my / (no /boot) partition. After i used a free partition with ext2 as /boot i have no problem with 2.6.30.1.

I also tested previous:
used kernel source 2.6.30.1 from kernel.org and build only the bzImage with our .config. Same effect as above.

But the clue seems a problem with 2.6.30.1 when kernel+initrd live on a jfs FS. Tomorrow i will reinstall my grub in MBR...
Nevertheless: there must be a "bug" from kernel.org between 2.6.30 and 2.6.30.1 that belongs to jfs in this early state of the whole boot and load process....

But this is mysterious enough, i bet not all posters with the same problem in the thread above have jfs as the /boot FS....


Comment by Olivier (litemotiv) - Monday, 06 July 2009, 22:47 GMT
i'm one of the affected people, JFS here too on / (no separate boot partition)

system: intel x86_64 with ati

---
this is the only jfs-related commit in the 2.6.30.1 changelog:

commit 206f0f05bdc291a9358ba59248e2bc44e8b3127d
Author: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Date: Tue Jun 16 13:43:22 2009 -0500

jfs: fix regression preventing coalescing of extents

commit f7c52fd17a7dda42fc9e88c2b2678403419bfe63 upstream.

Commit fec1878fe952b994125a3be7c94b1322db586f3b caused a regression in
which contiguous blocks being allocated to the end of an extent were
getting a new extent created. This typically results in files entirely
made up of 1-block extents even though the blocks are contiguous on
disk.

Apparently grub doesn't handle a jfs file being fragmented into too many
extents, since it refuses to boot a kernel from jfs that was created by
the 2.6.30 kernel.
Comment by Tiberiu Buta (zamolxis) - Tuesday, 07 July 2009, 08:13 GMT
The one that doesn't work here is:

i686 Intel(R) Pentium(R) 4 CPU 3.40GHz GenuineIntel GNU/Linux

File system is jfs.
Comment by Gerhard Brauer (GerBra) - Tuesday, 07 July 2009, 08:56 GMT
I went a step further...
IMHO i would say now: 2.6.30.1 (special the commit206f0f05bdc291a9358ba59248e2bc44e8b3127d) fixes a problem with jfs/grub which originate occurs under 2.6.30 (and maybe before).
After reading:
"Apparently grub doesn't handle a jfs file being fragmented into too many
extents, since it refuses to boot a kernel from jfs that was created by the 2.6.30 kernel."
i created a new initrd under running 2.6.30.1 (after used a ext2 /boot). I umounted my ext2-/boot and generate the initrd into my original (jfs)/boot dir.
This works, i could now also boot my normal system (/boot on jfs).
And so - AFAIK - it's clear that building new initrd images with 2.6.30 or 2.6.29/28/?? cold not solve the problem, when it is fixed first in 2.6.30.1
(This may not only accect the initrd, also the vmlinuz26 kernel file if it's extents are fragmentet)

So if you have the chance (like me) to boot a 2.6.30.1 you may solve it when bulding a new initrd image right to you jfs /boot dir (maybe a simple copying could solve it also). And/or do a complete (re)install of 2.6.30.1 into your jfs system with: pacman -S kernel26.



Comment by Johannes Krampf (wuischke) - Tuesday, 07 July 2009, 09:55 GMT
I've the same problem after updating, but I want to post my stats as requested in the forums. (I'll try your solution asap.)

It's a FSC Amilo Pro v3205
C2D T5200 - 1,6Ghz DualCore
1024 MB Ram
Intel 945
/ on JFS
Comment by Johannes Krampf (wuischke) - Tuesday, 07 July 2009, 10:28 GMT
(How does one edit a comment?)

It works now after a simple pacman -S kernel26 after booting with Slackware's 2.6.27 kernel. (no KMS btw.)
Comment by Thomas Bächler (brain0) - Tuesday, 07 July 2009, 11:23 GMT
Okay, it seems there is not much we can do about it. So far, everyone who has this problem uses JFS. The solution is to rewrite your kernel and initrd to the filesystem using a kernel <2.6.30 or >=2.6.30.1. This can be done by reinstalling kernel26 or simply cp both files, rm the old ones and mv them to the right name. Am I correct in all this?

This is not even a bug in Linux, but in GRUB's poor filesystem support.
Comment by Gerhard Brauer (GerBra) - Tuesday, 07 July 2009, 11:47 GMT
Thomas: +1

I thought about to report this on LKML, but i don't see what they can do (for us). They have fixed a problem, only a smooth transition from 2.6.30->2.6.30.1 was not given....

So closing this report is ok for me, or maybe leave t some days on Researching so it could be found by others...
Comment by Olivier (litemotiv) - Tuesday, 07 July 2009, 12:17 GMT
does this mean that 2.6.30.2 should be able to install ok from 2.6.30..?

i tried going back to .29 but this ended in a kernel panic, i can't afford to spend a day fixing this so for now i'm stuck with .30. i'm okay with that, as long as i can upgrade to 30.2 from here. i also don't have a means to create a separate non-jfs boot partition.
Comment by Gerhard Brauer (GerBra) - Tuesday, 07 July 2009, 12:24 GMT
@Olivier
No, the best way should be: boot from a medium(Arch install cd, other LiveCD) which have a Kernel <2.6.30.
Mount your / (and /boot partition into this mount) from your Ach system, ex:
sda2 (/) -> /mnt/arch
sda1 (/boot)(if you have a own /boot) -> /mnt/arch/boot
Do a chroot /mnt
and therein a:
pacman -U /var/cache/pacman/pkg/kernel26-2.6.30.1-1-x86_64.pkg.tar.gz (or i686)
This should solve the problem.
Comment by Gerhard Brauer (GerBra) - Tuesday, 07 July 2009, 12:29 GMT
chroot /mnt/arch
sorry
Comment by Olivier (litemotiv) - Tuesday, 07 July 2009, 12:58 GMT
that did the trick, thanks Gerhard :-)
Comment by Thomas Bächler (brain0) - Tuesday, 07 July 2009, 13:50 GMT
You cannot upgrade from 2.6.30 to anywhere, because the jfs files written by 2.6.30 will be unreadable by grub. A transistion from 2.6.30.1 to 2.6.30.2 or so will be smooth again. To ensure compatibility with grub, you should use a small ext3 for /boot - or use grub2 which is much more capable (I would hope the jfs driver in grub2 is better).
Comment by Sam (rbwsam) - Friday, 10 July 2009, 00:57 GMT
I can confirm that above method posted by GerBra got my system booting again.

"@Olivier
No, the best way should be: boot from a medium(Arch install cd, other LiveCD) which have a Kernel <2.6.30.
Mount your / (and /boot partition into this mount) from your Ach system, ex:
sda2 (/) -> /mnt/arch
sda1 (/boot)(if you have a own /boot) -> /mnt/arch/boot
Do a chroot /mnt
and therein a:
pacman -U /var/cache/pacman/pkg/kernel26-2.6.30.1-1-x86_64.pkg.tar.gz (or i686)
This should solve the problem."

JFS '/', no separate '/boot' partition.
Comment by Jens Adam (byte) - Friday, 10 July 2009, 14:32 GMT
Yes, rebuilding the initrd worked.
Damn, this one hit me in the worst possible moment... but gladly only one out of my five PCs runs without dedicated /boot partition.
Comment by Jens Adam (byte) - Friday, 10 July 2009, 14:33 GMT
"Reinstalling the kernel", I meant.
Comment by Roman Kyrylych (Romashka) - Monday, 13 July 2009, 15:55 GMT
so, can this report be closed now and some announcement with instructions posted on mailing list and forums?
Comment by Gerhard Brauer (GerBra) - Monday, 13 July 2009, 16:52 GMT
I think, yes.
Last days i saw no mmore reports on this issue. And we have the threads in bbs where a link to this report is referenced.

Loading...