FS#34358 - [linux] 3.8.x -3.9.x System Hangs Without Any message at Boot

Attached to Project: Arch Linux
Opened by Sudhir Khanger (donniezazen) - Monday, 18 March 2013, 16:33 GMT
Last edited by Tobias Powalowski (tpowa) - Tuesday, 06 August 2013, 13:45 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 14
Private No

Details

Description: I did a clean installation of ArchLinux on Thinkpad T420i using EFISTUB/rEFInd. System hangs as soon as it tries to load kernel. The issue exist on both linux-3.7.10-1 and linux-3.8.3-2 on my system. When I try to boot fallback it takes me back to rEFInd screen and if I then click stock kernel it boots fine. I have to try to boot fallback a few times for this to work. I have followed beginners guide very closely and seems to have followed instructions properly.


Additional info:
* package version(s)
linux-3.7.10-1
linux-3.8.3-2

* config and/or log files etc.
Please let me know what log files do I need to provide.

* forum discussion
https://bbs.archlinux.org/viewtopic.php?id=156670

Steps to reproduce:
Just try to boot stock kernels.

Thanks.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Tuesday, 06 August 2013, 13:45 GMT
Reason for closing:  No response
Comment by Artem A Klevtsov (unikum) - Monday, 18 March 2013, 19:28 GMT Comment by cfr (cfr42) - Wednesday, 20 March 2013, 23:08 GMT
I have the same issue except that I get the same hang with the fallback image - it does not take me back to the rEFInd screen. I can boot using grub from the rEFInd screen but I cannot boot the kernel using either image successfully.

I did not have the issue with linux-3.7.10-1. I first see the issue with linux-3.8.3-2.
Comment by Rod Smith (srs5694) - Thursday, 21 March 2013, 19:28 GMT
According to users in https://bbs.archlinux.org/viewtopic.php?pid=1247353#p1247353 (see posts 119-123, perhaps others), using a version of rEFInd compiled with GNU-EFI works around the problem. (Arch's rEFInd, and the official upstream rEFInd, are compiled with TianoCore EDK II.) Using the version 1 EFI shell also works for some users whereas the version 2 EFI shell fails. Some users say that gummiboot, which is built with GNU-EFI, fails, so it's not quite as simple as GNU-EFI being good and TianoCore being bad, but there does seem to be a correlation in at least some cases.
Comment by Vasya Pupkin (shahid) - Friday, 22 March 2013, 21:30 GMT
I have 99% same problem as described, but:
- I have NO rEFInd. Just plain old toshiba laptop.
- I've NOT tested on 3.7.10. i've updated 3.7.4 -> 3.8.4 right now and see the issue.
- Issue in NOT reproduceable on 3.7.4 and 3.6.10 kernels with mkinitcpio 0.13.0-1.
- Issue IS reproduceable on 3.8.4-ARCH kernel.

"fallback"-initrd does NOT helps.
Rollback rootfs (via lvm snapshot) does NOT helps.
Downgrade kernel and initramfs -- helps 100%.
earlyprintk loglevel=7 shows nothing related.

------ My /etc/mkinitcpio.conf: -------
MODULES="reiserfs ext2 ext4 btrfs"
BINARIES=""
FILES=""
HOOKS="base udev autodetect modconf block keyboard encrypt lvm2 filesystems fsck"
COMPRESSION="xz"
----------------------------------

As you can see, there are luks-encrypted LVM. And LVM contains snapshots of rootfs, /var, etc.
I've tried to boot OS in different configurations. Conclusions:
- According to HDD LED and loglevel=7, boot silently stops after LVM2 activation and before "fsck". (If snapshots enabled, activation takes some time (30-40 seconds), so i'm sure, that lvm is activated before the hang.)
- 100% same behaviour if i pass INVALID initrd to loaded kernel. For example, boot 3.6.10-ARCH kernel with 3.8.4-ARCH initramfs: it will ask password, it will activate LVM and it will silently hang.
- kparams "loglevel=7", "debug" shows nothing related to this problem (or nothing at all).
- Problem is NOT related to systemd/rootfs/etc. As said erlier, downgraded kernels can boot everything ok.
- Problem inside initramfs and somehow related to kernel version.

Is any way to debug initrd hooks?
Comment by Vasya Pupkin (shahid) - Saturday, 23 March 2013, 08:35 GMT
Ok, i'm able to boot 3.8.4-1-ARCH. My algoritm:
kernel params += "disablehooks=lvm2 break=y"
In busybox console:
$ mkdir /run/lvm
$ lvmetad
$ lvm
> vgchange -a y
> ^D
$ ^D
And system boots OK with some early warnings about lvmetad (already running).

I've tried kernel parameters like initcall_debug, debug, udev.loglevel=7, etc, but they shows nothing interesting.

If I exit from busybox without lvm activation, /init will NOT find root device and return me back into busybox console after 10 seconds.

Ideas?
Comment by cfr (cfr42) - Saturday, 23 March 2013, 22:25 GMT
I am seeing this issue with 3.8.4 as well. I don't mean shahid's issue (that's a different problem) but the issue targeted in the initial bug report. So 3.8.* is looking bad for me, whereas 3.7.* all worked fine. I haven't yet tried the alternative binary for rEFInd but I will probably have a go with that shortly.

I would be happy to provide any information which might be helpful although I can't think what that might be.

@Vasya Pupkin (shahid)

That's not the same bug. It is a different problem (which is also discussed on the forums and reported, I think) related specifically to LVM.

The symptoms you are seeing are different. You are getting much further in the boot process. This bug report concerns an issue which occurs earlier in the process.

EDIT: shahid, see https://bugs.archlinux.org/task/33851.
Comment by cfr (cfr42) - Tuesday, 26 March 2013, 03:44 GMT Comment by Matthias Kleemann (mkleemann) - Friday, 05 April 2013, 14:49 GMT
I think it may be the already solve refind issue here. With refind-efi 0.6.8-1 ist works with linux 3.7.10-1 on my Macbook. Any linux 3.8x works not, though.
Comment by Vasya Pupkin (shahid) - Saturday, 06 April 2013, 20:37 GMT
(just for information) My "similar" problem (without efi) is gone away with today update to kernel 3.8.5.
Comment by cfr (cfr42) - Monday, 08 April 2013, 02:44 GMT
No 3.8.* boots for me, either. (Not with the stub loader - grub works fine.)

Too bad this isn't assigned. This is probably linked to the rEFInd/gummiboot bugs as well but it is likely something in the kernel. (The developer of rEFInd seems to think something about the way the kernel is being compiled or the compiler.)
Comment by cfr (cfr42) - Wednesday, 10 April 2013, 08:15 GMT
  • Field changed: Percent Complete (100% → 0%)
As noted in my last comment, this issue is not fixed for everyone. In fact, the issue only appeared for me with the 3.8.* kernels. It makes no difference whether I use Arch rEFInd or the alternate binary, no 3.8.* kernel boots for me this way. Boots fine with rEFInd + grub but not rEFInd + STUB loader.

The developer of rEFInd advised me to request this bug be reopened. https://bbs.archlinux.org/viewtopic.php?pid=1256680#p1256680
Comment by Matthias Kleemann (mkleemann) - Wednesday, 10 April 2013, 08:51 GMT
As mentioned above, same behaviour here with EFI v1 (Macbook 2007). Using the EFI-Stub does no longer work with 3.8.x kernels.

EFI v2 (64bit Atom-Board) works fine with EFI-Stub and refind. No such issues there.
Comment by cfr (cfr42) - Wednesday, 10 April 2013, 23:38 GMT
Just thought I would note that I updated my BIOS to the latest available from Lenovo today but I still can't boot using rEFInd + the STUB loader. (I'm also getting a boot error but I think that is Lenovo's fault and it seems not to have affected the firmware upgrade at all.)

EDIT: Lenovo have fixed the boot error they were responsible for. The problem with rEFInd + STUB loader remains. rEFInd -> grub -> kernel works fine but rEFInd -> STUB loader fails every time.
Comment by Tobias Powalowski (tpowa) - Thursday, 23 May 2013, 19:55 GMT
Status on 3.9 kernels?
Comment by cfr (cfr42) - Thursday, 30 May 2013, 22:37 GMT
The issue affects all kernels in 3.8.* and 3.9.* which I have tried. (Basically, all kernels from Arch's stable repo.)

I can reproduce the issue with a direct EFI boot menu entry as well as using the EFI stub loader from a rEFInd menu. As usual, grub via rEFInd works fine.
Comment by Tobias Powalowski (tpowa) - Tuesday, 30 July 2013, 10:35 GMT
Status on 3.10.x?

Loading...