FS#55651 - [linux] Boot freeze after "::running early hook [lvm2]" with linux>=4.13

Attached to Project: Arch Linux
Opened by Freya Gentz (zegentz) - Saturday, 16 September 2017, 19:50 GMT
Last edited by freswa (frederik) - Sunday, 13 September 2020, 14:02 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
The boot freezes on the following screen when I upgraded to linux 4.13:
"
::running early hook [udev]
starting version 234
::running early hook [lvm2]
_
"
The freezes continued with my upgrade to linux 4.13.1 and linux 4.13.2. It happens in both the linux kernel and linux-hardened kernel.

No log is left behind. I've attached a boot log from linux-lts.
I've attached a copy of my partion data.
I've attached my fstab, and crypttab.
I've attached the contents of the files in grub.d in order.
I've attached my mkinitcpio.

Additional info:
lib32-util-linux 2.30.1-1
libutil-linux 2.30.1-2
linux 4.13.2-1
linux-api-headers 4.12.7-1
linux-docs 4.13.2-1
linux-firmware 20170907.a61ac5c-1
linux-hardened 4.13.1.b-1
linux-hardened-docs 4.13.1.b-1
linux-hardened-headers 4.13.1.b-1
linux-headers 4.13.2-1
linux-lts 4.9.50-1
linux-lts-docs 4.9.50-1
linux-lts-headers 4.9.50-1
util-linux 2.30.1-2
lvm2 2.02.174-1

Steps to reproduce:
1- Start PC
2- Type in password to decrypt boot partition
3- Select correct kernel then hit enter
4- Watch as it gets stuck on that screen
This task depends upon

Closed by  freswa (frederik)
Sunday, 13 September 2020, 14:02 GMT
Reason for closing:  Fixed
Comment by loqs (loqs) - Saturday, 16 September 2017, 20:16 GMT
Some similarity to  FS#55537 
Comment by loqs (loqs) - Saturday, 16 September 2017, 22:06 GMT
If you add the sysrq_always_enabled paramater to boot options can you use https://en.wikipedia.org/wiki/Magic_SysRq_key to reboot the system?
If you add the boot options loglevel=7 and remove the quiet option if present is any more output produced?
Other things to try the boot option scsi_mod.use_blk_mq=0, switch to the systemd hooks https://wiki.archlinux.org/index.php/Mkinitcpio#HOOKS bisect between 4.13 and 4.12
Comment by Freya Gentz (zegentz) - Sunday, 17 September 2017, 06:59 GMT
> If you add the sysrq_always_enabled paramater to boot options can you use https://en.wikipedia.org/wiki/Magic_SysRq_key to reboot the system?
Yes
> If you add the boot options loglevel=7 and remove the quiet option if present is any more output produced?

Yes, lots. The last line was "fb: switching to amdgpudrmfb from EFI VGA", my phone was dead, so couldn't take a picture.
Interestingly, on 4.12 and lts I used to get a colorful line across the center of my screen before it changed resolution after I went from radeon to amdgpu. However on 4.13 my screen doesn't change resolution and I don't get that line.

> switch to the systemd hooks https://wiki.archlinux.org/index.php/Mkinitcpio#HOOKS
Doesn't help

I plan to do a git bisect sometime over the next week.
Comment by loqs (loqs) - Sunday, 17 September 2017, 12:34 GMT
If you add the boot parameter rd.debug does it show which system call the early lvm2 hook hangs on?
Comment by Freya Gentz (zegentz) - Sunday, 17 September 2017, 17:50 GMT
No, adding rd.debug=1 doesn't do anything. I've attached a picture of the screen it hangs on.
Comment by loqs (loqs) - Sunday, 17 September 2017, 18:59 GMT
You could try blacklisting both radeon and amdgpu modules in case that is the issue instead of lvm: modprobe.blacklist=amdgpu,radeon
Otherwise I am out of ideas apart from the bisection. When doing the bisection please check that you can boot the 4.12 kernel you build to avoid having a false good start point.
Comment by Freya Gentz (zegentz) - Wednesday, 20 September 2017, 04:32 GMT
Blacklisting amdgpu, unblacklisting radeon, replacing the amdgpu hook with radeon and setting Xorg to use radeon fixes this issue.

Sometime this week I will hopefully start bisecting the kernel, just need to find time.
Comment by Freya Gentz (zegentz) - Thursday, 21 September 2017, 01:57 GMT
Searching around got me to this post on the manjaro forums: https://forum.manjaro.org/t/solved-kernel-4-13-doesnt-boot-with-amdgpu/30770

The suggested fix is to add “radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1” to the kernel command line. Undoing the last comments changes and adding those lines fixes the issue.

Mention of this was added to the wiki on Sept 12.
Comment by lukas svärdkvist (sleepyoh) - Friday, 29 September 2017, 15:00 GMT
I have the same bug, freezes on the same spot. I blacklisted amdgpu and use radeon instead, which works fine. I saw errors from amdgpu in the journal, i will post my logs here next time i have access to my computer.

Comment by Isa CIchon (nQL) - Monday, 09 October 2017, 12:17 GMT
I suspect that I am affected by the same bug. GRUB hangs after "running early hook [lvm2]" and before asking for a password to decrypt the root partition since upgrading from linux-4.12.13-1-x86_64 to kernel 4.13.3. I get the same error with kernel 4.13.4. The GPU in my PC is an AMD R9 390 and I have previously been using the ati (or radeon) drivers for it and did not have the amdgpu package installed until I found this bug report. Now I have both xf86-video-amdgpu 1.4.0-1 and xf86-video-ati 1:7.10.0-1 installed.

Blacklisting either or both of radeon and amdgpu on the kernel command line or adding “radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1” did not fix this issue.

Reverting to kernel 4.12.13 or running on the intel integrated graphics by changing the default graphics adapter in my bios settings works fine.

note: I am using the 'Arch Linux (AMD graphics)' menuentry from my grub.cfg and modifying the command line in grub.
Comment by Freya Gentz (zegentz) - Sunday, 15 October 2017, 00:12 GMT
I don't think you have the same issue, yours is getting jammed on something related to vfio. You might have better luck filing a seperate bug report.

Anyways, I would try to confirm that:
A) in 4.12.13 your PC is using amdgpu/radeon
b) if your GPU is SI or CIK that you've followed https://wiki.archlinux.org/index.php/AMDGPU#Enable_Southern_Islands_.28SI.29_and_Sea_Islands_.28CIK.29_support

Comment by Vu Duc Tung (tunggad) - Tuesday, 31 October 2017, 10:06 GMT
Adding “radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1” to the grub kernel command line helps!!!

my system:

kernel version: 4.13.9-1-ARCH
graphic card: Radeon HD 7850

But what do the above kernel settings do? are they really the solution or there is another long term solution?
Comment by Vu Duc Tung (tunggad) - Tuesday, 31 October 2017, 11:06 GMT
This link explains why the kernel parameters are needed by kernel version 4.13+

https://www.phoronix.com/scan.php?page=article&item=linux-413-gcn101&num=1

"Beginning with Linux 4.13, AMDGPU and Radeon GCN 1.0/1.1 support can co-exist nicer thanks to some new module options added. Even if blacklisting the Radeon DRM, AMDGPU doesn't have GCN 1.0/1.1 support by default but requires setting amdgpu.cik_support=1 for GCN 1.1 support and amdgpu.si_support=1 for GCN 1.0 support. To get Radeon DRM to not bind to these generations of GPUs, radeon.si_support=0 and radeon.cik_support=0 must be set. So basically if you want to get AMDGPU working for Sea Islands and Southern Islands GPUs on Linux 4.13+, you need to append "radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1" to your kernel command line when booting the system."

Loading...