FS#69757 - [linux] linux-5.11.1.arch1-1 Hangs on Boot of Dell Inspiron 3195 (AMD)

Attached to Project: Arch Linux
Opened by Albert Ferrero (aferrero) - Wednesday, 24 February 2021, 07:04 GMT
Last edited by Jan Alexander Steffens (heftig) - Saturday, 27 March 2021, 01:46 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Laptop is Dell Inspiron 3195 running AMD A9-9420e RADEON R5, this system has the integrated AMD graphics. System is using systemd-boot and is booting into an F2FS file system.

Upgraded linux package (5.10.16.arch1-1 -> 5.11.1.arch1-1). On reboot of system, immediately after systemd-boot, system appears to hang and do nothing. No log entries are recorded in journalctrl. Attempted to modify the systemd-boot kernel parameters by adding "debug ignore_loglevel earlyprintk=efi,keep log_buf_len=16M" but nothing is displayed to the screen. The systemd-boot loader goes blank and nothing is displayed on the screen. This happens every time. The power button is responsive, so I don't have to long press the power button to get it to power off, a quick press will turn the computer off, suggesting that something in the system is responsive.

Resolution, so far, has been to boot the system using the latest archlinux USB image, mount the file system, arch-chroot into it and downgrade linux package (5.11.1.arch1-1 -> 5.10.16.arch1-1) using pacman. On reboot, system works correctly. Attempted running "journalctl -k -b -1" to get previous logs but the only logs present is the one from before the kernel upgrade.

I can recreate the issue every time if I upgrade the linux package to 5.11.1.arch1-1, so I can run additional logs if required, I'm not sure what more to provide since it's not logging during the error state.

This task depends upon

Closed by  Jan Alexander Steffens (heftig)
Saturday, 27 March 2021, 01:46 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 5.11.10.arch1-1
Comment by loqs (loqs) - Wednesday, 24 February 2021, 14:41 GMT
Try blacklisting dell_wmi_sysman to see if it  FS#69702 
Comment by Albert Ferrero (aferrero) - Wednesday, 24 February 2021, 16:44 GMT
It does not appear to be related.

I upgraded the kernel and added module_blacklist=dell_wmi_sysman to the kernel command line but it did not make a difference in behavior. The system seems to hang immediately after systemd-boot, nothing is displayed to the screen (even after increasing logging levels on the kernel line), and nothing is recorded in the journalctrl.

Looking at  FS#69702 , it seems like they are able to get some output to screen which helped them narrow down the issue to that module. I seem to be hanging at a much earlier step in the boot process, sometime immediately after systemd-boot but before the kernel loads. I suspect it's at initramfs step but I'm not sure how to confirm this.
Comment by loqs (loqs) - Wednesday, 24 February 2021, 16:56 GMT
Without any output you may have to bisect the kernel to locate the cause.
Comment by Konstantin Shalygin (k0ste) - Wednesday, 24 February 2021, 17:35 GMT
I think my XPS 13 is affected too, I was blacklisted dell_wmi_sysman as suggested - no effect
Comment by loqs (loqs) - Wednesday, 24 February 2021, 20:30 GMT
If it helps with the bisection here are the first three kernels to test:
https://drive.google.com/file/d/1oQUdNTUAB3R4-o2GED2LYeWxDXmtejVb/view?usp=sharing linux-loqs-5.10-1-x86_64.pkg.tar.zst good?
https://drive.google.com/file/d/1vHLWttOAxtC6dIGNF11diAPbKmGiUiMm/view?usp=sharing linux-loqs-5.11-1-x86_64.pkg.tar.zst bad?
https://drive.google.com/file/d/1jp6strNz5J4vYKT5Lht9wN2FuPN1mqsf/view?usp=sharing linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst ??
Comment by bbo2adwuff (bbo2adwuff) - Wednesday, 24 February 2021, 20:50 GMT
EDIT: Sorry I might have been skimming the text above too quickly, I actually see some of the kernel output during booting. But then it seems to get stuck... So this is what I get with my XPS13 (9350). So not the same issue as in this bug here where there is no output to the console.

---------------------

There is a "simple workaround of switching to another terminal using Ctrl+Alt+F2 and switching back to GDM using Ctrl+Alt+F1 makes the login screen appear. It is just not displayed automatically after booting the system." (source: https://bugs.archlinux.org/task/69055)
Comment by Albert Ferrero (aferrero) - Thursday, 25 February 2021, 05:26 GMT
Thank you for the kernels to test with. I have some interesting results:

linux-loqs-5.10-1-x86_64.pkg.tar.zst - This one sort of works. I can see some activity after systemd-boot but then the screen will turn blank and nothing I do will bring anything back to the screen short of a reboot. The caps lock key lights up, making me think it's a display issue. If I blacklist the amdgpu kernel (kernel parameter module_blacklist=amdgpu) the system will boot to a login prompt. Console will work but no Xorg. It seems like the screen will blank out when the amdgpu module loads, which is weird because I haven't had any problems with this on the current 5.10.16 kernel I'm running, or any of the previous kernels (5.7.9 through 5.10.16).

The other two kernels behave exactly like the standard 5.11.1.arch1-1. Immediately after systemd-boot, the system becomes completely unresponsive (caps lock doesn't light up either, same behavior as standard 5.11.1.arch1-1):
linux-loqs-5.11-1-x86_64.pkg.tar.zst
linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst

Just out of curiosity, I tried blacklisting the amdgpu module on the 5.11.1 and 5.10.r7737* module but it seems to have no effect on behavior.
Comment by Albert Ferrero (aferrero) - Thursday, 25 February 2021, 05:30 GMT
Attached is a section of the journalctrl when booting linux-loqs-5.10-1-x86_64.pkg.tar.zst
Comment by loqs (loqs) - Thursday, 25 February 2021, 18:26 GMT
If you blacklist [1] amdgpu can you then reach a command prompt with linux-loqs-5.10-1-x86_64.pkg.tar.zst ?
Edit:
Does it make a difference to the other two kernels?

[1] https://wiki.archlinux.org/index.php/Kernel_module#Using_kernel_command_line
Comment by Albert Ferrero (aferrero) - Friday, 26 February 2021, 02:40 GMT
Correct. If I blacklist amdgpu on linux-loqs-5.10-1-x86_64.pkg.tar.zst I can get to a login prompt.

If I blacklist amdgpu on linux-loqs-5.11-1-x86_64.pkg.tar.zst, no difference.
If I blacklist amdgpu on linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst, no difference.
Comment by loqs (loqs) - Friday, 26 February 2021, 10:19 GMT
https://drive.google.com/file/d/1DE8Vxv9epXf0yBifqu14-uiNwIYVTP6O/view?usp=sharing linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar
https://drive.google.com/file/d/1276TscPknfNd1D1Xfy9cUnqi-7cAAuEo/view?usp=sharing linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar
https://drive.google.com/file/d/1Kj7G4lSX_oRzr382oGsWEt7wf0sVdN4e/view?usp=sharing linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar

Again with the amdgpu module blacklisted can any of these kernels reach a login prompt?
Comment by Albert Ferrero (aferrero) - Friday, 26 February 2021, 23:10 GMT
Thank you for providing these. Here are the results of my testing with the 3 new packages you provided:

linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.

~~~~~~~~~~~~

Consolidating the list to all packages tested so far:

linux-loqs-5.10-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.11-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
Comment by loqs (loqs) - Friday, 26 February 2021, 23:45 GMT
https://drive.google.com/file/d/1fGT1CL1D50gWV4s8Zjq84AwZQ-jAM4c5/view?usp=sharing linux-loqs-5.10.r2584.g2c075f38a708-1-x86_64.pkg.tar

Same again please.
Comment by Albert Ferrero (aferrero) - Saturday, 27 February 2021, 01:53 GMT
Thank you for providing the package. Here is the result:

linux-loqs-5.10.r2584.g2c075f38a708-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.

~~~~~~~~~~~~

Consolidating the list to all packages tested so far:

linux-loqs-5.10-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.11-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r2584.g2c075f38a708-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
Comment by loqs (loqs) - Saturday, 27 February 2021, 08:43 GMT
https://drive.google.com/file/d/1pzt4li8hvMYgQ31Z6OwEWUCdTzgRpeof/view?usp=sharing linux-loqs-5.10.r3014.g76d4acf22b48-1-x86_64.pkg.tar

Same again please.
Comment by Albert Ferrero (aferrero) - Monday, 01 March 2021, 02:13 GMT
Thank you for providing the package. Here is the result:

linux-loqs-5.10.r3014.g76d4acf22b48-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.

~~~~~~~~~~~~

Consolidating the list to all packages tested so far:

linux-loqs-5.10-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.11-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r2584.g2c075f38a708-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r3014.g76d4acf22b48-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
Comment by Albert Ferrero (aferrero) - Monday, 01 March 2021, 02:25 GMT
One update, I noticed the new version of the kernel was released, 5.11.2.arch1-1, but it also does not work. So the updated kernel did not correct the issue. It behaves the same as 5.11.1.arch1-1. It hangs.
Comment by loqs (loqs) - Monday, 01 March 2021, 08:50 GMT
Here is the next one:
https://drive.google.com/file/d/1fV6eWGztA541L8NA0y95IE1Bw0Lxr2kn/view?usp=sharing linux-loqs-5.10.r200.gdfefd226b0bf-1-x86_64.pkg.tar.zst
Comment by Jan Alexander Steffens (heftig) - Monday, 01 March 2021, 10:02 GMT
 FS#69767  might be a dupe.
Comment by Albert Ferrero (aferrero) - Monday, 01 March 2021, 22:12 GMT
Thank you for the last package. Here are the results:

linux-loqs-5.10.r200.gdfefd226b0bf-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.

~~~~~~~~~~~~

Consolidating the list to all packages tested so far:

linux-loqs-5.10-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.11-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r2584.g2c075f38a708-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r3014.g76d4acf22b48-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r200.gdfefd226b0bf-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
Comment by Albert Ferrero (aferrero) - Monday, 01 March 2021, 22:20 GMT
A note, I tried the amd_iommu=off parameter recommendation on ticket  FS#69767  with kernel 5.11.2.arch1-1 but it had no effect, it still hangs. Reading through what's in the ticket I also think we're seeing the same problem, only difference is that I suspect the other ticket is running grub which is why they get the "Loading initial ramdisk..." message but I'm running systemd-boot which has equivalent message.
Comment by loqs (loqs) - Monday, 01 March 2021, 22:55 GMT
The bisection is matching up with  FS#69810 

The next one is
https://drive.google.com/file/d/12OIkV-7_vOG-Y7Z50fsVtHsCIdNmRrr5/view?usp=sharing linux-loqs-5.10.r3165.geb0ea74120e0-1-x86_64.pkg.tar.zst
Was found to be good
https://drive.google.com/file/d/1My36BKy4PhH-oInlH_y5z1zi3UI7oU35/view?usp=sharing linux-loqs-5.10rc7.r2279.g22f07b86d4e5-1-x86_64.pkg.tar.zst
Was found to good
https://drive.google.com/file/d/1Zp7tS-pp964ztO5Yg6rcAhIKZzUq6Ecx/view?usp=sharing linux-loqs-5.10rc1.r42.g26ab12bb9d96-1-x86_64.pkg.tar.zst
Was found to be bad
https://drive.google.com/file/d/1PQBBrGHGptainsKV6htSVKp66KVtaugc/view?usp=sharing linux-loqs-5.10rc1.r21.g341b4a7211b6-1-x86_64.pkg.tar.zst
Was found to be good
https://drive.google.com/file/d/1I6zF2VqdNBRlmwM2DX_Jss8l8wDK64_6/view?usp=sharing linux-loqs-5.10rc1.r31.g79eb3581bcaa-1-x86_64.pkg.tar.zst

https://bugs.archlinux.org/task/69810#comment197237
Comment by Albert Ferrero (aferrero) - Monday, 01 March 2021, 23:57 GMT
Thank you for providing the packages. Below are the results of my testing:

linux-loqs-5.10.r3165.geb0ea74120e0-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc7.r2279.g22f07b86d4e5-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc1.r42.g26ab12bb9d96-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc1.r21.g341b4a7211b6-1-x86_64.pkg.tar.zst - Hangs *
linux-loqs-5.10rc1.r31.g79eb3581bcaa-1-x86_64.pkg.tar.zst - Hangs *

* On the last two, sometimes I would get the following message before hanging:
[153.217176] irq 3: nobody cared (try booting with the "irqpoll" option)
[153.217324] handlers:
[153.217327] [<000000001335e878>] i2c_dw_isr
[153.217330] Disabling IRQ #3

~~~~~~~~~~~~

Consolidating the list to all packages tested so far:

linux-loqs-5.10-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.11-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r7737.g538fcf57aaee-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10.r3428.g15b447361794-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc3.r1774.gb10733527bfd-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc3.r887.g9713158cb2a9-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r2584.g2c075f38a708-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r3014.g76d4acf22b48-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r200.gdfefd226b0bf-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10.r3165.geb0ea74120e0-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc7.r2279.g22f07b86d4e5-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
linux-loqs-5.10rc1.r42.g26ab12bb9d96-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc1.r21.g341b4a7211b6-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc1.r31.g79eb3581bcaa-1-x86_64.pkg.tar.zst - Hangs
Comment by loqs (loqs) - Tuesday, 02 March 2021, 00:47 GMT
git bisect good eb0ea74120e0f14a6d6454109153d1b4ccf210fc
Bisecting: 31 revisions left to test after this (roughly 5 steps)
[51130d21881d435fad5fa7f25bea77aa0ffc9a4e] x86/ioapic: Handle Extended Destination ID field in RTE

git bisect log
git bisect start
# bad: [f40ddce88593482919761f74910f42f4b84c004b] Linux 5.11
git bisect bad f40ddce88593482919761f74910f42f4b84c004b
# good: [2c85ebc57b3e1817b6ce1a6b703928e113a90442] Linux 5.10
git bisect good 2c85ebc57b3e1817b6ce1a6b703928e113a90442
# bad: [538fcf57aaee6ad78a05f52b69a99baa22b33418] Merge branches 'acpi-scan', 'acpi-pnp' and 'acpi-sleep'
git bisect bad 538fcf57aaee6ad78a05f52b69a99baa22b33418
# bad: [15b447361794271f4d03c04d82276a841fe06328] mm/lru: revise the comments of lru_lock
git bisect bad 15b447361794271f4d03c04d82276a841fe06328
# good: [b10733527bfd864605c33ab2e9a886eec317ec39] Merge tag 'amd-drm-next-5.11-2020-12-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
git bisect good b10733527bfd864605c33ab2e9a886eec317ec39
# good: [2c075f38a708c578a752b738a45e8c26923eac2e] Merge branch 'radeon-fixes' (Radeon and amdgpu fixes)
git bisect good 2c075f38a708c578a752b738a45e8c26923eac2e
# good: [76d4acf22b4847f6c7b2f9042366fbdc3d20f578] Merge tag 'perf-kprobes-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 76d4acf22b4847f6c7b2f9042366fbdc3d20f578
# good: [dfefd226b0bf7c435a58d75a0ce2f9273b9825f6] mm: cleanup kstrto*() usage
git bisect good dfefd226b0bf7c435a58d75a0ce2f9273b9825f6
# good: [eb0ea74120e0f14a6d6454109153d1b4ccf210fc] Merge tag 'x86-fpu-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good eb0ea74120e0f14a6d6454109153d1b4ccf210fc

Your bisection point
https://drive.google.com/file/d/1YMZc5Ds9xTlo7kqw_44bbbiQc1E00aRW/view?usp=sharing linux-loqs-5.10rc1.r32.g51130d21881d-1-x86_64.pkg.tar.zst
 FS#69810 's bisection point
https://drive.google.com/file/d/1I6zF2VqdNBRlmwM2DX_Jss8l8wDK64_6/view?usp=sharing linux-loqs-5.10rc1.r31.g79eb3581bcaa-1-x86_64.pkg.tar.zst
Comment by Albert Ferrero (aferrero) - Tuesday, 02 March 2021, 05:31 GMT
Thank you. I did try linux-loqs-5.10rc1.r32.g51130d21881d-1-x86_64.pkg.tar.zst and it hangs on the system in question. It doesn't behave exactly like the 5.11 kernel in that I have to long press the power button to power off, but it does hang regardless.
Comment by loqs (loqs) - Tuesday, 02 March 2021, 05:59 GMT
git bisect bad
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[e16c8058a10ba8e38d0d1ad0b64e444b245ffdbd] PCI: vmd: Use msi_msg shadow structs

https://drive.google.com/file/d/1o634_qOOEklBkCtdtXhuz2iFHBD4STge/view?usp=sharing linux-loqs-5.10rc1.r16.ge16c8058a10b-1-x86_64.pkg.tar.zst
Comment by Albert Ferrero (aferrero) - Wednesday, 03 March 2021, 03:36 GMT
Thanks
linux-loqs-5.10rc1.r16.ge16c8058a10b-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted.
Comment by Song L (songl) - Wednesday, 03 March 2021, 05:46 GMT
This seems critical not medium, since several different combinations from other bugs exist, not only amdgpu.
Comment by loqs (loqs) - Wednesday, 03 March 2021, 12:59 GMT
git bisect good
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[6452ea2a323b80868ce5e6d3030e4ccbeab9dc30] x86/apic: Add select() method on vector irqdomain

https://drive.google.com/file/d/1fId48GV_4UqFD2FI9KYH0vVBlrxnlcRt/view?usp=sharing linux-loqs-5.10rc1.r24.g6452ea2a323b-1-x86_64.pkg.tar.zst
Comment by Albert Ferrero (aferrero) - Wednesday, 03 March 2021, 15:53 GMT
Thanks
linux-loqs-5.10rc1.r24.g6452ea2a323b-1-x86_64.pkg.tar.zst - Hangs
Comment by loqs (loqs) - Wednesday, 03 March 2021, 16:35 GMT
git bisect bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[a27dca645d2c0f31abb7858aa0e10b2fa0f2f659] x86/io_apic: Cleanup trigger/polarity helpers

https://drive.google.com/file/d/12qlZOO9n_-GN7EmgYrMmB9K6oNxpRZKG/view?usp=sharing linux-loqs-5.10rc1.r20.ga27dca645d2c-1-x86_64.pkg.tar.zst
Edit:
If the above was good the next one to test is:
https://drive.google.com/file/d/1VU2z679VbF4puMsjMDSUp45BIYGo7uwa/view?usp=sharing linux-loqs-5.10rc1.r22.g5d5a97133887-1-x86_64.pkg.tar.zst
otherwise it is:
https://drive.google.com/file/d/144g6h6miYXHNaYEQwAfYri5f0_PrVdZl/view?usp=sharing linux-loqs-5.10rc1.r18.g41bb2115beec-1-x86_64.pkg.tar.zst
Comment by Albert Ferrero (aferrero) - Thursday, 04 March 2021, 00:45 GMT
Thanks

linux-loqs-5.10rc1.r20.ga27dca645d2c-1-x86_64.pkg.tar.zst - Hangs
linux-loqs-5.10rc1.r18.g41bb2115beec-1-x86_64.pkg.tar.zst - Arrives to login if amdgpu blacklisted
Comment by loqs (loqs) - Thursday, 04 March 2021, 02:07 GMT
git bisect bad
Bisecting: 1 revision left to test after this (roughly 1 step)
[41bb2115beec5e318095a89f5ad4a9c343cb21ad] x86/pci/xen: Use msi_msg shadow structs
git bisect good
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[0c1883c1eb9dfa3c72af6e00425eeb1eb171a03e] x86/msi: Remove msidef.h

https://drive.google.com/file/d/1KqQvjBDmazTmPoesgsiWHY0mW9m5OrCh/view?usp=sharing linux-loqs-5.10rc1.r19.g0c1883c1eb9d-1-x86_64.pkg.tar.zst

Assuming this one is good, as the removal of a file that is no longer used should not make a difference.

git bisect good
a27dca645d2c0f31abb7858aa0e10b2fa0f2f659 is the first bad commit
commit a27dca645d2c0f31abb7858aa0e10b2fa0f2f659
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sat Oct 24 22:35:19 2020 +0100

x86/io_apic: Cleanup trigger/polarity helpers

'trigger' and 'polarity' are used throughout the I/O-APIC code for handling
the trigger type (edge/level) and the active low/high configuration. While
there are defines for initializing these variables and struct members, they
are not used consequently and the meaning of 'trigger' and 'polarity' is
opaque and confusing at best.

Rename them to 'is_level' and 'active_low' and make them boolean in various
structs so it's entirely clear what the meaning is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link:20201024213535.443185-20-dwmw2@infradead.org"> https://lore.kernel.org/r/20201024213535.443185-20-dwmw2@infradead.org

arch/x86/include/asm/hw_irq.h | 6 +-
arch/x86/kernel/apic/io_apic.c | 244 +++++++++++++++++-------------------
arch/x86/pci/intel_mid_pci.c | 8 +-
drivers/iommu/amd/iommu.c | 10 +-
drivers/iommu/intel/irq_remapping.c | 9 +-
5 files changed, 130 insertions(+), 147 deletions(-)

git bisect log
git bisect start
# bad: [f40ddce88593482919761f74910f42f4b84c004b] Linux 5.11
git bisect bad f40ddce88593482919761f74910f42f4b84c004b
# good: [2c85ebc57b3e1817b6ce1a6b703928e113a90442] Linux 5.10
git bisect good 2c85ebc57b3e1817b6ce1a6b703928e113a90442
# bad: [538fcf57aaee6ad78a05f52b69a99baa22b33418] Merge branches 'acpi-scan', 'acpi-pnp' and 'acpi-sleep'
git bisect bad 538fcf57aaee6ad78a05f52b69a99baa22b33418
# bad: [15b447361794271f4d03c04d82276a841fe06328] mm/lru: revise the comments of lru_lock
git bisect bad 15b447361794271f4d03c04d82276a841fe06328
# good: [b10733527bfd864605c33ab2e9a886eec317ec39] Merge tag 'amd-drm-next-5.11-2020-12-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
git bisect good b10733527bfd864605c33ab2e9a886eec317ec39
# good: [2c075f38a708c578a752b738a45e8c26923eac2e] Merge branch 'radeon-fixes' (Radeon and amdgpu fixes)
git bisect good 2c075f38a708c578a752b738a45e8c26923eac2e
# good: [76d4acf22b4847f6c7b2f9042366fbdc3d20f578] Merge tag 'perf-kprobes-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 76d4acf22b4847f6c7b2f9042366fbdc3d20f578
# good: [dfefd226b0bf7c435a58d75a0ce2f9273b9825f6] mm: cleanup kstrto*() usage
git bisect good dfefd226b0bf7c435a58d75a0ce2f9273b9825f6
# good: [eb0ea74120e0f14a6d6454109153d1b4ccf210fc] Merge tag 'x86-fpu-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good eb0ea74120e0f14a6d6454109153d1b4ccf210fc
# bad: [51130d21881d435fad5fa7f25bea77aa0ffc9a4e] x86/ioapic: Handle Extended Destination ID field in RTE
git bisect bad 51130d21881d435fad5fa7f25bea77aa0ffc9a4e
# good: [e16c8058a10ba8e38d0d1ad0b64e444b245ffdbd] PCI: vmd: Use msi_msg shadow structs
git bisect good e16c8058a10ba8e38d0d1ad0b64e444b245ffdbd
# bad: [6452ea2a323b80868ce5e6d3030e4ccbeab9dc30] x86/apic: Add select() method on vector irqdomain
git bisect bad 6452ea2a323b80868ce5e6d3030e4ccbeab9dc30
# bad: [a27dca645d2c0f31abb7858aa0e10b2fa0f2f659] x86/io_apic: Cleanup trigger/polarity helpers
git bisect bad a27dca645d2c0f31abb7858aa0e10b2fa0f2f659
# good: [41bb2115beec5e318095a89f5ad4a9c343cb21ad] x86/pci/xen: Use msi_msg shadow structs
git bisect good 41bb2115beec5e318095a89f5ad4a9c343cb21ad
# good: [0c1883c1eb9dfa3c72af6e00425eeb1eb171a03e] x86/msi: Remove msidef.h
git bisect good 0c1883c1eb9dfa3c72af6e00425eeb1eb171a03e
# first bad commit: [a27dca645d2c0f31abb7858aa0e10b2fa0f2f659] x86/io_apic: Cleanup trigger/polarity helpers
Comment by Albert Ferrero (aferrero) - Thursday, 04 March 2021, 02:42 GMT
Assumption is correct.
linux-loqs-5.10rc1.r19.g0c1883c1eb9d-1-x86_64.pkg.tar.zst - Arrives to login prompt only if amdgpu is blacklisted
Comment by loqs (loqs) - Thursday, 04 March 2021, 02:56 GMT
I would suggest following the reply-to instructions from https://lore.kernel.org/kvm/20201024213535.443185-20-dwmw2%40infradead.org/ documenting the commit prevents booting on your system using 5.11.
Make sure to CC Thomas Gleixner <tglx@linutronix.de> and David Woodhouse <dwmw@amazon.co.uk>
Or open a bug on https://bugzilla.kernel.org product drivers component IOMMU.
Comment by Albert Ferrero (aferrero) - Friday, 05 March 2021, 08:00 GMT
Thanks, I have submitted bug with upstream: https://bugzilla.kernel.org/show_bug.cgi?id=212069

Also, reviewing the link from the email, I did try booting the 5.11.2 kernel with kernel parameters "iommu=off" and "acpi=off".

When I do "iommu=off" the system will boot but is very sluggish and unstable. The system will kernel panic when I shutdown and network (and possibly other systems) are unavailable.

When I do "acpi=off" the system will boot and seems stable. No kernel panics on shutdown, however, wireless network drivers refuse to load.
Comment by Konstantin Shalygin (k0ste) - Saturday, 06 March 2021, 19:02 GMT
My XPS 13 without AMD graphics. Still stuck on 5.11.2, thanks @aferrero I found why. I was disabled WiFi and Bluetooth in BIOS - now laptop boots fine. ath10k issue?
Comment by Albert Ferrero (aferrero) - Tuesday, 09 March 2021, 03:14 GMT
Just tried 5.11.4 but the issue is still present on this specific computer. It still hangs on boot.

@k0ste, I saw in the email trail that @loqs linked that someone tried booting with those kernel parameters and could boot so I thought I would give it a try. It does work, but no wireless. This computer also uses the ath wifi drivers, but I don't think it's an issue with the driver, rather a consequence of booting with acpi=off. I observe the same behavior with the LTS kernel (5.10.21), meaning the LTS kernel works just fine but if I boot with acpi=off the wireless stops working. At the moment, I just have this one computer using the LTS kernel until upstream corrects the issue.
Comment by loqs (loqs) - Tuesday, 16 March 2021, 21:50 GMT
As pointed out in [1] a27dca645d2c0f31abb7858aa0e10b2fa0f2f659 was fixed by aec8da04e4d71afdd4ab3025ea34a6517435f363 which was added before linux 5.11-rc1, so the bisection found the wrong bug.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=212069
Comment by Albert Ferrero (aferrero) - Monday, 22 March 2021, 01:29 GMT
Yeah, it seems like that's the case. I did try again with 5.11.7 as well and the problem persists. Does this mean that aec8da04e4d71afdd4ab3025ea34a6517435f363 should work and the problem is probably between aec8da04e4d71afdd4ab3025ea34a6517435f363 and 5.11?
Comment by Albert Ferrero (aferrero) - Saturday, 27 March 2021, 01:42 GMT
Good news, seems like 5.11.10 has corrected the issue. I'm able to boot the computer with this kernel.

Loading...