FS#76736 - After updating to kernel 6.0.11-arch1-1 the system hangs at booting

Attached to Project: Arch Linux
Opened by Radoslav Nenchovski (rado84) - Sunday, 04 December 2022, 02:53 GMT
Last edited by Toolybird (Toolybird) - Thursday, 12 January 2023, 04:42 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:


Additional info:
* package version(s): 6.0.11-arch1-1

Steps to reproduce:
Update to 6.0.11-arch1-1 from 6.0.8-arch1-1, then reboot.
Observe the system hang and the screen starts flickering, which I recorded on a video:
https://www.youtube.com/watch?v=Y6cchQFW274

P.S. This video's description includes nvidia as well bc both nvidia and the latest kernel cause the same problem. I performed a separate update - once with linux kernel only and once with nvidia only. Each of them separately led to the same result.
This task depends upon

Closed by  Toolybird (Toolybird)
Thursday, 12 January 2023, 04:42 GMT
Reason for closing:  Fixed
Additional comments about closing:  See comments
Comment by Toolybird (Toolybird) - Sunday, 04 December 2022, 03:33 GMT
> I performed a separate update - once with linux kernel only and once with nvidia only

That's a partial upgrade i.e. unsupported.

kernel and nvidia drivers *must* be updated together.

Have you regenerated your initramfs as per  FS#76708 ?
Comment by Radoslav Nenchovski (rado84) - Sunday, 04 December 2022, 06:08 GMT
They were updated together and the result was the same. That's why I performed the partial upgrade - I wanted to see which of the two causes the problem, but apparently it's both of them.
Comment by Nicholas Hadaller (hadallen) - Wednesday, 07 December 2022, 04:07 GMT
After upgrading to 6.0.11 today, I experienced very similar lockups after boot. I could boot and login to either graphical target or command line, but system would crash (usually within seconds) and become unresponsive. Sometimes the computer would restart itself. linux-zen kernel seemed to crash faster than the linux kernel. I reinstalled arch on a separate SSD with very little packages, and experienced the same crashes.

OS: Arch Linux x86_64
Host: X570 AORUS ELITE WIFI -CF
Kernel: 6.0.10-zen2-1-zen
CPU: Ryzen 7 5700G with Radeon Graphics (16) @ 3.8GHz
GPU: NVIDIA GeForce RTX 3060
Memory: 3858MiB / 64173MiB
Comment by Toolybird (Toolybird) - Wednesday, 07 December 2022, 05:25 GMT
We need log files. Unless proper information is forthcoming, nothing can be done here in the bug tracker. Best to take it to the proper support channels (forum/IRC/etc) to see if others can help debug it. At a guess, it's probably nvidia drivers. For "process of elimination" purposes, please try removing nvidia and boot only with nouveau and see what happens.
Comment by Nicholas Hadaller (hadallen) - Wednesday, 07 December 2022, 06:48 GMT
I have attached journal boot logs for 6.0.11/nouveau, 6.0.10/nouveau, and 6.0.11/nvidia-dkms. I experienced the same crash with 6.0.11/nouveau as I did with nvidia, however with 6.0.10 I was able to boot properly (to a command line, at least - I don't think nouveau works with the 3060). I don't have any Xorg logs from a crashing session - I should have time tomorrow to continue, let me know any specific things/logs I can retrieve to help narrow it down.
Comment by q rty (q234rty) - Thursday, 08 December 2022, 07:30 GMT
This might be a bit of a long shot, but could you guys try booting with module_blacklist=amd_pstate ?
Comment by Nicholas Hadaller (hadallen) - Thursday, 08 December 2022, 17:14 GMT
Hello, I tried that this morning and had the same system failure
Comment by Toolybird (Toolybird) - Sunday, 11 December 2022, 05:15 GMT
Ok, not much in the way of clues in those logs. Still happening with linux-6.0.12.arch1-1 / nvidia-525.60.11-3?

FWIW, works fine for me in a VM with Nvidia card passed through.

Edit: Actually, in that latest log there are warnings:

"NVRM: Try unloading the conflicting kernel module (and/or"

Are you sure you regenerated the initramfs properly? if using mkinitcpio, point 5. in the wiki [1] now says "Remove kms from the HOOKS array in /etc/mkinitcpio.conf and regenerate the initramfs."

[1] https://wiki.archlinux.org/title/NVIDIA#Installation

Comment by Nicholas Hadaller (hadallen) - Sunday, 11 December 2022, 06:06 GMT
Hello, I believe those messages are because my second Nvidia GPU is passed through to a VM. I have now removed kms, regenerated the initramfs, and the messages still appear.

I just tried updating to linux-zen-6.0.12.zen1-1 and still received the same crash, seconds after booting. I have included an Xorg log from the same session
Comment by Toolybird (Toolybird) - Tuesday, 10 January 2023, 21:43 GMT
Still happening with latest pkg updates?
Comment by Nicholas Hadaller (hadallen) - Tuesday, 10 January 2023, 21:47 GMT
I was able to update to 6.1.2.arch1-1 on Jan 2 - I was away from my desktop for the holidays so I am not sure if there was an earlier update that worked.

I have since updated to 6.1.3.arch1-1 and have experienced no issues

Loading...