FS#75986 - Nvidia driver freezing since linux 5.19.9
Attached to Project:
Arch Linux
Opened by Michele (mikefender) - Friday, 23 September 2022, 11:11 GMT
Last edited by Toolybird (Toolybird) - Monday, 26 September 2022, 21:27 GMT
Opened by Michele (mikefender) - Friday, 23 September 2022, 11:11 GMT
Last edited by Toolybird (Toolybird) - Monday, 26 September 2022, 21:27 GMT
|
Details
Description:
Additional info: * linux-5.19.9, nvidia-515.65.01-14 (and higher) * reproduced on a PRIME system * Dell XPS 9570 * NVIDIA GXT 1050Ti * Intel UDH Graphics 630 * Wayland compositor: Sway Steps to reproduce: on a PRIME system, turning the dGPU on and then using "prime-run" and run "vkcube" or any other application using the GPU. The issue is not present in linux-5.19.8 and nvidia-515.65.01-13 and lower versions (linux-lts/nvidia-lts works too) Attached kernel log showing the nvidia driver exceptions. |
This task depends upon
Closed by Toolybird (Toolybird)
Monday, 26 September 2022, 21:27 GMT
Reason for closing: Fixed
Additional comments about closing: linux 5.19.11.arch1-1
Monday, 26 September 2022, 21:27 GMT
Reason for closing: Fixed
Additional comments about closing: linux 5.19.11.arch1-1
Basically when running a 3D application using the discrete GPU, the application doesn't start and the process hangs forever and can't be even killed with SIGKILL. When attempting to reboot the system, systemd will wait forever for that process to end, preventing shutdown/reboot.
I understand the scripts are not an official method, but so far they've been working fine for years, and the logic behind it is used in popular solutions like nvidia-xrun.
[1] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
I'll try to revert it to see if it fixes the issue. As for the reporting, where should I report it? Should it be considered a kernel bug or a nvidia driver bug?
[1] https://github.com/NVIDIA/open-gpu-kernel-modules/issues
[2] https://forums.developer.nvidia.com/c/gpu-graphics/linux/148
Sep 24 19:32:45 jason kernel: DMAR: DRHD: handling fault status reg 2
Sep 24 19:32:45 jason kernel: DMAR: [INTR-REMAP] Request device [01:00.0] fault index 0x8000 [fault reason 0x25] Blocked a compatibility format interrupt request
Sep 24 19:32:47 jason kernel: nouveau 0000:01:00.0: sec2: cmdq: timeout waiting for queue ready
Sep 24 19:32:47 jason kernel: nouveau 0000:01:00.0: gr: init failed, -110
[1] https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.19.11
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4d8637f1d67242207410734844ca4b143ac5585e