FS#72267 - Panic in kernel 5.14.x-arch1-1

Attached to Project: Arch Linux
Opened by Infanta Xavier (xavier83) - Tuesday, 28 September 2021, 08:21 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Monday, 07 February 2022, 07:38 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:
Upgrading to 5.14.x-arch1-1 from 5.13.x-arch1-1 and I get kernel panics on boot. It takes over 2.5 mins to reach graphical target, typically it took 20 seconds on older kernels. and
Reverting the kernel fixes it.

Additional info:
* linux-(zen-)5.14.x-arch1-1

Steps to reproduce:
On boot:
[ 5.017782] amdgpu: ATOM BIOS: BR45416.001
[ 5.017812] [drm] GPU posting now...
[ 5.048455] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[ 5.048558] amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[ 5.048565] amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[ 5.048581] [drm] Detected VRAM RAM=2048M, BAR=256M
[ 5.048583] [drm] RAM width 64bits DDR3
[ 5.048612] [drm] amdgpu: 2048M of VRAM memory ready
[ 5.048615] [drm] amdgpu: 3072M of GTT memory ready.
[ 5.048621] [drm] GART: num cpu pages 65536, num gpu pages 65536
[ 5.049134] amdgpu 0000:01:00.0: amdgpu: PCIE GART of 256M enabled (table at 0x000000F400000000).
[ 5.049932] [drm] Internal thermal controller without fan control
[ 5.049951] [drm] amdgpu: dpm initialized
[ 5.264970] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 5.265039] #PF: supervisor read access in kernel mode
[ 5.265079] #PF: error_code(0x0000) - not-present page
[ 5.265118] PGD 0 P4D 0
[ 5.265144] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 5.265181] CPU: 3 PID: 135 Comm: modprobe Not tainted 5.14.8-arch1-1 #1 72b691689b70bfbb45bec8353a1f598ad6e355d8
[ 5.265259] Hardware name: LENOVO 80E3/Lancer 5B2, BIOS A2CN45WW(V2.13) 08/04/2016
[ 5.265314] RIP: 0010:si_dpm_set_power_state+0xe3d/0x1290 [amdgpu]
[ 5.265877] Code: ff e9 2c f5 ff ff 41 89 c1 48 c7 c7 c9 03 99 c0 44 89 4c 24 04 e8 f3 6b cc ff e9 13 f5 ff ff 45 31 c0 49 8b b4 24 10 0e 00 00 <0f> b7 0e 66 85 c9 0f 84 cf 03 00 00 83 e9 01 48 8d 46 14 48 8d 0c
[ 5.265997] RSP: 0018:ffffc1a88055f8b8 EFLAGS: 00010246
[ 5.266035] RAX: ffff9b658c3f9a00 RBX: 000000000000ffff RCX: 0000000000000000
[ 5.266083] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b658c700000
[ 5.266131] RBP: ffff9b658c700000 R08: 0000000000000000 R09: 00000000000007cd
[ 5.266178] R10: 00000000c0643d00 R11: 0000000000000000 R12: ffff9b658c3f8000
[ 5.266226] R13: ffff9b658c3f8000 R14: 0000000000000000 R15: ffff9b658c3f9988
[ 5.266274] FS: 00007f966bf6c740(0000) GS:ffff9b689fd80000(0000) knlGS:0000000000000000
[ 5.266329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.266369] CR2: 0000000000000000 CR3: 0000000100b14000 CR4: 00000000000406e0
[ 5.266417] Call Trace:
[ 5.266442] ? si_dpm_pre_set_power_state+0x525/0xaa0 [amdgpu f01a08a32db101afed581310fa496ec779498af3]
[ 5.266877] amdgpu_pm_compute_clocks.part.0+0x326/0x5d0 [amdgpu f01a08a32db101afed581310fa496ec779498af3]
[ 5.267313] si_dpm_hw_init+0x77/0x80 [amdgpu f01a08a32db101afed581310fa496ec779498af3]
[ 5.267738] amdgpu_device_init.cold+0x1670/0x1b14 [amdgpu f01a08a32db101afed581310fa496ec779498af3]
[ 5.268185] amdgpu_driver_load_kms+0x67/0x300 [amdgpu f01a08a32db101afed581310fa496ec779498af3]
[ 5.268563] amdgpu_pci_probe+0x110/0x1a0 [amdgpu f01a08a32db101afed581310fa496ec779498af3]
[ 5.268936] local_pci_probe+0x42/0x80
[ 5.268969] ? pci_match_device+0xd7/0x110
[ 5.269002] pci_device_probe+0xfa/0x1b0
[ 5.269034] really_probe+0x1f5/0x3f0
[ 5.269066] __driver_probe_device+0xfe/0x180
[ 5.269101] driver_probe_device+0x1e/0x90
[ 5.269135] __driver_attach+0xc0/0x1c0
[ 5.269165] ? __device_attach_driver+0xe0/0xe0
[ 5.269200] ? __device_attach_driver+0xe0/0xe0
[ 5.269235] bus_for_each_dev+0x89/0xd0
[ 5.269267] bus_add_driver+0x12b/0x1e0
[ 5.269298] driver_register+0x8f/0xe0
[ 5.269327] ? 0xffffffffc0bd4000
[ 5.269354] do_one_initcall+0x57/0x220
[ 5.269388] do_init_module+0x5c/0x270
[ 5.270930] load_module+0x2588/0x2790
[ 5.272394] ? __do_sys_init_module+0x12e/0x1b0
[ 5.273967] __do_sys_init_module+0x12e/0x1b0
[ 5.275437] do_syscall_64+0x5c/0x80
[ 5.276997] ? do_syscall_64+0x69/0x80
[ 5.278447] ? exc_page_fault+0x72/0x170
[ 5.279989] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 5.281432] RIP: 0033:0x7f966c09832e
[ 5.282948] Code: 48 8b 0d 45 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 12 0b 0c 00 f7 d8 64 89 01 48
[ 5.286124] RSP: 002b:00007ffe4ce14548 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[ 5.287706] RAX: ffffffffffffffda RBX: 00005559bc1ebc50 RCX: 00007f966c09832e
[ 5.289361] RDX: 00005559bc1eea90 RSI: 0000000000f052f1 RDI: 00007f966a63b010
[ 5.290905] RBP: 00007f966a63b010 R08: 00007f966ba1f000 R09: 0000000000000000
[ 5.292508] R10: 00005559bc45c5b0 R11: 0000000000000246 R12: 00005559bc1eea90
[ 5.293992] R13: 000000000000001b R14: 00005559bc1ebad0 R15: 00005559bc1ebc50
[ 5.295481] Modules linked in: amdgpu(+) usbhid gpu_sched i2c_algo_bit drm_ttm_helper ttm agpgart drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm
[ 5.297033] CR2: 0000000000000000
[ 5.298615] ---[ end trace acc5159af943c0b3 ]---
[ 5.300118] RIP: 0010:si_dpm_set_power_state+0xe3d/0x1290 [amdgpu]
[ 5.302016] Code: ff e9 2c f5 ff ff 41 89 c1 48 c7 c7 c9 03 99 c0 44 89 4c 24 04 e8 f3 6b cc ff e9 13 f5 ff ff 45 31 c0 49 8b b4 24 10 0e 00 00 <0f> b7 0e 66 85 c9 0f 84 cf 03 00 00 83 e9 01 48 8d 46 14 48 8d 0c
[ 5.305269] RSP: 0018:ffffc1a88055f8b8 EFLAGS: 00010246
[ 5.306900] RAX: ffff9b658c3f9a00 RBX: 000000000000ffff RCX: 0000000000000000
[ 5.308537] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b658c700000
[ 5.310164] RBP: ffff9b658c700000 R08: 0000000000000000 R09: 00000000000007cd
[ 5.311779] R10: 00000000c0643d00 R11: 0000000000000000 R12: ffff9b658c3f8000
[ 5.313367] R13: ffff9b658c3f8000 R14: 0000000000000000 R15: ffff9b658c3f9988
[ 5.314926] FS: 00007f966bf6c740(0000) GS:ffff9b689fd80000(0000) knlGS:0000000000000000
[ 5.316482] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.318013] CR2: 0000000000000000 CR3: 0000000100b14000 CR4: 00000000000406e0
[ 5.522043] i8042: PNP: PS/2 Controller [PNP0303:KBC0,PNP0f13:_MSS] at 0x60,0x64 irq 1,12
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Monday, 07 February 2022, 07:38 GMT
Reason for closing:  Fixed
Additional comments about closing:  2022-01-08: A task closure has been requested. Reason for request: The person who had opened the following issue has commented saying that the issue has been fixed in the latest version of the kernel(5.15, see latest comment)
Comment by Jon L (JonBL) - Wednesday, 29 September 2021, 17:37 GMT
The autodetect this release still needs work. Adding one of the following to the kernel's command line should help:
amdgpu.dpm=0 amdgpu.dc=0 amdgpu.gartsize=2048

This is assuming the radeon module is either blacklisted or disabled with radeon.si_support=0 radeon.cik_support=0
Comment by Infanta Xavier (xavier83) - Friday, 01 October 2021, 15:51 GMT
Thanks JonBL, will try that and update
Comment by Infanta Xavier (xavier83) - Sunday, 03 October 2021, 11:22 GMT
The workaround is working @JonBL!
Comment by Elvis Gorena Rosado (darklyn3r) - Saturday, 06 November 2021, 00:56 GMT
Hi I'm a not so advanced Arch Linux user, in advance sorry if I can't describe my problem so technically.

Yesterday I updated my system (linux-zen 5.14.16.zen1-1) and when I restarted it it no longer responded, it did not even show me a TTY session or a kernel panics response. It just had a black screen without even a response from the keyboard.

I never had an error like this and I didn't find anything in any forum, so I decided to reinstall my ArchLinux system again. Download the image archlinux-2021.11.01-x86_64.iso (01-Nov-2021 07:55) and upon installation I had the same problem.

I suspect that the cause of this was something related to the new kernel that they talk about in the last reported bugs.
I have resigned myself to waiting for the next ArchLinux ISO image to come out but could you tell me when this error is fixed or if there is any alternative for my case.
Comment by Enzo Pacheco (Packss) - Tuesday, 09 November 2021, 00:58 GMT
@darklyn3r I do have the same problem, with a downgrade i was able to boot again on 5.14.15, i added it to IgnorePkg and waiting for a new kernel atm.
I never had a unbootable arch system in this way, it just sits on a black screen, no logs, no tty, no display response.
Comment by Infanta Xavier (xavier83) - Wednesday, 17 November 2021, 09:04 GMT
This seems to be fixed in 5.15.x

Loading...