Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#76354 - [linux] NULL pointer dereference since 6.0.5-arch1-1

Attached to Project: Arch Linux
Opened by Roland Ruckerbauer (ruabmbua) - Monday, 31 October 2022, 12:00 GMT
Last edited by Toolybird (Toolybird) - Tuesday, 15 November 2022, 05:35 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
David Runge (dvzrv)
Levente Polyak (anthraxx)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:

Since booting into the 6.0.5 and 6.0.6 kernel my audio stopped working. When investigating I found,
that the pipewire process is frozen and can not be killed. Next I tried alsamixer, it did not start,
and the process is also frozen and not kill-able.

Then I checked kernel logs, and found a NULL pointer dereference and a "supervisor write access in kernel mode".
The bug is easily reproducible in both 6.0.5 and 6.0.6 kernels. Kernel log is attached below.

Additional info:
* Bad versions: 6.0.5-arch1-1 6.0.6-arch1-1
* Good versions 6.0.2-arch1-1

Steps to reproduce:

1) Boot into 6.0.5 or newer kernel, and look into dmesg?
This task depends upon

Closed by  Toolybird (Toolybird)
Tuesday, 15 November 2022, 05:35 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 6.0.8.arch1-1
Comment by Roland Ruckerbauer (ruabmbua) - Monday, 31 October 2022, 12:05 GMT
I just found out, that it is always the same userspace process causing the fault: rasdaemon.
After disabling it, it seems that the crash and subsequent audio freeze is no longer triggered.
Comment by loqs (loqs) - Monday, 31 October 2022, 17:13 GMT Comment by Roland Ruckerbauer (ruabmbua) - Monday, 31 October 2022, 17:30 GMT
Investigated a bit more and reported it upstream, seems someone else already reported the problem.

https://linkshortner.net/dyvLZ
Comment by Devin Cofer (Ranguvar) - Tuesday, 01 November 2022, 02:38 GMT
Also had this issue on an X570 motherboard with Zen 2 (Ryzen 3xxx desktop) CPU, rasdaemon also implicated.
First noticed Firefox crashing, and even on a new profile still crashes reliably when trying to right-click any tab.
Had issues shutting down -- only SysRq sync was processed and echoed but system never halted and SysRq+B was not respected.

6.0.3.arch3 still had issues, 6.0.2.arch1 is working well.
Even latest 5.15 LTS may have had issues, not certain, but I need to actually use my system now instead of testing again :)
Comment by loqs (loqs) - Tuesday, 01 November 2022, 03:11 GMT Comment by Devin Cofer (Ranguvar) - Tuesday, 01 November 2022, 11:00 GMT
Confirming no issue with linux 6.0.6.arch1 and rasdaemon disabled.
OP opened a bug report upstream[1], thank you!

[1] https://github.com/mchehab/rasdaemon/issues/73
Comment by Jan Hradek (jan.hradek) - Tuesday, 01 November 2022, 19:49 GMT
My issues were that the system refused to power off (my primary issue, the same as stated above) and grub-mkconfig hanged during grub-probe, both of which were resolved by disabling rasdaemon.

I've been having the same issues on the 6.0.6-zen1-1-zen and the "vanilla" 6.0.6.arch1-1 kernel (AFAIR).
Comment by Roland Ruckerbauer (ruabmbua) - Tuesday, 01 November 2022, 20:08 GMT
I am in contact with the responsible linux kernel maintainer, we are currently debugging the root cause.
Comment by Roland Ruckerbauer (ruabmbua) - Tuesday, 01 November 2022, 23:09 GMT
Possible fix was found and tested successful. Will probably be in upstream kernel soon.

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 199759c73519..4ffcc6e33258 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -937,6 +937,9 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu)
struct ring_buffer_per_cpu *cpu_buffer;
struct rb_irq_work *rbwork;

+ if (!buffer)
+ return;
+
if (cpu == RING_BUFFER_ALL_CPUS) {

/* Wake up individual ones too. One level recursion */
@@ -945,7 +948,14 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu)

rbwork = &buffer->irq_work;
} else {
+ if (WARN_ON_ONCE(!buffer->buffers))
+ return;
+ if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
+ return;
cpu_buffer = buffer->buffers[cpu];
+ /* The CPU buffer may not have been initialized yet */
+ if (!cpu_buffer)
+ return;
rbwork = &cpu_buffer->irq_work;
}
Comment by Toolybird (Toolybird) - Wednesday, 02 November 2022, 02:53 GMT
Nice work @ruabmbua. Thanks for reporting it upstream and following through [1]

[1] https://lore.kernel.org/all/20221101191009.1e7378c8%40rorschach.local.home/

Comment by Devin Cofer (Ranguvar) - Friday, 04 November 2022, 12:27 GMT
6.0.7 lacks the patch unfortunately, to save anyone the time checking,
Comment by loqs (loqs) - Monday, 07 November 2022, 15:54 GMT Comment by Devin Cofer (Ranguvar) - Tuesday, 15 November 2022, 05:31 GMT
Confirmed fixed for me with rasdaemon running while on 6.0.8.arch1-1.

Thank you all!

Loading...