Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines
Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.
REPEAT: Do NOT report bugs for outdated packages!
https://wiki.archlinux.org/title/Bug_reporting_guidelines
Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.
REPEAT: Do NOT report bugs for outdated packages!
FS#55141 - [linux] "BUG: unable to handle kernel paging request" with 4.12.3, 4.12.5
Attached to Project:
Arch Linux
Opened by Jeff Cook (jeffcookio) - Monday, 14 August 2017, 16:35 GMT
Last edited by Toolybird (Toolybird) - Sunday, 28 May 2023, 06:07 GMT
Opened by Jeff Cook (jeffcookio) - Monday, 14 August 2017, 16:35 GMT
Last edited by Toolybird (Toolybird) - Sunday, 28 May 2023, 06:07 GMT
|
DetailsDescription:
After 36-48 hours of heavy KVM usage, I get the attached output, which begins with: Aug 14 06:30:28 kvm_master kernel: BUG: unable to handle kernel paging request at ffffffffc0955f96 kernel: IP: report_bug+0x94/0x120 kernel: PGD a67a0c067 kernel: P4D a67a0c067 kernel: PUD a67a0e067 kernel: PMD 495886067 kernel: PTE 800000049bbdd161 I've experienced this on 4.12.3 and 4.12.5 so far (I also experienced another KVM lockup on 4.12.3 which may have been resolved). I do *not* experience this on 4.11.9, the last 4.11 kernel that was packaged by Arch. Things are stable on 4.11.9. Does not seem to be triggered by anything specific, just happens after some usage. I am making extensive use of PCI passthrough (USB controller + GPUs; 1 GPU to one VM, another GPU to another, both Win10) and have several VMs running on the host (4 Linux VMs + 2 Windows VMs). On this error condition, other VMs and the host system initially remain responsive, but they too fail after issuing a few commands. Messages indicating a soft CPU lockup are emitted regularly: kernel: INFO: rcu_preempt detected stalls on CPUs/tasks: kernel: Tasks blocked on level-1 rcu_node (CPUs 0-15): P2837 kernel: (detected by 23, t=990092 jiffies, g=2933189, c=2933188, q=9740206) See attached for full oops, /proc/cpuinfo, and /proc/meminfo. |
This task depends upon
Closed by Toolybird (Toolybird)
Sunday, 28 May 2023, 06:07 GMT
Reason for closing: No response
Additional comments about closing: Plus it's old and stale. If still an issue, please follow up with upstream.
Sunday, 28 May 2023, 06:07 GMT
Reason for closing: No response
Additional comments about closing: Plus it's old and stale. If still an issue, please follow up with upstream.
Comment by Jeff Cook (jeffcookio) -
Monday, 14 August 2017, 16:41 GMT
Clarification: this crash doesn't trigger explicit "soft CPU lockup" messages in the journal (that's the other crash that I haven't observed on 4.12.5), so describing the failure as a "soft CPU lockup" was probably dumb. This crash _does_ trigger the pasted rcu_preempt message.
Comment by loqs (loqs) -
Monday, 14 August 2017, 18:18 GMT
Please try 4.12.6-1 (you might also want to try 4.13-rc5) if that does not resolve the issue bisect the kernel to find the bad commit and report the issue upstream.
Comment by Lily (voidlily) -
Monday, 21 August 2017, 01:20 GMT
I'm also having this issue on 4.12.8-2-ck. I'm going to downgrade to 4.11-ck and see if that at least gets me running again for now. My oopses were the same as the reporter.
Comment by Jeff Cook (jeffcookio) -
Monday, 21 August 2017, 05:38 GMT
Similar bug report upstream: https://bugzilla.kernel.org/show_bug.cgi?id=196685
Comment by mattia (nTia89) -
Sunday, 27 February 2022, 14:00 GMT
I cannot reproduce the issue. Is it still valid?
oops-only.txt