Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#54562 - [linux] kvm lapic oops

Attached to Project: Arch Linux
Opened by Wilken Gottwalt (Akiko) - Friday, 23 June 2017, 19:45 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Thursday, 03 March 2022, 11:54 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Architecture x86_64
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Since a while KVM triggers a kernel oops, not a crash, but anoying. It always looks like this.

------------[ cut here ]------------
[23413.974172] WARNING: CPU: 1 PID: 7071 at arch/x86/kvm/lapic.c:1529 kvm_lapic_expired_hv_timer+0xd2/0xf0 [kvm]
[23413.974172] Modules linked in: vhost_net vhost tap tun cfg80211 rfkill bridge stp llc uinput nct6775 hwmon_vid snd_virtuoso snd_oxygen_lib input_leds led_class snd_hda_intel iTCO_wdt ftdi_sio mousedev usbserial snd_hda_codec iTCO_vendor_support snd_mpu401_uart snd_hda_core snd_rawmidi intel_rapl snd_seq_device sb_edac evdev snd_hwdep snd_pcm mxm_wmi mac_hid edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_timer snd soundcore intel_cstate i2c_i801 lpc_ich intel_rapl_perf e1000e mei_me shpchp ptp mei pps_core tpm_tis wmi tpm_tis_core tpm button sch_fq_codel sg ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache algif_skcipher af_alg dm_crypt dm_mod raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy async_pq async_xor xor async_tx hid_generic usbhid hid raid6_pq
[23413.974200] md_mod sd_mod sr_mod cdrom crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc ahci xhci_pci libahci ehci_pci xhci_hcd ehci_hcd libata aesni_intel aes_x86_64 crypto_simd glue_helper usbcore cryptd scsi_mod usb_common serio amdkfd amd_iommu_v2 amdgpu i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio
[23413.974215] CPU: 1 PID: 7071 Comm: qemu-system-x86 Tainted: G W 4.11.6-1-ARCH #1
[23413.974216] Hardware name: MSI MS-7885/X99A RAIDER (MS-7885), BIOS P.30 04/12/2016
[23413.974216] Call Trace:
[23413.974220] dump_stack+0x63/0x81
[23413.974223] __warn+0xcb/0xf0
[23413.974225] warn_slowpath_null+0x1d/0x20
[23413.974229] kvm_lapic_expired_hv_timer+0xd2/0xf0 [kvm]
[23413.974232] handle_preemption_timer+0xe/0x20 [kvm_intel]
[23413.974233] vmx_handle_exit+0xba/0x1460 [kvm_intel]
[23413.974235] ? atomic_switch_perf_msrs+0x6f/0xa0 [kvm_intel]
[23413.974236] ? vmx_vcpu_run+0x322/0x430 [kvm_intel]
[23413.974241] kvm_arch_vcpu_ioctl_run+0xca5/0x16a0 [kvm]
[23413.974246] ? kvm_arch_vcpu_load+0x6d/0x290 [kvm]
[23413.974247] ? __vmx_load_host_state.part.30+0x128/0x210 [kvm_intel]
[23413.974251] kvm_vcpu_ioctl+0x2a6/0x640 [kvm]
[23413.974254] ? kvm_vcpu_ioctl+0x2a6/0x640 [kvm]
[23413.974257] ? __schedule+0x236/0x8e0
[23413.974259] ? vfio_pci_rw+0x37/0x90 [vfio_pci]
[23413.974261] do_vfs_ioctl+0xa5/0x600
[23413.974262] ? retint_kernel+0x1b/0x1d
[23413.974264] ? __fget+0x77/0xb0
[23413.974264] SyS_ioctl+0x79/0x90
[23413.974265] entry_SYSCALL_64_fastpath+0x1a/0xa9
[23413.974266] RIP: 0033:0x7f05b4f2dcb7
[23413.974267] RSP: 002b:00007f05a53fe8e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[23413.974268] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f05b4f2dcb7
[23413.974268] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
[23413.974268] RBP: 00007f05a84d70c0 R08: 00005565248326d0 R09: 000000000000ffff
[23413.974269] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000
[23413.974269] R13: 00007f05bbf62000 R14: 0000000000000000 R15: 00007f05a84d70c0
[23413.974270] ---[ end trace 167258acbebecbf2 ]---

Additional info:
* Linux 4.11.6-1-ARCH #1 SMP PREEMPT x86_64
* qemu 2.9.0

Steps to reproduce:
Just run a Windows or a Linux with full hardware acceleration (-enable-kvm, -machine q35,accel=kvm,kernel_irqchip=on,mem-merge=off) with a current kernel (4.10.x/4.11.x) and qemu (2.8.x/2.9.x). It will happen right after some load in the VM.

What happens?:
The affected function is kvm_lapic_expired_hv_timer() from arch/x86/kvm/lapic.c. Preemption can kick in here, but the function does not work well with this. So an okay-ish solution is to turn off preemption in the function.

This patch should do the trick.

--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1495,8 +1495,10 @@ EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);

static void cancel_hv_timer(struct kvm_lapic *apic)
{
+ preempt_disable();
kvm_x86_ops->cancel_hv_timer(apic->vcpu);
apic->lapic_timer.hv_timer_in_use = false;
+ preempt_enable();
}

static bool start_hv_timer(struct kvm_lapic *apic)
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Thursday, 03 March 2022, 11:54 GMT
Reason for closing:  Fixed
Additional comments about closing:  2022-02-27: A task closure has been requested. Reason for request: Fixed upstream
Comment by Tobias Powalowski (tpowa) - Saturday, 24 June 2017, 06:52 GMT
Please take this to upstream bugtracker.
Comment by loqs (loqs) - Saturday, 24 June 2017, 08:53 GMT Comment by Wilken Gottwalt (Akiko) - Sunday, 25 June 2017, 14:10 GMT
Yeah, it already went upstream, but that way I have to wait until 4.12 is done or use a custom kernel in Arch. I hoped I could get a official Arch kernel with that fix a bit earlier. ;-)
Comment by mattia (nTia89) - Sunday, 27 February 2022, 14:05 GMT
Is still an open issue?
Comment by Wilken Gottwalt (Akiko) - Sunday, 27 February 2022, 14:15 GMT
No, like mentioned, it was fixed in upstream and is now part of all newer kernels.

Loading...