Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in Unsupported. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#59515 - [qemu] 2.12.0-1 -> 2.12.0-2 breaks Windows 10 guest

Attached to Project: Arch Linux
Opened by Jimi Bove (Jimi-James) - Friday, 03 August 2018, 04:36 GMT
Last edited by Doug Newgard (Scimmia) - Friday, 03 August 2018, 04:58 GMT
Task Type Bug Report
Category Packages: Extra
Status Assigned
Assigned To Evangelos Foutras (foutrelis)
Anatol Pomozov (anatolik)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

Relevant note:
I'm using the TianoCore BIOS for OVMF and passing thru a PCI USB card. I also passthru a GPU, but I've confirmed this issue happens without the GPU. I plan to test it without *any* PCI passthru (no GPU, no USB card) tomorrow.

Description:
Trying to boot my Windows 10 guest results in a hard freeze on the guest at some point while the Windows logo is loading. If I shut the VM down during said freeze, there's a good chance (not every single time) that it'll hard freeze my entire system with it. I tried restoring a backup of the VM's hard drive from 4 months ago. Same behavior. Downgrading qemu fixed it.

Steps to reproduce:
Upgrade qemu to 2.12.0-2
This task depends upon

Comment by Jimi Bove (Jimi-James) - Friday, 03 August 2018, 16:59 GMT
Alright, this is getting super weird. Up until this moment, this issue has been 100% consistent and predictable over multiple tests. 2.12.0-2 would freeze the guest, and 2.12.0-1 would not. When I tested it without any PCI passthru (not the USB card either), 2.12.0-2 worked. So, I thought, OK, this is a PCI passthru issue, which makes sense because a commit related to that was between 2.12.0-1 and 2.12.0-2. But then I added the USB card back and tried again, with 2.12.0-2, and suddenly for the first time in days on 2.12.0-2, Windows booted fine. So now I don't know what to think. Maybe this issue only affects the first time the guest boots since the last time the host booted? I'll reboot and try again to confirm.
Comment by Anatol Pomozov (anatolik) - Friday, 03 August 2018, 17:06 GMT
Yeah, it is hard to debug such flaky issues. I would suggest to post your experience to qemu-devel list - they might have better explanation of what is going on.
Comment by Jimi Bove (Jimi-James) - Friday, 03 August 2018, 17:27 GMT
It seems the issue has completely disappeared now. Unless anything comes up later, I'm just going to assume that my guest needed to boot *after* the changes in 2.12.0-2, *without* any PCI passthru devices, just once, to permanently acclimate something in the Windows system itself.
Comment by Jimi Bove (Jimi-James) - Friday, 03 August 2018, 17:35 GMT
AHA! Nevermind! I found what's going on. This confusion came from the fact that I've been simultaneously fixing an issue with a new AMD card where my system randomly freezes, and one of the things I've had to do to solve that was disable MSI interrupts. The reason it worked with the USB card on 2.12.0-2 just now was because, at the same time, I switched my kernel parameters from pci=nomsi to amdgpu.msi=0, i.e., I made it so just my GPU and not also the USB card was avoiding MSI interrupts. Now the VM has the same issue--only working on 2.12.0-1--with *just* the GPU, instead of both the GPU and the USB card.

So here are the exact 3 steps to reproduce this bug in 2.12.0-1 -> 2.12.0-2:
1. Have a Windows (10?) guest with a PCI card passed thru to it
2. Disable MSI interrupts for the PCI card in question on the host
3. Upgrade qemu to 2.12.0-2

Which means the more precise description is, "[qemu] 2.12.0-1 -> 2.12.0-2 breaks PCI passthru Windows 10 guest for PCI cards that aren't using MSI interrupts"

Loading...