FS#65956 - [linux] amdgpu: corrupt header after shoutdown VM

Attached to Project: Arch Linux
Opened by Sebastian Münch (vfio_experte) - Tuesday, 24 March 2020, 09:48 GMT
Last edited by freswa (frederik) - Tuesday, 24 March 2020, 13:39 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
I have a Problem with a corrupt Header on my AMD RX VEGA 64 Card after shutdown the VM.
The GPU is with vfio in Qemu VM.
arch linux kernel 5.5.10 and linux-lts 5.4. make this BUG on my KVM server.
I downgrade the kernel to 5.3.5 an the corrupt Header is fixed.
I have mesa beta 20.0.1 and archlinux 19.3.5 tested. and the BUG is not fixed.
see the log lspci -v > lspciv1.log for the 5.3.5 kernel loading in VM after shutdown.
see the log lspci -v > lspci_header_corupt.log for the 5.4.26 or 5.5.10 kernel loading in VM after shutdown.
see the dmesg >vfio_5.3.5.log for the 5.3.5 kernel loading in VM after shutdown.
see the log dmesg > vfio_5.4.26.log for the 5.4.26 or 5.5.10 kernel loading in VM after shutdown.

the gpu corrupt header has a gpu then not colling any more and fan rpm of 0.
hte gpu 30min - 40min fan of 100% and pc must remove for engine and waiting 15min gpu coling down.



Additional info:
* linux linux-lts mesa 19.5.3
* config and/or log files etc.
* link to upstream bug report, if any

Steps to reproduce:
I starting a qemu q35 or qemu std VM with the gpu per vfio deice add and shoutdown the vm and the Header is corrupt.
This task depends upon

Closed by  freswa (frederik)
Tuesday, 24 March 2020, 13:39 GMT
Reason for closing:  Upstream
Additional comments about closing:  This doesn't seem to be a packaging issue. Please report this upstream.
Comment by Sebastian Münch (vfio_experte) - Tuesday, 24 March 2020, 12:45 GMT
I have blacklist amdgpu and radeon and the header is after poweroff or reboot good in Sever.
Problem gpu cant't use with blacklisting amdgpu driver.

Loading...