FS#70713 - [libvirt][qemu] ZFS volume backed VMs cannot start

Attached to Project: Arch Linux
Opened by Uwe Sauter (UweSauter) - Wednesday, 05 May 2021, 06:39 GMT
Last edited by David Runge (dvzrv) - Wednesday, 05 May 2021, 09:02 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Tobias Powalowski (tpowa)
Anatol Pomozov (anatolik)
David Runge (dvzrv)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Since qemu was upgraded from 5.2.0-4 -> 6.0.0-2 this morning I cannot start VMs that are backed by ZFS volumes. virt-manager shows:


Error starting domain: Internal error: process exited while connecting to monitor: 2021-05-05T06:28:58.437961Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/dev/zvol/VM/VPN_gateway","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: 'file' driver requires '/dev/zvol/VM/VPN_gateway' to be a regular file

Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 101, in tmpcb
callback(*args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 57, in newfn
ret = fn(self, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/domain.py", line 1329, in startup
self._backend.create()
File "/usr/lib/python3.9/site-packages/libvirt.py", line 1353, in create
raise libvirtError('virDomainCreate() failed')
libvirt.libvirtError: Internal error: process exited while connecting to monitor: 2021-05-05T06:28:58.437961Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/dev/zvol/VM/HLRS_gateway","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: 'file' driver requires '/dev/zvol/VM/HLRS_gateway' to be a regular file



The update that introduced this handled the following:

[2021-05-05T08:01:59+0200] [PACMAN] Running 'pacman -Syyu'
[2021-05-05T08:01:59+0200] [PACMAN] synchronizing package lists
[2021-05-05T08:02:00+0200] [PACMAN] starting full system upgrade
[2021-05-05T08:02:07+0200] [ALPM] transaction started
[2021-05-05T08:02:07+0200] [ALPM] upgraded libmanette (0.2.6-1 -> 0.2.6-2)
[2021-05-05T08:02:07+0200] [ALPM] upgraded libtool (2.4.6+42+gb88cebd5-14 -> 2.4.6+42+gb88cebd5-15)
[2021-05-05T08:02:07+0200] [ALPM] upgraded mailcap (2.1.49-1 -> 2.1.53-1)
[2021-05-05T08:02:07+0200] [ALPM] upgraded mercurial (5.7.1-1 -> 5.8-1)
[2021-05-05T08:02:07+0200] [ALPM] upgraded mjpegtools (2.2.0beta-1 -> 2.2.0-1)
[2021-05-05T08:02:07+0200] [ALPM] upgraded mpfr (4.1.0-1 -> 4.1.0-2)
[2021-05-05T08:02:07+0200] [ALPM] upgraded oath-toolkit (2.6.6-2 -> 2.6.7-1)
[2021-05-05T08:02:07+0200] [ALPM] upgraded qemu (5.2.0-4 -> 6.0.0-2)
[2021-05-05T08:02:08+0200] [ALPM] upgraded qemu-arch-extra (5.2.0-4 -> 6.0.0-2)
[2021-05-05T08:02:09+0200] [ALPM] transaction completed
[2021-05-05T08:02:09+0200] [ALPM] running '30-systemd-udev-reload.hook'...
[2021-05-05T08:02:09+0200] [ALPM] running '30-systemd-update.hook'...
[2021-05-05T08:02:09+0200] [ALPM] running 'gtk-update-icon-cache.hook'...
[2021-05-05T08:02:09+0200] [ALPM] running 'texinfo-install.hook'...
[2021-05-05T08:02:09+0200] [ALPM] running 'update-desktop-database.hook'...


Installed package versions:
local/libvirt 1:7.1.0-3
local/libvirt-glib 4.0.0-1
local/libvirt-python 1:7.1.0-1
local/linux 5.11.16.arch1-1
local/qemu 6.0.0-2
local/qemu-arch-extra 6.0.0-2
local/virt-install 3.2.0-1
local/virt-manager 3.2.0-1
local/zfs-linux-git 2021.04.30.r6791.gc903a756a_5.11.16.arch1.1-1
local/zfs-utils-git 2021.04.30.r6791.gc903a756a-1



Downgrading back to qemu 5.2.0-4 brings back functionality.
This task depends upon

Closed by  David Runge (dvzrv)
Wednesday, 05 May 2021, 09:02 GMT
Reason for closing:  Not a bug
Additional comments about closing:  User configuration issue (fixed in comment):

https://bugs.archlinux.org/task/70713#co mment199457
Comment by Toolybird (Toolybird) - Wednesday, 05 May 2021, 07:35 GMT
> 'file' driver requires '/dev/zvol/VM/HLRS_gateway' to be a regular file

The 'file' driver seems incorrect for a zvol. Please post an xml snippet containing your <disk>...</disk> section.
Comment by Uwe Sauter (UweSauter) - Wednesday, 05 May 2021, 07:42 GMT
Disk configuration from VM:

<disk type="file" device="disk">
<driver name="qemu" type="raw"/>
<source file="/dev/zvol/VM/VPN_gateway" index="1"/>
<backingStore/>
<target dev="vdb" bus="virtio"/>
<alias name="virtio-disk1"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x09" function="0x0"/>
</disk>

Storage configuration:
<pool type="dir">
<name>dev_zvol_vm</name>
<uuid>e7cd2030-4187-4856-8edf-0d35782d20a9</uuid>
<capacity unit="bytes">16261496832</capacity>
<allocation unit="bytes">0</allocation>
<available unit="bytes">16261496832</available>
<source>
</source>
<target>
<path>/dev/zvol/VM</path>
<permissions>
<mode>0755</mode>
<owner>0</owner>
<group>0</group>
</permissions>
</target>
</pool>


This configuration is a leftover from times when libvirt was unable to handle ZFS volumes and the only way was to configure /dev/zvol/<POOLNAME> as directory containing raw images.



Comment by Uwe Sauter (UweSauter) - Wednesday, 05 May 2021, 07:50 GMT
Changing configuration to "native" ZFS doesn't change the outcome with qemu 6:

Selecting the Zvol from storage configuration
<pool type="zfs">
<name>VM</name>
<uuid>fae6e2c3-7fbb-4943-b3f6-16caf54c9f76</uuid>
<capacity unit="bytes">1992864825344</capacity>
<allocation unit="bytes">317974659072</allocation>
<available unit="bytes">1674890166272</available>
<source>
<name>VM</name>
</source>
<target>
<path>/dev/zvol/VM</path>
</target>
</pool>


results in a VM disk configuration

<disk type="file" device="disk">
<driver name="qemu" type="raw"/>
<source file="/dev/zvol/VM/VPN_gateway"/>
<target dev="vda" bus="virtio"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x07" function="0x0"/>
</disk>


and the error

Fehler beim Starten der Domain: Interner Fehler: process exited while connecting to monitor: 2021-05-05T07:48:53.330160Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/dev/zvol/VM/HLRS_gateway","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: 'file' driver requires '/dev/zvol/VM/HLRS_gateway' to be a regular file

Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 101, in tmpcb
callback(*args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 57, in newfn
ret = fn(self, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/domain.py", line 1329, in startup
self._backend.create()
File "/usr/lib/python3.9/site-packages/libvirt.py", line 1353, in create
raise libvirtError('virDomainCreate() failed')
libvirt.libvirtError: Interner Fehler: process exited while connecting to monitor: 2021-05-05T07:48:53.330160Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/dev/zvol/VM/HLRS_gateway","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: 'file' driver requires '/dev/zvol/VM/HLRS_gateway' to be a regular file



Sorry for the formatting but Flyspray's markdown dialect differs to what I'm used to…
Comment by Toolybird (Toolybird) - Wednesday, 05 May 2021, 07:51 GMT
Try editing the XML to look like:

<disk type="block" device="disk">
<source dev="/dev/zvol/VM/VPN_gateway" index="1"/>

I don't use ZFS but I do use LVM volumes and the syntax should be similar.
Comment by Uwe Sauter (UweSauter) - Wednesday, 05 May 2021, 08:01 GMT
Well, thanks.

index="1" seems unnecessary (in fact it gets removed automatically on boot) but the proposed change did the trick. Time to move all VMs to the ZFS storage config and remove the directory-backed pool.


As I don't seem to have the permissions to close my own ticket: admins, please go ahead and close.

Loading...