FS#76220 - [qemu] guest boot fail with "cache=none" when sector sizes differ

Attached to Project: Arch Linux
Opened by frederick_metzengerstein (metzengerstein) - Sunday, 16 October 2022, 17:50 GMT
Last edited by David Runge (dvzrv) - Thursday, 20 October 2022, 17:45 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Anatol Pomozov (anatolik)
David Runge (dvzrv)
Architecture x86_64
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Qemu doesn't boot anymore from drives that use the option "-drive cache=none,...",
resulting in the following output after qemu start:

"SeaBIOS (version Arch Linux 1.16.0-3-3)
Boot failed: could not read the boot disk"

It doesn't matter if the boot media is disk or cdrom (see below).
Setting cache to writeback, writethrough etc. works.

Additional info:
System: Linux 6.0.1-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 13 Oct 2022 18:58:49 +0000 x86_64 GNU/Linux
package: qemu-desktop 7.1.0-9

Steps to reproduce (example lines):
$ qemu-system-x86_64 -enable-kvm -cpu qemu64 -m 2G -net none -drive file=disk.img,format=raw,cache=none
$ qemu-system-x86_64 -enable-kvm -cpu qemu64 -m 2G -net none -drive file=cd.iso,media=cdrom,cache=none -boot order=d

Last working package versions:
linux 5.19.13.arch1-1
qemu-desktop 7.1.0-8
This task depends upon

Closed by  David Runge (dvzrv)
Thursday, 20 October 2022, 17:45 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed with qemu 7.1.0-10
Comment by Toolybird (Toolybird) - Sunday, 16 October 2022, 20:09 GMT
Cannot repro. I just grabbed latest Arch Boxes qcow2, converted it to raw img, and it boots fine using your cmd. There must be something else going on with your setup. cache=none is a fairly common config so I doubt it's the problem. Admittedly, I have a local QEMU build, but it's pretty close to upstream Arch. It would be good if someone else can try to repro.
Comment by frederick_metzengerstein (metzengerstein) - Sunday, 16 October 2022, 21:34 GMT
Well, it's definitely the cache option. As I wrote, even trying to boot the arch iso with a simple cmd like the following, doesn't work:
$ qemu-system-x86_64 -m 2G -drive file=archlinux-2022.06.01-x86_64.iso,media=cdrom,cache=none -boot order=d

Btw, If I enable the boot menu (with -boot menu=on) the disk/iso is listed there. So the bios "sees" it, but cannot boot it.


I haven't made any changes to the system lately, so it must be caused by the upgrade to linux 6.0.1 and/or qemu 7.1.0-9.
Viewing at the boot log, the only warning that appeared after the linux 6.0.1 upgrade is: "amd_gpio AMDI0030:00: failed to get iomux index", but I don't think it's related.

HW: Asus B350-F Gaming, Ryzen 5600x
Comment by loqs (loqs) - Sunday, 16 October 2022, 21:39 GMT
If you use linux-lts is the issue still present? As it could not be replicated on 5.19 [1].

[1] https://bbs.archlinux.org/viewtopic.php?id=280493
Comment by frederick_metzengerstein (metzengerstein) - Sunday, 16 October 2022, 22:54 GMT
Just tried with linux-lts: It doesn't have this problem.
So linux 6.0 is the culprit here.
Comment by frederick_metzengerstein (metzengerstein) - Monday, 17 October 2022, 17:33 GMT
New weird findings:
The bug happens only if the file (disk image, iso) resides on a file system inside a crypto luks container!
(which is my setup: my system root and my vms are on an ext4 partition inside a luks container)
If the file is on a normal filesystem (ext4, ntfs, xfs whatever), the booting works without problem.

Steps to reproduce:
1. create an empty partition on a spare device (here /dev/sdc4)
2. make crypto container ($ cryptsetup luksFormat /dev/sdc4)
3. open container ($ cryptsetup open /dev/sdc4 testcrypto)
4. create a filesystem inside container ($ mkfs.ext4 /dev/mapper/testcrypto)
5. create mountpoint and mount test partition ($ mkdir /mnt/test && mount /dev/mapper/testcrypto /mnt/test)
6. cp an iso to /mnt/test ($ cp archlinux-2022.06.01-x86_64.iso /mnt/test)
7. run qemu cmd ($ qemu-system-x86_64 -m 2G -drive file=/mnt/test/archlinux-2022.06.01-x86_64.iso,media=cdrom,cache=none)
-> it should not boot from the iso

Can someone please try to reproduce?

Btw: the bug happens also with uefi (so it's not related to seabios or sth.)
Comment by frederick_metzengerstein (metzengerstein) - Monday, 17 October 2022, 22:37 GMT
Ok, I've tried to reproduce this on a drive other than my internal SSDs (sata ahci), an USB drive:
-> It doesn't show this bug.
So I've compared the luks dump ($ cryptsetup luksDump /dev/sdx) and noticed the difference of the encryption sector sizes:
USB: 512 bytes, SSD: 4096 bytes
Thus, I recreated the container on the SSD with an encryption sector size of 512 (Step 2: $ cryptsetup --sector-size 512 luksFormat /dev/sdc4):
-> qemu boots without problem

So somehow the bug depends on the encryption sector size of the luks volume.

The fdisk output of my SSD (sdc) is: Sector size (logical/physical): 512 bytes / 4096 bytes
According to https://wiki.archlinux.org/title/Advanced_Format#dm-crypt, cryptsetup automatically sets the optimal encryption sector size,
which is 4096 bytes for my SSDs.
Comment by Toolybird (Toolybird) - Thursday, 20 October 2022, 06:38 GMT Comment by David Runge (dvzrv) - Thursday, 20 October 2022, 07:11 GMT
@metzengerstein: Thanks for the ticket and the tests!
@loqs: Thanks for the investigation on this and figuring out the related upstream commits fixing the problem!

Will apply in a pkgrel bump in [testing].
Comment by David Runge (dvzrv) - Thursday, 20 October 2022, 08:00 GMT
The fixes are applied in [testing] in qemu 7.1.0-10.
Please let me know if this works as expected.
Comment by frederick_metzengerstein (metzengerstein) - Thursday, 20 October 2022, 17:05 GMT
No more boot problems with qemu 7.1.0-10 from [testing]. :)
Comment by David Runge (dvzrv) - Thursday, 20 October 2022, 17:45 GMT
Cool, thanks for testing!

Loading...