FS#55044 - [qemu] [jemalloc] ceph/rbd backed libvirt VMs experience I/O hang

Attached to Project: Arch Linux
Opened by Jamin Collins (jamincollins) - Saturday, 05 August 2017, 20:49 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Sunday, 04 July 2021, 21:16 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Tobias Powalowski (tpowa)
Sven-Hendrik Haase (Svenstaro)
Anatol Pomozov (anatolik)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
VMs using ceph/rbd backed volumes experience a complete I/O stall when attempting to access the ceph/rbd volume

Additional info:
* ceph-10.2.5-2

Steps to reproduce:
* configure libvirt for usage with ceph[1]
* configure at least one VM drive to be backed by ceph/rbd
* boot VM
* attempt to access ceph/rbd backed volume
- dd if=/dev/zero of=${rbd_backed_device} bs=1M oflag=sync status=progress

Background:
I recently found that I could not restart any of my ceph/rbd backed VMs. They all appeared to hang during boot. Initially, I thought this was due to migrating them to a freshly installed host and possibly missing some configuration step. However, attempting to boot them on an existing host revealed similar hangs during boot. Eventually, I attempted to boot one of the VMs on a third host and found that it worked. Looking for differences between the hosts I found that the working host was running my development aur/ceph-git package while the failing hosts were all running the extras/ceph 10.2.5-2 package. To test this theory, I installed my aur/ceph-git package on one of the failing hosts. After installation, VMs were able to start successfully. Reverting to the extras/ceph package resulted in the same VM boot hangs.

Using the extras/ceph PKGBUILD file as a template, I've successfully compiled 10.2.6 and 10.2.7. Both versions exhibit the same I/O hang behavior. In all cases, replacing the extras/ceph package with the aur/ceph-git build resolves the issue.

The specific aur/ceph-git build used is:
ceph-git-1:12.1.0.1018.g171104cb93-1-x86_64.pkg.tar.xz

I'll continue digging into the differences between the two packages and update this report if I'm successful in finding a solution.



[1] - http://docs.ceph.com/docs/master/rbd/libvirt/
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Sunday, 04 July 2021, 21:16 GMT
Reason for closing:  Fixed
Comment by Jamin Collins (jamincollins) - Saturday, 05 August 2017, 20:51 GMT
I should probably note that previously running VMs do not appear to be impacted, as long as they are not stopped and restarted.
Comment by Eli Schwartz (eschwartz) - Sunday, 06 August 2017, 19:07 GMT
You can try git bisect to see where this got fixed in the development version.
Comment by Jamin Collins (jamincollins) - Monday, 07 August 2017, 16:16 GMT
I'm not convinced this is a problem with the upstream code. Rather it appears to be an issue with the Arch Linux package.

I say this for a few reasons:
1) this use case (backing libvirt vms with rbd devices) is common for ceph and the 10.2.5 release has been out since December 2016, a bug like this would have been reported
2) Ubuntu's 16.04 LTS release has ceph 10.2.7, I've installed and configured a Ubuntu based virt host, (using 10.2.7) it does not experience this IO hang

Comment by Jamin Collins (jamincollins) - Monday, 07 August 2017, 19:15 GMT
Attached is an rbd debug log from the host during a VM IO hang under ceph 10.2.7.
Comment by Jamin Collins (jamincollins) - Monday, 07 August 2017, 21:01 GMT
I also have additional rbd debug logs from the same host running:
- Ubuntu with ceph 10.2.7 where it does not hang
- Arch with ceph-git-1:12.1.0.1018.g171104cb93-1 where it does not hang

However, these logs are each over 15M compressed.
Comment by Jamin Collins (jamincollins) - Tuesday, 08 August 2017, 18:19 GMT
Working with a ceph developer, we've isolated the issue to Arch's qemu build. Specifically, the Arch qemu package build explicitly enables jemalloc. The ceph logs show that several threads were hung inside a jemalloc request. Rebuilding the ceph package without jemalloc resolves this IO hang, and appears to improve overall RBD related throughput.

This report should probably be reassigned to the qemu package.
Comment by Eli Schwartz (eschwartz) - Tuesday, 08 August 2017, 19:24 GMT
Rebuilding the ceph package, or the qemu package???

And I wonder why ceph-git didn't have that problem, if it is a qemu issue.
Comment by Jamin Collins (jamincollins) - Tuesday, 08 August 2017, 19:33 GMT
Rebuilding only the qemu package, to not use jemalloc. I'm currently deploying a custom qemu build to all my VM hosts to resolve this time bomb.

I suspect that quite a bit has changed architecturally between the ceph LTS and development branches. I know they use an entirely different build system.
Comment by Filipp Andjelo (scorp) - Friday, 15 September 2017, 09:25 GMT
Just a question, could it be, that this issue could also affect similar behavior not only with ceph/rbd?
My CentOS7 VM has being completely broken by sudden I/O stop during system update. I tried to setup a new CentOS VM using virt-manager, but after some minutes the VM hangs completely in the installation, obviously due to some I/O problems. The qemu instance can then only be killed a hard way. This is absolutely reproducible on my machine here. Just grab the latest ISO from CentOS and try to install. I'll try this later on another machine, I wonder if I'll run into the same problem there.
Comment by Filipp Andjelo (scorp) - Friday, 15 September 2017, 13:31 GMT
Just recompiled without jemalloc, but it didn't help. Seems to be a different problem.
Comment by loqs (loqs) - Monday, 07 September 2020, 02:56 GMT Comment by Anatol Pomozov (anatolik) - Tuesday, 08 September 2020, 16:38 GMT
Indeed, jemalloc is not used with qemu anymore.

Jamin, could you please confirm if the issue still exists?
Comment by Jamin Collins (jamincollins) - Wednesday, 09 September 2020, 04:54 GMT
I haven't had the issue in a fair bit. I did run a custom build for a while (as indicated previously in this report). But have been running normal builds for a while now, currently:

ceph 14.2.8-1
ceph-libs 14.2.8-1
qemu 5.0.0-7
qemu-block-rbd 5.0.0-7
Comment by Sven-Hendrik Haase (Svenstaro) - Sunday, 04 July 2021, 21:15 GMT
It would appear that we can consider this fixed then for the time being. I'll close it. Request a reopen if it does reoccur in some fashion.

Loading...