FS#50420 - [lvm2] Remote possibility of data loss during shutdown using thin volumes

Attached to Project: Arch Linux
Opened by James Harvey (jamespharvey20) - Wednesday, 17 August 2016, 10:42 GMT
Last edited by Christian Hesse (eworm) - Wednesday, 10 February 2021, 09:35 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Christian Hesse (eworm)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 8
Private No

Details

See related dm-devel mailing list thread at:

dm-devel@redhat.com/msg02361.html"> https://www.mail-archive.com/dm-devel@redhat.com/msg02361.html


Description:

After systemd 231 (commit d4506129) changed the timeout on sending a SIGKILL after a SIGTERM from 10 seconds to 90 seconds, the 90 second delay in reboots rather than 10 seconds I had previously sent me digging. Turns out when using thin volumes, lvm-lvmetad and dm-event services are ignoring systemd's SIGTERM, and eventually being SIGKILL'ed by systemd. This has been happing for quite some time, months at least.

I had modified my systemd services to only give these services 3 seconds, but I got a response from a RedHat developer saying SIGKILL'ing dmeventd during a recovery operation and also lvmetad can "leave your system in a dizzy state (suspended devices) essentially useless".

If I'm interpreting this correctly, I think there's a possibility of data loss IF you happen to manually stop these services, shutdown, or reboot, while one of the services is modifying something such as auto-extending. Seems like a remote situation to me, but wanted to pass it along.

Short of that, it's probably just irritating Arch users using thin volumes on why reboots are so slow all of a sudden.

He says dm-event ignores SIGTERM/SIGINT when a device is being monitored.

But after running:
==========
for lvDeviceName in $( lvs --noheadings -o dmpath ); do
lvchange --monitor n ${lvDeviceName
done
for volumeGroup in $( vgs --noheadings -o vg_name); do
vgchange --monitor n ${volumeGroup}
done
==========

Then running systemd stop dm-event, still gives a 90 second timeout. So something else is going on.

At this point, I'm wondering if I can just not use lvm2-lvmetad and dm-event. My understanding is lvm2-lvmetad just caches the LVM information so the disks don't have to be scanned on commands line lvs. dm-event is a bit more vague, that it's for library plugins... But I think it's mainly for stuff like auto-extend which could be done manually, granted automatically could prevent a volume unexpectedly becoming unwritable.


Additional info:
linux 4.7.0-1, lvm2 2.02.164-1

The RedHat developer also said: "Fedora should be doing it properly on reboot - switching to ramdisk and continuing with shutdown sequence from there. Unsure how other OS-es solves this."

I admit I don't know much about Arch's shutdown procedure, much more familiar with its booting procedure.

I might see if I can reproduce this using Arch in a VM, and if so, see if I can reproduce this using Fedora in a VM.


Steps to reproduce:

Here are the minimal steps I used to cause the problem, using one disk and EXT4.

If during the install I combine the 2 lvcreate commands into a single one without using thin pools, then lvmetad terminates pretty much immediately with SIGTERM.

===================

/dev/sda1 3.5G Linux filesystem
/dev/sda2 4.5TB Linux LVM

{ Setup LVM and filesystems }
# mkfs.ext4 -L boot /dev/sda1
# pvcreate /dev/sda2
# vgcreate disk1 /dev/sda2
{ Merging these 2 lvcreates, removing the thin volume usage makes lvm2-lvmetad properly terminate on SIGTERM }
# lvcreate --size 500G --thinpool disk1thin disk1
# lvcreate --virtualsize 100G --name root disk1/disk1thin
# mkfs.ext4 -L /mnt /dev/disk1/main
# mount /dev/disk1/main /mnt
# mkdir /mnt/boot
# mount /dev/sda1 /mnt/boot

{ Install Arch Linux }
# vi /etc/pacman.d/mirrorlist
# pacstrap -i /mnt base syslinux gptfdisk lvm2
# arch-chroot /mnt
# vi /etc/locale.gen
# locale-gen
# locale > /etc/locale.conf
# vi /etc/nsswitch.conf
# systemctl enable systemd-resolved systemd-networkd
# ln -s /usr/share/zoneinfo/America/Detroit /etc/localtime
# hwclock --utc --systohc
# passwd
{ Add lvm2 between block and filesystems }
# vi /etc/mkinitcpio.conf
# mkinitcpio -p linux
# echo hostname > /etc/hostname
# vi /etc/systemd/network/enp31s0.network
# syslinux-install_update -i -a -m
# vi /boot/syslinux/syslinux.cfg

{ After Reboot }
# systemctl stop dm-event || systemctl poweroff || systemctl reboot
This task depends upon

Closed by  Christian Hesse (eworm)
Wednesday, 10 February 2021, 09:35 GMT
Reason for closing:  Implemented
Additional comments about closing:  lvm2 2.02.176-2
Comment by James Harvey (jamespharvey20) - Saturday, 03 September 2016, 03:21 GMT
The short answer is lvm2-monitor.service needs to be enabled. This runs "vgchange --monitor n" before lvm2-lvmetad.service and dm-event.service are shut down, which makes those programs no longer monitoring LVM volumes which makes them terminate on SIGTERM rather than ignore it.

The long answer is a little confusing to me, getting into the Arch philosophy of minimizing deviating from upstream.

The summary of my longer answer is that I think lvm2-monitor.service should be defaulted to be enabled. The rest is something I think someone higher up than me might want to consider, as there could be other issues in the way Arch configures/runs LVM2. I'm not necessarily advocating all the changes below be made, just saying I think it would be worth someone deciding.

If we consider the sourceforge releases, or even the fedora lvm2.git branch, as upstream, there isn't really a set of configuration parameters and systemd unit defaults. There's only a fairly useless INSTALL file that basically says configure, make, make install, and refer to man pages.

But, lvm2 is certainly a redhat/fedora project, and I wonder if Arch's lvm2 package should consider upstream to be fedora's lvm2 package rather than the unconfigured/no settings source on sourceforge as well as the git branch. Especially when the redhat/fedora LVM2 developers seem to be saying in some cases if it's not ran as in their redhat/fedora packages are configured, support requests are invalid. (As in this case of lvm2-monitor.service being disabled.)

Looking at Fedora 24's .spec file, Arch's PKGBUILD deviates from it a lot. IF I'm understanding and working through their .spec file properly:

Arch enables dm-event.socket and lvm2-lvmetad.socket. Fedora enables these, but also enables blk-availability.service, lvm2-lvmpolld.socket (which isn't even in Arch's package), and lvm2-monitor.service. Following Fedora's configuration here would have lvm2-monitor.service enabled, and avoid this shutdown delay and (remote) potential data loss.

For patches, Fedora 24 is using lvm2 v2.02.150, but Arch is using v2.02.164. So although Fedora uses 4 patches that Arch doesn't, I have a feeling (but haven't confirmed) these may be backported patches only needed by Fedora since they're 14 versions back, that may already be included in what Arch uses.

For configure options, Fedora 24 uses additional parameters that Arch doesn't: --with-usrlibdir=/usr/lib --enable-lvm1_fallback --enable-fsadm --with-pool=internal --enable-write_install --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-blkid_wiping --enable-python2-bindings --enable-python3-bindings --with-cluster=internal --with-clvmd=corosync --with-cmirrord --with-udevdir=%{_prefix}/lib/udev/rules.d --enable-udev_sync --enable-lvmpolld --enable-lockd-dlm --enable-lockd-sanlock --enable-dbus-service

I'm not sure what all of those configure options do, and am not sure if any of them would be incompatible with Arch for any reason, or may not make a difference on Arch.

One that sticks out to me for sure is Arch's missing --enable-lvmpolld, as Fedora defaults to having lvmpolld.service enabled, but it and its executable aren't on Arch.

And, there are several Arch parameters used on configure that Fedora 24 doesn't use: --sysconfdir=/etc --prefix=/usr --with-systemdsystemunitdir=/usr/lib/systemd/system --with-udev-prefix=/usr --enable-readline --enable-udev_sync --enable-udev_rules --localstatedir=/var --sbindir=/usr/bin

Most of those are just pointing to proper directories, of which might be in different locations on Fedora so are fine, but some of the middle ones might alter behavior, or are maybe defaulted to on, I'm not sure.
Comment by James Harvey (jamespharvey20) - Saturday, 03 September 2016, 03:24 GMT
I'll also copy in 3 questions I just added to the dm-devel mailing list, in the thread linked to in the original bugreport. (The link is screwed up, but copy/paste works.)

1) Should the lvm2-lvmetad, dm-event, and lvm2-monitor unit files be modified so they are never given a SIGKILL? Even with lvm2-monitor.service enabled, even on Fedora, if systemd sees they don't SIGTERM/SIGINT within 90 seconds (systemd v231 is 90 seconds, was 10 second before), it's sending them a SIGKILL. I think adding
"SendSIGKILL=no" to the Service and Socket sections will do this, if I understand it correctly.

2) Should lvm2-lvmetad and dm-event systemd unit files want lvm2-monitor.service?

3) Could all LVM programs be changed so if they receive a SIGTERM/SIGINT and choose to ignore it, they give a warn/info/debug message? Not doing so invites thinking a SIGKILL is the proper thing to do.
Comment by James Harvey (jamespharvey20) - Wednesday, 07 September 2016, 10:09 GMT
FWIW, as viewable in the thread the original report links to, Zdenek Kabelac (highest committer in LVM at least recently) says having lvm2-monitor.service disabled by default is a "crazy idea" and "IMHO a thing to fix in Arch".

Discussion on questions 1-2 ongoing, question 3 needs some BugZilla entries by me.
Comment by Timofonic (timofonic) - Sunday, 12 November 2017, 16:52 GMT
What's the status of this bug? It's from september 2016! :O
Comment by loqs (loqs) - Sunday, 06 September 2020, 20:29 GMT
lvm2 2.02.176-2 enabled lvm2-lvmpolld.socket and lvm2-monitor.service by default.

Are there any remaining issues to be addressed?

Loading...