Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#70830 - [systemd] 248-2 fails to decrypt multiple volumes with same passphrase

Attached to Project: Arch Linux
Opened by Federico (fedev) - Wednesday, 12 May 2021, 23:39 GMT
Last edited by Andreas Radke (AndyRTR) - Thursday, 13 May 2021, 08:23 GMT
Task Type Bug Report
Category Packages: Core
Status Assigned
Assigned To Christian Hesse (eworm)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

Description:

This is very similar to https://bugs.archlinux.org/task/70272. The initial difference with that ticket (which might not be so different as I'll explain) is that this Luks volume is encrypted by password and not a key-file.

Why I said that it might not be so different after all is that I have two volumes and while the first is decrypted, the second isn't. It seems the reason behind is is that systemd stores the passphrase temporarily as a file as it tries the same passphrase for different volumes. This can be seen in the logs:

May 13 07:19:18 archlinux systemd-tty-ask-password-agent[315]: Failed to send: No such file or directory
May 13 07:19:18 archlinux systemd-tty-ask-password-agent[315]: Invalid password file /run/systemd/ask-password/ask.RpMg9u
May 13 07:19:18 archlinux systemd-tty-ask-password-agent[315]: Failed to process password: No such file or directory

Additional info:
* package version(s): systemd 248.2-2 (I'll test with previous version and report back)


Steps to reproduce:

- Power-up
- Provide Luks passphrase
- Observe the drive failing to decrypt and mount
This task depends upon

Comment by Federico (fedev) - Thursday, 13 May 2021, 00:39 GMT
Downgrading systemd (all systemd packages) to 248-5, didn't do the trick (however it was working on that version before). Besides downgrading, I removed manually the following files and forced mkinitcpio to run again as they were throwing an error after the downgrade:

Skipping "/boot/EFI/systemd/systemd-bootx64.efi", since a newer boot loader version exists already.
Skipping "/boot/EFI/BOOT/BOOTX64.EFI", since a newer boot loader version exists already.

What did work in the end was to use snapper to revert to a previous snapshot (still with systemd at 248-5). What is curious is that at that point, while an error was still showing:

May 13 08:24:07 archlinux systemd-tty-ask-password-agent[311]: Invalid password file /run/systemd/ask-password/ask.rJg049
May 13 08:24:07 archlinux systemd-tty-ask-password-agent[311]: Failed to process password: Bad message

The system was able to unlock the drive and boot normally. Notice that the error is not the same.
Comment by Federico (fedev) - Friday, 14 May 2021, 02:41 GMT
Since my initial report, I tried upgrading back to 248.2-2 to test further. While the error is thrown into the logs consistently:

May 14 09:24:56 archlinux systemd-tty-ask-password-agent[320]: Failed to send: No such file or directory
May 14 09:24:56 archlinux systemd-tty-ask-password-agent[320]: Invalid password file /run/systemd/ask-password/ask.Uelawq
May 14 09:24:56 archlinux systemd-tty-ask-password-agent[320]: Failed to process password: No such file or directory

There are times in which all volumes end-up mounted as expected (archroot resides in a single PV, archhome is a VG that spans across 2 PVs) despite of the error. As for the times when it did fail, it seems the issue was not really that the volumes failed to decrypt (which is what I thought was going on initially). When it failed and dropped me to the emergency console, all I had to do in order to complete the successful boot was:

<code>
vgchange -ay
exit
</code>

That'd mean that the volumes were unlocked, just not activated.

So here is a comparison of when it fails against when it works:

FAILS:
May 14 09:20:43 archlinux lvm[473]: pvscan[473] PV /dev/mapper/nvme0p3crypt online, VG archhome incomplete (need 1).
May 14 09:20:48 archlinux lvm[819]: pvscan[819] PV /dev/mapper/nvme1p1crypt online, VG archhome is complete.
May 14 09:20:48 archlinux lvm[819]: pvscan[819] VG archhome run autoactivation.
May 14 09:20:48 archlinux lvm[819]: 0 logical volume(s) in volume group "archhome" now active
May 14 09:20:48 archlinux lvm[819]: archhome: autoactivation failed.

WORKS:
May 14 09:24:58 archlinux lvm[498]: pvscan[498] PV /dev/mapper/nvme0p3crypt online, VG archhome incomplete (need 1).
May 14 09:25:01 archlinux lvm[650]: pvscan[650] PV /dev/mapper/nvme1p1crypt online, VG archhome is complete.
May 14 09:25:01 archlinux lvm[650]: pvscan[650] VG archhome run autoactivation.
May 14 09:25:01 archlinux lvm[650]: 1 logical volume(s) in volume group "archhome" now active

When it does fail, I also see (these lines are not present when it works):

May 14 09:20:48 archlinux systemd[1]: lvm2-pvscan@254:4.service: Main process exited, code=exited, status=5/NOTINSTALLED
May 14 09:20:48 archlinux systemd[1]: lvm2-pvscan@254:4.service: Failed with result 'exit-code'.


Let me know if there is anything else I could be looking at or help with.
Comment by Federico (fedev) - Thursday, 03 June 2021, 00:17 GMT
I did a change which seems to be working fine for me. I was decrypting and activating all volumes by adding them to crypttab.initramfs (this was working fine until I hit the bug in question). With the current bug, only the logical volumes found in the same PV as root were decrypted and activated. As I don't really need the "archhome" VG to be available that early anyway, I moved the decryption of that volume group to crypttab. I don't get the error anymore and all volumes are decrypted and activated as expected.

I don't see the errors from "systemd-tty-ask-password-agent" anymore.

This does sound more of a work-around rather than a fix though.

Loading...