FS#70272 - [systemd] 248-1 breakes LUKS with keyfile

Attached to Project: Arch Linux
Opened by Nikita Grabar (Slenderchat) - Saturday, 03 April 2021, 00:36 GMT
Last edited by Christian Hesse (eworm) - Wednesday, 14 July 2021, 18:58 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Christian Hesse (eworm)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 14
Private No

Details

Description:
Suddenly after systemd upgrade my server and laptop, which is LUKS encrypted and uses systemd-cryptsetup with keys in initramfs and root filesystem to auto-unlock, started to boot to emergency shell (photo of output will be attached). I started to investigate the problem, and find out that culprit was systemd-cryptsetup@root unit. Logs for that unit contained this error: "Failed to load key file "root.key": Argument list too long". After i extracted keys from initramfs file and chrooted to my broken systems, I decided to downgrade systemd to 247.4-2. After it was done and after initcpios was updated, both systems booted succesfully. This is the only change which was done and it helped, so the culprit is systemd 248-1. Seems to be, that  FS#70264  is related and has the same culprit.

Steps to reproduce:
1. Setup system with LUKS full disk encryption which uses systemd-cryptsetup to auto-unlock partitions on boot with keyfiles in initcpios;
2. Make sure that you've replaced udev hook with systemd hook in mkinitcpio.conf, added sd-encrypt hook, and included /etc/cryptsetup-keys.d/root.key in FILES array;
3. Make sure, that you use systemd >= 248;
4. Try to boot.

P.S. Unfortunately I only captured output of server, which has "quiet" in kernel parameters. All other information was gained on laptop by removing "quiet" option, and by including /etc/shadow in initcpio's to get emergency shell access.
This task depends upon

Closed by  Christian Hesse (eworm)
Wednesday, 14 July 2021, 18:58 GMT
Reason for closing:  Fixed
Additional comments about closing:  systemd 249
Comment by xgjmibzr (xgjmibzr) - Saturday, 03 April 2021, 02:12 GMT
I had a similar bug, but on a secondary disk rather than root.
Error was "Apr 02 17:37:32 arch-server systemd-cryptsetup[247]: Failed to activate with key file '/etc/keys/[UUID].key': Invalid argument."

Downgraded to 247.4-2 and boots without error.
Comment by Robert (robson) - Saturday, 03 April 2021, 11:32 GMT
And did you install libfido2, which is an optional dependency for systemd 248?
Comment by Frederick Zhang (FrederickZh) - Saturday, 03 April 2021, 12:07 GMT
Similar issue here. While my main disk is opened via SSH hence unaffected, other additional ones all failed with

> Failed to activate with key file '/etc/crypttab.key': Invalid argument

I filed a ticket upstream: https://github.com/systemd/systemd/issues/19193
Comment by Vince (Heronymous) - Saturday, 03 April 2021, 17:59 GMT
I wanted to throw my hat in and confirm that I also have the same issue with systemd 248-1. Downgrading to systemd 247.3-1 fixed my issue (and only this downgrade was needed).
Comment by Vince (Heronymous) - Saturday, 03 April 2021, 19:39 GMT
I wanted to throw my hat in and confirm that I also have the same issue with systemd 248-1. Downgrading to systemd 247.3-1 fixed my issue (and only this downgrade was needed).
Comment by Nikita Grabar (Slenderchat) - Saturday, 03 April 2021, 21:27 GMT
@Robert Read the description for libfido2, it’s needed only for usb devices.
Comment by Robert (robson) - Saturday, 03 April 2021, 21:40 GMT
@Nikita Well, take a look at this https://i.postimg.cc/QMpWx4wz/screen.png
Comment by xgjmibzr (xgjmibzr) - Saturday, 03 April 2021, 22:14 GMT
@Robert, take a look at this https://developers.yubico.com/libfido2/

> libfido2 provides library functionality and command-line tools to communicate with a FIDO device over USB, and to verify attestation and assertion signatures.
Comment by Kawai Tomato (kawaitomato) - Sunday, 04 April 2021, 05:17 GMT
I got a similar problem with my computer Hasee Z6-CT5NA. I use LVM on LUKS to encrypt my disk, and use systemd-boot as boot manager. After upgrade systemd from version 247 to version 248-3(I haven't upgraded for a few days), I couldn't boot my system. The screenshot: https://www.kawaitomato.com/files/problem.jpg
Comment by MMH (mmh) - Monday, 05 April 2021, 22:32 GMT
I get "Failed to load key file 'disk.key': Argument list too long" like the author but I don't see any log messages related to "Invalid argument".
Are those two different issues?
Comment by Aaron Maslen (microsoftenator) - Tuesday, 06 April 2021, 04:00 GMT
@mhh: I'd guess that's a separate issue.

The bug appears to be in systemd-cryptsetup, specifically the "Invalid Argument" error message appears to originate in the code that reads the keyfile from the disk - https://github.com/systemd/systemd/blob/34fde9f898f63096262d95c61d75db85dabe6fe4/src/cryptsetup/cryptsetup.c#L1222

My guess would be that the bug is triggered by the crypt_activate_by_passphrase function, but I don't know the codebase well enough to attempt to debug further than this.

I can confirm that this issue also occurs for me with systemd version 248 and that downgrading to 247.4-2 works.
Comment by Kevin (kp7) - Wednesday, 07 April 2021, 09:08 GMT
I have encountered the same bug.
My system is configured to unlock the encrypted /home partition (which is on a secondary drive) automatically with a keyfile - /etc/cryptsetup-keys.d/home.key.
After upgrading to systemd-248, the system boots in emergency mode. However, I am able to correctly mount my home partition using both the said keyfile and the passphrase.
The issue is not present anymore after downgrading to systemd-247
Comment by Simon Paul (smnpl) - Sunday, 11 April 2021, 18:24 GMT
Same here as Kevin. A couple of devices that should get unlocked by a keyfile. Broke with 248 and works with 247.
Right now I downgraded, but that is very unfortunate for a systems core component.

Is this already filed as bug at the systemd github repo?
Comment by Hrvoje Hodak (tribly) - Tuesday, 13 April 2021, 14:33 GMT
Same here. I have 3 disks in /etc/crypttab and only one of them is getting decrypted. The other 2 are giving an "Invalid argument" error.

Downgrading to 247.4-2 solves that issue.
Comment by Random (random237849) - Friday, 16 April 2021, 20:42 GMT
Version 248-5 is still affected. Seems the fix for  FS#70264  (see initial report) did not solve this issue.
Also no indication that there will be a solution from upstream soon.
Comment by Cysioland (Cysioland) - Sunday, 02 May 2021, 18:26 GMT
I have the same issue. libfido2 is present on my system, but this doesn't help.

Relevant journal attached.
Comment by Hrvoje Hodak (tribly) - Sunday, 13 June 2021, 09:09 GMT
Looks like they fixed and merged it upstream https://github.com/systemd/systemd/pull/19878
Comment by Christian Hesse (eworm) - Sunday, 13 June 2021, 19:12 GMT
Watching this myself... Sadly this does not easily apply to v248...
Comment by Simon Paul (smnpl) - Saturday, 19 June 2021, 08:25 GMT
I don't understand enough of systemd here. Does that mean we'll need to wait for a new major release (v249)? If so, what is the usual release cycle? I'm getting more uncomfortable with every week I have systemd on my ignore list =)
At least that looks like progress!
Comment by Christian Hesse (eworm) - Monday, 21 June 2021, 18:52 GMT
The release process for v249 already started, version v249-rc1 is available. At some point (-rc2 or -rc3) I will push packages to [testing].
That being said... I could build packages for v249-rc1 and create a separate repository if you are interested.

If the fix is backported to the stable branch I will push an update to the repositories of course.
Comment by Christian Hesse (eworm) - Monday, 21 June 2021, 18:53 GMT
BTW, this is the milestone for v249:
https://github.com/systemd/systemd/milestone/26
Comment by Simon Paul (smnpl) - Monday, 21 June 2021, 19:40 GMT
Ah, thanks for the explanation and offer. Looks like is going it's natural way and it's not too far from arriving. So from my point of view no need for extra effort on your end. :)
Comment by MMH (mmh) - Friday, 02 July 2021, 11:54 GMT
Updated to testing/249rc3-1 and it resolved the issue for me.
Comment by Simon Paul (smnpl) - Wednesday, 14 July 2021, 18:36 GMT
Yep, v249 fixes it for me as well. Glad this is over :)

Loading...