FS#56771 - [cryptsetup] using luks2 produces an unbootable system

Attached to Project: Arch Linux
Opened by Christopher (_zeptoSteve) - Tuesday, 19 December 2017, 11:16 GMT
Last edited by Christian Hesse (eworm) - Monday, 08 January 2018, 19:12 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Christian Hesse (eworm)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 6
Private No

Details

recently installed Arch on a second SSD with cryptsetup-2.0.0-1 (from core). I'm using luks on both drives, but only one (with LUKS2) is unbootable.

`cryptsetup --cipher aes-xts-plain64 --iter-time 8192 --key-size 512 --type luks2 --verify-passphrase luksFormat /dev/sda2` is the exact command for creating the encryption. I rebooted and found that `cryptsetup isLuks /dev/sda2` fails because a lock cannot be obtained. error -5.

I tried: `cryptsetup --disable-locks --type luks open /dev/sda2 core`, but something complained (probably cryptsetup) about missing "libgcc_s.so". it may have been "libgcc_s.so.1", though. this file was apparently required for (I think) some pthread operation. (pthread_cancel, if I remember correctly.)

I realize this is probably hard to test, sorry about that. I am using the encrypt hook after block and before filesystem. I hope I'm not missing anything really obvious.
This task depends upon

Closed by  Christian Hesse (eworm)
Monday, 08 January 2018, 19:12 GMT
Reason for closing:  Fixed
Additional comments about closing:  Latest cryptsetup and argon revisions.
Comment by Christopher (_zeptoSteve) - Tuesday, 19 December 2017, 19:28 GMT
managed to work around it, but I'm pretty sure it's not a very long-term solution.

* removed the "isLuks" check from /usr/lib/initcpio/hooks/encrypt, and placed "true" there instead (line 70)
* added --disable-locks to the "cryptsetup open" line of the same file (line 87)
* added 'add_file "libgcc_s.so.1"' before the add_runscript line of /usr/lib/initcpio/install/encrypt

not the greatest, because "isLuks" is kinda required sometimes. and I'm not sure what the locks even do in LUKS2, so removing them might be a bad idea.

aside from this, though, maybe the "isLuks" thing just doesn't work in busybox due to the lock? that would be an upstream issue I guess.
Comment by Christian Hesse (eworm) - Wednesday, 20 December 2017, 20:57 GMT
I can not reproduce any of these issue. Tested within a running system, though - did not try to boot off LUKS2.

Do you see the locking issue in initramfs only? Would be nice if you could reproduce this from a running system as well.

I could not see a way that cryptsetup could load "libgcc_s.so.1". Can you try to get the details what happens? Even more interesting: mkinitcpio adds all required libraries to initramfs, so no idea why this one can be missing when it is required.

I will try to install a system with LUKS2 any time soon... Not sure when I will have any spare time.

You could try to migrate your initramfs to a systemd-based one... Possibly that works.
Comment by Christopher (_zeptoSteve) - Wednesday, 20 December 2017, 21:24 GMT
attached image is potentially more concrete info.

actually, I can open it from an already running Arch; it's only inside initramfs that the locking issue occurs.

I tried the sd-encrypt hook. I'm not too familiar with the required hooks. I got an error then too, dropped into a console. it was just "check systemctl status <something>". can't remember the exact term used.

when I did that, it said something about "/usr/lib/systemd/systemd-<something> attach /dev/sda2", where <something> is "generator" or "cryptsetup", can't remember.

(I'm saying "/dev/sda2" because I can't remember the exact UUID of the disk; I'm on a different computer right now.)

obviously failed when I tried it. their error messages are not much help.

EDIT: the systemd command was "/usr/lib/systemd/systemd-cryptsetup attach absoCore /dev/sda2". the error returned was: "crypt_load() failed on device /dev/sda2".
Comment by Christopher (_zeptoSteve) - Wednesday, 20 December 2017, 23:28 GMT
at this point, I'm 90% confident that /run/lock/cryptsetup is required for isLuks to recognize luks2. there's no option to turn off locking in the isLuks command, for some reason.

I tried to see if I could boot after creating /run/lock/cryptsetup, and I got pretty close. I had to include /usr/bin/mkdir, because I guess including /run/lock/cryptsetup/dummy didn't persist. (I created the dummy file and added it to the FILES array of mkinitcpio.conf)

the only trouble I had was using the "switch_root" command, because it complained about not being boot in a systemd environment. so basically:

~~~
mkdir -p /run/lock/cryptsetup # works.
cryptsetup isLuks /dev/sda2 # works.
cryptsetup --type luks open /dev/sda2 absoCore # works.
mount /dev/mapper/absoCore /new_root # works.
switch_root /new_root /usr/bin/systemd/systemd # does not work.
~~~

I know it's not rendered as a code block, but whatever.
inspired by https://gitlab.com/cryptsetup/cryptsetup/issues/325
Comment by Christopher (_zeptoSteve) - Thursday, 21 December 2017, 00:04 GMT
this problem is kinda solved for me, but I still don't know why libgcc_s.so.1 is required. I'm certain it is, because I've tried without it a bunch of times. system only boots with it. below is another way to get it to work, without remove the isLuks line. this also keeps locks, which might be a good idea if upstream implemented it for a reason. (they did, I just don't know what it is.)

anyway, three changes.
* "/usr/lib/initcpio/install/encrypt" must also 'add_binary "mkdir"', because of the next thing.
* "/usr/lib/initcpio/hooks/encrypt" must "mkdir -p /run/locks/cryptsetup" **before** isLuks. anytime before is probably fine, but I chose the line immediately above.
* "/etc/mkinitcpio.conf" must have "/usr/lib/libgcc_s.so.1" in the FILES array.

there may be a simpler way to do this, but this works.
Comment by Arthur (pysen) - Thursday, 21 December 2017, 12:53 GMT
I also have this problem and can verify what _zeptosteve has experienced. To make my system ask for the LUKS2 password and not fail to recognize the LUKS2 volume I had to make the three changes from the previous comment but also include the 'crypto=' parameter in my /etc/default/grub so that GRUB_CMDLINE_LINUX contains "cryptdevice=UUID=$PARTITIONUUID crypto=$HASH:$CIPHER:$KEYSIZE::". Although I still can't boot my system, the password prompt now shows up but upon entering the password it drops into the emergency shell. Prior to these modifications I got the error message "Failed to open encryption mapping: The device UUID=$PARTITIONUUID is not a LUKS volume and the crypto= parameter was not specified."
Comment by leuko (leuko) - Saturday, 30 December 2017, 17:34 GMT
I have a similar problem with a slightly different output:

Even after applying the changes what Christopher have mentioned I get the following error:

The device /dev/* is not a LUKS volume and the crypto= parameter was not specified.

After this error I get dropped to the emergency shell. If I then try to open the partition manually using:

cryptsetup luksOpen /dev/* root

Then I get the following error:

Failed to acquire read lock on device /dev/*
Comment by leuko (leuko) - Saturday, 30 December 2017, 21:46 GMT
I found that LUKS2 tries to obtain a read lock on the device when it reads the LUKS2 header [1], which seems to fail in my case. Locking applies to all operations like 'isLuks, open, or openLuks'. Fortunately, cryptsetup supplies the '--disable-locks' argument, which deactivates this check.

I reverted Christopher's changes and tried running cryptsetup using '--disable-locks':

/usr/lib/initcpio/hooks/encrypt: In my case I just had to change the following line with isLuks:

if cryptsetup isLuks ${resolved} ...

to

if cryptsetup isLuks --disable-locks ${resolved} ...

and the following line for opening the LUKS container:

while ! eval cryptsetup open --type luks ...

to

while ! eval cryptsetup open --disable-locks --type luks ...

After these changes I noticed that the global variable $root is not set by the encrypt hook if $DEPRECATED_CRYPT==0, so I had to add the following block:

if [ -e "/dev/mapper/${cryptname}" ]; then
if [ ${DEPRECATED_CRYPT} -eq 1 ]; then
export root="/dev/mapper/root"
fi
else

to:

if [ -e "/dev/mapper/${cryptname}" ]; then
if [ ${DEPRECATED_CRYPT} -eq 1 ]; then
export root="/dev/mapper/root"
fi
export root=/dev/mapper/${cryptname}
else

A last change was needed due to the following error by cryptsetup:

libgcc_s.so.1 must be installed for pthread_cancel to work

So I added /usr/lib/libgcc_s.so.1 to the FILES array in /etc/mkinitcpio.conf like Christopher mentioned:

FILES=(/usr/lib/libgcc_s.so.1)

Then, I could finally boot!

[1] https://gitlab.com/cryptsetup/cryptsetup/blob/master/lib/luks2/luks2_json_metadata.c#L820
Comment by David McAdoo (geecroof) - Sunday, 31 December 2017, 11:22 GMT
Maybe it's worth to ask upstream about this on gitlab or mailinglist http://www.saout.de/mailman/listinfo/dm-crypt ?
Comment by Zane (doublez13) - Tuesday, 02 January 2018, 22:47 GMT
Same issue for me.
Same boot errors as show in Christopher's image.

From the cryptsetup 2.0.0 release notes:
NOTE: to operate correctly, LUKS2 requires locking of metadata.
Locking is performed by using flock() system call for images in file
and for block device by using a specific lock file in /run/lock/cryptsetup.

If your distro does not support tmpfiles.d directory, you have to create locking directory (/run/lock/cryptsetup) in cryptsetup package (or init scripts).

Let me know if there's any input that I can provide.
Comment by loqs (loqs) - Tuesday, 02 January 2018, 23:16 GMT
It seems strange that some systems would still be able to open a LUKS2 root volume without encountering the issue.
Comment by Zane (doublez13) - Wednesday, 03 January 2018, 00:04 GMT
Opening a LUKS2 volume isn't the problem, its initcpio reading it.

Here's my temporary fix.
1. Add "FILES=(/usr/lib/libgcc_s.so.1)" to /etc/mkinitcpio.conf
2. Add "mkdir -p /run/lock/cryptsetup/" to /usr/lib/initcpio/hooks/encrypt

Seems like the cleanest way for now...
Comment by loqs (loqs) - Wednesday, 03 January 2018, 00:19 GMT
Zane that is what I meant just checked the volume I was testing against and it was LUKS1 which explains why I could not recreate the issue in either a systemd based initrd or shell script based initrd.
Still does not explain what is calling libgcc_s.so.1 it is not linked against cryptsetup but clearly something is trying to use it.
Comment by loqs (loqs) - Wednesday, 03 January 2018, 00:47 GMT Comment by Zane (doublez13) - Wednesday, 03 January 2018, 01:27 GMT
Good call!
Comment by David McAdoo (geecroof) - Wednesday, 03 January 2018, 10:55 GMT
So libgcc_s.so.1 issue seems limited to argon2 use as PBKDF which means if someone updated in place his LUKS container from LUKS1 to LUKS2 it shouldn't be affected as updating existing container doesn't support argon2 hashing.
Comment by loqs (loqs) - Wednesday, 03 January 2018, 12:08 GMT
@eworm cryptsetup 2.0.0-3 did not change the sd-encrypt hook will it not still be broken?
Comment by Christian Hesse (eworm) - Wednesday, 03 January 2018, 12:24 GMT
A systemd based initcpio does not use the cryptsetup binary but systemd-cryptsetup from systemd. Does anybody use systemd based initcpio with LUKS2?
Comment by loqs (loqs) - Wednesday, 03 January 2018, 12:35 GMT
It seems to be affected https://bugs.archlinux.org/task/56771#comment164579 systemd-cryptsetup is linked to libcryptsetup.so.12
Comment by Christian Hesse (eworm) - Wednesday, 03 January 2018, 15:45 GMT
I added a workaround for systemd-based initcpio in cryptsetup 2.0.0-4.
Comment by Christopher (_zeptoSteve) - Sunday, 07 January 2018, 05:21 GMT
tried the 2.0.0-5 package. totally works. specifically, the non-systemd hook. didn't try anything else.
Comment by Mortan (Mortan1961) - Monday, 08 January 2018, 01:41 GMT
I tried booting using cryptsetup 2.0.0-5 from a LUKS2 partition created using the 2018.01.01 ISO and I get "Illegal instruction" whenever I enter a passphrase. I am guessing that is because the ISO has outdated packages. If that really is the issue, then I would appreciate it if release engineering produced a new ISO.
Comment by Christian Hesse (eworm) - Monday, 08 January 2018, 14:48 GMT
Mortan1961, please try argon2 20171227-3.
What CPU does your system have?
Comment by Mortan (Mortan1961) - Monday, 08 January 2018, 17:11 GMT
I can successfully boot with argon2 20171227-3. This was in VirtualBox on an Intel processor.
Comment by Mortan (Mortan1961) - Monday, 08 January 2018, 17:16 GMT
Also want to say that I am using the encrypt hook, not sd-encrypt.
Comment by Mortan (Mortan1961) - Monday, 08 January 2018, 17:54 GMT
The march (OPTTARGET) appears to default to native when unset; have you tried setting it to x86-64 instead of none?
https://github.com/P-H-C/phc-winner-argon2/blob/20171227/Makefile#L44
Comment by Christian Hesse (eworm) - Monday, 08 January 2018, 19:09 GMT
It uses 'x86-64' as a fallback if optimization checks fail.
I tried to be honest to our i686 and arm friends. ;)

Loading...