FS#73309 - [pambase] system-auth confuses systemd-homed

Attached to Project: Arch Linux
Opened by Hartmut Malzahn (hwm) - Sunday, 09 January 2022, 14:22 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:14 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Jan Alexander Steffens (heftig)
David Runge (dvzrv)
Levente Polyak (anthraxx)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:

Since upgrading pambase to 20211210-1, logging on as a user managed by systemd-homed sometimes fails. The log always shows a failed activation followed by a succesful one, like this:

systemd-homed[42543]: test: changing state inactive → activating-for-acquire
systemd-homework[43550]: None of the supplied plaintext passwords unlock the user record
systemd-homed[42543]: Activation failed: Required key not available
systemd-homed[42543]: test: changing state activating-for-acquire → inactive
systemd-homed[42543]: Got notification that all sessions of user test ended, deactivating
systemd-homed[42543]: Home test already deactivated, no automatic deactivation needed.
systemd-homed[42543]: test: changing state inactive → activating-for-acquire
systemd-homework[43551]: Provided password unlocks user record.

and homectl inspect test shows a "Bad Auth" for every "Good Auth". I'm able to reproduce this behaviour with a simple test user, eg "homectl create test --disk-size 512M --storage luks --fs-type btrfs".

On one in maybe 10-20 logins this leads to the effect that the login happens without the home path mounted, thus effectively being locked out and having to logoff and logon again. I was not able to reproduce this escalation reliably, though.

Downgrading the pambase and filesystem packages (for pam.d/system-auth and nsswitch.conf) eliminates the problem.

I did some experimenting with system-auth, to no effect.

Steps to reproduce:

create a user with homectl create (LUKS storage) and log on.
This task depends upon

Closed by  Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:14 GMT
Reason for closing:  Moved
Additional comments about closing:  https://gitlab.archlinux.org/archlinux/p ackaging/packages/pambase/issues/2
Comment by Jan Alexander Steffens (heftig) - Friday, 11 February 2022, 23:49 GMT
I'm not sure what to do here. I don't know PAM or systemd-home well enough.

FTR, the changes to pambase were for  FS#72967 .
Comment by Hugo Brugel (NeverTooLate) - Thursday, 09 February 2023, 22:46 GMT
I can confirm bug is still there with pambase 20221020-1 and systemd 252.5-1.
I'll try looking into it if I can but I'm no expert either.
Comment by LaserEyess (LaserEyess) - Friday, 31 March 2023, 23:15 GMT
I'm getting this too

Good Auth.: 1392
Last Good: Fri 2023-03-31 19:08:14 EDT
Bad Auth.: 1645
Last Bad: Fri 2023-03-31 19:08:14 EDT

I get more bad auths than good auths. I think the severity of this should be increased. It seems quite precarious and liable to lock someone out of their machine.
Comment by Hugo Brugel (NeverTooLate) - Sunday, 02 April 2023, 20:53 GMT
Here might be an explanation : https://github.com/systemd/systemd/issues/26722#issuecomment-1465940613
So this would be the normal behaviour.

For every good auth there is one bad auth. If you add bad auth that shows to be doubles, you end up with quite a lot of bad auth.
Comment by LaserEyess (LaserEyess) - Tuesday, 04 April 2023, 00:53 GMT
No I don't think that's the same one. First, I have noticeably more bad auths than good auths. Second, in the OP there is a mention of actual *failed* logins, that does not sound like expected behavior.

If I look at my logs I see things like:

Apr 03 20:46:33 homura systemd-homed[1385]: Authentication failed: Required key not available
Apr 03 20:46:33 homura systemd-homed[1385]: laser: changing state authenticating-for-acquire → active
Apr 03 20:46:33 homura swaylock[42799]: pam_systemd_home(swaylock:auth): Failed to acquire home for user laser: Password for home laser is incorrect or not sufficient for aut>
Apr 03 20:46:33 homura swaylock[42799]: pam_unix(swaylock:auth): authentication failure; logname= uid=60465 euid=60465 tty= ruser= rhost= user=laser
Apr 03 20:46:35 homura systemd-homed[1385]: laser: changing state active → authenticating-for-acquire
Apr 03 20:46:35 homura systemd-homework[42977]: None of the supplied plaintext passwords unlock the user record's hashed passwords.
Apr 03 20:46:35 homura systemd-homed[1385]: Authentication failed: Required key not available
Apr 03 20:46:35 homura systemd-homed[1385]: laser: changing state authenticating-for-acquire → active
Apr 03 20:46:35 homura systemd-homed[1385]: laser: changing state active → authenticating-for-acquire

What happens in swaylock is that I will enter my password (correctly!) multiple times and will be denied entry, and then if I enter an empty password, I am denied entry, but then eventually it will unlock. There's some sort of race condition that's either a bug in systemd-homed or a bug in the PAM configuration that is causing valid attempts to fail. This happens on another machine where I use encrypted partitions:

vps@vps password:
Last login: Mon Mar 27 12:50:48 2023 from 1.2.3.4
Could not chdir to home directory /home/vps: Permission denied
-bash: /home/vps/.bash_profile: Permission denied

The logs from this:

Apr 03 20:56:23 vps sshd[21273]: Accepted password for vps from 1.2.3.4 port 53023 ssh2
Apr 03 20:56:23 vps sshd[21273]: pam_unix(sshd:session): session opened for user vps(uid=60175) by (uid=0)
Apr 03 20:56:23 vps systemd-logind[316]: New session 3 of user vps.
Apr 03 20:56:23 vps systemd[1]: Created slice User Slice of UID 60175.
Apr 03 20:56:23 vps systemd[1]: Starting User Runtime Directory /run/user/60175...
Apr 03 20:56:23 vps systemd[1]: Finished User Runtime Directory /run/user/60175.
Apr 03 20:56:23 vps systemd[1]: Starting User Manager for UID 60175...
Apr 03 20:56:23 vps systemd-homed[317]: vps: changing state inactive → activating-for-acquire
Apr 03 20:56:23 vps systemd-homework[21319]: None of the supplied plaintext passwords unlock the user record's hashed passwords.
Apr 03 20:56:23 vps systemd-homed[317]: Activation failed: Required key not available
Apr 03 20:56:23 vps (systemd)[21318]: pam_systemd_home(systemd-user:account): conversation failed
Apr 03 20:56:23 vps (systemd)[21318]: pam_systemd_home(systemd-user:account): conversation failed
Apr 03 20:56:23 vps systemd-homed[317]: vps: changing state activating-for-acquire → inactive
Apr 03 20:56:23 vps (systemd)[21318]: pam_systemd_home(systemd-user:account): Failed to prompt for password/prompt.
Apr 03 20:56:23 vps (systemd)[21318]: pam_warn(systemd-user:setcred): function=[pam_sm_setcred] flags=0x8002 service=[systemd-user] terminal=[] user=[vps] ruser=[<unknown>] >
Apr 03 20:56:23 vps systemd-homed[317]: vps: changing state inactive → activating-for-acquire
Apr 03 20:56:23 vps systemd-homework[21320]: None of the supplied plaintext passwords unlock the user record's hashed passwords.
Apr 03 20:56:23 vps systemd-homed[317]: Activation failed: Required key not available
Apr 03 20:56:23 vps systemd-homed[317]: vps: changing state activating-for-acquire → inactive
Apr 03 20:56:23 vps (systemd)[21318]: pam_systemd_home(systemd-user:session): conversation failed
Apr 03 20:56:23 vps (systemd)[21318]: pam_systemd_home(systemd-user:session): conversation failed
Apr 03 20:56:23 vps (systemd)[21318]: pam_systemd_home(systemd-user:session): Failed to prompt for password/prompt.
Apr 03 20:56:23 vps (systemd)[21318]: pam_unix(systemd-user:session): session opened for user vps(uid=60175) by (uid=0)
Apr 03 20:56:23 vps systemd[21318]: Failed to open configuration file '/home/vps/.config/systemd/user.conf': Permission denied

So, something is wrong on PAM's end or homed's end.

And please don't tell me I'm entering my password wrong. That doesn't make sense because on the vps I can actually log in, and on my laptop I know my muscle memory and I can tell when I enter my password in wrong. I am entering it in correctly. I am certainly not getting it wrong 20% of the time, that's absurd.

Comment by Hugo Brugel (NeverTooLate) - Tuesday, 04 April 2023, 17:07 GMT
I have more bad auth than good auth too. That's to be expected...to have at least as many bad auths as good.
Here is the log with debug just for the sake of showing what's happening here.(I just hided my true password).

The explanation for this is homed does fido2 and stuff that doesn't require a password in the first place.
Pam_systemd_home doesn't have a way of knowing before hand how systemd-homework does manage it's stuff.

This explains why configuring pam system-auth is useless.
Actually the problems you're describing might be something else, I don't know.

If login doesn't AquireHome fast enought like said this can cause issue I guess.
And Sway is not typically managed by systemd. I don't know if that can be a thing.
Comment by LaserEyess (LaserEyess) - Tuesday, 04 April 2023, 18:01 GMT
No that's not right, let's do some math. Supposing there's 1 bad login for every good login, that means out of 1392 good logins I should have at least 1392 bad logins, so let's ignore those for a second. I have 1392 successful logins and (1645 - 1392 =) 253 failed logins. Supposing each failed login also fails twice, that's ~126 failed logins vs 1392 successful logins, or a failed percentage of ~9%. (18% if it doesn't fail twice per bad login, but let's be generous). Now I may mess up my password every once and a while, but I do not mess up my password ~1/10 times. Some percentage of valid logins are *failing* on my system, and I have the logs to prove it. That does not seem like intended behavior.

Sway has nothing to do with this btw, even if it isn't started by systemd. Swaylock is a program that uses PAM for authentication, just like openssh or any other program using libpam. libpam talks to pam_systemd_homed.so and that is where the issue happens.

As for the mount failure on my VPS via openssh. Openssh validates my password and it is accepted, then it gets handed off to pam_systemd_homed.so via PAM mechanisms, and it *fails* and cannot mount my drive. This is what the OP is describing, and what can be reproduced by the steps in the OP. I can reproduce it ~1/10 times as well (strangely seems to be the same percantage that my laptop fails, but that could be coincidence).

What I don't know is if this is a PAM configuration bug or a systemd-homed bug. Either way, even if it failing *once* is intended, failing *twice* is not. That is the bug here.

Loading...