FS#56828 - [systemd] Service units with User=nobody are unable to access config file owned by nobody
Attached to Project:
Arch Linux
Opened by Eric Wang (enihcam) - Saturday, 23 December 2017, 00:18 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:13 GMT
Opened by Eric Wang (enihcam) - Saturday, 23 December 2017, 00:18 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:13 GMT
|
Details
Description:
Service units with User=nobody are unable to access config file owned by nobody, "Most likely your distro builds systemd incorrectly, not matching the build configuration correctly to their /etc/passwd setup. They need to tell systemd right right UID as build-time, using "meson -D nobody_user=foobar -D nobody_group=foobar", and it has to be the name used on uid 65534. Please report this to your distribution. Thanks." For more details, please open https://github.com/systemd/systemd/issues/7717#issuecomment-353639360 Additional info: * package version(s) v236 * config and/or log files etc. ``` $ cat /etc/nsswitch.conf # Name Service Switch configuration file. # See nsswitch.conf(5) for details. passwd: files mymachines systemd group: files mymachines systemd shadow: files publickey: files hosts: files mymachines resolve [!UNAVAIL=return] dns myhostname networks: files protocols: files services: files ethers: files rpc: files netgroup: files ``` ``` $ id nobody uid=99(nobody) gid=99(nobody) groups=99(nobody) ``` ``` $ cat /usr/lib/sysusers.d/basic.conf | grep nobody # The nobody user for NFS file systems u nobody 65534 "Nobody" - ``` Steps to reproduce: ``` $ cat /etc/systemd/system/test@.service [Service] Type=simple User=nobody CapabilityBoundingSet=CAP_NET_BIND_SERVICE ExecStart=/usr/bin/cat /tmp/%i.json ``` ``` $ ls -l /tmp/test.json -rw-r----- 1 nobody root 0 Dec 22 21:04 test.json ``` ``` $ sudo systemctl status test@test ● test@test.service Loaded: loaded (/etc/systemd/system/test@.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2017-12-22 21:06:42 CST; 3s ago Process: 511 ExecStart=/usr/bin/cat /tmp/test.json (code=exited, status=1/FAILURE) Main PID: 511 (code=exited, status=1/FAILURE) Dec 22 21:06:42 cat[511]: /usr/bin/cat: /tmp/test.json: Permission denied Dec 22 21:06:42 systemd[1]: test@test.service: Main process exited, code=exited, status=1/FAILURE Dec 22 21:06:42 systemd[1]: test@test.service: Failed with result 'exit-code'. ``` |
This task depends upon
Closed by Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:13 GMT
Reason for closing: Moved
Additional comments about closing: https://gitlab.archlinux.org/archlinux/p ackaging/packages/systemd/issues/1
Saturday, 25 November 2023, 20:13 GMT
Reason for closing: Moved
Additional comments about closing: https://gitlab.archlinux.org/archlinux/p ackaging/packages/systemd/issues/1
$ sudo usermod -o -u 65534 nobody #the -o option is because nobobody as provided by nss-systemd already uses 65534 so it may need to override the usual unique check.
or
$ sudo userdel nobody
$ sudo systemd-sysusers /usr/lib/sysusers.d/basic.conf
Creating group nobody with gid 65534.
Creating user nobody (Nobody) with uid 65534 and gid 65534.
What we need is (again) to fix systemd who overnight decided to assume a specific UID was assigned to user nobody.
filesystem provides nobody=99 as before
rebuild systemd with -D nobody_user=systemd_nobody -D nobody_group=systemd_nobody
systems where nobody=65534 need manual intervention
instead of
systems where nobody!=65534 need manual intervention
or something else?
If both this and
FS#56818result in needing manual intervention could they please be scheduled at the same time.IMO it's better to deal with this with post install script which change UID and chowns relevant files. Or print info what should be done manually at least.
It's also more general issue. Systemd is being incompatible with statically preassigned UID/GID numbers. We already had issues with qemu/libvirt, now cups, nobody...the list is growing. I'm afraid that Arch will have to abandon setting static UID/GID altogether and leave this to Systemd.
See also https://bugs.archlinux.org/task/56662
And now if I want to manually create a user in /etc/passwd, where do I look to see if the UID I want to use is already taken by systemd? Having to manage UID tables in different locations is just plain dumb.
See https://github.com/systemd/systemd/commit/24eccc3414a29a14b319d639531bd23c158b20e1
Even with the feature turned off if would probably be a good idea to mark nobody has having both UID/GID 65534 and UID/GID 99
to avoid packages shipping any files using either ID until arch decides on which ID nobody should have.
AFAICT the systemd devs have not even stated a reason for synthesizing this user, other than that it is somehow "legacy" -- but providing this flag file, according to that commit, works *flawlessly* on systemd (hence why they provide the thing to begin with).
They literally just said "we're trying to standardize the UID for this user, *because* we are trying to standardize it". This is rather circular logic...
As an additional component to this, we should also define nobody_user and nobody_group to 99 since AFAICT every Arch system since ever uses this as the UID/GID for the "nobody" user, *including* systems where systemd arbitrarily synthesizes the user to 65534 which only takes effect for processes started as a systemd service and which is therefore completely and utterly borked and doesn't work to begin with.
Therefore there is no manual intervention needed, we just need to tell systemd to use the same UID that everyone's system is already using in passwd (preferably by actually using passwd).
As filesystem no longer provides nobody (1) it is generated by sysusers.d from /usr/lib/sysusers.d/basic.conf with ID 65534
https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/filesystem&id=20928f58767d34ed6711befd6255f6a0b1706ae8
Edit:
If nobody has to have ID 99 on arch why not pass --nobody-user=systemd-nobody --nobody-group=systemd-nobody to systemd's configure
and have nobody added back to filesystem?
Okay, this is completely unmaintainable. It's now a choice between:
- pushing out a news post telling 99% of users to fix their ten-year-old perfectly working systems
- telling all users who deployed a new system in the past several months to fix their systems
- deploying this simple fix to turn off the systemd insanity that even systemd agrees doesn't really work and therefore provide a file flag to disable
And if either #2 or #3 is chosen, we should opt to respect the very much longstanding system of using UID=99 for the "nobody" user as there is absolutely no compelling reason to hardcode it to 65534 other than "well, we accidentally broke some users' systems for three months because we couldn't figure out what to do with this bug".
@eworm, this really seems like a simple fix. What is taking so long?
Also wtf I did not realize that systemd actually wanted to hardcode a different username to still have their arbitrary UID. I thought they would specify a different hardcoded UID for the same "nobody" user. So systemd has basically made it completely and utterly impossible to fix things "their way", without manual intervention across many systems.
So yes, this flag file is looking increasingly necessary. That will give us time to sort out this unholy mess. Which is only getting more unholy the longer it takes to consistently specify, say, the systemd-nobody user.
I'm now thoroughly confused -- what, exactly, does systemd actually use this *configurable* user for? The basic.conf only specifies NFS filesystems, but I don't see why that specifically has a need to avoid NSS lookups... and I don't think systemd handles that anyway.
And again, the existence of the flag file indicates the systemd devs agree that things will legitimately work fine with NSS lookup. As it has done for a long time already.
# touch file
# chown nobody:nobody file
# ls -n file
-rw-r--r-- 1 99 99 0 mar 11 13:45 file
# ls -l file
-rw-r--r-- 1 nobody nobody 0 mar 11 13:45 file
# chown 65534:65534 file
# ls -n file
-rw-r--r-- 1 65534 65534 0 mar 11 13:45 file
# ls -l file
-rw-r--r-- 1 nobody nobody 0 mar 11 13:45 file -> WTF!!!
IMHO if standard UID for nobody is 65534 in overall maybe Arch should not go counter-current.
in the kernel, and has this to say about it:
> 2. 65534 → The `nobody` UID, also called the "overflow" UID or similar. It's
> where various subsystems map unmappable users to, for example file systems
> only supporting 16bit UIDs, NFS or user namespacing. (The latter can be
> changed with a sysctl during runtime, but that's not supported on
> `systemd`. If you do change it you void your warranty.) Because Fedora is a
> bit confused the `nobody` user is called `nfsnobody` there (and they have a
> different `nobody` user at UID 99). I hope this will be corrected eventually
> though. (Also, some distributions call the `nobody` group `nogroup`. I wish
> they didn't.)
Once upon a time, the only way to get a user that the kernel couldn't
map to a UID was when doing name-based mapping of NFS users, hence the
folks at Fedora naming it "nfsnobody". With modern user namespaces,
anyone working with containers is likely to see the uid=65534 at some
point.
Note that systemd's ./mkosi.build sets `-D nobody-user=nfsnobody -D
nobody-group=nfsnobody` if there's a already a user/group named
"nobody" with a different UID/GID.
Now, why systemd need to know what the overflow UID's name is... I'm
not sure.
@ogarcia: Arch no longer goes "counter-current" as you put it. Since
filesystem-2017.10 (which moved from [testing] to [core] on
2017-12-10), Arch has not had a uid=99 "nobody" user. If you have
that user, it is because it is left-over from an old version of the
package, as you did not remove it when merging `/etc/passwd.pacnew`.
Given that (1) many Arch users will have a left-over uid=99 "nobody"
user, and (2) the way that the uid=99 "nobody" user was used is
different than the way the overflow user is used, I believe that the
right decision is to set the overflow user's name to something other
than "nobody". Whether it mimics Fedora as "nfsnobody" (despite not
being particularly related to NFS), or is something more-sensible but
uncommon like "overflow", I have no strong opinion.
In the Arch repos, there are a few .service files that say User=nobody. Because uid=65534 is for unmappable users, it is inappropriate to run a service as uid=65534; when these services write User=nobody, they mean the uid=99 user, not the uid=65534 user.
https://github.com/systemd/systemd/issues/7717#issuecomment-353420378
nobody:x:65534:65534:Nobody:/:/sbin/nologin
For me this was the better solution because I have not need an other nobody user. But yes there is both solutions. Change UID and GID from 99 to 65534 or rename the user.
If we want to fix this in arch to avoid the conflict, we can rename the user that systemd maps UID and GID 65534 to via some compiletime options. Would anyone object to keeping nobody=99 in Arch and mapping UID/GID 65534 to something else like "overflow"?
Would not the systemd package providing /etc/systemd/dont-synthesize-nobody as eschwartz suggsted be simpler then whatever values nobody has in /etc/passwd and /etc/group are the values systemd will use.
https://github.com/systemd/systemd/commit/24eccc3414a29a14b319d639531bd23c158b20e1
However, AFAICT this does not conflict with also providing the flag file. We should end up with a standard overflow=65534 user, and a standard nobody=99 user, and users who have installed in the last year will have the "wrong" uids but a consistent user database as long as systemd does not try to synthesize a user that disagrees with the passwd database.
If we implement #1 but not #2, then I guess we need a news post about this, so users can finally fix their systems and have files which have a consistent ownership that they expect to have.
[1] https://github.com/systemd/systemd/blob/v246/NEWS#L106
$ sudo userdel nobody
$ sudo systemd-sysusers /usr/lib/sysusers.d/basic.conf
Plus I changed the owner on all my nobody files.
Seems to be the only reasonably solution to me. But I stumbled over this issue only because of issues with my NFS file server. I would expect that a distribution is either providing a fix or communicate this properly.