FS#69563 - core/glibc 2.33 prevents Archlinux runing under systemd-nspawn
Attached to Project:
Arch Linux
Opened by Federico Cuello (fedux) - Saturday, 06 February 2021, 14:49 GMT
Last edited by Allan McRae (Allan) - Saturday, 13 February 2021, 23:48 GMT
Opened by Federico Cuello (fedux) - Saturday, 06 February 2021, 14:49 GMT
Last edited by Allan McRae (Allan) - Saturday, 13 February 2021, 23:48 GMT
|
Details
Description:
When running Archlinux under Ubuntu 20.10 using nspawn, upgrading glibc to version 2.33 breaks the execution: Additional info: Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: systemd 247.3-1-arch running in system mode. (+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy> Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Detected virtualization systemd-nspawn. Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Detected architecture x86-64. Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Welcome to Arch Linux! Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Set hostname to <xxxxxx>. Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Failed to create /init.scope control group: Operation not permitted Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Failed to allocate manager object: Operation not permitted Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: [!!!!!!] Failed to allocate manager object. Feb 06 11:15:54 kaoz-srv systemd-nspawn[1471]: Exiting PID 1... Downgrading to: core/gcc 10.2.0-4 core/gcc-libs 10.2.0-4 core/glibc 2.32-5 , fixes the issue. Steps to reproduce: Boot Archlinux inside systemd-nspawn. |
This task depends upon
Closed by Allan McRae (Allan)
Saturday, 13 February 2021, 23:48 GMT
Reason for closing: Fixed
Additional comments about closing: glibc-2.33-4
Saturday, 13 February 2021, 23:48 GMT
Reason for closing: Fixed
Additional comments about closing: glibc-2.33-4
```
...
Step 3/18 : RUN pacman -Syu --noconfirm --noprogressbar
---> Running in dc424abf9fce
:: Synchronizing package databases...
downloading core.db...
downloading extra.db...
downloading community.db...
:: Starting full system upgrade...
resolving dependencies...
looking for conflicting packages...
Packages (12) curl-7.75.0-1 e2fsprogs-1.46.0-1 findutils-4.8.0-1 gcc-libs-10.2.0-6 glib2-2.66.6-1 glibc-2.33-3 libldap-2.4.57-1
linux-api-headers-5.10.13-1 pacman-mirrorlist-20210206-1 systemd-247.3-1 systemd-libs-247.3-1 systemd-sysvcompat-247.3-1
Total Download Size: 54.81 MiB
Total Installed Size: 249.56 MiB
Net Upgrade Size: 0.04 MiB
:: Proceed with installation? [Y/n]
...
---> 738379fec923
Step 4/18 : RUN ls -l /var/lib/pacman
---> Running in 4ddf9edeb25f
total 9
drwxr-xr-x 110 root root 111 Feb 7 17:02 local
drwxr-xr-x 2 root root 5 Feb 7 17:02 sync
Removing intermediate container 4ddf9edeb25f
---> 3006db6e04d6
Step 5/18 : RUN pacman -Syu --noconfirm --noprogressbar
---> Running in 7d9ed43a3beb
error: failed to initialize alpm library
(could not find or read directory: /var/lib/pacman/)
```
Edit:
Just for an extra data point, rebuilding `pacman` doesn't help here - it looks more fundamental than that (e.g. `makepkg` can't find `/etc/makepkg.conf`). However, `pacman-static` does still work, so it might still point towards some other linking issue?
Edit 2:
The main difference I can see between pacman-5.2.2 and pacman-static-5.2.2 is this in an `strace -f pacman`:
```
newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0755, st_size=27, ...}, 0) = 0
newfstatat(AT_FDCWD, "/var/lib/pacman/", {st_mode=S_IFDIR|0755, st_size=4, ...}, 0) = 0
```
This output doesn't exist with `strace -f pacman-static`, instead it uses `stat`:
```
stat("/", {st_mode=S_IFDIR|0755, st_size=27, ...}) = 0
stat("/var/lib/pacman/", {st_mode=S_IFDIR|0755, st_size=4, ...}) = 0
```
The latest images on Docker Hub are likewise broken ... (images: archlinux/archlinux:base and archlinux/archlinux:base-devel) ... the "archlinux:base-devel" from the Docker Library is not yet affected as it has glibc-2.32-5 (see below)
Installing glibc-2.33-3 breaks something big... I get wild permission errors when running something like "ldd /usr/sbin/pacman"...
Will try to find more but this needs to be pulled *now* as the *published* Docker images are *broken*.
---
glibc from docker library image:
Name : glibc Version : 2.32-5 Description : GNU C Library Architecture : x86_64 URL : https://www.gnu.org/software/libc
Licenses : GPL LGPL Groups : None Provides : None Depends On : linux-api-headers>=4.10 tzdata filesystem Optional Deps : gd: for memusagestat
Required By : argon2 attr audit base bash binutils bison bzip2 coreutils
device-mapper diffutils expat fakeroot file findutils flex gawk gcc-libs gdbm gnupg grep gzip iproute2 json-c kbd keyutils kmod krb5 less libcap libcap-ng libffi libgpg-error libksba libmnl libnfnetlink libnghttp2 libnl libp11-kit libpcap libseccomp libtasn1 libtool libunistring libxcrypt lz4 m4 make ncurses openssl pacman pam patch pciutils perl pkgconf popt procps-ng readline sed sudo systemd-libs tar which zlib Optional For : None Conflicts With : None
Replaces : None
Installed Size : 46.16 MiB
Packager : Jelle van der Waa <jelle@archlinux.org> Build Date : Wed 14 Oct 2020 05:00:17 PM UTC Install Date : Sun 31 Jan 2021 12:20:10 AM UTC Install Reason : Installed as a dependency for another package Install Script : Yes
Validated By : Signature
https://bbs.archlinux.org/viewtopic.php?id=263376
https://bbs.archlinux.org/viewtopic.php?id=263379
https://github.com/soloturn/swift-aur/runs/1847303230?check_suite_focus=true
it also affects the docker images from lopsided/archlinux:devel which are multi-arch.
https://github.com/moby/moby/pull/41353/files
https://github.com/opencontainers/runc/issues/2151
https://docs.docker.com/engine/release-notes/
https://github.com/systemd/systemd/pull/16819/files
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=3d3ab573a5f3071992cbc4f57d50d1d29d55bde2
https://bugzilla.redhat.com/show_bug.cgi?id=1869030
```
$ apt-cache policy docker-ce
docker-ce:
Installed: 5:20.10.3~3-0~ubuntu-bionic
Candidate: 5:20.10.3~3-0~ubuntu-bionic
Version table:
*** 5:20.10.3~3-0~ubuntu-bionic 500
500 https://download.docker.com/linux/ubuntu bionic/stable amd64 Packages
100 /var/lib/dpkg/status
$ apt-cache policy systemd
systemd:
Installed: 237-3ubuntu10.44
Candidate: 237-3ubuntu10.44
Version table:
*** 237-3ubuntu10.44 500
500 https://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
100 /var/lib/dpkg/status
```
(It's "up-to-date" with available repo packages)
I'm not sure if it's the same issue with fedux's host system though...
https://github.com/sickcodes/Docker-OSX/issues/144
Another example:
https://github.com/johannesjo/super-productivity/issues/877
Building an Arch Linux docker image using hub.docker.com automated builds is affected.
However, building an Arch Linux docker image on an Arch Linux host is unaffected.
Likewise you should check docker for faccessat2 support.
I am using Arch. It works fine on my machine.
However, every build on hub.docker.com that uses archlinux:latest will fail if it involves glibc 2.33.
I will go ahead and submit a bug report to docker.com.
faccessat2(AT_FDCWD, "/var/lib/pacman/", F_OK, AT_EACCESS) = -1 EPERM (Operation not permitted)
faccessat2() was introduced in Linux 5.8, so glibc 2.33 won't work in containers on hosts that have older kernels.
It looks like this is the commit that was responsible: https://sourceware.org/git/?p=glibc.git;a=commit;h=3d3ab573a5f3071992cbc4f57d50d1d29d55bde2
EDIT: I think I read the patch wrong. It looks like it will try to fall back to faccessat() if faccessat2() returns ENOSYS. But in this case the seccomp policy is making it return EPERM even if the policy includes faccessat2().
FWIW, the docker host running into this issue has a kernel from debian/sid v5.8.10-1 ... so just having an up to date kernel isn't the only thing
https://github.com/opencontainers/runc/pull/2750
Will update when hub.docker.com states they've fixed it.
however the docker image could rely on a glibc-linux4 patched glibc package that allows compatibility with linux 4.x hosts:
--- a/sysdeps/unix/sysv/linux/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/kernel-features.h
@@ -213,7 +213,7 @@
/* The faccessat2 system call was introduced across all architectures
in Linux 5.8. */
#if __LINUX_KERNEL_VERSION >= 0x050800
-# define __ASSUME_FACCESSAT2 1
+# define __ASSUME_FACCESSAT2 0
#else
# define __ASSUME_FACCESSAT2 0
#endif
(there are also the time64 5.x syscalls, but you get the idea)
what do you think ?
Apparently the fix will get included in the already "released" runc 1.0.0-rc93 (https://github.com/opencontainers/runc/milestone/10) ; however, runc is part of containerd (current stable 1.4.3) and so far, no sign of any 1.4.4, this will probably be part of containerd 1.5 (currenty in beta https://github.com/containerd/containerd/releases); there is no planned release date (https://containerd.io/releases/)
Unless debian and others backport runc 1.0.0-rc93, this won't get fixed soon
And ubuntu... I mean, consider the number of Ubuntu LTS releases -- bionic (18.04 supported until 2023-04), focal (20.04 supported until 2025-04)...
focal-updates is using containerd 1.33... https://packages.ubuntu.com/focal-updates/containerd
Even the bleeding edge Ubuntu (hirsute) is on runc 1.0.0-rc92.. but the older releases are on runc 1.0.0-rc10! https://packages.ubuntu.com/hirsute/runc
On the debian front, it looks like runc 1.0.0-rc93 was pushed to testing(Sid) yesterday (2021-02-08)... https://tracker.debian.org/pkg/runc
However, whether this gets integrated into containerd .......
----
Ultimately: A *lot* of Docker hosts are running some version of Ubuntu or Debian (I don't know the exact number but I feel safe with a guesstimate of >25%). As it stands now, that means that Arch on Docker is *flat out broken* on ~>25% of hosts. (And it gets worse as packages start to be built with glibc 2.33 -- python 3.9 stopped working since the latest package is built against glibc 2.33).
One of the reasons I love Arch is that it *is* a rolling distribution. I get the predictability/stability that is offered by something like an Ubuntu LTS - but chasing down outdated packages is really a pain. Arch offers a (typically very) stable up-to-date experience. However, in this case, it's *super* broken. Like "why doesn't my server work anymore" massive broken.
And because of the rolling nature of upgrades, Arch on Docker is - for all buy a few exceptional cases - completely broken.
I get the idea that "oh well, this isn't really Arch's fault, everyone else needs to upgrade" -- but the reality is that the *vast* majority of docker hosts aren't going to be upgraded anytime soon. And in that timeframe, a huge number of people will run into issues.
https://aur.archlinux.org/packages/glibc-linux4/
- a global seccomp filter for a faccessat2 -> faccessat transition/wrapper
Assuming that global filters are a thing, even then - this would be error prone due to semantic changes across the two functions.
- add runtime faccessat2 detection, falling back to faccessat as applicable - in the glibc library and/or headers (if the call is embedded into the users)
Should be more stable, albeit the overhead might be noticeable.
Syscall was introduced with linux 5.8 where our LTS is "only" 5.4.96
https://aur.archlinux.org/packages/glibc-linux4/
https://aur.archlinux.org/packages/glibc-linux4/
I've just tried passing https://github.com/moby/moby/blob/master/profiles/seccomp/default.json to my Ubuntu-hosted Docker (as per https://docs.docker.com/engine/security/seccomp/) without any change in the container's resulting behaviour. That default.json already (?) includes e.g. faccessat, faccessat2, but then I might be missing other syscalls... (and that's not an Arch issue).
So perhaps we want to:
- introduce linux-lts-api-headers
- update glibc to depend on ^^
- rebuild glibc/other affected software against linux-lts
[1] https://github.com/bminor/glibc/blob/3016596a819aeedfdc7d658435016be413a1fca7/sysdeps/unix/sysv/linux/faccessat.c#L27
else you want to completely avoid the new syscalls like in my patch to avoid EPERM
another problem are the time64 syscalls which appeared in kernel 5.1 (and ubuntu uses 4.x)
I installed *just* runc (1.0.0~rc93+ds1-1) from debian/sid (https://packages.debian.org/sid/runc)
It seems to have fixed the issue! Pacman works again... And the image is up-to-date. More testing required...
And this was running on Docker 19.03 / containerd 1.3.3 from Ubuntu focal on the host....
a) use linux-lts-api-headers and nuke linux-api-headers, or
b) nuke linux-lts
This issue will continue to happen - chances are - even more often, as kernel developers continue introducing new syscalls.
Downgrade to glibc 2.33-2 seems to work as well! After downgrade to glibc 2.32-5, pacman no longer works (version `GLIBC_2.33' not found (required by pacman))
If other distros are of any indication that lts-api-headers should work - let's look at Debian:
- single kernel through release life (4.19.x for Buster)
- newer kernel is available in backports, yet linux/capability.h is _not_ updated
Looking through the mentioned projects:
- libcap - [1] _always_ uses internal copy of the UAPI linux/capability.h
- libcap-ng (is this used by Arch?) - [2] does not check the define against the procfs
- util-linux - [3] uses a combination of procfs and prctl(PR_CAPBSET_READ) to find the last cap
The examples seems perfectly fine to use linux-lts-api-headers - perhaps they were different in the past.
Can you think of any other places where this might be an issue? I'm more than happy to check through the code.
[1]
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/Make.Rules#n54
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/libcap/include/uapi/linux/capability.h
[2]
https://github.com/stevegrubb/libcap-ng/blob/b6ff250a71a1f0a11b2917186155d2426080293d/utils/filecap.c#L114
https://github.com/stevegrubb/libcap-ng/blob/f15c2d318c1b50e29efcce3036d19ccd0d4d8b8b/src/cap-ng.c#L202
[3]
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/tree/lib/caputils.c
FS#67781?Edit:
https://github.com/karelzak/util-linux/commit/5d95818757941bc609e5aeec5e2218f7d35a6e19 is not in a release?
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/tree/sys-utils/setpriv.c?h=v2.36.1#n537
[1] https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=93de9f687d1640fff963f26b7db474eef3746532
Submitted util-linux MR backporting the changes to the next stable [1].
[1] https://github.com/karelzak/util-linux/pull/1248 - Merged \o/
Edit: linked backports MR
>As far I can see, the only difference between 2.33-2 and 2.33-3 is bz27343.patch. So this patch seems to be to blame here.
@skoehler I think the reason -2 works is because it might have been built using older linux headers and not because of the patch.
EDIT: BTW, 2.33-2 doesn't work for me on Ubuntu 20.10 + systemd-nspawn to run Arch
It must be built with linux-lts-headers, which are at version 5.4 right now. Otherwise, Arch maintainers risk that glibc won't work with linux-lts.
It is likely that glibc is not even built with linux-lts-headers to ensure compatibility.
As-is glibc is built against non LTS UAPI, thus the new system call is mandated. Which in turn is missing for the linux-lts kernel.
The non LTS UAPI was used due to util-linux bug
FS#67781, which has been fixed upstream and I've requested a Arch backport inFS#69613.So util-linux would work without the fixes?
@anthraxx are there other issues with leaving linux-api-headers on 5.10 when linux moves to 5.11?
Edit:
That would allow until 5.12 for util-linux to have a release with the fixes.
Okay, so, besides the point that linux-lts will soon be 5.10 (because 5.10 is an lts release), there's also that, unless I misread the thread, the problem comes down to some outdated systems not being able to containerize newer Archlinux because, well, they are outdated. That doesn't sound like an Archlinux bug to me.
When I'm reading patch description that added `faccess2` syscall¹, it sounds like it was added for a reason, I'm not sure it's a good idea to ignore it.
1: https://lwn.net/Articles/817916/
If other distributions do not work with our toolchain because they are outdated, it is not an Arch problem.
Arch is not fine.
1) whitelisting syscalls in non-Arch packages/software is not our problem
2) running against older kernels is sort-of our problem - I'd still argue that having old packages is not our problem, but it is often out of control of the user.
glibc-2.33-4 in [testing] should fix this. I readded the --enable-kernel which was inexplicably removed from the PKGBUILD four years ago (surprised it has not hit us before). I set the minimum kernel version as 4.4, being the oldest still LTS upstream. Anything older is really not an Arch problem! All those patching glibc should really just have looked at configure options!