Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#71021 - [glibc] [containerd] realpath and paths with trailing slashes will EPERM on older kernels

Attached to Project: Arch Linux
Opened by Makoto Mizukami (makotom) - Wednesday, 26 May 2021, 03:24 GMT
Last edited by Andreas Radke (AndyRTR) - Thursday, 27 May 2021, 18:20 GMT
Task Type Bug Report
Category Packages: Core
Status Assigned
Assigned To Giancarlo Razzolini (grazzolini)
Architecture All
Severity Medium
Priority Normal
Reported Version 6.0.0
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

Summary and Info:

As I understand [1], pacman has a default value of `DBPATH` as `/var/lib/pacman/` (_**with** a trailing slash_).
However, due to several changes introduced to glibc 2.33, this trailing slash causes EPERM when `realpath(3)` for this path is called on older Linux kernels. (The function is indeed called by ALPM [2] [3].)

Steps to Reproduce:

1. Set up a Docker host based on a bit older Linux kernel.
The author followed this repro procedure using Ubuntu 18.04 LTS, which runs Linux kernel 4.15.0.
Note that this version of Ubuntu with this version of kernel is still supported by Canonical [4].

2. Set up a Docker container with the official `archlinux` Docker image and start a shell inside the container.

3. Populate the targeted version of `pacman` accordingly.
The author tested this procedure with pacman v5.2.2 and v6.0.0.
At this point, you may realize that you need to use `-b` option to avert the error discussed in this ticket. (See the analysis below.)

4. Run `pacman -S`

Expected behaviour:

The pacman command succeeds.

Actual behaviour:

The pacman command fails with an error message `could not find or read directory`.

Analysis:

The error can be suppressed by passing `-b /var/lib/pacman` command line argument (_**without** a trailing slash_).

This finding suggests that the issue can be fixed by removing the trailing slash (`/`) in the default value.
The author has not considered any side effects of removing the trailing slash yet from the default value.

See [5] also.

[1] https://git.archlinux.org/pacman.git/tree/meson.build#n71
[2] https://git.archlinux.org/pacman.git/tree/lib/libalpm/alpm.c#n50
[3] https://git.archlinux.org/pacman.git/tree/lib/libalpm/handle.c#n418
[4] https://ubuntu.com/kernel/lifecycle#installation-18-04
[5] https://github.com/circle-makotom/demo-20210526-realpath
This task depends upon

Comment by Allan McRae (Allan) - Wednesday, 26 May 2021, 03:34 GMT
Was this an intended change in glibc-2.33? Sounds like something they would want to fix.
Comment by Eli Schwartz (eschwartz) - Wednesday, 26 May 2021, 03:36 GMT
Given this only occurs when you upgrade to glibc 2.33 and you're *also* using docker, have you checked whether this is in fact actually a docker bug like the well known  FS#69563 ?

In fact, if glibc realpath() is actually EPERMing at you when it didn't used to, doesn't this seem obviously like a glibc bug or something else in the depths of your execution environment, which badly needs to be fixed rather than being papered over by changing applications to not use slashes?

I don't like the idea of removing this slash as it affects the output of "pacman-conf DBPath" and this already regressed in the meson port which actively broke third-party scripts. Like pacman-contrib.
Comment by Makoto Mizukami (makotom) - Wednesday, 26 May 2021, 15:04 GMT
Ahh, thanks for your good catch, I just realized that the issue won't happen if I directly execute (without Docker) a binary statically linked to glibc 2.33. That means it's Docker-dependent (and in fact I confirmed it's containerd-dependent).

Now just curious whether Arch could take active actions against this, as actually I heard reports that the issue bothers several users; you can see some of them even on Google. (Though now it sounds like to me that the side effects of removing the trailing slash from the default value is considerably significant.)
Comment by Eli Schwartz (eschwartz) - Wednesday, 26 May 2021, 15:38 GMT
I don't believe there's anything we could do other than following debian stable and not upgrading glibc until after some other distro like archlinux-ng (formed in June 2021 by users discontented with Arch's conservative "don't update glibc" policy) or Fedora Rawhide exposes such containerd bugs first.

In any event, does this happen on virtualization platforms that provide the latest version of docker/containerd/etc. or did the system you tested this on still distribute a slightly older version of docker?

In other words, is this a new bug not solved by  FS#69563 ?
Comment by Makoto Mizukami (makotom) - Wednesday, 26 May 2021, 17:23 GMT
> In any event, does this happen on virtualization platforms that provide the latest version of docker/containerd/etc. or did the system you tested this on still distribute a slightly older version of docker?
> In other words, is this a new bug not solved by  FS#69563  ?

I was on a slightly older version of Docker and containerd (docker-ce_19.03.15 + containerd.io_1.4.3-1, distributed by Docker for Ubuntu).
I'm not fully confident that I understand the context of  FS#69563 , but I feel it's something related to  FS#69563  and not solved by it.

// Seemingly it was "won't fix" in  FS#69563  because it's in a scope of underlying infrastructure and not in that of Arch...? While I believe lives of container consumers would be much easier if it can be remedied with inner-container approaches, e.g., patches in glibc.)

Loading...