FS#73352 - [pacman] "pacman -Syu" has overwritten all system files with empty files

Attached to Project: Arch Linux
Opened by andrew (andrew-ld) - Thursday, 13 January 2022, 20:19 GMT
Last edited by Toolybird (Toolybird) - Thursday, 21 September 2023, 00:18 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Allan McRae (Allan)
Levente Polyak (anthraxx)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

Description:
today while running a "pacman -Syu" I noticed that ldconfig was starting to complain that some shared libraries were empty files. not paying much attention to this problem I restarted the computer and it could no longer load some system modules, I noticed that the initramfs was not updated so I tried to do it manually but the .preset file was empty, doing some analysis I noticed that every single updated package overwrote both the configuration and binaries with empty files.

Additional info:
package version: Pacman v6.0.1 - libalpm v13.0.1

- I honestly have no idea how something of this caliber could have happened, I have no modified system configuration, I keep the system as standard as possible with defaults.

- I tried to reinstall all system packages (I cleared pacman cache first), after this massive reinstall the system binaries are restored. system configurations, on the other hand, remained empty files.

- I don't have any third party repo and i haven't updated any packages from aur, i don't have any unofficial pacman hooks.

- I checked the system logs with journalctl, I fscked the system filesystem but there is no sign of corruption which could indicate that the problem has existed since before this update.

Steps to reproduce:
pacman -Syu
This task depends upon

Closed by  Toolybird (Toolybird)
Thursday, 21 September 2023, 00:18 GMT
Reason for closing:  Upstream
Additional comments about closing:  Please report "upstream" as per comments.
Comment by Jonathon (jonathon) - Thursday, 13 January 2022, 22:10 GMT
I've seen similar behaviour in two instances. One was when the disk was filled during the update process (i.e. after pacman calculated the necessary disk space was available), the other was (I think) when an NFS share that held /var/cache/pacman/pkg vanished during the update and left the process in a half-completed state and broken.

Other adjacent issues were due to bad RAM causing filesystem corruption, and when /var/cache/pacman/pkg is a symlink and an update to pacman causes the package cache to vanish part-way through the update.
Comment by andrew (andrew-ld) - Friday, 14 January 2022, 18:29 GMT
Jonathon (jonathon) I don't have any of these setups, the filesystem with the packages is the same filesystem where I installed them.
Comment by Allan McRae (Allan) - Tuesday, 18 January 2022, 00:03 GMT
Diskspace issues would result in one package being empty - even then, it may limit to a single file. Losing the pacman cache makes the current or next package install fail. Neither would explain multiple packages having empty file.

I'm not sure there is anything we can do here. This appears unique in millions of pacman updates, so without further information there is nothing we can do.
Comment by Alexander F. Rødseth (xyproto) - Friday, 21 January 2022, 22:41 GMT
I don't know if it's related or not, but I have an unusual issue with the gambit-c package, where /usr/bin/gsi is empty:

https://bugs.archlinux.org/task/73426

When building with `makepkg`, it's there, but when building with extra-x86_64-build from devtools 20211129-1 and installing the package with `pacman -U`, it's empty.

I tried different compilation flags for gcc, and stripping and not stripping, but I get the same result.

Could this be connected to devtools somehow?
Comment by Allan McRae (Allan) - Saturday, 22 January 2022, 01:52 GMT
That seems entirely unrelated.
Comment by Alexander F. Rødseth (xyproto) - Wednesday, 26 January 2022, 11:37 GMT
Which filesystems are involved here, is it ext4 for the entire /?
Comment by andrew (andrew-ld) - Wednesday, 26 January 2022, 15:58 GMT
Alexander F. Rødseth (xyproto), is a single xfs filesystem, no cryptography, no raid, no custom mount parameters.
Comment by Caleb Maclennan (alerque) - Friday, 04 February 2022, 23:25 GMT
Similar or same issue here last week on a BTRFS root file system (with dedup options enabled) that ran out of space during upgrade. It appeared to have space at the start of the operation but when it ran out pacman overwrote a couple hundred /usr/lib files with 0 size nothingness.
Comment by Joo Kia (Jookia) - Friday, 05 August 2022, 11:24 GMT
I hit this issue a few weeks ago, I had maybe 2GB left on my root partition on btrfs and did an update since pacman okay'd the size required. Evidently something else in my system was eating space.

It did indeed overwrite multiple files with empty files- pacman will not error after it fails to write one file.
Comment by Yauhen (actionless) - Thursday, 24 November 2022, 19:57 GMT
(sorry, didn't noticed that's an old thread already)
Comment by Tomas Mudrunka (harvie) - Friday, 03 February 2023, 21:33 GMT
as elder arch user i remember filing bug back in the days for this exact issue.
when disk was full during full system upgrade, the pacman replaced lots of files with empty ones rendering system useless or unbootable.

back then the solution was to chack for free space before proceeding with upgrade. which helped a lot at the time.
but also was a workaround in some sense. because the operation is not atomic. the free space can disappear between the check and actual package upgrade.
Comment by - (xiota) - Tuesday, 07 February 2023, 21:41 GMT
> the operation is not atomic. the free space can disappear between the check and actual package upgrade.

What about adding additional free space checks?

I had a similar problem when running out of disk space sometime during an upgrade. I noticed the problem while `pacman` was running, so killed the process before much damage was done. Ran out of space because `/var/cache/pacman` had numerous old packages and `/usr/lib/modules` had dozens of no longer used folders. I ran `pacman -Sc` and installed `kernel-modules-hook` to clean up the folders.
Comment by Tomas Mudrunka (harvie) - Wednesday, 08 February 2023, 08:46 GMT
> What about adding additional free space checks?

Same problem. I think pacman should reserve the space beforehand in some smart way.
Eg. unpack/decompress the files first (to the same fs/dir with some .suffix !) and then mv them to final location (overwriting original files and only after that removing the rest).
Note this approach would require twice the free space, since every package has to be unpacked twice for a little while, which is not that great idea, especialy for packages with size of several GBs...

BTW How are Debian/apt and other distros solving this problem?
Comment by Caleb Maclennan (alerque) - Friday, 03 March 2023, 10:04 GMT
I've had this happen several times as well, always when I tried to do an -Syu on a BTRFS root file system that was out of space.

The problem with free space checks is they are not accurate on all file systems. My systems frequently have compression settings enabled and BTRFS and other modern file systems have de-duplication, sometimes COW, etc. These mean free space calculations are only even projected estimates, not hard values.
Comment by andrew (andrew-ld) - Monday, 03 April 2023, 19:48 GMT
I would like to propose a possible solution to this problem.

- if the new file is larger than the present file on the filesystem, it should be resized first with fallocate.

- if the new file is less large than the file present on the filesystem it should be resized only after writing the new content to it.

in either case the file to be overwritten should be opened in append mode to prevent it from becoming 0 bytes and some other program that found more space on the disk starts writing new files and there is no more space available for the package to be updated.

as an additional solution in combination with the previous one or an alternative would be to stop any operation after the one that went wrong because of insufficient space.
Comment by Stanislav (Stanislav_pythonist) - Friday, 21 April 2023, 16:45 GMT
Yet another idea how to protect from that - make pacman hang, ask administrator to free space and confirm continue paused operations.
It's anyway better than breaking the system, in my opinion, as it gives chances to manually restore normal operation.
It's also better than completely stop transaction in the middle, as it will require non-trivial (I think) actions from admin to fix the system.

Behavior could be configurable of course, like:
on_nospace = stop-transaction | pause-to-ask | ignore
Comment by Allan McRae (Allan) - Saturday, 02 September 2023, 22:57 GMT
I can not replicate on an ext4 filesystem. Pacman detects the failure and stops. It seems this may be a btrfs issue.
Comment by andrew (andrew-ld) - Monday, 04 September 2023, 11:58 GMT
@allan I originally encountered this issue on xfs
Comment by Toolybird (Toolybird) - Thursday, 21 September 2023, 00:18 GMT
It seems there are possible multiple issues at play in this ticket. Either way, if pacman has a problem or can be improved in some way, it needs to be reported "upstream" at the home of pacman development [1]. But first of all, ideally someone should come up with a reliable test scenario to allow for proper reproduction and debugging. VM's are good for this kind of thing.

[1] https://gitlab.archlinux.org/pacman/pacman

Loading...