FS#65705 - CheckSpace logic doesn't take snapshots into account

Attached to Project: Pacman
Opened by Alexander Kobel (akobel) - Wednesday, 04 March 2020, 20:36 GMT
Task Type Feature Request
Category Backend/Core
Status Unconfirmed
Assigned To No-one
Architecture All
Severity Medium
Priority Normal
Reported Version 5.2.1
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 3
Private No


pacman estimates the disk space requirements under the assumption that deleting a file reclaims the space occupied by the file. That's perfectly sound, unless snapshots are taken. In a setup like the one used by snap-pac, which creates pre- and post-upgrade snapshots on btrfs partitions, deletion of a file does *not* free any space; this only happens once the snapshot is deleted.

This is not an extremely rare case anymore, and it will become even more frequent with the advent of recent filesystems and logical volume management.
FWIW, I experienced this with btrfs snapshots, but I would imagine that similar issues arise with LVM snapshots.

In an ideal world, pacman would be able to detect whether snapshots of the filesystem containing an updated file exist (let's assume a single partion for the moment) or will be taken by pre-upgrade hooks (the latter could be done by setting an option like CheckSpaceLogic=Snapshot).

If so, the calculation of Net Upgrade Size should take into account which files remain unchanged during an upgrade.
For an upgrade of a package from v1 to v2, the "Effective Net Upgrade Size before snapshot cleanup" would be sum(size(f) for f in changed(v1,v2)), where the sum is taken over the files f in v2 that are actually replaced on disk.

I'm not an expert, but I guess that this data could be gathered with extremely high confidence from the .MTREE file. I don't know whether files with identical hashes are skipped during upgrades, but I could imagine that this is not done to make sure that hash collisions don't affect anything. In this case, it might be necessary to avoid overwriting files by duplicates, but instead unpack diff the new version against the old and only ever touch the old entry if actually something changed. filesystems might offer more sophisticated ways to do so in an atomic operation; I'm pretty sure that btrfs does, for example.

For a simpler measure (that does not have to explicitly compare all files in v1 and v2 individually), a conservative guess would be to simply pretend the worst case: sum(size(f) for all f in v2), that is, Effective Net Upgrade Size = Total Installed Size (of v2), assuming that no file is deleted at all.

Steps to Reproduce:

1. Have / on a snapshottable filesystem (e.g., btrfs).
2. Create some huge dummy package in a couple of versions, e.g. with the attached PKGBUILD.
3. Install versions of that package a couple of times, snapshotting / in between (either manually or using snap-pac hooks).
4. Realize that eventually you will run out of disk space, because pacman thinks the Net Upgrade Size is 0, but the old versions of the files are still kept in the snapshots.
5. Delete the snapshots to recover your system.

Also see the correponding thread "Pacman/checkupdates: Net upgrade size / expected disk usage" in the forum with detailed log: https://bbs.archlinux.org/viewtopic.php?pid=1890399.
   PKGBUILD (0.1 KiB)
This task depends upon

Comment by Kenny MacDermid (kenmacd) - Friday, 01 May 2020, 20:26 GMT
Is there any update on this? When it happens it can lead to broken installs.

An option to just not count any of the data removed (ie calculate_removed_size() return 0) would prevent this issue.
Comment by Mikael Blomstrand (chawlindel) - Tuesday, 15 June 2021, 12:45 GMT
I frequently manage to break my installation because of this. It's extremely annoying. Any update on this at all?
Comment by Christian Kotte (ckotte) - Wednesday, 02 August 2023, 22:38 GMT
Any updates?