FS#50298 - /var/cache being symlink causes /var/cache/pacman to "disappear" mid upgrade

Attached to Project: Pacman
Opened by Ralph Corderoy (RalphCorderoy) - Sunday, 07 August 2016, 12:06 GMT
Last edited by Allan McRae (Allan) - Friday, 23 December 2022, 13:32 GMT
Task Type Bug Report
Category General
Status Closed
Assigned To Andrew Gregory (andrewgregory)
Architecture All
Severity Medium
Priority Normal
Reported Version 5.0.1
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

pacman 5.0.1-4. I've two filesystems, / and /home. Overnight recently, a download-only of packages filled / so I created /home/var-cache, moved /var/cache's contents there, and made a /var/cache be a symlink. Everything continued working fine, including non-pacman users of /var/cache. The next evening, last night, four packages were downloaded into /home filesystem:

:: Retrieving packages...
downloading fakeroot-1.21-2-x86_64.pkg.tar.xz...
downloading fontconfig-2.12.1-3-x86_64.pkg.tar.xz...
downloading mesa-12.0.1-6-x86_64.pkg.tar.xz...
downloading python-setuptools-1:25.1.6-1-any.pkg.tar.xz...

The problem appeared when I tried to install these:

$ sudo -i pacman -Su
[sudo] password for ralph:
:: Starting full system upgrade...
resolving dependencies...
looking for conflicting packages...

Packages (4) fakeroot-1.21-2 fontconfig-2.12.1-3 mesa-12.0.1-6
python-setuptools-1:25.1.6-1

Total Installed Size: 39.16 MiB
Net Upgrade Size: -0.10 MiB

:: Proceed with installation? [Y/n]
(4/4) checking keys in keyring [######################] 100%
(4/4) checking package integrity [######################] 100%
(4/4) loading package files [######################] 100%
(4/4) checking for file conflicts [######################] 100%
(4/4) checking available disk space [######################] 100%
:: Processing package changes...
(1/4) upgrading fakeroot [######################] 100%
error: cannot remove /var/cache/ (Not a directory)
(2/4) upgrading fontconfig [######################] 100%
updating font cache... done.
error: could not open file /var/cache/pacman/pkg/mesa-12.0.1-6-x86_64.pkg.tar.xz:
No such file or directory
error: could not commit transaction
error: failed to commit transaction (transaction aborted)
Errors occurred, no packages were upgraded.
$

The log file says:

[2016-08-07 10:57] [PACMAN] Running 'pacman -Su'
[2016-08-07 10:57] [PACMAN] starting full system upgrade
[2016-08-07 10:58] [ALPM] transaction started
[2016-08-07 10:58] [ALPM] upgraded fakeroot (1.21-1 -> 1.21-2)
[2016-08-07 10:58] [ALPM] error: cannot remove /var/cache/ (Not a directory)
[2016-08-07 10:58] [ALPM] upgraded fontconfig (2.12.1-1 -> 2.12.1-3)
[2016-08-07 10:58] [ALPM-SCRIPTLET] updating font cache... done.
[2016-08-07 10:58] [ALPM] transaction failed

It's the aftermath that's odd, and left me with broken packages.

/var/cache had been removed by something, despite the "cannot remove" above, and re-created as a directory. In it was just the fontconfig directory, containing a few files that were already in the "real" version sitting in /home. This mean /var/cache/pacman "disappeared" mid upgrade as far as pacman was concerned.

/var/lib/pacman/local/mesa-12.0.1-6/ was left with no files, though `pacman -Q mesa' said it was the installed version. Attempting `pacman -Su' detected this:

:: Starting full system upgrade...
resolving dependencies...
error: could not open file /var/lib/pacman/local/mesa-12.0.1-6/desc: No such file or directory
looking for conflicting packages...

Packages (1) python-setuptools-1:25.1.6-1

That only one package is left suggests it thinks the other three installed OK. `pacman -Qk' only complained about mesa.

At this point, trying to do things to investigate was hampered, e.g.

error while loading shared libraries: libEGL.so.1: cannot open shared object file:
No such file or directory

I resolved the problem by -Rdd mesa, followed by -S mesa. -Qk then complained about problems with nvidia-304xx-libgl so that was -S too. -qQkk's is now as expected.

Several thoughts... The symlink shouldn't be removed, especially as it wasn't /var/cache/pacman but the directory above. The error during fakeroot's upgrade, 1/4, didn't stop /var/cache being mkdir'd and /var/cache/fontconfig, 2/4 from being created. The disappearance of /var/cache/pacman during the upgrade caused mesa, 3/4, to have no files in /var/lib/pacman/local/mesa-12.0.1-6. The Linux FHS says "The application must always be able to recover from manual deletion of these files (generally because of a disk space shortage)." — http://www.pathname.com/fhs/pub/fhs-2.3.html#VARCACHEAPPLICATIONCACHEDATA To me, that includes during an upgrade! :-) If it needs them to persist then perhaps they should be sat elsewhere.
This task depends upon

Closed by  Allan McRae (Allan)
Friday, 23 December 2022, 13:32 GMT
Reason for closing:  Deferred
Additional comments about closing:  Bug transferred to gitlab:
https://gitlab.archlinux.org/pacman/pacm an/-/issues/5
Comment by Ralph Corderoy (RalphCorderoy) - Sunday, 07 August 2016, 12:23 GMT
The CacheDir pacman.conf option only applies to pacman, not /var/cache as a whole. I agree a bind mount should work, and I'll try it, but the two problems still remain: If someone else attempts to use a symlink for /var/cache then pacman should bail out if it can't handle that rather than stumble on, leaving its data in a broken state. The FHS makes clear /var/cache can disappear during program execution.
Comment by Ralph Corderoy (RalphCorderoy) - Friday, 12 August 2016, 12:00 GMT
As per last comment. (Sorry, new to Arch Linux and this bug system.)
Comment by Eli Schwartz (eschwartz) - Wednesday, 29 August 2018, 17:47 GMT
There's two issues here. First, as I suggested in  FS#58804 , pacman should be taught to abort with a file conflict error, when it detects that a package upgrade will overwrite an on-disk symlink with a directory.

Second, Ralph has a good point IMO. pacman could gain from holding open an fd for each package archive it is preparing to to upgrade, thus ensuring that a cached file which disappears from disk mid-transaction does not result in aborting the transaction halfway through. This is actually independent of the cachedir, because it applies to -U operations as well, for example.

I think this latter case is something we should put on the TODO list to try to fix.

Loading...