FS#61497 - Add a timestamp file to repos

Attached to Project: Pacman
Opened by Allan McRae (Allan) - Tuesday, 22 January 2019, 02:19 GMT
Last edited by Allan McRae (Allan) - Thursday, 20 May 2021, 12:45 GMT
Task Type Feature Request
Category General
Status Assigned
Assigned To Allan McRae (Allan)
Architecture All
Severity High
Priority Normal
Reported Version 5.1.2
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 2
Private No


Even with signed repos (come on Arch!), people could delay updates to keep vulnerabilities from being fixed.

Our repos should contain a .TIMESTAMP file, that pacman reads. Then a config option that gives the maximum amount of time a repo is considered valid for.
This task depends upon

Comment by Andrew Gregory (andrewgregory) - Tuesday, 22 January 2019, 02:35 GMT
What happens when the time has elapsed?
Comment by Allan McRae (Allan) - Tuesday, 22 January 2019, 02:44 GMT
Pacman no longer can use the repo. It is equivalent to a bad signature.
Comment by Andrew Gregory (andrewgregory) - Tuesday, 22 January 2019, 02:57 GMT
Are you expecting it to try other servers first, so the user can get the updated database? Just refusing to use the db still leaves them with outdated software.
Comment by Allan McRae (Allan) - Tuesday, 22 January 2019, 03:24 GMT
I'm assuming Arch would set the default timeout value at something like 10 days. A mirror that old can be dropped.

I have not considered whether pacman should automatically go to the next mirror, or whether the user should manually handle the removal of that mirror from the mirrorlist. Also up for question is whether an -Ss or -Si operation should succeed on an old database, with only installation or upgrades restricted.
Comment by Allan McRae (Allan) - Tuesday, 22 January 2019, 03:27 GMT Comment by Andrew Gregory (andrewgregory) - Tuesday, 22 January 2019, 03:44 GMT
Hmmm, they don't specify the behavior when the time is expired.

My initial thought was that this should really only affect -Sy because until the user has actually tried to update the db there's no reason to suspect foul play. But, the config option would essentially mean that the user has requested pacman not use a database older than X, so it should probably apply to all operations. Applying that same logic to -F operations would probably also prevent confusion from people using old .files databases and not realizing that they're outdated.
Comment by Luca Bertozzi (ekardnam) - Thursday, 04 April 2019, 08:42 GMT
This requires both a modification on repo-add and on pacman (or libalpm maybe not sure i still don't know much about the codebase) right?
About the expiry I would make it default for any pacman operation and provide a flag to ignore timestamps maybe?
Comment by Luca Bertozzi (ekardnam) - Thursday, 04 April 2019, 12:34 GMT
And wouldn't a person be able to change the .TIMESTAMP without updating the packages? This would be even a more severe issue as the user would trust the source
Comment by Allan McRae (Allan) - Thursday, 04 April 2019, 12:44 GMT
You would not be able to change the .TIMESTAMP file if the repo was signed. We already have the ability to sign repos, so this avoids the holding back of old repositories.
Comment by Luca Bertozzi (ekardnam) - Thursday, 04 April 2019, 12:47 GMT
So .TIMESTAMP is valid only on signed repos?

Can i contact you via email to get a better idea of this? Will post a shorter log in here to share with everyone
Comment by Eli Schwartz (eschwartz) - Thursday, 04 April 2019, 18:28 GMT
It's not invalid on unsigned repos, but it is less useful.

A timestamp would still prevent issues related to people using broken mirrors without realizing it, and as Andrew said, it would help avoid the use of mismatched .files databases. It just would not enforce a strong security policy, because the security policy needs to come from signing the database file.
Comment by Luca Bertozzi (ekardnam) - Saturday, 06 April 2019, 07:23 GMT
right gotcha, thanks Eli;)
Comment by Ruben Kelevra (RubenKelevra) - Wednesday, 10 March 2021, 09:34 GMT
I mean, we don't need a new file. The mirrors already contain a lastsync and lastupdate file.

We basically just need to define where to find those files, like 'they are always assumed by default in the folder where $repo can be found, and otherwise need to be defined by a custom URL.

I feel like there should be tighter thresholds, like 1h for lastsync and 1d for lastupdate, at least by default. Waiting 10 days for a security update is an issue.

Comment by Allan McRae (Allan) - Wednesday, 10 March 2021, 09:48 GMT
Arch mirrors have those files. Pacman does not care what Arch does and needs a more general solution.
Comment by Eli Schwartz (eschwartz) - Wednesday, 10 March 2021, 11:37 GMT
The lastsync and lastupdate files are useful for rsync scripts to know when it's time to rsync using minimal bandwidth. They exist completely outside of the entire repository structure -- one timestamp covers ~12 repos.

They're useless for pacman, since pacman doesn't want to know when the last time "any repo got modified on the server", pacman wants to know on a per-db level and that information is already present in the last-modified date... but we need the information to be available in a format capable of being cryptographically signed, and which cannot get *accidentally* broken.
Comment by Ruben Kelevra (RubenKelevra) - Wednesday, 10 March 2021, 19:53 GMT
My point was more: It's working fine for arch, why not use it for pacman?

But true, a signed file would be much more reliable.

But since the file doesn't really say anything about the rest of the repo, say only the one file got updated, but the database and the packages didn't.

I'm not sure where the blocker for database signing is, but it would probably the most reliable way to implement a timestamp:
Add the database hash and the timestamp to a file and sign this one off.

Comment by Eli Schwartz (eschwartz) - Wednesday, 10 March 2021, 20:18 GMT
"it's working fine for arch".

Our use of django for the website is also working fine for arch, but we're not going to replace pacman with a django app because the archlinux.org website has nothing to do with pacman.

Likewise, we are not going to use the lastupdate file with pacman... Because it has nothing to do with pacman.

Even though the lastupdate file is a technology that works fine for arch, in the domain context of an ftp server distributed load balancing script running on a 60-second interval.

This has nothing to do with whether it's signed, but the lack of signing is the cherry on top of this "no".

Given we are inventing a new file, I cannot think of a single reason to be attached to the idea of "add the db hash and timestamp to another file and sign that", when we can reduce complexity by adding the timestamp only, stick the file in the db, and reuse existing infrastructure for signing the db.
Comment by Eli Schwartz (eschwartz) - Wednesday, 10 March 2021, 20:20 GMT
There is no blocker for database signing in pacman, all my databases are signed already.

If you want to discuss blockers for database signing in archlinux, the topic is best discussed in:
- the tracker for the archlinux infrastructure
- the arch-general mailing list
Comment by neowutran (neowutran) - Friday, 25 June 2021, 11:37 GMT
I had an issue related to this and was about to open a new bug ticket.

I was building a unofficial archlinux template for QubesOS but the process aborted due to 404 errors when downloading some packages.
The mirrors database was not up to date:

$: wget https://mirrors.evowise.com/archlinux/extra/os/x86_64/extra.db
give me a up to date database file
$: wget http://mirrors.evowise.com/archlinux/extra/os/x86_64/extra.db
give me a outdated database file (multiple days)

In this case, the "http" mirror is the first mirror in the default mirrorlist.

Problematic pacman behavior:
When trying to update (pacman -Syu), pacman will download the latest database (example: extra.db).
To do that, it will download it from the first server in the mirror list.

Some thought
- HTTP mirrors should be forbidden. Signature is not enough for this issue. A malicious actor on the network could override the DB file with a outdated DB file.
- How the database is selected among the mirror could be modified. Instead of downloading it from 1 mirror, I guess it could download it from multiple mirrors, then choose among them what is the most likely correct database. Like selecting the most common database file. Assuming that there is more than 2 available server in the mirrorlist and that only 1 mirror is malicious, that should solve the issue. If there is only 1 mirror, nothing can be done. If there is the same amount of mirror providing the same files, could select by looking at some future internal/signed timestamp

Enforcing DB file signature and adding a signed timestamp (inside or outside the DB file) would be nice, but I do not think it is enough for all cases.
If we imagine the case of a critical firefox vulnerability that need to be fixed asap, malicious mirror (and anyone on the network for "http" mirror) could delay the update by the X days before pacman flag the mirror as outdated.
So I think trying to download the database from multiple mirrors can be interesting.
Comment by Allan McRae (Allan) - Friday, 25 June 2021, 12:01 GMT
There are package updates all the time in Arch, so it becomes quite unlikely that any two mirrors ever have the same database.
Comment by Eli Schwartz (eschwartz) - Friday, 25 June 2021, 13:34 GMT
We're not going to download anything multiple times for "consensus".

Anyway your suggestion would result in consensus to use outdated packages when the consensus is to be outdated, and completely fails to take into account people with only one mirror, and in any and all cases where your suggestion's codepaths *actually* run at all, it results in wasted bandwidth that hits certain people *very* hard.

The pacman developers are already in agreement; we have decided that your exact concern WILL be considered as solved by demanding distributors of pacman repos to choose a reasonable expiration policy, and rebuild the databases to embed a new timestamp at least that often. Arch does so, up to hundreds of times a day, already.

No one cares if a malicious mirror is able to hold back updates for 6 hours. A *non* malicious mirror can hold back updates for 6 hours by only *syncing* once every few hours. If you're concerned that some malicious attacker is going to hold back updates for a few hours, then follow https://lists.archlinux.org/listinfo/arch-security for security advisories, which works even without timestamps in the files. Resolutions listed in security advisories always instruct one to run pacman -Syu "vulnerable_package>=fixed-version" so you're guaranteed to either get the fixed version or for pacman to error out if it cannot fix your system.