FS#24537 - [abs] Repository Update Policy

Attached to Project: Arch Linux
Opened by Andrej Podzimek (andrej) - Wednesday, 01 June 2011, 21:28 GMT
Last edited by Eric Belanger (Snowman) - Monday, 09 September 2013, 01:37 GMT
Task Type Feature Request
Category Arch Projects
Status Closed
Assigned To matt mooney (mfm)
Architecture All
Severity Very Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 9
Private No

Details

When updating my system with srcpac, usually more than 10% of packages do not get updated, despite the fact that they compile cleanly. This is due to stale records in the ABS tree with PKGBUILDs that just compile the old version.

Would it be possible to introduce a policy forcing the ABS tree entries to be updated *before* the binary packages? Obviously, these two updates often occur in the incorrect order, causing srcpac to malfunction. I really like using ArchLinux in a slightly Gentoo-like manner, with my options in makepkg.conf. But problems of this kind are sometimes annoying.
This task depends upon

Closed by  Eric Belanger (Snowman)
Monday, 09 September 2013, 01:37 GMT
Reason for closing:  Won't implement
Additional comments about closing:  for reason see https://bugs.archlinux.org/task/24537#co mment81649
Comment by Jan de Groot (JGC) - Thursday, 02 June 2011, 09:43 GMT
This can never be in sync, as ABS is pulled from the archlinux development server and packages come from a mirror. Sometimes the mirror is ahead, sometimes ABS is ahead. Besides that, regenerating the ABS repository on the development server takes quite some I/O load, which is not what we want to do everytime we update a package in the repositories.

As for updating your system with srcpac: you can end up with a completely broken system whenever we do a mass rebuild with soname bumps.
Comment by Andrej Podzimek (andrej) - Thursday, 02 June 2011, 10:47 GMT
AFAIK, most distros support building packages from source just fine. I do not see any reason why this approach should be discouraged in Arch Linux, famous for its "as you like it" approach.

Although ABS and the repositories cannot be synchronized atomically, it is certainly possible to guarantee that ABS is always ahead of the mirrors. This simple guarantee would make srcpac much more useful than it is now.

I/O load is not an argument. If you need a new server, then you can always ask the community to contribute. If the I/O load is not acceptable for some other reasons (unrelated to hardware or connectivity), it might mean that ABS is not designed correctly. Perhaps the ABS tree is not the right data source for srcpac. Introducing something equivalent of "source packages" would be nice. These archives would not contain the whole sources, but just (what is now called) the ABS entries. And they could be disseminated using the standard repositories and mirrors, without a central out-of-date server.
Comment by Karol Błażewicz (karol) - Thursday, 02 June 2011, 10:50 GMT Comment by Ionut Biru (wonder) - Thursday, 02 June 2011, 10:53 GMT
can't srcpac use our svn? you can't get more instant than that :D
Comment by Andrej Podzimek (andrej) - Thursday, 02 June 2011, 21:15 GMT
> Arch isn't accepting donations atm: http://mailman.archlinux.org/pipermail/arch-general/2011-April/019799.html

I see... But this means that Arch has a critical problem that should be dealt with ASAP. When hardware limitations stop volunteers from doing useful work, it might do a lot of harm to the fragile ecosystem of a community distribution.

Perhaps Arch might also consider using some of the community fundraising facilities, such as Kickstarter (http://www.kickstarter.com/), to circumvent the problem. Of course, this cannot be a project of the form „we will just maintain something“. But I am pretty sure some clearly defined projects can be easily identified, e.g.:

1) Adding full IPv6 support to netcfg. And a working NetworkManager + KDE integration would be fine as well, when it comes to network.
2) It might be a good idea to merge pacman and srcpac and redesign the ABS mechanism.
3) Do we want one of those fancy parallel and dependency-aware startup mechanisms? Well, I bet we do, sooner or later...

I don't have time to work on any of these (shot-in-the-dark) suggestions, but perhaps someone could take the challenge. I would not hesitate to contribute, say, $100 to a project of this kind. And with some 50 more contributors, at least the most critical hardware requirements could be dealt with.

> can't srcpac use our svn? you can't get more instant than that :D

In that case it would have to be careful about the revisions it chooses... (Or at least let the user make the choice between release, testing, unstable, "HEAD" and the like.) I would expect 'pacman -S' and 'srcpac -Sb' to have exactly the same effect by default (with an unmodified makepkg.conf). Ideally, it would be nice to have the -b option integrated directly into pacman.
Comment by Karol Błażewicz (karol) - Thursday, 02 June 2011, 21:17 GMT Comment by Ionut Biru (wonder) - Thursday, 02 June 2011, 21:43 GMT
i don't see how full ipv6 support in netcfg has something to do with this report and also with the donations.

@Andrej for 3) check out the latest sodeps thread from pacman-dev. also you don't have to be careful about the revisions since abs is getting the builds from package/repos/reponame-$arch.

to be more on-topic, instead of recreating the whole abs tree every time,once a day we should do a a svn hook to copy the newly touched files into abs tree
Comment by Allan McRae (Allan) - Friday, 03 June 2011, 03:58 GMT
The ABS update script currently is absolutely horrible and in my opinion should only ever be used to regenerate the entire ABS tree (which is what it does... not an incremental update).

What we could do is have the repo db-scripts update the tree as packages are pushed to the repos. They already checkout the relevant SVN files for all their checks anyway so it should be simple to add. The full tree could remain regenerated daily if that was really needed,
Comment by Andrej Podzimek (andrej) - Saturday, 09 July 2011, 23:22 GMT
I came across an outdated kernel in ABS today... Now this is *very* *annoying*, to say the least. Is it really that hard to keep at least *core* in sync?

Not only does it make ArchLinux totally useless in some specific (benchmarking and research) scenarios where it could shine otherwise, but it also causes some of the machines I maintain to spend lots of time compiling outdated packages for no purpose.

Some people told me „man, if you want Gentoo, move to Gentoo“ and I'm really thinking about such a migration. There are, however, lots of things I really like about ArchLinux, which prevented me from taking this step so far. But as a matter of fact, the whole ecosystem around ABS stinks and it is just painful to use. When someone mentioned Portage on ArchLinux years ago (https://bbs.archlinux.org/viewtopic.php?id=12403), there were quite many angry (over)reactions. I have to say I don't understand why.

Let's face it: ArchLinux has no working equivalent of

* the "source RPM" packages (and their counterparts in Debian-based distros)
* the "ports" system from BSDs based on Makefiles
* the "portage" build system known from Gentoo

For instance, if you compare the number of build steps available in the "ports" or "portage" packaging systems, you will quickly see how immature ABS actually is.

I suggest that this task be marked "Critical". Sanity of the packaging system is closely related to the sanity of the whole distribution. Binary packages emerging "out of nowhere", without the corresponding PKGBUILDs readily available, should *not* be a common practice.
Comment by Sverd Johnsen (sjohnsen) - Sunday, 10 July 2011, 10:22 GMT
Arch is not a source distribution, if you want to rebuild official packages regulary (For whatever reason) you can do that with a few of your own little helper scripts quite reliable. Use the git repo (Syncs every 2h from SVN) instead of the ABS tree. In case you want to rebuild something that didn't get synced to GIT, grab the invididual packages from SVN. It's not rocket science, get to work.

% expac %p | grep -ivc "Hardened"
52
Comment by Andrej Podzimek (andrej) - Tuesday, 16 August 2011, 12:18 GMT
1) If I wanted to hack scripts all the time, I would use Linux from scratch or the like. Providing a standard and *working* package management system is the main reason why distributions exist. I used to have my own update scripts for ArchLinux years ago, when srcpac was so buggy that one coud not even build a single package. But I still hope those times are gone and srcpac will be fully usable one day. Homebrew update scripts are only a temporary workaround, not a reasonable solution to the problem.

2) I don't know what a "source distribution" is. This collocation is a piece of nonsense and only leads to confusion. All other major distributions have a working build system. You can download and build source RPMs in RPM based distros. You can build packages from source in Debian and Ubuntu if you want. Needless to talk about Gentoo, right? That's why I don't really understand *why* ArchLinux users should be limited to binary packages compiled for ten years old machines for the sake of compatibility.

3) You say "It's not rocket science"... Have a look at the source code of Portage tools. Have a look at the current source code of srcpac. Do you really think that handling version upgrades and various temporary build-time dependencies correctly is something simple, something easy to do in one's free time? Well, I don't think so. In my opinion, these utilities should be provided by the distribution and maintained by people who *understand* the details. Users mostly have other work to do; hacking the distribution's tools is not their primary job.
Comment by Allan McRae (Allan) - Tuesday, 16 August 2011, 17:51 GMT
> Users mostly have other work to do; hacking the distribution's tools is not their primary job.

Developers of a "binary" distro have other things to do as well, primarily providing binary packages... If users want a source based solution, they need to step up and provide the tools.
Comment by Thomas Bächler (brain0) - Tuesday, 16 August 2011, 20:39 GMT
1) Always the latest PKGBUILDs (no delay): https://www.archlinux.org/svn/ - check out only the packages you need and it'll work fine.
2) Git clone, updated every two hours: https://projects.archlinux.org/svntogit/
3) ABS is updated once a day.

You want us to change our workflow because that is not enough for you. You seem to DEMAND we do it the way you want.

> Well, I don't think so. In my opinion, these utilities should be provided by the distribution and maintained by people
> who *understand* the details. Users mostly have other work to do; hacking the distribution's tools is not their primary job.

That is where you are wrong. We have no obligation to do what you want. It is our distribution, we maintain it for ourselves, we provide what we want to provide, and if you don't like what we give you, you can either deal with it or do it yourself and contribute.

We provide binary packages, we provide what I listed under points 1-3 above. If the 24 hour delay in ABS is a problem for you, then don't use it. That is all.
Comment by Andrej Podzimek (andrej) - Wednesday, 17 August 2011, 14:33 GMT
Let me cite something from the SVN page, just for fun: "DO NOT CHECK OUT THE ENTIRE SVN REPO Your address may be blocked." I have never heard about a repository with such limitations before.

On a more technical note, checking out packages one-by-one has obvious down sides. A trivial race condition is easy to imagine:
1) srcpac checks out xulrunner.
2) Both firefox and xulrunner are updated and committed by the maintainer(s).
3) srcpac checks out firefox.

Perhaps the Git repository helps a little bit, provided that it can be fully cloned (without blocking my address...) and subsequent pulls always lead to a consistent state with no version clashes... But how do I make sure I am checking out PKGBUILDs of *released* packages, rather than possibly unstable intermediate revisions? Furthermore, compiled updates often lead to corner cases (such as temporary build-time dependencies that need not or cannot be compiled) that could be theoretically handled by pacman gracefully. But the inconsistency between the ABS-like repositories and pacman's view of the binary repositories makes this virtually impossible due to frequent version conflicts.

That's why I am convinced that a combination of pacman and the repositories does not have the potential to solve the problems mentioned above, unless there is at least some very basic support from the distribution's maintainers.

As far as "providing binary packages" is concerned, it would be fine to have greater choice. When it comes to benchmarking, for example, code compiled to be compatible with a 10 years old AMD Athlon might not be the most suitable choice for a Sandy Bridge class Intel processor. My final goal was to create multiple binary ArchLinux repositories optimized for various processors (at least for the latest Intel and latest AMD processors). I have access to both the computing power and the bandwidth to make this work. But unfortunately, my attempts to compile all (or at least most of) the packages cleanly and keep them up-to-date (in a mostly unattended manner) just failed. (The reasons have been explained...)

The statement that you don't have any obligation to do what I want is absolutely correct. Similarly, I did not have any obligation to support ArchLinux (which could be done easily through PayPal once upon a time), to report bugs, to create AUR packges etc. The reasons why people do these things are not a matter of obligation. People just do what they consider important or potentially beneficial to others. And believe it or not, some of them might complain when they feel something is going wrong.

I do not "demand" anything. I am actually just (more or less) explaining why I started planning to move away from ArchLinux after years of using it on numerous machines. Your "we have no obligation" statement made my decision-making much easier.
Comment by Karol Błażewicz (karol) - Wednesday, 17 August 2011, 15:16 GMT
Not sure if it helps, but have you tried vABS https://bbs.archlinux.org/viewtopic.php?id=123331 ? They have git access and - as far as I know - you can check out the whole repo.

If I understand it correctly, the problem is how to get a consistent state, so the packages build unattended, right?
Comment by Andrej Podzimek (andrej) - Wednesday, 17 August 2011, 15:37 GMT
Exactly. The key problem is actually finding something roughly equivalent to 'emerge --sync; emerge --update world' on ArchLinux. But there is no need for a direct counterpart of these commands. It would suffice to build a repository of packages and update that repository in an unattended manner (with a cron job rebuilding updated packages every hour or so). Then all the problems would be reduced to simply using pacman and choosing a non-default software repository.

Doing this on a machine that has all available packages installed might be quite straighforward. But unfortunately, some packages may have build dependencies conflicting with other packages, which would make the whole thing slightly more complicated. But I can imagine chroot environments inside Btrfs snapshots. Such COW snapshots could be used to build those conflicting packages in a (hopefully) automated manner (create snapshot - chroot into the snapshot - remove dependencies' conflicts - install dependencies (from the local repo, of course) - build package - drop snapshot).

Many thanks for the link, I'll definitely have a look at vABS.

As far as SVN and updates are concerned, this page https://wiki.archlinux.org/index.php?title=Getting_PKGBUILDS_From_SVN says it loud and clear: "Never use the public SVN for any sort of scripting." Yet another reason why the SVN repo is not an option for me.
Comment by Andrej Podzimek (andrej) - Wednesday, 17 August 2011, 20:34 GMT
OK, let's see what can be done. I created this new task: https://bugs.archlinux.org/task/25628

However, this is still too *far* from the ideal solution — the -b option should be integrated into pacman, which could then work with one single consistent view of the repositories and handle all the corner cases (especially temporary build dependencies and package replacements) gracefully.
Comment by Leonid Isaev (lisaev) - Monday, 29 August 2011, 15:12 GMT
The biggest problem of Gentoo is building everything as root, which is a security concern, since emerge not only calls make et al, but also executes scripts... Also, last time I checked, Fedora needed a lot of manual labor to build a .srpm. Please remember that ABS is not supposed to be used for system-wide rebuilds, but only for individual packages.

Perhaps what you really meant was opensuse build system: https://bbs.archlinux.org/viewtopic.php?pid=926560
Comment by Andrej Podzimek (andrej) - Saturday, 08 October 2011, 13:46 GMT
When planning to deploy a machine fully devoted to building packages, it is quite easy to imagine setting up chrooted build environments (or even one chroot for each supported (sub)architecture), in which case building as root wouldn't hurt that much.
Thank you for the link ot OpenSUSE build system. I'll have a look at it. (I did not know about it before.)
Comment by max compress (maxcompress) - Tuesday, 07 February 2012, 08:56 GMT
synchronizing ABS PKGBUILDs and REPO packages is so hard?
100 MB package come before 2KB build script. funny:)
if you can send new package, you can send tiny PKGBUILD same time. i think this is very easy.
please somebody fix this problem. don't want wait new PKGBUILD all day long.

Loading...