FS#60909 - Package mpich in the repo
Attached to Project:
Community Packages
Opened by Bruno Pagani (ArchangeGabriel) - Sunday, 25 November 2018, 16:31 GMT
Last edited by Jelle van der Waa (jelly) - Sunday, 03 September 2023, 09:57 GMT
Opened by Bruno Pagani (ArchangeGabriel) - Sunday, 25 November 2018, 16:31 GMT
Last edited by Jelle van der Waa (jelly) - Sunday, 03 September 2023, 09:57 GMT
|
Details
This FR is here to discuss whether and how to package mpich
in our repos.
Currenty, we provide OpenMPI as the only MPI implementation in the repo. Some people would prefer we package mpich instead, but the real solution would be packaging both. It’s likely that most software would depend on the actual MPI implementation used during compilation, so the easiest option would be to package both as conflicting, and then duplicate our packages as $pkgname-openmpi and $pkgname-mpich. I’m not sure whether this is desirable, other options include packaging mpich co-installable with OpenMPI and having the first one available for people needs only, not as a dependency of any package. |
This task depends upon
Closed by Jelle van der Waa (jelly)
Sunday, 03 September 2023, 09:57 GMT
Reason for closing: Deferred
Additional comments about closing: If someone is willing to implement it go ahead, but for now I'll close it.
Sunday, 03 September 2023, 09:57 GMT
Reason for closing: Deferred
Additional comments about closing: If someone is willing to implement it go ahead, but for now I'll close it.
We have roughly 4 choices:
1. Package MPICH in a separated namespace, keep all our stack on OpenMPI. That way people can easily use MPICH for their projects, though our stuff still depends on OpenMPI. This is probably not good, because if people want to depend on e.g. an MPI version of HDF5 or NetCDF, they would have issues with two different MPI stacks at use.
2. Reverse that, and use MPICH by default (it has been considered as a better implementation for a while by a large number of people), but concerns remain.
3. Package both stack as conflicting, provide all or most our software in both flavours.
4. Be Debian-esque, package both (https://packages.debian.org/sid/amd64/mpich/filelist and https://packages.debian.org/sid/amd64/openmpi-bin/filelist for instance) and rename all conflicting files of all software so that the implementation can be selected at runtime (e.g. https://packages.debian.org/sid/amd64/libhdf5-openmpi-dev/filelist and https://packages.debian.org/sid/amd64/libhdf5-mpich-dev/filelist).
4. is way too much work if you ask me. 3 is already a lot, but provides choice.
We could even merge 1+2+3: provide both stacks in our repos, make them co-installable (maybe the Debian way here), but then don’t care about other software being co-installable and just let the user which stack then want to be used for all remaining software on their system.
Pros: 1) avoiding all package conflicts; 2) the choice of MPI isn't limited to MPICH and OpenMPI.
Cons: 1) some users can consider it as the breakage of the KISS principle; 2) may be tricky to implement.
#1) We hardly need a bugtracker ticket to decide whether one TU wishes to do this (at least to start with).
#2) If you want to convince other Devs/TUs to switch, this is best discussed to gain consensus on the mailing list, which people watch, rather than the bugtracker, which is only assigned to you and only observed by people who happen to have heard directly from someone else that the ticket exists.
#3) Sounds like a ton of work for everyone packaging something that links against mpi when we should just agree on a preferred implementation (rather like how we agree that glibc is the preferred C library implementation).
#4) Sounds misery-inducing, providing two versions of every package that links indirectly against mpi in order to provide option #3 except co-installable will eventually lead us to having multiple versions of every package in the repositories depending on which flavor-of-the-day random dependency we're waffling over.
Before starting a discussion on the ML, I wanted to gather feedback on available options.
I see your point on the libc. The idea is that MPI-depending packages are way less numerous, and my hope is that packages compiled against either MPI implementations stay ABI-compatible, so that only directly depending packages need to be provided twice. But I might be wrong on this, in which case you are right, this is not possible (and I would advocate that we should switch to MPICH as the default).
Now as you’ve pointed out on IRC, I should already be able to package mpich in the repo, since it is mostly co-installable with openmpi. Whether providing some packages in both flavours is doable and desirable can be decided later.
Anything that does not directly depend on mpi, should only depend on the binary interface provided by e.g. hdf5, but which version of mpi is used by hdf5 is a *private implementation detail* and as long as an ABI-compatible hdf5 package is available, hdf5-using software should not care how that hdf5 was compiled. (That software may also link directly to mpi, but that is a different story.)
Where this breaks down is trying to make hdf5-openmpi and hdf5-mpich co-installable, by performing magic sorcery and renaming libhdf5.so to libhdf5-openmpi.so, in which case all of a sudden hdf5-using software does care how hdf5 was compiled. This is nuts.
> Before starting a discussion on the ML, I wanted to gather feedback on available options.
The mailing list is defined as "the place where we try to discuss options about things". The bugtracker is defined as "the place where we track actionable items for action". I really do feel this discussion would be a lot more *productive* on the mailing list.
If we go this way, we definitively don’t want to have both hdf5-openmpi and hdf5-mpich co-installable (which would have been option 4 above, that I’m not advocating for), just the user to be able to choose between the two of them.
As for bug tracker vs mailing list, the second is OK if we converge to a point quickly, but here on a quite technical discussion, I prefer to have a discussion somewhere that allows keeping track on a longer term first, and then move to the mailing list when I have a clear proposal (to reduce noise for uninterested people in particular). They are countless discussions on the ML that are stalled because messages are ancient and almost no-one kept track of them. And when you restart them, people have to search the archive over several months sometimes to find the start of the discussion. Here it is way easier. ;)