FS#75727 - [openpmix] openpmix-4.2.0-1 breaks openmpi at runtime

Attached to Project: Arch Linux
Opened by Jonathan Schilling (Jonathan9192) - Monday, 29 August 2022, 07:47 GMT
Last edited by David Runge (dvzrv) - Monday, 29 August 2022, 20:56 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To No-one
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description: openpmix-4.2.0-1 breaks openmpi at runtime

Additional info:
* package version(s): openpmix-4.2.0-1, openmpi-4.1.4-1
* config and/or log files etc.
After upgrading to openpmix-4.2.0-1, executing some MPI program is use to run leads to the following error message:

```
jonathan@Z820:/data/jonathan/work/code/eclipse_wksp/VMEC$ mpirun Debug/vmec -p 1 -v 1 resources/solovev.json
[Z820:39256] mca_base_component_repository_open: unable to open mca_pmix_ext3x: /usr/lib/openmpi/mca_pmix_ext3x.so: undefined symbol: pmix_value_load (ignored)
[Z820:39256] [[13185,0],0] ORTE_ERROR_LOG: Not found in file ess_hnp_module.c at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_pmix_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
```

* link to upstream bug report, if any

Steps to reproduce:
- pacman -Syu (this upgrades openpmix to (latest) version 4.2.0-1)
- run some MPI program using mpirun
--> immediately after startup, the program crashes with above error message

A minimal-crashing example code is attached: hello_mpi.c
It is the "Hello world" MPI example from https://mpitutorial.com/tutorials/mpi-hello-world/
Compile it with `mpicc -o hello_mpi hello_mpi.c`.
Then run it with `mpirun ./hello_mpi` to reproduce this error.

Thanks for any support in this matter!
This task depends upon

Closed by  David Runge (dvzrv)
Monday, 29 August 2022, 20:56 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed with openmpi 4.1.4-2
Comment by Jonathan Schilling (Jonathan9192) - Monday, 29 August 2022, 07:48 GMT
Note that downgrading openpmix to version 4.1.2-1 from the Arch Linux Archive (https://archive.archlinux.org/packages/o/openpmix/openpmix-4.1.2-1-x86_64.pkg.tar.zst) makes above error message disappear.
Comment by Jakub Klinkovský (lahwaacz) - Monday, 29 August 2022, 11:30 GMT
You can reproduce the error with just the following command (i.e. you don't need to compile any C code):

mpirun -np 1 echo "Hello, world!"
Comment by David Runge (dvzrv) - Monday, 29 August 2022, 16:05 GMT
Hm, I guess upstream should have introduced a soname change (but they didn't) when updating openpmix to 4.2.0.

I have rebuilt openmpi and 4.1.4-2 should fix this error.

Loading...