FS#78124 - [intel-oneapi-compiler-shared-opencl-cpu] causes hang in clGetPlatformIDs()

Attached to Project: Community Packages
Opened by Barnabás Pőcze (pobrn) - Wednesday, 05 April 2023, 14:05 GMT
Last edited by Torsten Keßler (tpkessler) - Thursday, 25 May 2023, 15:16 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To Konstantin Gizdov (kgizdov)
Torsten Keßler (tpkessler)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:
`clinfo` hangs when intel-oneapi-compiler-shared-opencl-cpu is installed. (This can also affect other programs, like libreoffice.)


Additional info:
* intel-oneapi-compiler-shared-opencl-cpu 2023.0.0-7

Steps to reproduce:
1) install intel-oneapi-compiler-shared-opencl-cpu
2) start clinfo
3) observe that it hangs

---

gdb reveals the following stack trace:

Starting program: /usr/bin/clinfo -l
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
^C
Program received signal SIGINT, Interrupt.
futex_wait (private=0, expected=1, futex_word=0x7ffff7fc3878) at ../sysdeps/nptl/futex-internal.h:146
146 int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
(gdb) bt
#0 futex_wait (private=0, expected=1, futex_word=0x7ffff7fc3878) at ../sysdeps/nptl/futex-internal.h:146
#1 futex_wait_simple (private=0, expected=1, futex_word=0x7ffff7fc3878) at ../sysdeps/nptl/futex-internal.h:177
#2 __pthread_once_slow (once_control=0x7ffff7fc3878, init_routine=0x7ffff7fc0940) at pthread_once.c:105
#3 0x00007ffff7fbd4a3 in clGetExtensionFunctionAddress () from /opt/intel/oneapi/compiler/2023.0.0/linux/lib/libOpenCL.so
#4 0x00007ffff7fbc6e7 in ?? () from /opt/intel/oneapi/compiler/2023.0.0/linux/lib/libOpenCL.so
#5 0x00007ffff7fc0bd9 in ?? () from /opt/intel/oneapi/compiler/2023.0.0/linux/lib/libOpenCL.so
#6 0x00007ffff7aa3b17 in __pthread_once_slow (once_control=0x7ffff7fc3878, init_routine=0x7ffff7fc0940) at pthread_once.c:116
#7 0x00007ffff7fbd4a3 in clGetExtensionFunctionAddress () from /opt/intel/oneapi/compiler/2023.0.0/linux/lib/libOpenCL.so
#8 0x00007ffff7c02fc4 in ?? () from /opt/cuda/lib64/libOpenCL.so.1
#9 0x00007ffff7aa3b17 in __pthread_once_slow (once_control=0x7ffff7e070d0, init_routine=0x7ffff7c02a60) at pthread_once.c:116
#10 0x00007ffff7c048df in clGetPlatformIDs () from /opt/cuda/lib64/libOpenCL.so.1
#11 0x000055555555b1e6 in ?? ()
#12 0x00007ffff7a3c790 in __libc_start_call_main (main=main@entry=0x55555555b020, argc=argc@entry=2, argv=argv@entry=0x7fffffffdf48) at ../sysdeps/nptl/libc_start_call_main.h:58
#13 0x00007ffff7a3c84a in __libc_start_main_impl (main=0x55555555b020, argc=2, argv=0x7fffffffdf48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdf38)
at ../csu/libc-start.c:360
#14 0x000055555555b8be in ?? ()

which shows that the problem is that the same pthread_once_t object (at 0x7ffff7fc3878) is recursively initialized (frames 2 and 6).

I have observed this on two machines, it happened even when it was the only opencl implementation (no nvidia, no `intel-compute-runtime`, etc.). I have tried to find an upstream bug tracker but the PKGBUILD just downloads binary packages; the closest I found on github was https://github.com/intel/compute-runtime but I am not sure if that is the source. So I decided to report it here.
This task depends upon

Closed by  Torsten Keßler (tpkessler)
Thursday, 25 May 2023, 15:16 GMT
Reason for closing:  Fixed
Additional comments about closing:  upstream reorganized package
Comment by Barnabás Pőcze (pobrn) - Wednesday, 05 April 2023, 14:09 GMT

Loading...