Community Packages

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#68940 - [python-tensorflow] Need rebuilt against intel-mkl

Attached to Project: Community Packages
Opened by Chih-Hsuan Yen (yan12125) - Friday, 11 December 2020, 14:21 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Saturday, 12 December 2020, 16:50 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To Sven-Hendrik Haase (Svenstaro)
Konstantin Gizdov (kgizdov)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

Somehow I need to use both tensorflow and pytorch from the same Python script, while it is not possible with the latest Arch packages:

```
$ python -c 'import tensorflow, torch'
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.summary API due to missing TensorBoard installation.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.9/site-packages/torch/__init__.py", line 189, in <module>
_load_global_deps()
File "/usr/lib/python3.9/site-packages/torch/__init__.py", line 142, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.9/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /opt/intel/mkl/lib/intel64/libmkl_gnu_thread.so: undefined symbol: mkl_graph_mxm_gus_phase2_plus_second_fp32_def_i64_i32_fp32

$ python -c 'import torch, tensorflow'
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/tensorflow/python/pywrap_tensorflow.py", line 64, in <module>
from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /usr/lib/python3.9/site-packages/tensorflow/python/../../_solib_k8/_U@mkl_Ulinux_S_S_Cmkl_Ulibs_Ulinux___Umkl_Ulinux_Slib/libmkl_intel_thread.so: undefined symbol: mkl_blas_zgemm_blk_info_hi_thr_bdz

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.9/site-packages/tensorflow/__init__.py", line 41, in <module>
from tensorflow.python.tools import module_util as _module_util
File "/usr/lib/python3.9/site-packages/tensorflow/python/__init__.py", line 40, in <module>
from tensorflow.python.eager import context
File "/usr/lib/python3.9/site-packages/tensorflow/python/eager/context.py", line 35, in <module>
from tensorflow.python import pywrap_tfe
File "/usr/lib/python3.9/site-packages/tensorflow/python/pywrap_tfe.py", line 28, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/lib/python3.9/site-packages/tensorflow/python/pywrap_tensorflow.py", line 83, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/tensorflow/python/pywrap_tensorflow.py", line 64, in <module>
from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /usr/lib/python3.9/site-packages/tensorflow/python/../../_solib_k8/_U@mkl_Ulinux_S_S_Cmkl_Ulibs_Ulinux___Umkl_Ulinux_Slib/libmkl_intel_thread.so: undefined symbol: mkl_blas_zgemm_blk_info_hi_thr_bdz


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
```

Apparently python-tensorflow bundles a copy of intel-mkl libraries:

$ pacman -Qo /usr/lib/python3.9/site-packages/_solib_k8/_U@mkl_Ulinux_S_S_Cmkl_Ulibs_Ulinux___Umkl_Ulinux_Slib
/usr/lib/python3.9/site-packages/_solib_k8/_U@mkl_Ulinux_S_S_Cmkl_Ulibs_Ulinux___Umkl_Ulinux_Slib/ is owned by python-tensorflow 2.3.1-7
$ ls /usr/lib/python3.9/site-packages/_solib_k8/_U@mkl_Ulinux_S_S_Cmkl_Ulibs_Ulinux___Umkl_Ulinux_Slib
libiomp5.so libmkl_core.so libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_rt.so

So I suspect there are some ABI changes in intel-mkl. After rebuilding tensorflow against the latest intel-mkl, my script works again. Here is the PKGBUILD is use: https://fars.ee/ligJ - I removed -cuda and -opt for a shorter build.

Additional info:
python-tensorflow 2.3.1-7
python-pytorch 1.7.1rc2-1
intel-mkl 2020.4.304-1

Steps to reproduce:
See above
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Saturday, 12 December 2020, 16:50 GMT
Reason for closing:  Fixed
Comment by Sven-Hendrik Haase (Svenstaro) - Saturday, 12 December 2020, 10:27 GMT
Check 2.4.0rc4 in repos.
Comment by Chih-Hsuan Yen (yan12125) - Saturday, 12 December 2020, 15:03 GMT
Thanks, it works!

Loading...