FS#74520 - [python-grpcio] Makes tensorboard 2.8.0-2 segment fault

Attached to Project: Community Packages
Opened by NextAlone (NextAlone) - Wednesday, 20 April 2022, 18:48 GMT
Last edited by Konstantin Gizdov (kgizdov) - Thursday, 21 April 2022, 10:08 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To Massimiliano Torromeo (mtorromeo)
Sven-Hendrik Haase (Svenstaro)
Konstantin Gizdov (kgizdov)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:

Fatal Python error: Segmentation fault

Current thread 0x00007f693bfdc740 (most recent call first):
File "/usr/lib/python3.10/site-packages/grpc/_channel.py", line 926 in _blocking
File "/usr/lib/python3.10/site-packages/grpc/_channel.py", line 944 in __call__
File "/usr/lib/python3.10/site-packages/tensorboard/data/server_ingester.py", line 184 in start
File "/usr/lib/python3.10/site-packages/tensorboard/program.py", line 409 in _start_subprocess_data_ingester
File "/usr/lib/python3.10/site-packages/tensorboard/program.py", line 446 in _make_data_ingester
File "/usr/lib/python3.10/site-packages/tensorboard/program.py", line 463 in _make_data_provider
File "/usr/lib/python3.10/site-packages/tensorboard/program.py", line 475 in _make_server
File "/usr/lib/python3.10/site-packages/tensorboard/program.py", line 292 in _run_serve_subcommand
File "/usr/lib/python3.10/site-packages/tensorboard/program.py", line 276 in main
File "/usr/lib/python3.10/site-packages/absl/app.py", line 258 in _run_main
File "/usr/lib/python3.10/site-packages/absl/app.py", line 312 in run
File "/usr/lib/python3.10/site-packages/tensorboard/main.py", line 46 in run_main
File "/usr/bin/tensorboard", line 33 in <module>

Extension modules: grpc._cython.cygrpc, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, simplejson._speedups, _cffi_backend, google.protobuf.pyext._message, tensorflow.python.framework.fast_tensor_util, h5py._errors, h5py.defs, h5py._objects, h5py.h5, h5py.h5r, h5py.utils, h5py.h5s, h5py.h5ac, h5py.h5p, h5py.h5t, h5py._conv, h5py.h5z, h5py._proxy, h5py.h5a, h5py.h5d, h5py.h5ds, h5py.h5g, h5py.h5i, h5py.h5f, h5py.h5fd, h5py.h5pl, h5py.h5o, h5py.h5l, h5py._selector, scipy._lib._ccallback_c, scipy.sparse._sparsetools, scipy.sparse._csparsetools, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.strptime, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pandas._libs.ops, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json, PIL._imaging, scipy.ndimage._nd_image, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg.cython_lapack, scipy.linalg._decomp_update, scipy.special._comb, scipy.special._ellip_harm_2, _ni_label, scipy.ndimage._ni_label (total: 111)
fish: Job 1, 'tensorboard --logdir ~/tf-logs/' terminated by signal SIGSEGV (Address boundary error)

Additional info:
* package version(s)
* config and/or log files etc.
* link to upstream bug report, if any

Steps to reproduce:

pacman -S tensorboard
tensorboard --logdir ~/log
This task depends upon

Closed by  Konstantin Gizdov (kgizdov)
Thursday, 21 April 2022, 10:08 GMT
Reason for closing:  Fixed
Additional comments about closing:  grpc 1.45.2-2
Comment by Chih-Hsuan Yen (yan12125) - Thursday, 21 April 2022, 01:34 GMT
Sorry, I only checked if tensorboard fails to start in https://bugs.archlinux.org/task/74460, but did not check if it can actually load data.

The crash is caused by python-grpcio and can be fixed by upstream patch https://github.com/grpc/grpc/commit/05af494b282542304c9fa60d19e8aa1b9f474621.patch, so I assigned maintainers of both tensorboard and python-grpcio to this ticket.
Comment by Yuxin Wu (ppwwyyxx) - Thursday, 21 April 2022, 08:25 GMT
Downgrading to `python-grpcio-1.43.2-1` can make tensorboard work again.
Comment by Konstantin Gizdov (kgizdov) - Thursday, 21 April 2022, 08:50 GMT
try grpc 1.45.2-2
Comment by Chih-Hsuan Yen (yan12125) - Thursday, 21 April 2022, 09:37 GMT
I can confirm python-grpcio 1.45.2-2 brings tensorboard back. I can load tensorflow logs from a boring and small Keras model.

Loading...