FS#67028 - [python-pytorch-cuda] testing: needs to be rebuilt against cuda 11

Attached to Project: Community Packages
Opened by Benoit Brummer (trougnouf) - Wednesday, 17 June 2020, 14:23 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Monday, 13 July 2020, 06:14 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Sven-Hendrik Haase (Svenstaro)
Konstantin Gizdov (kgizdov)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

cuda 11.0.1-1 in Community-Testing is not compatible with python-tensorflow-*cuda 2.2.0-1 which is built against cuda 10.2

Steps to reproduce:
trougnouf@benoit-intoPIXPC liujiaheng_compression]$ python
Python 3.8.3 (default, May 17 2020, 18:15:42)
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/site-packages/torch/__init__.py", line 135, in <module>
_load_global_deps()
File "/usr/lib/python3.8/site-packages/torch/__init__.py", line 93, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory
>>>
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Monday, 13 July 2020, 06:14 GMT
Reason for closing:  Fixed
Comment by Sven-Hendrik Haase (Svenstaro) - Thursday, 18 June 2020, 02:19 GMT
Yeah, we know. We're holding off the rebuild until nvidia has published the rest of the ecosystem against cuda 11.
Comment by Sven-Hendrik Haase (Svenstaro) - Thursday, 18 June 2020, 02:32 GMT
Just to add: You can think as cuda 11 in testing as a "user preview" currently but it's likely that we'll also wait for the final non-rc release to do the full rebuild (we're also still waiting on a publicly accessible cudnn 8 with cuda 11 support). If you need cuda right now, go back to the non-testing version.
Comment by Benoit Brummer (trougnouf) - Friday, 10 July 2020, 19:40 GMT
  • Field changed: Percent Complete (100% → 0%)
Is testing supposed to be an unusable "user preview"? It's been a month and the testing repo can't be used with pytorch and likely other cuda applications. If it's normal to knowingly break dependencies here then I apologise for bugging you.
Comment by Sven-Hendrik Haase (Svenstaro) - Friday, 10 July 2020, 20:45 GMT
No, usually testing should not be broken. We usually keep things in staging if they are known to be broken. However, cuda is so complex and contains so many moving parts that we put it into testing this time around to be able to test it somewhat thoroughly. The problem about pytest is that it's blocked on magma and magma upstream is currently migrating to git and apparently that takes a lot of time.

Anyway, all that said, it you need the cuda ecosystem to work, just use stable packages for the time being and block the cuda upgrade in your pacman.conf. If you'd like to help, you can try to make a patch to get magma to build. It's essentially the last missing part.
Comment by Eli Schwartz (eschwartz) - Sunday, 12 July 2020, 18:40 GMT
  • Field changed: Severity (Low → High)
> cuda is so complex and contains so many moving parts that we put it into testing this time around to be able to test it somewhat thoroughly

There's no need to test it if we know it doesn't work. It's incredibly discouraging to users who were told that [testing] is intended to work, to be told "well, we decided to break it this time, my advice is to do partial updates".

The inevitable consequence of this is that our users lose their faith in the testing repos and decide they're not interested in a hot mess of broken... stuff. So they stop using testing, we end up without testers, and the entire fundamental point of having a testing repository is rendered null.

Please move it from testing to staging until such time as it's actually ready to be tested. If people want to test completely broken stuff, they can do so by downloading the relevant packages manually, and installing them with pacman -U.

[testing] shall not be broken with deliberate intent, and shall not be left to exist in torment like that for months. [staging] is the correct technology for doing so.

Loading...