FS#68362 - [cuda] cudaGetDeviceCount returned 999: unknown error

Attached to Project: Community Packages
Opened by Maxim Terpilowski (maximtrp) - Wednesday, 21 October 2020, 19:20 GMT
Last edited by Doug Newgard (Scimmia) - Wednesday, 21 October 2020, 23:13 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To No-one
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
I am experiencing the strange behaviour of CUDA after updating the whole system (including drivers and kernel): running tensorflow, pytorch, and even deviceQuery sample program from CUDA package results in "unknown error" (see log).

lspci entry:
```
01:00.0 3D controller: NVIDIA Corporation GM108M [GeForce MX130] (rev a2)
```

GPU is seen in nvidia-smi output (nvidia-smi -L):
```
GPU 0: GeForce MX130 (UUID: GPU-4f408ed8-e9f2-8743-9d54-7668a08eafbd)
```

Driver is active (lsmod):
```
nvidia_drm 61440 2
nvidia_modeset 1216512 2 nvidia_drm
nvidia 27705344 72 nvidia_modeset
drm_kms_helper 266240 2 nvidia_drm,i915
drm 585728 11 drm_kms_helper,nvidia_drm,i915
```

Additional info:
* package version:
* config and/or log files etc: https://pastebin.com/yizP3TY1

Steps to reproduce:
1. Compile deviceQuery sample from CUDA package (/opt/cuda/samples/1_Utilities/deviceQuery)
2. See output.

Possible reason:
Now, the old version of nvidia driver is specified in cuda package PKGBUILD file (https://github.com/archlinux/svntogit-community/blob/packages/cuda/trunk/PKGBUILD). I have nvidia-455.28 installed, but CUDA is built using 455.23. I do not know if it is somehow related to this bug...
This task depends upon

Closed by  Doug Newgard (Scimmia)
Wednesday, 21 October 2020, 23:13 GMT
Reason for closing:  Duplicate
Additional comments about closing:   FS#68312 
Comment by Maxim Terpilowski (maximtrp) - Wednesday, 21 October 2020, 19:23 GMT
Versions of linux kernel, cuda, and nvidia packages installed in my system:
- cuda 11.1.0-2
- nvidia 455.28-7
- linux 5.9.1.arch1-1
Comment by David Thurstenson (thurstylark) - Wednesday, 21 October 2020, 19:26 GMT

Loading...