FS#68362 - [cuda] cudaGetDeviceCount returned 999: unknown error
Attached to Project:
Community Packages
Opened by Maxim Terpilowski (maximtrp) - Wednesday, 21 October 2020, 19:20 GMT
Last edited by Doug Newgard (Scimmia) - Wednesday, 21 October 2020, 23:13 GMT
Opened by Maxim Terpilowski (maximtrp) - Wednesday, 21 October 2020, 19:20 GMT
Last edited by Doug Newgard (Scimmia) - Wednesday, 21 October 2020, 23:13 GMT
|
Details
Description:
I am experiencing the strange behaviour of CUDA after updating the whole system (including drivers and kernel): running tensorflow, pytorch, and even deviceQuery sample program from CUDA package results in "unknown error" (see log). lspci entry: ``` 01:00.0 3D controller: NVIDIA Corporation GM108M [GeForce MX130] (rev a2) ``` GPU is seen in nvidia-smi output (nvidia-smi -L): ``` GPU 0: GeForce MX130 (UUID: GPU-4f408ed8-e9f2-8743-9d54-7668a08eafbd) ``` Driver is active (lsmod): ``` nvidia_drm 61440 2 nvidia_modeset 1216512 2 nvidia_drm nvidia 27705344 72 nvidia_modeset drm_kms_helper 266240 2 nvidia_drm,i915 drm 585728 11 drm_kms_helper,nvidia_drm,i915 ``` Additional info: * package version: * config and/or log files etc: https://pastebin.com/yizP3TY1 Steps to reproduce: 1. Compile deviceQuery sample from CUDA package (/opt/cuda/samples/1_Utilities/deviceQuery) 2. See output. Possible reason: Now, the old version of nvidia driver is specified in cuda package PKGBUILD file (https://github.com/archlinux/svntogit-community/blob/packages/cuda/trunk/PKGBUILD). I have nvidia-455.28 installed, but CUDA is built using 455.23. I do not know if it is somehow related to this bug... |
This task depends upon
Closed by Doug Newgard (Scimmia)
Wednesday, 21 October 2020, 23:13 GMT
Reason for closing: Duplicate
Additional comments about closing: FS#68312
Wednesday, 21 October 2020, 23:13 GMT
Reason for closing: Duplicate
Additional comments about closing:
- cuda 11.1.0-2
- nvidia 455.28-7
- linux 5.9.1.arch1-1