FS#69005 - [cuda] 11.2 incompatible with driver 455.45
Attached to Project:
Community Packages
Opened by Michael (ZeroBeat) - Wednesday, 16 December 2020, 16:42 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Thursday, 07 January 2021, 17:24 GMT
Opened by Michael (ZeroBeat) - Wednesday, 16 December 2020, 16:42 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Thursday, 07 January 2021, 17:24 GMT
|
Details
CUDA 11.2 is incompatible with current driver:
$ pacman -Q | grep cuda cuda 11.2.0-1 $ pacman -Q | grep nvidia nvidia 455.45.01-7 nvidia-settings 455.45.01-1 nvidia-utils 455.45.01-1 opencl-nvidia 455.45.01-1 $ hashcat -m 22000 --benchmark hashcat (v6.1.1-120-g15bf8b730) starting in benchmark mode... CUDA API (CUDA 11.1) Device #1: GeForce GTX 970, 3887/4039 MB, 13MCU OpenCL API (OpenCL 1.2 CUDA 11.1.114) - Platform #1 [NVIDIA Corporation] Device #2: GeForce GTX 970, skipped Hashmode: 22000 - WPA-PBKDF2-PMKID+EAPOL (Iterations: 4095) cuLinkAddData(): the provided PTX was compiled with an unsupported toolchain. Device #1: Kernel /usr/share/hashcat/OpenCL/shared.cl link failed. Error Log: ptxas application ptx input, line 9; fatal : Unsupported .version 7.2; current version is '7.1' Device #1: Kernel /usr/share/hashcat/OpenCL/shared.cl build failed. Started: Wed Dec 16 16:16:38 2020 Stopped: Wed Dec 16 16:16:40 2020 release notes: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html CUDA 11.2.0 GA >=460.27.04 >=460.89 Solutions: Inform user not to update to 11.2 untill nvidia 460.27 is released or stop update to cuda 11.2 or provide nvidia 460.27.4 (beta) driver Stay healthy, cheers Mike |
This task depends upon
Closed by Sven-Hendrik Haase (Svenstaro)
Thursday, 07 January 2021, 17:24 GMT
Reason for closing: Fixed
Thursday, 07 January 2021, 17:24 GMT
Reason for closing: Fixed
https://docs.nvidia.com/cuda/parallel-thread-execution/#changes-in-ptx-isa-version-7-2
and the "basic functions" (e.g. quering a device) are still working.
This is an output of a small CUDA code (only basic functions) to query my device:
$ ./dp
CUDA Device Query...
There are 1 CUDA devices.
CUDA Device #0
Major revision number: 5
Minor revision number: 2
Name: GeForce GTX 970
Total global memory: 4236115968
Total shared memory per block: 49152
Total registers per block: 65536
Warp size: 32
Maximum memory pitch: 2147483647
Maximum threads per block: 1024
Maximum dimension 0 of block: 1024
Maximum dimension 1 of block: 1024
Maximum dimension 2 of block: 64
Maximum dimension 0 of grid: 2147483647
Maximum dimension 1 of grid: 65535
Maximum dimension 2 of grid: 65535
Clock rate: 1228000
Total constant memory: 65536
Texture alignment: 512
Concurrent copy and execution: Yes
Number of multiprocessors: 13
Kernel execution timeout: No
I observed the same problem when moving from CUDA 11.0 to 11.1 while running an older driver:
CUDA 11.1 GA >= 455.23 >= 456.38
CUDA 11.0.3 Update 1 >= 450.51.06 >= 451.82
https://github.com/hashcat/hashcat/issues/2626
adds uvm kernel module support for Kernel >= 5.9; which is reenabled now, i.e. things like CUDA are working again with kernels >= 5.9
https://opensuse.pkgs.org/15.2/nvidia-x86_64/nvidia-computeG05-460.27.04-lp152.33.1.x86_64.rpm.html
That let me assume, we (Arch) are not the only one running into that issue.
Stay healthy,
cheers
Mike
Tried that on 5.9.14-arch1-1. Maybe 460.27.04 is improved for 5.10.1 - but I'm not sure.
But I will continue to test the driver in combination with 5.10.1.
Maybe I'm able to figure out, what went wrong.
BTW:
Your PKGBULDs are excellent. Worked before like a charm.
It looks like that the combination kernel 5.10.1 -> nvidia 460.27.04 is working much better kernel 5.9.14 -> 460.27.04.
We can assume that, if the final driver is released, it will work fine with kernel 5.10 and you shouldn't waste your time, trying to get it work on 5.9.xx
Unfortunately there are still some issues (notebooks) in combination with I have to deal with:
ASUS (TUF gaming) notebook: AMD integrated GPU + NVIDA PCIe card (GTX 1650)
ASUS notebook: Intel integrated GPU + NVIDIA PCIe card (M940)
After turning on the notebook sometimes it takes more than 5 times to reboot the notebook until the NVIDIA card is detected and I'll not run into a black screen.
But I'm not sure if this issue is really related to the beta driver or my xorg config's (attached it - maybe I'm too stupid to generate a correct one and you have a better idea).
20-nvidia.conf.amd_nvidia:
Section "Device"
Identifier "nvidia"
Driver "nvidia"
BusID "PCI:1:0:0"
VendorName "NVIDIA Corporation"
Option "NoLogo" "1"
Option "Interactive" "0"
Option "Coolbits" "12"
Option "AllowEmptyInitialConfiguration"
EndSection
Section "Device"
Identifier "amd"
Driver "amdgpu"
BusID "PCI:5:0:0"
EndSection
Section "Screen"
Identifier "amd"
Device "amd"
EndSection
20-nvidia.conf.intel_nvidia:
Section "Device"
Identifier "nvidia"
Driver "nvidia"
BusID "PCI:1:0:0"
VendorName "NVIDIA Corporation"
Option "NoLogo" "1"
Option "Interactive" "0"
Option "Coolbits" "12"
Option "AllowEmptyInitialConfiguration"
EndSection
Section "Device"
Identifier "intel"
Driver "modesetting"
EndSection
Section "Screen"
Identifier "intel"
Device "intel"
EndSection
Stay healthy
cheers
Mike
At least I found the issue. Due to fast SSDs my notebooks booting too fast and I have "slow down" them during boot to prevent systemd attempt to start the display manager before the NVIDIA driver has fully initialized.
After adding an udev rule, the combination of kernel 5.10.1 and nvidia 460.27.04 is working fine.
Now we can wait until final nvidia 460.27 will be released and 5.10 leaves testing.
We can close this report.
Thanks.
Happy new year,
cheers
Mike