FS#80156 - [cuda] 12.3.0 compilation errors

Attached to Project: Arch Linux
Opened by Marcin Rzeźnicki (mrzeznicki) - Saturday, 04 November 2023, 02:59 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:21 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Sven-Hendrik Haase (Svenstaro)
Felix Yan (felixonmars)
Konstantin Gizdov (kgizdov)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No



cuda does not support gcc 13.x yet. See https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#host-compiler-support-policy
If you try to compile almost anything, you will get errors such as:

/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
typedef __float128 _Float128;

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
typedef float _Float32;

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
typedef double _Float64;
This task depends upon

Closed by  Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:21 GMT
Reason for closing:  Moved
Additional comments about closing:  https://gitlab.archlinux.org/archlinux/p ackaging/packages/cuda/issues/1
Comment by Jed Brown (jedbrown) - Saturday, 04 November 2023, 04:01 GMT
Note that `/opt/cuda/bin/g++` is a symlink to `g++-12`. One can manually `nvcc -ccbin /opt/cuda/bin/g++`, but I believe that is intended to be default behavior.
Comment by Sven-Hendrik Haase (Svenstaro) - Saturday, 04 November 2023, 12:03 GMT
cuda correctly uses gcc12. If there's a specific packaging issue you want to point out, please first verify it's not a problem with the build system configuration. nvcc should by default use the symlinked gcc12 compiler.
Comment by Marcin Rzeźnicki (mrzeznicki) - Saturday, 04 November 2023, 12:05 GMT
I see, I did not notice that the package already depends on gcc12.
Comment by Jed Brown (jedbrown) - Tuesday, 07 November 2023, 06:56 GMT
  • Field changed: Percent Complete (100% → 0%)
It is not fixed in cuda-12.3.0-4 in extra-testing. That has exactly the same behavior as in my previous message.
Comment by Toolybird (Toolybird) - Tuesday, 07 November 2023, 06:59 GMT
I just tried compiling the "deviceQuery" sample as per the Wiki and that worked. But most of the other samples are failing as per above so this is possibly still not right...
Comment by loqs (loqs) - Tuesday, 07 November 2023, 08:37 GMT
The cuda-samples specifies the host compiler using -ccbin $(HOST_COMPILER) [1]. As per the documentation you need to use `make HOST_COMPILER=/opt/cuda/bin/g++` or `make -O HOST_COMPILER=/opt/cuda/bin` [2].

[1]: https://github.com/NVIDIA/cuda-samples/blob/v12.3/Samples/1_Utilities/deviceQuery/Makefile#L163-L164
[2]: https://github.com/NVIDIA/cuda-samples/blob/v12.3/README.md#linux
Comment by loqs (loqs) - Tuesday, 07 November 2023, 08:49 GMT
If NVCC_PREPEND_FLAGS is is switched to NVCC_APPEND_FLAGS so -ccbin /opt/cuda/bin is added as the last argument would that allow all the samples to work as expected?
The above change allowed all but two samples to build:
Comment by Sven-Hendrik Haase (Svenstaro) - Tuesday, 07 November 2023, 15:14 GMT
The whole idea of having it be NVCC_PREPEND_FLAGS is to allow users to easily overwrite this during build processes. It's a bit of a pity that the CUDA samples overwrite this in the wrong way.
Comment by loqs (loqs) - Tuesday, 07 November 2023, 15:32 GMT
Close as 'Won't Fix' and expect users to follow the README.md for cuda-samples?
Comment by Toolybird (Toolybird) - Tuesday, 07 November 2023, 21:04 GMT
So it's a docs issue? It's not obvious and very unclear for noobs (i.e. me :) so someone should add a note to the Wiki and/or the .install file. Is  FS#80157  related?
Comment by loqs (loqs) - Tuesday, 07 November 2023, 21:32 GMT
> Is  FS#80157  related?
The second issue mentioned in  FS#80157  was caused by gcc 13 being used but is fixed in cuda-12.3.0-3/cuda 12.3.0-4. The first issue from  FS#80157  is unrelated and needs an upstream patch or pkgver update.
Comment by Sven-Hendrik Haase (Svenstaro) - Sunday, 12 November 2023, 03:36 GMT
Ideally, we could get rid of the profile thing again. Jakub reported a bug to nvidia to restore the original behavior. Let's see whether anything comes out of that.
Comment by Jed Brown (jedbrown) - Sunday, 12 November 2023, 04:08 GMT
Yeah, note that with cuda-12.3.0-4 (with the profile flags) when the user does, for example, `nvcc -ccbin clang++`, they get this warning, which is undesirable noise and a bit confusing to users, but otherwise "harmless".

nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
Comment by Jakub Klinkovský (lahwaacz) - Sunday, 12 November 2023, 19:58 GMT
Sorry for getting here late, this report slipped while I communicated with Sven by other channels...

I reported this to Nvidia's private bug tracker as "CUDA 12.3.0 breaks default GCC configuration by symlinks in $CUDA_ROOT/bin/" where we ping-ponged about 10 messages with their "bug wrangler" but now the engineering team should look at it and maybe fix it in the next release (or some later release).

Some core points from the discussion:

- Nvidia does not support GCC 13 as a host compiler and probably does not officially support a Linux distro which uses GCC 13 as default, so it was difficult to prove our case
- nvcc has several components (gcc/cudafe++/cicc/fatbinary…) and only cudafe++ was looking into /opt/cuda/bin/g++ in previous releases (you can check by running `nvcc --verbose something.cu` and notice `--gnu_version=130201` being passed around)
- our hack which removes the host compiler checks from the headers [1] did not help, I had to rebuild two CUDA versions (as I was arguing about a regression) without this change to get the "officially correct" error messages

[1] https://gitlab.archlinux.org/archlinux/packaging/packages/cuda/-/blob/main/PKGBUILD?ref_type=heads#L126-131