FS#79620 - [python-pytorch-opt-rocm] Add new APU to the build

Attached to Project: Arch Linux
Opened by João (JotaFan) - Friday, 08 September 2023, 17:48 GMT
Last edited by Torsten Keßler (tpkessler) - Tuesday, 12 September 2023, 07:51 GMT
Task Type Feature Request
Category Packages: Extra
Status Closed
Assigned To Sven-Hendrik Haase (Svenstaro)
Konstantin Gizdov (kgizdov)
Torsten Keßler (tpkessler)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
Could you had the gfx90c as one of the default GPU on the build of this package?
It took me a long time to build this package, and I had to twirk a lot.

Additional info:
* package version(s)
* config and/or log files etc.
* link to upstream bug report, if any

Steps to reproduce:
This task depends upon

Closed by  Torsten Keßler (tpkessler)
Tuesday, 12 September 2023, 07:51 GMT
Reason for closing:  Won't implement
Additional comments about closing:  APUs are not supported by ROCm
Comment by Christian Heusel (gromit) - Friday, 08 September 2023, 18:09 GMT
This is missing a lot of context:

- Which package are you talking about?
- When you tweaked the PKGBUILD, could you share it?
- What problem are you trying to solve?

Future low quality bug reports like this will most likely just be closed, so please show some effort!
Comment by João (JotaFan) - Friday, 08 September 2023, 18:44 GMT
Yes. I apologise.
1st report, and I though I was doing it directly on the package.

package: python-pytorch-opt-rocm
PKGBUILD:
added on line 275:
export PYTORCH_ROCM_ARCH="gfx90c"

The package was originally built for other GPU's, so this variables PYTORCH_ROCM_ARCH were few GPU but mine is not in it.
And since the building is huge and always breaks due to OOM, I though of asking just to also add that GPU to the list on the next build
Comment by loqs (loqs) - Sunday, 10 September 2023, 22:38 GMT
The current value of PYTORCH_ROCM_ARCH [1] does not include any APUs [2]? Perhaps a comment could be added explaining the selection?

[1] https://gitlab.archlinux.org/archlinux/packaging/packages/python-pytorch/-/blob/79b2507736b38bf87cf47a2c3a3e4aafbf7f2cd0/PKGBUILD#L275
[2] https://llvm.org/docs/AMDGPUUsage.html
Comment by João (JotaFan) - Monday, 11 September 2023, 15:28 GMT
The specific APU I am asking for:
gfx90c

So when I install the package and try to run pythorch I get:
2023-09-11 16:26:05.415264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2015] Ignoring visible gpu device (device: 0, name: AMD Radeon Graphics, pci bus id: 0000:04:00.0) with AMDGPU version : gfx90c. The supported AMDGPU versions are gfx1030, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942.


So I would guess that if the build also adds that version I would be able to run it.
But I cannot make the build as I get OOM when I tried. and That is why I come here to ask for a package, also built with that flag
Comment by Torsten Keßler (tpkessler) - Tuesday, 12 September 2023, 07:51 GMT
It's not as simple as just adding the targets to the list. Since APUs are not supported by ROCm, any low level GPU library will crash when using pytorch. Try

env HSA_OVERRIDE_GFX_VERSION=9.0.0 python ...

instead. This forces the GPU libraries to follow the code paths of the closest supported ISA, gfx900.