FS#74023 - [mac] Build with optimisation flags (and optionally AVX2)

Attached to Project: Community Packages
Opened by Valérian Sibille (Dakeryas) - Friday, 04 March 2022, 00:09 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:08 GMT
Task Type Feature Request
Category Packages
Status Closed
Assigned To George Rawlinson (rawlinsong)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description: The current compression is four times slower than the AVX2 build on my machine (i7-7700HQ). The original author added AVX2 instructions in version 6.50 already, so I guess it would make sense to at least add `-march=native` in the build flags, such that AVX2 would be enable on any machine that support is. Even just including the O2 flag (which is not included in the author's Makefile, although it uses CXXFLAGS correctly if set) shrinks the compression time by a factor of three.

Steps to reproduce: Compare the WAV to APE conversion runtime with a manual build using `export CXXFLAGS="${CXXFLAGS} -O3 -march=native"` and GCC 11, with the runtime using the executable from package.

This task depends upon

Closed by  Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:08 GMT
Reason for closing:  Moved
Additional comments about closing:  https://gitlab.archlinux.org/archlinux/p ackaging/packages/mac/issues/1
Comment by Doug Newgard (Scimmia) - Friday, 04 March 2022, 00:23 GMT
On machines that support it? That's not how binary distros work, this would make it so it doesn't run on a bunch of machines that Arch supports. AVX2 would have to wait until the x86_64v3 repos are a thing.

Arch's default CFLAGS and CXXFLAGS already include -O2
Comment by loqs (loqs) - Friday, 04 March 2022, 00:47 GMT
It looks like mac does support runtime AVX2 detection but it is guarded by #if defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64)) so it does not work with gcc.
Comment by Valérian Sibille (Dakeryas) - Friday, 04 March 2022, 00:58 GMT
@Doug Newgard

You have a fair point, then I guess it would entail making two packages (mac and mac-avx2) and two builds, i.e. add a build with an explicit "-mavx2" flag for distribution.
Comment by Levente Polyak (anthraxx) - Friday, 04 March 2022, 18:18 GMT
@Valérian: Thats not something we do, this will need to wait for x86_64v3 if it doesnt support runtime detection for gcc.
Comment by loqs (loqs) - Saturday, 05 March 2022, 00:01 GMT
@Dakeryas if you apply the attached patch (patch is against 7.38) is runtime detection with gcc then enabled?
Comment by Valérian Sibille (Dakeryas) - Monday, 07 March 2022, 20:52 GMT
@loqs

I have git-applied your patch, but the compression speeds are the same as with only the O2 flag. In other words, AVX2 is not used. I built the code with:
make -C Source/Projects/NonWindows clean all CXXFLAGS='-O2'

Looking further at the code it seems that the GetMMXAvailable function is only called in the APE decompression code (file Source/MACLib/Old/APEDecompressCore.cpp), so I assume this is expected...


Comment by loqs (loqs) - Monday, 07 March 2022, 21:32 GMT
AVX2 is used in Source/MACLib/NNFilter.cpp. I missed that ENABLE_AVX_ASSEMBLY has a condition check in All.h.
Please try the updated test.patch attached to this comment.
Comment by Valérian Sibille (Dakeryas) - Saturday, 12 March 2022, 21:09 GMT
@loqs Sorry for the late reply. I have tried the new patch, but I still get the same slower compression time...
Comment by loqs (loqs) - Sunday, 13 March 2022, 00:03 GMT
The project contains two All.h Shared/All.h which I patched that is not used and Source/Shared/All.h which it used but I failed to patch.
Patching that as well then revealed a much more significant issue. GCC and Clang only make the wrapper functions for AVX and SSE2 intrinsics available when the relevant feature switch is passed.
MSVC allows their use even when the architecture does not support it. So the build then fails. This looks like it is impossible to fix before x86_64v3.
Comment by Valérian Sibille (Dakeryas) - Sunday, 13 March 2022, 04:08 GMT
I see, thanks for trying! I still don't get the issue of having a second package function/version in the same PKGBUILD that builds explicitly with the `-mavx2` flag
Comment by Toolybird (Toolybird) - Wednesday, 17 May 2023, 07:15 GMT
This could be solved *right now* by leveraging "hwcaps" support in glibc [1]. No need to wait around for a x86-64-v3 port..no need for a 2nd pkg. openSUSE are all aboard [2]

[1] https://bbs.archlinux.org/viewtopic.php?id=263371
[2] https://www.phoronix.com/news/openSUSE-TW-x86-64-v3-HWCAPS

Loading...