FS#74023 - [mac] Build with optimisation flags (and optionally AVX2)
Attached to Project:
Community Packages
Opened by Valérian Sibille (Dakeryas) - Friday, 04 March 2022, 00:09 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:08 GMT
Opened by Valérian Sibille (Dakeryas) - Friday, 04 March 2022, 00:09 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:08 GMT
|
Details
Description: The current compression is four times slower
than the AVX2 build on my machine (i7-7700HQ). The original
author added AVX2 instructions in version 6.50 already, so I
guess it would make sense to at least add `-march=native` in
the build flags, such that AVX2 would be enable on any
machine that support is. Even just including the O2 flag
(which is not included in the author's Makefile, although it
uses CXXFLAGS correctly if set) shrinks the compression time
by a factor of three.
Steps to reproduce: Compare the WAV to APE conversion runtime with a manual build using `export CXXFLAGS="${CXXFLAGS} -O3 -march=native"` and GCC 11, with the runtime using the executable from package. |
This task depends upon
Closed by Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:08 GMT
Reason for closing: Moved
Additional comments about closing: https://gitlab.archlinux.org/archlinux/p ackaging/packages/mac/issues/1
Saturday, 25 November 2023, 20:08 GMT
Reason for closing: Moved
Additional comments about closing: https://gitlab.archlinux.org/archlinux/p ackaging/packages/mac/issues/1
Arch's default CFLAGS and CXXFLAGS already include -O2
You have a fair point, then I guess it would entail making two packages (mac and mac-avx2) and two builds, i.e. add a build with an explicit "-mavx2" flag for distribution.
I have git-applied your patch, but the compression speeds are the same as with only the O2 flag. In other words, AVX2 is not used. I built the code with:
make -C Source/Projects/NonWindows clean all CXXFLAGS='-O2'
Looking further at the code it seems that the GetMMXAvailable function is only called in the APE decompression code (file Source/MACLib/Old/APEDecompressCore.cpp), so I assume this is expected...
Please try the updated test.patch attached to this comment.
Patching that as well then revealed a much more significant issue. GCC and Clang only make the wrapper functions for AVX and SSE2 intrinsics available when the relevant feature switch is passed.
MSVC allows their use even when the architecture does not support it. So the build then fails. This looks like it is impossible to fix before x86_64v3.
[1] https://bbs.archlinux.org/viewtopic.php?id=263371
[2] https://www.phoronix.com/news/openSUSE-TW-x86-64-v3-HWCAPS