FS#79112 - [hyperscan] need to pass C(XX)FLAGS to build to avoid the -march=native

Attached to Project: Arch Linux
Opened by John (graysky) - Saturday, 15 July 2023, 18:12 GMT
Last edited by Toolybird (Toolybird) - Monday, 17 July 2023, 00:02 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Upstream has hardcoded[1] the use of -march=native so which ever CPU is used to build the package will be passed on to the package. You can pass the following to force a level of features/override that behavior.

--- a/PKGBUILD
+++ b/PKGBUILD
@@ -34,6 +34,8 @@ build() {
-DCMAKE_INSTALL_PREFIX=/usr \
-DCMAKE_INSTALL_LIBDIR=lib \
-DBUILD_SHARED_LIBS=ON \
+ -DCMAKE_C_FLAGS="-march=x86-64-v2" \
+ -DCMAKE_CXX_FLAGS="-march=x86-64-v2" \
-Wno-dev
cmake --build build
}

The plus side is that it should allow for the most supported CPUs. The down side is that AVX and AVX2 processing will not be built in to the package. Do you have an opinion about which to go with? More supported hardware or better theoretical performance? Note that a min of SSSE3 is required for hyperscan.

For reference:
-march=x86-64-v2: (close to Nehalem) CMPXCHG16B, LAHF-SAHF, POPCNT, SSE3, SSE4.1, SSE4.2, SSSE3
-march=x86-64-v3: (close to Haswell) AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE

Source: https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels

1. https://github.com/intel/hyperscan/blob/master/doc/dev-reference/getting_started.rst
This task depends upon

Closed by  Toolybird (Toolybird)
Monday, 17 July 2023, 00:02 GMT
Reason for closing:  Not a bug
Additional comments about closing:  See comments
Comment by loqs (loqs) - Saturday, 15 July 2023, 18:46 GMT
If you add --verbose to cmake --build build are you seeing -march=native being passed to the compiler? If you set FAT_RUNTIME=OFF does cmake configuration then fail with "A minimum of SSSE3 compiler support is required" while with the default FAT_RUNTIME=ON there is no error?
Edit:
With FAT_RUNTIME=OFF does adding the following allow the build to continue?
CFLAGS=${CFLAGS/x86-64/x86-64-v2}
CXXFLAGS=${CXXFLAGS/x86-64/x86-64-v2}
Comment by John (graysky) - Saturday, 15 July 2023, 19:40 GMT
Condition 1. Addition of --verbose
I do not see the native value for my CPU but I do see this several values getting passed including:
-march=x86-64 and -march=core-avx2 which is appearing later and thus overriding the standard value, see attachment = hyperscan-5.4.2-1-x86_64-build.log.1.gz

Condition 2. Addition of -DFAT_RUNTIME=OFF \
Yes, build fails:
-- Building without AVX512VBMI support
CMake Error at cmake/arch.cmake:108 (message):
A minimum of SSSE3 compiler support is required
Call Stack (most recent call first):
CMakeLists.txt:340 (include)

Condition 3. Condition 2 + export CFLAGS=${CFLAGS/x86-64/x86-64-v2} and export CXXFLAGS=${CXXFLAGS/x86-64/x86-64-v2} and --verbose
Yes, build completes. See attachment =
Comment by loqs (loqs) - Saturday, 15 July 2023, 20:17 GMT
I think the additional march is set for the optimized library [1][2][3] which match the values posted in your first log.
Do you have system without SSE3 to test what the current repo package does when the listed minimal support is not present? I do not have such a system any more (at least working).
If SSE3 support is a hard requirement then it is a question of is such a package allowed.
With FAT_RUNTIME=ON there does not seem to be any benefit to changing the CFLAGS/CXXFLAGS.

[1] https://github.com/intel/hyperscan/blob/v5.4.2/CMakeLists.txt#L1295
[2] https://github.com/intel/hyperscan/blob/v5.4.2/CMakeLists.txt#L1302
[3] https://github.com/intel/hyperscan/blob/v5.4.2/CMakeLists.txt#L1309
Comment by John (graysky) - Saturday, 15 July 2023, 21:26 GMT
Loqs - I don't have older hardware for testing.
Comment by Toolybird (Toolybird) - Saturday, 15 July 2023, 22:46 GMT
Sidenote: It's very easy to emulate older CPU's with QEMU. Docs here [1]

$ qemu-system-x86_64 -cpu help

[1] https://qemu.readthedocs.io/en/latest/system/qemu-cpu-models.html#abi-compatibility-levels-for-cpu-models
Comment by loqs (loqs) - Sunday, 16 July 2023, 01:50 GMT
The hyperscan package only provides a library. The only user of that library is rspamd which will not use the library unless the CPU supports SSE3 [1][2].

[1] https://github.com/rspamd/rspamd/blob/3.5/src/libutil/multipattern.c#L74
[2] https://github.com/intel/hyperscan/blob/v5.4.2/src/hs_valid_platform.c#L39
Comment by Toolybird (Toolybird) - Monday, 17 July 2023, 00:02 GMT
Thanks for the research @loqs, -march=native is definitely not used in our build so there is no bug. The docs mention "ifunc" is leveraged for the Fat Runtime which is perfect for this kind of thing.

Loading...