FS#75104 - [zstd] Arch build of zstd does not scale with thread count due to the build system it uses
Attached to Project:
Arch Linux
Opened by Arvid Norlander (VorpalGun) - Saturday, 18 June 2022, 07:13 GMT
Last edited by Levente Polyak (anthraxx) - Tuesday, 21 February 2023, 18:53 GMT
Opened by Arvid Norlander (VorpalGun) - Saturday, 18 June 2022, 07:13 GMT
Last edited by Levente Polyak (anthraxx) - Tuesday, 21 February 2023, 18:53 GMT
|
Details
Description:
After reading a reading a recent Phoronix benchmark (see reference at the bottom) I decided to investigate why Arch Linux was so much slower (10-20x) for zstd performance. Here is what I found, and hopefully this can help improve the performance in the future! Zstd appears to have more than one build system supported by upstream. Relevant to Linux are: * Plain Makefile (what Ubuntu uses to build) * CMake (what Arch uses to build) * Meson (not sure who uses this but I tested it for completeness) It turns out that with the CMake and Meson build systems (without any options as well as the options Arch uses) there is *negative* scaling between -T1 (one thread) and -T6 (the number of cores my computer has). However for the plain Makefile the expected positive scaling with number of threads exist. I included the full performance analysis in the upstream bug report linked below, but depending on the outcome of that bug, I suggest Arch might want to change which build system it uses. Additional info: * package version(s) 1.5.2-7 * link to upstream bug report: https://github.com/facebook/zstd/issues/3163 Steps to benchmark: * To benchmark use path/to/zstd/binary -T<num threads> -b4 <path to large file> * For the large file I used the FreeBSD USB stick image, as this is what Phoronix uses. Phoronix uses an older version, for which I could not find the download link, but the same general pattern can be reproduced with the current version. References: * https://www.phoronix.com/scan.php?page=article&item=hp-devone-linux&num=3 (a bit down the page) |
This task depends upon
Closed by Levente Polyak (anthraxx)
Tuesday, 21 February 2023, 18:53 GMT
Reason for closing: Fixed
Additional comments about closing: 1.5.2-8
Tuesday, 21 February 2023, 18:53 GMT
Reason for closing: Fixed
Additional comments about closing: 1.5.2-8
https://github.com/facebook/zstd/blob/dev/build/cmake/CMakeModules/AddZstdCompilationFlags.cmake#L28
Removing it makes the cmake-build speed similar to the plain-make one (which is still faster with -T1 vs -T6 on my machine)
I will confirm that information then append it to the upstream bug report.
I don't know why make would be slow for you though. I literally just downloaded the upstream release tarball and built it with make, no CFLAGS/CXXFLAGS set. When talking on IRC with another person who had a 10 core CPU they said it only scaled up to 7 threads for them and then started going down again, so there may be memory bandwidth and/or cache effects to take into account. Consider checking if you have a "peak" like that.
I am the aforementioned "another person"
I7-12700KF
64GB DDR4 3200 MT/s dual channel 4 dimms
cpu pinned to 5GHz
gov set to performance
test run 2x first run discard for thermal reasons
test file archlinux-2022.04.01-x86_64.iso
zstd packages
core/zstd (cmake) v1.5.2
upstream zstd (make) v1.5.2 (built with makepkg can provide the pkgbuild if wanted)
see results file attached you can searchfor specific test results in the format of
NumThreads:CompressionLevel:Binary
Test script:
#!/usr/bin/zsh
Cores=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)
Levels=(1 2 3 4 5 6 7 8)
Bins=(/usr/bin/zstd ~/git/zstd/pkg/zstd/usr/bin/zstd)
echo "Cores:Level:Bin"
for Level in $Levels
do
for Core in $Cores
do
for Bin in $Bins
do
echo "$Core:$Level:$Bin"
time $Bin -T$Core -b$Level archlinux-2022.04.01-x86_64.iso
echo ""
done
done
done
If Arch maintainers doesn't want to constantly debug issues nobody else paying attention to then maybe they should switch back to something actually supported.
[1] https://github.com/facebook/zstd/issues/3163#issuecomment-1159627324
[2] https://github.com/facebook/zstd/issues/2261
2. Isn't pcsx2 not supporting pkg-config (if that is the case) worth filing an upstream feature request about? I know for sure cmake does have support for pkg-config (I have used that myself).