FS#78892 - [linux-firmware] Please switch to zstd compressed firmware

Attached to Project: Arch Linux
Opened by Emil (xexaxo) - Monday, 26 June 2023, 12:17 GMT
Last edited by Laurent Carlier (lordheavy) - Sunday, 09 July 2023, 09:56 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Laurent Carlier (lordheavy)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:

With v20230625 my firmware compression patches have landed. Now we have upstream support for xz and zstd compressed firmware and can drop our local patch.

The kernel had support for zstd compressed firmware since 5.19 and all the kernels supported in Arch are much newer.

In terms of numbers zstd default vs xz
- size has slightly gone slightly up - 408 vs 358
- compression speed (in case we care) - 27.27s user 7.31s system 104% cpu 33.218 total, vs 214.19s user 26.09s system 99% cpu 4:00.43 total
- decompression speed - should be much faster, but didn't measure it

If the increased size is a concern, we can use ZSTD_CLEVEL=19. Then the numbers change to:
- size is down to 378M
- compression speed is up (but still faster than xz) to 166.09s user 14.83s system 100% cpu 3:00.03 total

Note: the env. variable does not support ultra levels & I'm not sure if the kernel supports those.


Additional info:
* package version(s) 20230404.2e92a49f-1

Steps to reproduce:
- update the PKGBUILD to the v20230625 release or later, drop out xz patch
- update the PKGBUILD to switch to zst compression
- mkinitcpio -P
- enjoy the faster boot times - lolz much speed, in practise a few us
This task depends upon

Closed by  Laurent Carlier (lordheavy)
Sunday, 09 July 2023, 09:56 GMT
Reason for closing:  Implemented
Additional comments about closing:  linux-firmware-20230625.ee91452d-4
Comment by Piotr Górski (sir_lucjan) - Friday, 30 June 2023, 10:50 GMT
I can confirm that it works quite well.
Comment by Laurent Carlier (lordheavy) - Friday, 30 June 2023, 12:24 GMT
linux-firmware-20230625.ee91452d-2 is now in testing
Comment by Andreas Radke (AndyRTR) - Friday, 30 June 2023, 18:56 GMT
-2 version fails to boot linux-lts 6.1.36 here. Going back to -1 and rebuilding the initcpios make it boot again. So something in -lts kernel seems to be missing. Mainline kernel boots well with -2 here. Tested using booster and mkinitcpio here.
Comment by freswa (frederik) - Saturday, 01 July 2023, 12:25 GMT
Same for the default arch kernel. Fails to init (AMD) GPU on boot. Thus X and Wayland fail to start.
Rolling back to -1 fixes the issue.

```
amdgpu 0000:04:00.0: amdgpu: Fetched VBIOS from VFCT
amdgpu: ATOM BIOS: 113-REMBRANDT-X37
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/yellow_carp_toc.bin failed with error -2
[drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <psp> failed -19
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/yellow_carp_dmcub.bin failed with error -2
[drm:dm_early_init [amdgpu]] *ERROR* DMUB firmware loading failed: -19
[drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <dm> failed -19
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/yellow_carp_pfp.bin failed with error -2
[drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <gfx_v10_0> failed -19
[drm] VCN(0) decode is enabled in VM mode
[drm] VCN(0) encode is enabled in VM mode
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/yellow_carp_vcn.bin failed with error -2
[drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <vcn_v3_0> failed -19
[drm] JPEG decode is enabled in VM mode
amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
amdgpu 0000:04:00.0: amdgpu: amdgpu: finishing device.
```
Comment by Laurent Carlier (lordheavy) - Saturday, 01 July 2023, 13:03 GMT
linux-firmware-20230625.ee91452d-3 is now in testing with compression reverted to xz
Comment by Laurent Carlier (lordheavy) - Saturday, 01 July 2023, 14:49 GMT
This bug could be related to zstd support missing in our dracut package:

https://github.com/dracutdevs/dracut/commit/9d8387ed803dfc3e8b97d2e415a15083774d7ac6
Comment by Mike Cloaked (mcloaked) - Saturday, 01 July 2023, 15:09 GMT
With amd-ucode 20230625.ee91452d-3 as well as the -3 linux-firmware packages released today, my issue at https://bugs.archlinux.org/task/78939 is now resolved. Yes I am using the arch dracut package, so the comment about missing zstd support is relevant. It is also noted that the dracut package in [extra] is a rather old version (056) and the current upstream version (059) does have zstd support.
Comment by loqs (loqs) - Saturday, 01 July 2023, 15:52 GMT
@mcloaked if you rebuild dracut with the attached diff applied does that add zstd firmware support?
Comment by Mike Cloaked (mcloaked) - Saturday, 01 July 2023, 16:07 GMT
Thanks, loqs, though I have access to a privately built version of dracut that does have zstd support at version 059, which will be the easier route for me to be up to date with dracut.
Comment by loqs (loqs) - Saturday, 01 July 2023, 16:12 GMT
@mcloaked unfortunately Arch can not update to 059 as it is not signed. So with respect to Arch a backport is required.
Edit:
Should also note booster appears to only support xz compression https://github.com/anatol/booster/blob/0.10/generator/generator.go#L309
Comment by Mike Cloaked (mcloaked) - Saturday, 01 July 2023, 16:29 GMT
Thanks - and yes, shame about upstream not signing dracut ( https://github.com/dracutdevs/dracut/issues/1850 )
Comment by Laurent Carlier (lordheavy) - Saturday, 01 July 2023, 17:53 GMT
@loqs It should be a one line patch for booster to support zstd
Comment by Toolybird (Toolybird) - Saturday, 01 July 2023, 22:56 GMT
Lowering severity now that revert is implemented.
Comment by Emil (xexaxo) - Sunday, 02 July 2023, 10:31 GMT
Hmm something very strange is happening. Let me add some background information, for posterity sake:

The kernel can directly decompress zstd firmware since v5.19, as such mkinitcpio/dracut/booster do not need to decompress it. At a glance neither of them do, they just need suffix every `modinfo -F firmware amdgpu` entry with .zst when adding into the initrd. There is one caveat - the amd cpu microcode cannot be decompressed early (most likely a kernel bug, but it can be when loaded "late". As such a) those are not compressed and b) at least for mkinitcpio, we use separate "early" initrd image.

Looking at the `failed with error -2` messages, those are "file is missing". Although if you can share more details (below) that would be appreciated:

- kernel package and version (FS78939 shows 6.4.0-stable-1 which is not an Arch pkg IIRC)
- initrd tool package and version - mkinitcpio, dracut, booster, other
- are you seeing other non "failed with error -2" firmware messages - please attach full journal/dmesg
- drop drm/amdgpu from the initrd, does things work - for mkinitcpio omit the kms hook, for other see their documentation
- usually you don't need drm modules in early boot - the UEFI splash is displayed, until X/Wayland session kicks in

Thanks in advance
Comment by Alex (BS86) - Sunday, 02 July 2023, 10:41 GMT
@loqs The dracut devs say it is signed and have closed the bug report:
https://github.com/dracutdevs/dracut/issues/1850#issuecomment-1364554967
https://github.com/dracutdevs/dracut/issues/1850#issuecomment-1364692255

it apparently is not the key that Arch expects.
https://github.com/dracutdevs/dracut/issues/1850#issuecomment-1428280936

Arch is waiting for a closed bug to be dealt with - which in my experience won't happen because it is closed.
https://github.com/dracutdevs/dracut/pull/2101
Comment by Emil (xexaxo) - Sunday, 02 July 2023, 11:13 GMT
Polishing the dracut development process is nice, but outside of scope, and outside of our power AFAICT.

That said, Laurent has already backported the patch to dracut 056-3. Thank you

For booster - I've opened upstream MR and an Arch tracking bug https://bugs.archlinux.org/task/78947. Please give it a test and comment.
Comment by Gene (GeneC) - Sunday, 02 July 2023, 13:54 GMT
Question - how useful is each compression? I made a test to help answer the question - attached here in case of interest.
To run give it a list of files or a directory.


The tradeoff, as usual, is disk space vs speed. If compression is significant enough, then reading compressed files and decompressing (in memory) can even be faster than reading uncompressed file. The latter is the win/win scenario. I did not see this latter case in my tests.

The numbers speak for themselves. It seems clear that zst is a win for both kernel modules and firmware. When there are a few files, like firmware, then the size benefit seems to outweigh the time penalty for xz as well. When loading many files zst still looks decent. We're already using zst for kernel modules.

Here's some results from the test program to compare time to read or read plus decompress when required:

Test1: 1
--------
single file - iwlwifi-ty-a0-gf-a0-81.ucode
This is only 1 file and 1 run so numbers less precise of course:

Summary
no : 0.001 secs 1.55 MB
xz : 0.037 secs 522.74 KB
zst : 0.002 secs 632.96 KB

Test 2:
--------
Copy all files from /usr/lib/firmware and repeat test on all files (I have about 250 files)

Summary
no : 0.050 secs 324.46 MB
xz : 3.471 secs 50.13 MB
zst : 0.187 secs 60.00 MB


Comment by Emil (xexaxo) - Sunday, 02 July 2023, 18:23 GMT
Thanks for the numbers Gene, although I don't think (m)any maintainer doubt the zstd decision. They all know the benefits based on the zstd packages proposal a while ago [1].

As an executive summary - at the highest levels zstd compressed files are within 1-2% margin of xz compressed ones. While being 4x-18x times faster to decompress.

Some notable caveats, that not many people are aware of:
- the decompression happens in the kernel - plus I'm involved in some WIP work, to also handle the modules
- kernel supports up-to 19, no ultra levels - size is 5-7% larger than xz, decompression speed is unchanged for all levels, decompression memory is identical as xz
- the in-kernel decompression is not identical to the userspace ones - although userspace numbers are close enough representation
- drivers issue multiple _sequential_ _blocking_ firmware requests, directly affecting the driver load and ultimately boot times - amdgpu, nouveau and intel GPU drivers are the most prominent examples

[1] https://lists.archlinux.org/pipermail/arch-dev-public/2019-March/029542.html
Comment by Alf (fractalf) - Sunday, 02 July 2023, 19:29 GMT
Hey guys, just also wanted to report that "linux-firmware-20230625.ee91452d-1-any.pkg.tar.zst" and everything after (-2 and -3) still breaks here.
I have to revert to "linux-firmware-20230404.2e92a49f-1-any.pkg.tar.zst" to get X loading.

I'm on EndeavourOS with kernel 6.4.1-arch1-1 (but 6.3.9 has the same problem).

Comment by loqs (loqs) - Sunday, 02 July 2023, 19:54 GMT
@fractalf that is not related to this issue please see https://gitlab.freedesktop.org/drm/amd/-/issues/2666
Comment by Alf (fractalf) - Sunday, 02 July 2023, 20:22 GMT
@loqs ah sorry, and thank you for pointing me to the correct issue!
Comment by Laurent Carlier (lordheavy) - Saturday, 08 July 2023, 09:51 GMT
linux-firmware-20230625.ee91452d-4 is now in testing
Comment by Gene (GeneC) - Saturday, 08 July 2023, 12:13 GMT
With:
- intel cpu and gpu (i.e. not using amdgpu) - meaning the revert from git head doesn't impact me
- dracut 059 (thanks for getting this one out @grazzolini :) ) t

The zst compressed firmware are working fine.
Added ack to testing signoffs.

Thanks!
Comment by Andreas Radke (AndyRTR) - Sunday, 09 July 2023, 08:15 GMT
Works for me one two systems with amdgpu and booster. Thx.

Loading...