FS#80297 - [libxml2] 2.12.0-1 causes shared-mime-info hook flakiness
Attached to Project:
Arch Linux
Opened by Toolybird (Toolybird) - Monday, 20 November 2023, 08:00 GMT
Last edited by Toolybird (Toolybird) - Thursday, 23 November 2023, 21:54 GMT
Opened by Toolybird (Toolybird) - Monday, 20 November 2023, 08:00 GMT
Last edited by Toolybird (Toolybird) - Thursday, 23 November 2023, 21:54 GMT
|
Details
I'm seeing some intermittent flakiness when running fresh
installs. I can reproduce it quite easily inside a
systemd-nspawn container as per below:
( 4/13) Updating the MIME type database... Error in type 'application/x-core' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value. Error in type 'image/jp2' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value. Error in type 'image/jpx' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value. Error in type 'image/jpm' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value. Error in type 'video/mj2' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value. Error in type 'image/vnd.adobe.photoshop' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value. If I downgrade the container to libxml2-2.11.5-1, it doesn't happen. Steps to reproduce: $ mkdir /tmp/foo $ sudo pacstrap -cK /tmp/foo $ sudo systemd-nspawn -D /tmp/foo (now inside the container) # pacman -S gtk3 (if it doesn't repro first go - rinse, repeat until it does) # pacman -Rs gtk3 # pacman -S gtk3 I haven't reported upstream (yet... not sure where to start..) Any ideas? Can you repro? |
This task depends upon
Closed by Toolybird (Toolybird)
Thursday, 23 November 2023, 21:54 GMT
Reason for closing: Fixed
Additional comments about closing: libxml2-2.12.1-1 in [core-testing]
Thursday, 23 November 2023, 21:54 GMT
Reason for closing: Fixed
Additional comments about closing: libxml2-2.12.1-1 in [core-testing]
$ git bisect good
4a513d5667d7690998f01b9048c56c4f1f50f6a5 is the first bad commit
commit 4a513d5667d7690998f01b9048c56c4f1f50f6a5
Author: Nick Wellnhofer <wellnhofer@aevum.de>
Date: Sat Sep 16 19:12:25 2023 +0200
hash: Rewrite hash table code
This is a complete rewrite of the code in hash.c
Move from a chained hash table implementation to open addressing with
Robin Hood probing. This allows to increase the maximum fill factor and
further reduce the growth factor, saving considerable amounts of memory
without sacrificing performance.
To make this work, hash values are now cached in the table entry
also avoiding many key comparisons.
Tables are created lazily with a smaller minimum size.
Insertion functions now report an error if growing the table resulted in
a memory allocation failure.
Some string comparisons were optimized to call directly into libc
instead of using the xmlstring API.
The length of inserted keys is computed along with the hash improving
allocation performance.
Bounds checking was made more robust.
In dictionary-based mode, unneeded interning of strings is avoided.
Copyright | 2 +-
hash.c | 1699 ++++++++++++++++++++++++++++++-------------------------------
2 files changed, 830 insertions(+), 871 deletions(-)
$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [5e9b167dce73bd6a804ab107ae4c4b95e6849597] Release v2.12.0
git bisect bad 5e9b167dce73bd6a804ab107ae4c4b95e6849597
# status: waiting for good commit(s), bad commit known
# good: [2b998a4ffbdfea04fc6a620721abc690a15743af] Release v2.11.5
git bisect good 2b998a4ffbdfea04fc6a620721abc690a15743af
# good: [f296934ade688baab79caf1c62a82149ad78accf] Release v2.11.0
git bisect good f296934ade688baab79caf1c62a82149ad78accf
# good: [4e1c13ebfd1514e066bc1e816fa8f9c2125bce58] debug: Remove debugging code
git bisect good 4e1c13ebfd1514e066bc1e816fa8f9c2125bce58
# good: [42a0bc6d96ea2d2861178ebec98123a94008b94e] tests: Add ATTRIBUTE_NO_SANITIZE_INTEGER macro
git bisect good 42a0bc6d96ea2d2861178ebec98123a94008b94e
# bad: [36374bc9fcf6e670dc9521ac032474066521858b] parser: Fix error handling in xmlLoadEntityContent
git bisect bad 36374bc9fcf6e670dc9521ac032474066521858b
# bad: [61e29b6949c8878fbc20e46248d631b336e8bcc1] malloc-fail: Grow hash tables before making allocations
git bisect bad 61e29b6949c8878fbc20e46248d631b336e8bcc1
# bad: [a873191cd259a0e0c16c82026c1b4b11ed968966] parser: Introduce xmlParseQNameHashed
git bisect bad a873191cd259a0e0c16c82026c1b4b11ed968966
# bad: [4a513d5667d7690998f01b9048c56c4f1f50f6a5] hash: Rewrite hash table code
git bisect bad 4a513d5667d7690998f01b9048c56c4f1f50f6a5
# good: [4f221a774896fcb5a9dd5c270c5de52b2ba0a45a] hash: Add hash table tests
git bisect good 4f221a774896fcb5a9dd5c270c5de52b2ba0a45a
# first bad commit: [4a513d5667d7690998f01b9048c56c4f1f50f6a5] hash: Rewrite hash table code
I have no idea if this is a bug in libxml2 or it revealed an underlying issue in shared-mime-info.
Did you take that into account with the bisection?
Which does make testing very problematic. libxml2-2.11.0+r188+g4a513d56-1 I can reproduce the issue perhaps one time in five. libxml2-2.11.0+r187+g4f221a77-1 I have tried fifty times and still not triggered it.
I can not cleanly revert 4a513d5667d7690998f01b9048c56c4f1f50f6a5 to cross check. The error message though seems to be correct as the mask is longer than the value [1][2][3][4][5][6].
[1]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L1878
[2]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L5224
[3]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L5234
[4]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L5244
[5]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L5254
[6]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L5938
I've just arbitrarily checked 'image/jp2', the 2nd of the 6 'Error in type' reported on my laptop as well (error output [1]),
in the source of freedesktop.org.xml, linked by both @loqs and me here as [2].
EDIT (minor corr.): Both the mask and the value are 24 octets (bytes) long -
in gitlab.freedesktop.org/xdg/shared-mime-info repo [2], as well as locally on my computer.
Therefore the 'match' element of <mime-type type="image/jp2"> is sound.
Based on this, I can only conclude that this is a regression in libxml2 2.12.0 release compared to its earlier 2.11.5 tag.
[1]:
> (2/6) Updating the MIME type database...
> Error in type 'application/x-core' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value.
> Error in type 'image/jp2' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value.
> Error in type 'image/jpx' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value.
> Error in type 'image/jpm' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value.
> Error in type 'video/mj2' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value.
> Error in type 'image/vnd.adobe.photoshop' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask is longer than value.
[2]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/data/freedesktop.org.xml.in?ref_type=tags#L5224
Error in type 'application/x-core' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask length 12 is longer than value length 12.
Error in type 'image/jp2' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask length 34 is longer than value length 34.
Error in type 'image/jpx' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask length 34 is longer than value length 34.
Error in type 'image/jpm' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask length 34 is longer than value length 34.
Error in type 'video/mj2' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask length 34 is longer than value length 34.
Error in type 'image/vnd.adobe.photoshop' (in /usr/share/mime/packages/freedesktop.org.xml): Error in <match> element: Mask length 18 is longer than value length 18.
While the check [1] is if the mask is greater than or equal to the value the error message implies the check should only be for greater than?
This does not explain why the libxml2 change introduced this issue or why it is indeterminate.
Edit:
That only causes the loop to read one more byte.
[1]: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/2.4/src/update-mime-database.cpp?ref_type=tags#L1317
the actual 'mask' and decoded 'value' field lengths inside each 'match' tag are:
- 'application/x-core': 17 chars (17 octets), not 12
- 'image/vnd.adobe.photoshop': 10 chars (10 octets), not 18
- and for all the other implicated types: 24 chars (octets), not 34.
The trouble is, I'm not seeing an easy reproducer for upstream to digest in a bug report. Although, I had the idea to rebuild "shared-mime-info" against the latest libxml2...but now its test suite fails with a similar looking issue...so I guess "s-m-i" is now technically FTBFS...
https://gitlab.gnome.org/GNOME/libxml2/-/issues/626
Just to submit the outcome of one more check I did, based on a wild (and uninformed) guess:
I've rebuilt libxml2 2.12.0 without the new '--with-legacy' configure option.
Result: no effect - the flaky sometimes-errors still happen, with the exact same mime-types.
> Then possibly a hash collision resulting in the wrong entry being returned. Although that still does not explain why the issue is indeterminate.
I think that might be because each has table has its own random seed: https://gitlab.gnome.org/GNOME/libxml2/-/blob/aca37d8c77cd66cc628f1b748b8f55434a4168aa/hash.c#L53
[1]: https://gitlab.gnome.org/GNOME/libxml2/-/commit/a2b5c90a442295d2b75ae60af854b3c4a43aa0ff
[2]: https://gitlab.gnome.org/GNOME/libxml2/-/issues/626#note_561