FS#74864 - [systemd] >= 251 breaks devtools' locale

Attached to Project: Arch Linux
Opened by Felix Yan (felixonmars) - Friday, 27 May 2022, 02:31 GMT
Last edited by freswa (frederik) - Monday, 06 June 2022, 09:47 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Christian Hesse (eworm)
Giancarlo Razzolini (grazzolini)
freswa (frederik)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:
systemd >= 251 forces LANG to be set to C.UTF-8 which overrides our devtools settings at https://gitlab.archlinux.org/archlinux/devtools/-/blob/master/mkarchroot.in#L88

This effectively breaks packages relying on a UTF-8 locale to build.

Relevant commit is https://github.com/systemd/systemd/commit/b626f6959bcee11d966f96bd29a00502f4aa2ce4

Example of failures due to this: https://paste.xinu.at/4oGr9k/

Additional info:
* package version(s)
systemd 251-1 & 251.1-1
This task depends upon

Closed by  freswa (frederik)
Monday, 06 June 2022, 09:47 GMT
Reason for closing:  Implemented
Additional comments about closing:  glibc-2.35-6
Comment by Felix Yan (felixonmars) - Friday, 27 May 2022, 02:37 GMT
Adding glibc maintainers too because systemd upstream want us to generate C.UTF-8 by default: https://github.com/systemd/systemd/pull/23252#issuecomment-1115825144
Comment by Christian Hesse (eworm) - Saturday, 28 May 2022, 21:17 GMT
So does this work if C.UTF-8 locale is generated and available?
Comment by Morten Linderud (Foxboron) - Saturday, 28 May 2022, 21:24 GMT
Yes. But we currently do not enforce such a thing on user systems.
Comment by Toolybird (Toolybird) - Wednesday, 01 June 2022, 06:25 GMT
So, this issue seems to be causing all kinds of build mayhem...

Researching the back story [1] indicates upstream are ultimately wanting to include C.UTF-8 as built-in to Glibc itself (exactly how the "C" locale is currently handled). Even though this didn't quite happen in the 2.35 release, maybe we should simply include it with our glibc by default anyway? This is what Debian / Fedora appear to be doing. I haven't looked in detail yet but hopefully it's a simple matter of crafting a suitable "localedef" command.

Failing that, someone should just whack C.UTF-8 into devtools and get a release out ASAP. Thanks!

[1] https://sourceware.org/glibc/wiki/Proposals/C.UTF-8
Comment by freswa (frederik) - Wednesday, 01 June 2022, 11:35 GMT
As long as glibc doesn't ship C.UTF-8 by default, I'd suggest to remove the '#' from locale.gen by default and issue a localedef command with the installscript (on install only). Upgrades shall be handeled by pacdiff or manual merge. So anyone installing a fresh arch will have the locale enabled and existing users see the change in the pacnew file.

Any objections?
Comment by Christian Hesse (eworm) - Wednesday, 01 June 2022, 11:50 GMT
Sounds good to me, so +1...

Currently the install script has `locale-gen` in `post_upgrade()` only... Is there a reason for that? How are locales generated on initial installation?
Comment by nl6720 (nl6720) - Thursday, 02 June 2022, 05:52 GMT
Generating locales on install has been rejected before:  FS#67464 
Comment by freswa (frederik) - Thursday, 02 June 2022, 14:42 GMT
I think we have a good reason now, as systemd is expecting us to include C.UTF-8 and glibc will probably include it in the future similar to the C locale.
Comment by Jonas Witschel (diabonas) - Friday, 03 June 2022, 11:37 GMT
I don't think enabling C.UTF-8 in /etc/locale.gen is the best option due to the manual intervention required. Instead I suggest pregenerating C.UTF-8 using localedef and ship it in /usr/lib/locale/C.UTF-8/ as part of glibc. This is what Debian does:

https://salsa.debian.org/glibc-team/glibc/-/blob/83fb5553addb34aa12820b93a5344866e6a92196/debian/rules.d/build.mk#L372
https://salsa.debian.org/glibc-team/glibc/-/blob/83fb5553addb34aa12820b93a5344866e6a92196/debian/debhelper.in/libc-bin.install#L21

For this to work, our custom locale-gen script needs to be patched to only remove /usr/lib/locale/locale-archive instead of all the contents of the /usr/lib/locale/ folder. This has been patched in Debian by

https://salsa.debian.org/glibc-team/glibc/-/commit/0c4e4734074fd8a45c5ac8a4100b083037fa449c
Comment by Jonas Witschel (diabonas) - Friday, 03 June 2022, 14:38 GMT
Attached to this comment is a proposed patch to the glibc PKGBUILD that implements the above suggestion. I tested that the following works:

- After upgrading a system that did not have C.UTF-8 enabled in /etc/locale.gen to this new glibc package, the C.UTF-8 locale is available (tested with "locale -a" and "LANG=C.UTF-8 perl") without any user intervention.
- After upgrading a system that had C.UTF-8 already enabled, nothing changes, C.UTF-8 continues to work as before.
- Running locale-gen (directly or through an alpm hook) preserves /usr/lib/locale/C.UTF-8/ so that the locale continues to work, whether enabled in /etc/locale.gen or not.
Comment by nl6720 (nl6720) - Friday, 03 June 2022, 14:59 GMT
If C.UTF-8 is going to be shipped in the glibc package, it should probably be removed from /etc/locale.gen. Otherwise it'll just be confusing.
Comment by Jonas Witschel (diabonas) - Friday, 03 June 2022, 15:27 GMT
Sure, that could be achieved using

sed -i '/#C\.UTF-8 /d' "$pkgdir/etc/locale.gen"

See the updated patch attached to this comment.
Comment by Toolybird (Toolybird) - Friday, 03 June 2022, 22:16 GMT
@diabonas, this looks like an excellent solution! (/me tries to find a like button)

My only comment would be that both Debian and Fedora go out of their way to fully ensure the just-built glibc is used to do the actual work:

elf/ld.so --library-path blah localedef blah ...

Just calling the "locale/localedef" binary directly might not cut the mustard. There must be a reason why they do it. It probably matters when upgrading to a new major Glibc version. If we are bootstrapping a toolchain where we build Glibc twice then maybe it doesn't matter much. Anyway, just raising it as a potential issue.
Comment by freswa (frederik) - Monday, 06 June 2022, 09:47 GMT
I hope glibc will include the locale by default. But let's keep this in mind in case we run into erros. Thanks @diabonas and @Toolybird

Loading...