Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in Unsupported. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#30228 - [initscripts] 2012.06.1-1 possible encoding error

Attached to Project: Arch Linux
Opened by Xyne (Xyne) - Saturday, 09 June 2012, 23:42 GMT
Last edited by Tom Gundersen (tomegun) - Sunday, 04 November 2012, 16:26 GMT
Task Type Bug Report
Category Arch Projects
Status Closed
Assigned To Tom Gundersen (tomegun)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 12
Private No

Details

When I logged in on tty1 today, the box-drawing characters in my Bash prompt (e.g. "┌─" were rendered as several accented ascii characters. The same rendering occurs when viewing files with vim, less, cat, etc. This occurs on tty1-7.

Downgrading to initscripts 2012.05.1-3 seems to have solved the problem.

I diff'd the packages to see what had changed and noticed that several chunks of code for configuring UTF-8 locales have been removed.



I am using the en_XX.UTF-8@POSIX locale if it matters.
This task depends upon

Closed by  Tom Gundersen (tomegun)
Sunday, 04 November 2012, 16:26 GMT
Reason for closing:  Won't fix
Additional comments about closing:  Please open a bug against systemd if this is also broken there.
Comment by Dave Reisner (falconindy) - Saturday, 09 June 2012, 23:56 GMT
Most of the locale business was handed off to systemd-vconsole-setup. With regard to locale, it understands any locale-related var in /etc/locale.conf as well as the LOCALE var in /etc/rc.conf. In that regard, it should be identical to the old behavior. Where are you setting this?

And what exactly is "en_XX.UTF-8@POSIX"?
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 00:11 GMT
We completely changed our vconsole handling, so the UTF-8 detection might be stricter now than it used to be.

What is the output of "locale charmap"? It should be "UTF-8".
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 00:33 GMT
I installed http://xyne.archlinux.ca/projects/locale-en_XX/ and tried it out. I can reproduce this, and confirm that "locale charmap" is "UTF-8" as expected.

Weird thing is that everything works as it should on my standard locale: en_US.UTF-8. Have to dig deeper i guess.

Xyne: you probably know this stuff better than me, having written your own locale stuff, so any hints would be appreciated :-)
Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 00:46 GMT
I can only reproduce this with specific programs. tree, for example, falls back on ascii line drawing characters because of the way it does its unicode detection. If I fix tree to do unicode detection like the rest of the world, it has no problems drawing unicode under your locale.
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 00:56 GMT
Additional comment: it seems the utf-8 detection/setting is not to blame. i can reproduce the problem by just changing my LANG env var, without resetting utf-8 mode in-between.

Further to Dave's comment: I would assume that "tree" (and similarly broken programs) never worked for your locale (or any other locale with an "@modifier" part)?

Any chance of a minimal test case that works with the previous version and not with the new version?

If you want to run some tests, you can call "/usr/lib/systemd/systemd-vconsole-setup" manually, and it will do the same as we do on boot.
Comment by Xyne (Xyne) - Sunday, 10 June 2012, 03:11 GMT
Until now I have used LOCALE="en_XX.UTF-8@POSIX" in /etc/rc.conf to set the locale. While I was trying to debug this, I began using LANG=en_XX.UTF-8@POSIX and LC_CTYPE=en_US.UTF-8 in /etc/locale.conf (both with and without setting LOCALE in rc.conf). No combination of those settings resolves the problem.

I have LC_CTYPE set to en_US.UTF-8 because some applications throw unsupported errors otherwise, even though en_XX specifies exactly the same LC_CTYPE that en_US does (that was the source). This should not matter though, because I have done that for over a year using a script in /etc/profile.d/ to export LC_CTYPE without issue.

I still have the previous version of initscripts installed. "tree" displays the file hierarchy correctly with box-drawing characters. I can't reboot right now but I will try to test the new initscripts package again tomorrow with "tree". I expect it to just spit out a jumbled mess of ASCII characters.

I really don't know much about initscripts or locale handling (even if I did write my own locale, I only spent a few hours on it a year and a half ago and haven't really touched it since). I'm not even sure how to use /usr/lib/systemd/systemd-vconsole-setup.

Please suggest specific tests that I should run and I will post the results.

As mentioned, I noticed this because of my Bash prompt. You can reproduce it with the following (colors make no difference):

PS1="
┌─[\u@\h \w]
└─> "

It displays correctly upon login with the previous version of initscripts but not the newer one.

Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 03:16 GMT
As I pointed out, tree is going to fail because of the way it detects UTF-8 capable locales, from the source:

patmatch(setlocale(LC_CTYPE,NULL), "*[Uu][Tt][Ff]-8") == 1

Note that its right anchored. I've sent a patch to the author to use the more common nl_langinfo(CODESET) call, which returns exactly "UTF-8" for utf-8 capable locales.

I can't reproduce your prompt failure: http://sprunge.us/MLBG

Does the 'locale' command actually show your desired locale settings?
Comment by Dave Reisner (falconindy) - Sunday, 10 June 2012, 03:43 GMT
Just for some little educational bits on how locale is actually set, systemd-vconsole-setup isn't actually important for purposes of setting locale in _user_ sessions. We still rely on /etc/profile.d/locale.sh to properly set your locale on login. It draws from /etc/locale.conf (reading LANG and LC_*, but not LC_ALL) falling back on /etc/rc.conf (setting only LANG=$LOCALE). If this isn't working, it would likely indicate a problem with your shell config or with /etc/profile.d/locale.sh itself.
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 11:43 GMT
@Xyne: are you using frambuffer or VGA console? After discussing this with Thomas, it seems like the old fashioned VGA text console does not default to utf-8. Systemd (and I!) assumed it did, so it will only change away from utf-8 if you have a non-utf-8 locale, not force your tty to be utf-8 if you DO have a utf-8 locale (old initscripts would, though for the wrong reason IMHO).

Using loadkeys should give you a warning to verify that your console is not in utf-8 mode. You can force it to be in utf-8 mode by using "kbd_mode -u" and try again if that solved the problem.
Comment by Xyne (Xyne) - Sunday, 10 June 2012, 17:10 GMT
Here's the output of various "locale", "loadkeys" and "kbd_mode" commands using the current initscripts package:
http://xyne.archlinux.ca/tmp/initscripts_bug/output.txt

My kernel line from Grub's menu.lst:
kernel /vmlinuz-linux root=/dev/mapper/foo-root ro vga=873

More information:
After the first reboot with the current initscripts package, only tty1 is affected. The characters are correctly displayed on the other terminals. After the second reboot all terminals are affected.

I use "--noclear" on tty1 in /etc/inittab as described in the wiki. Removing it does not change anything.

edit: fixed typo
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 18:42 GMT
This patch should in principle solve the problem: http://lists.freedesktop.org/archives/systemd-devel/2012-June/005426.html

@Xyne: if you are on x86_64, could you replace your /usr/lib/systemd/systemd-vconsole-setup with http://dev.archlinux.org/~tomegun/systemd-vconsole-setup and let us know if it solves the problem?
Comment by Xyne (Xyne) - Sunday, 10 June 2012, 19:06 GMT
Tom's patched version of systemd-vconsole-setup seems to have solved the problem. Can it be included in the package while waiting for upstream?

Thanks for the help and quick solution!


*edit*
Looking at the patch and the message that accompanies it, I still don't understand what part of my setup caused the problem. The message indicates that it is due to using "the old VGA console" instead of a framebuffer. Forgive my ignorance, but I thought that "vga=873" in Grub's kernel line enabled the framebuffer (although I do realize that the line itself is "vga"). The Grub wiki page only talks about "framebuffer resolution". Is the framebuffer not the same thing as "framebuffer resolution"? I can't find anything that clarifies this or how to set it up if it is.

I don't see anything in the patch that pertains to the "@modifier".

Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 21:38 GMT
@Xyne: as far as i can tell the bug has nothing to do with your locale and its modifier. That is causing some unrelated problems with other programs (such as 'tree'), but that should be the same even with old versions of initscripts.

The issue seems to be that for some reason your console does not default to utf-8, which, as far as I can tell can happen if you use vga= in your kernel command line. I realize now that I don't understand this stuff as well as I thought, so I'm not 100% certain on WHY the vga= parameter causes this.
Comment by Tom Gundersen (tomegun) - Sunday, 10 June 2012, 22:53 GMT
I can now reproduce. Blacklisting my graphics driver ("i915" fwiw) was all it took. Thanks for reporting. Hopefully this will be accepted upstream soon so we can backport.
Comment by Leonid Isaev (lisaev) - Monday, 18 June 2012, 14:30 GMT
I may be way off here (sorry for the noise in that case), but I think the problem is bigger than just ttys not defaulting to utf with vga=xxx.

After update to initscripts 2012.06.1-1 (from 2012.05.1-1) and with i915 inside the ramdisk (early KMS) I have garbage instead of line drawing symbols in pstree (and pstree --unicode but not with --vt100 or --ascii), findmnt and alsamixer in tty's (not in pts like gnome-terminal). Relevant vars in rc.conf: LOCALE="en_US.UTF-8" and DAEMON_LOCALE="no".

Running "LANG=C findmnt" works OK (as well as unsetting LOCALE, and, yes, "locale charmap" returns utf8). I also noticed that /sys/module/vt/parameters/default_utf8 is 0 on sysV machine (used to be 1 with old initscripts) but is 1 on a pure systemd setup (same en_US.UTF-8 locale).
Comment by Tom Gundersen (tomegun) - Monday, 18 June 2012, 15:03 GMT
Thanks for the extra info. It is useful, and fits with my findings. There is indeed two bugs here, one is my fault and the other (I think) is systemd's fault. Expect a new initscripts very soon.
Comment by Tom Gundersen (tomegun) - Saturday, 23 June 2012, 00:48 GMT
Please try with new systemd-tools and new initscripts in testing. Does this solve all your problems?
Comment by Mark Kusch (groover) - Sunday, 24 June 2012, 09:32 GMT
I've got this issue on two nodes, both using kernel.org mirrors, both updated between half an hour.

Installed software versions:
- systemd-tools 185-2
- initscripts 2012.06.1-1
- filesystem 2012.6-4
- linux 3.4.4-1

Software configuration which may be interesting:
- agetty -8 -s 38400 ttyN linux, --noissue --noclear on tty1.
LOCALE="en_US.UTF-8"
DAEMON_LOCALE="no"

$ locale charmap
UTF-8

$ LANG=C findmnt # does not work (see comment from lisaev above)


Version A:
Nvidia board (GT200b [GeForce GTX 285] (rev a1)), using VGA=865.
All ttys affected.

Version B:
Intel board (Core integrated), using early KMS.
All ttys affected.
Comment by Tom Gundersen (tomegun) - Sunday, 24 June 2012, 11:09 GMT
Could you please try again with initsctipts from testing? It should be fixed.
Comment by Dave Reisner (falconindy) - Sunday, 24 June 2012, 11:19 GMT
This will never be "fixed" if you expect to see Unicode characters with a C locale.
Comment by Tom Gundersen (tomegun) - Sunday, 24 June 2012, 11:37 GMT
Ah, good catch Dave, I didn't read that bit. No this will not work. Either everything has to be in an utf8 locale, or everything has to be in a non-utf8 locale.
Comment by Mark Kusch (groover) - Sunday, 24 June 2012, 12:31 GMT
initscripts 2012.06.2-1 (testing) fixes the issue for me at least for the nvidia box. Thanks!
Comment by Mark Kusch (groover) - Sunday, 24 June 2012, 12:37 GMT
Confirmed for intel/KMS. Again thanks!
Comment by Mark Kusch (groover) - Sunday, 29 July 2012, 07:44 GMT
Problem re-occured with filesystem 2012.7-1.
I've set up /etc/environment and /etc/locale.conf for LANG="en_US.UTF-8".
Comment by Pierre Schmitz (Pierre) - Sunday, 29 July 2012, 08:16 GMT
I can confirm that the terminal now get's corrupted once initscripts (or systemd) try to setup vconsole: http://paste.xinu.at/ETYH/
Comment by Tom Gundersen (tomegun) - Thursday, 02 August 2012, 18:27 GMT
This is the situation:

If DAEMON_LOCALE="no", then vconsole is set up with the default locale, LANG=C. That means the vconsole is configured to be in non-utf-8 mode. When this kicks in, then whatever was printed in utf-8 mode before gets garbled.

At least I think that's what's happening.

I can see a few ways of working around this, but I'd like the solution to be that DAEMON_LOCALE is always set to "yes", but that it can be overridden on a user-by-user basis in $HOME/.config/locale.conf, then people can set this up as they like.
Comment by Milos Kaurin (Kaurin) - Friday, 03 August 2012, 02:26 GMT
Hey tomegun, seeing that DAEMON_LOCALE is no longer an option that works in rc.conf, do you know of a workaround for this issue?

P.S. +1 on the confirm list that the bug is indeed back
Comment by Karol Błażewicz (karol) - Friday, 03 August 2012, 12:11 GMT
Maybe I'm doing something wrong, but:
1. I'd like to get all output (early boot, daemons, commands etc.) in English so I can just post it if I run into problems. Translating terse error messages is tricky.
2. I need utf in the console because otherwise even 'rm: remove regular file ‘Xorg.0.log’?' gets garbled (the quotes around Xorg.0.log).
3. I need to be able to read and write in a non-English locale (Polish). Files looking like garbage or full of '??' are not an option.

Is there - or will there be - a setup that does that?
Comment by Justin (velusip) - Sunday, 05 August 2012, 02:58 GMT
Hello, I was just forwarded here. I didn't notice the bug report and posted in the forum about it: https://bbs.archlinux.org/viewtopic.php?pid=1141780#p1141780

Very vanilla system, no framebuffer or vga args, just default grub-legacy.
cat /etc/locale.conf
LANG=en_CA.UTF-8
Comment by Tom Gundersen (tomegun) - Tuesday, 28 August 2012, 21:35 GMT
Hi guys,

With the new filesystem++ from testing some of these problems should be solved: you can now set one system-wide locale in /etc/locale.conf and a user-specific one in $HOME/.config/loacel.conf [0].

If you still see problems with your font during boot, please try setting a different font (even no font at all) in /etc/vconsole.conf [1].

If there are still problems, please let me know (or reopen as I'll close this bug if I don't hear anything for a little while).

[0]: https://plus.google.com/114015603831160344127/posts/2zKCcnTWDpa
[1]: https://plus.google.com/114015603831160344127/posts/PYgPjektqsY
Comment by Simon Perry (pezz) - Tuesday, 28 August 2012, 22:13 GMT
Is there anything else from testing I need to update?

I don't have the testing repo enabled, but I downloaded and installed 2012.8-1 manually and the issue still occurs just after the "Waiting for udev" message.

% cat /etc/locale.conf
LOCALE=en_AU.UTF-8
LANG=en_AU.UTF-8

% cat /etc/vconsole.conf
KEYMAP=us
Comment by Leonid Isaev (lisaev) - Tuesday, 28 August 2012, 22:28 GMT
Thanks for the update!

But since you presented the links, I have questions regarding them:
1. If currently /etc/locale.conf is LANG=en_US.UTF-8 \\ LC_COLLATE=C, will setting LANG=C system-wide and LANG=en_US.UTF-8 per-user have the same effect?
2. My /etc/vcondsole.conf doesn't contain FONT variable, and I see no artefacts at boot. Is it because I use KMS as opposed to vga=xxx, and which font am I using: arch or kernel?
Comment by Tom Gundersen (tomegun) - Thursday, 30 August 2012, 19:21 GMT
@pezz: yeah, there were a few packages that need to go together. However, they are all in core now, so if you still have this problem please reopen.

@lisaev: 1. no. If the per-user file exists, only that matters (even if it is empty) and the system-wide one will be ignored. 2. sorry, i don't know too many things going on, so i don't have a 100% overview of what does and what does not cause the bug. If you don't have any entry at all you'd be using LarArCyrHeb (the Arch default), if you set FONT="" you'd get the kernel default.
Comment by Simon Perry (pezz) - Friday, 31 August 2012, 08:21 GMT
Fully up-to date and it still happens for me, just rebooted now: http://i.imgur.com/T3drW.jpg
Comment by Simon Perry (pezz) - Friday, 31 August 2012, 08:30 GMT
Also, is there a file missing from /etc/profile.d ?

When I login to my box now, I don't have a LANG env variable being set. Shouldn't it be set to what I have configured in /etc/locale.conf (assuming I want to use the global system setting)?
Comment by Tom Gundersen (tomegun) - Friday, 31 August 2012, 08:35 GMT
@pezz: set FONT="" in vconsole.conf, and try again.

/etc/profile.d/locale.sh should be owned by filesystem and be responsible for setting LANG.
Comment by Simon Perry (pezz) - Friday, 31 August 2012, 08:39 GMT
Never mind about profile.d - I removed the filesystem package I downloaded from testing the other night from the pacman cache and re-installed.

Still seeing the issue, though. :(
Comment by Simon Perry (pezz) - Friday, 31 August 2012, 08:42 GMT
Beautiful - FONT="" did it Tom, you're a champion.

Confirmed fixed. :)
Comment by Tom Gundersen (tomegun) - Friday, 31 August 2012, 08:45 GMT
@pezz: great, this will be the default soon i hope.
Comment by Martín Cigorraga (msx) - Friday, 21 September 2012, 06:22 GMT
Hi everyone, I want to confirm I'm suffering this bug on a fully updated x86_64 install using initscripts - however I fixed it with the DAEMON_LOCALE="yes" workaround: https://bbs.archlinux.org/viewtopic.php?pid=1164283#p1164283

Note: my system is fully en_US.UTF-8, see above link for further details.

Loading...