FS#8877 - Setting LESSCHARSET in /etc/profile is broken and just plain wrong
Attached to Project:
Arch Linux
Opened by Dan McGee (toofishes) - Sunday, 09 December 2007, 08:07 GMT
Last edited by Roman Kyrylych (Romashka) - Saturday, 09 February 2008, 09:18 GMT
Opened by Dan McGee (toofishes) - Sunday, 09 December 2007, 08:07 GMT
Last edited by Roman Kyrylych (Romashka) - Saturday, 09 February 2008, 09:18 GMT
|
Details
Can anyone explain why we still have settings in
/etc/profile that shouldn't be there? This particular report
is being filed for the LESSCHARSET="latin1" setting, which
breaks a whole bunch of stuff in my pager, including the
example below:
Before: commit e360bebf713b6b03768c62de8b94ddf9350b0953 Author: <81><97><82><89><81><84><81><97><81>�<81>�<81><93> <email@example.com> Date: Wed Dec 5 18:24:26 2007 +0900 After: commit e360bebf713b6b03768c62de8b94ddf9350b0953 Author: しらいしななこ <email@example.com> Date: Wed Dec 5 18:24:26 2007 +0900 When the majority of people probably want to use a UTF-8 locale, this is broken behavior. |
This task depends upon
if [ $(echo $LANG | grep utf8) ]; then export LESSCHARSET="utf-8"; else export LESSCHARSET="latin1";fi
I am sure that there be better solutions so please see this only as an example.
If neither LESSCHARSET nor LESSCHARDEF is set, but any of the strings
"UTF-8", "UTF8", "utf-8" or "utf8" is found in the LC_ALL, LC_TYPE or
LANG environment variables, then the default character set is utf-8.
If that string is not found, but your system supports the setlocale
interface, less will use setlocale to determine the character set.
setlocale is controlled by setting the LANG or LC_CTYPE environment
variables.
Finally, if the setlocale interface is also not available, the default
character set is latin1.
It seems like it really isn't necessary to set it at all, as less/more can detect it on their own. Your above solution would also be an issue if the default system locale is one language, but a user's locale (set after the if magic) was a different one.
For me, a text file saved as utf8 displays fine in less with the following settings:
LANG=en_US.utf8
LESSCHARSET not set
And as a second check, if "export LESSCHARSET=latin1" is done, then it shows up broken as in your above example. So it is my guess your text file was not using UTF-8 encoding that you ran your tests with.
In either case, enforcing a default latin1 charset seems silly when the setlocale() interfaces are available to less.