FS#14877 - [grep] -i does not work for non-latin characters

Attached to Project: Arch Linux
Opened by serge (xchllataa) - Saturday, 30 May 2009, 11:10 GMT
Last edited by Allan McRae (Allan) - Wednesday, 17 June 2009, 15:17 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Allan McRae (Allan)
Architecture x86_64
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
grep does not properly ignore case of non-latin characters.

Additional info:
* grep 2.5.4-1
* my locale is ru_RU.UTF-8

Steps to reproduce:
/tmp $ cat txt
АБВ
абв

/tmp $ grep АБВ txt
АБВ
/tmp $ grep абв txt
абв
/tmp $ grep -i АБВ txt
абв
/tmp $ grep -i абв txt
абв
This task depends upon

Closed by  Allan McRae (Allan)
Wednesday, 17 June 2009, 15:17 GMT
Reason for closing:  Fixed
Additional comments about closing:  grep-2.5.4-2
Comment by Gerardo Exequiel Pozzi (djgera) - Saturday, 30 May 2009, 18:08 GMT
grep have many issues, yes. Seems that is a upstream issue, can report to it?
http://lists.gnu.org/mailman/listinfo/bug-grep
Comment by serge (xchllataa) - Saturday, 30 May 2009, 20:30 GMT
This issue is not reproducible with upstream grep.

/tmp $ /bin/grep -i абв txt
абв
/tmp $ /tmp/grep/bin/grep -i абв txt
АБВ
абв
/tmp $ /tmp/grep/bin/grep --version
GNU grep 2.5.4
Comment by serge (xchllataa) - Saturday, 30 May 2009, 20:40 GMT
I think 64-egf-speedup.patch is the root cause of this issue. grep compiled without the patch works just fine.
Comment by Allan McRae (Allan) - Saturday, 06 June 2009, 10:37 GMT
I'm not sure how to deal with this... removing that patch results in massive slow-downs on UTF8 locales (see  FS#7141  and links therein). The additional patches used by Fedora and Debian do not seem to help either.
Comment by serge (xchllataa) - Sunday, 07 June 2009, 08:37 GMT
I checked how ubuntu people built their grep (http://packages.ubuntu.com/source/karmic/grep).

In order to resolve this bug it is needed:
65-dfa-optional.patch
66-match_icase.patch
explicit --without-included-regex option

Please note that without explicit --without-included-regex option these patches don't work although the option supposed to be default. It might be a bug in the configure script.


Loading...