FS#28093 - ETQW segfault & missing in-game letters after glibc-2.15-3 update

Attached to Project: Arch Linux
Opened by Sam (smudge) - Wednesday, 25 January 2012, 01:48 GMT
Last edited by Allan McRae (Allan) - Thursday, 02 February 2012, 00:49 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture All
Severity Very Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

ETQW = Enemy Territory Quake Wars. Closed source game so not sure if this is the right place but the gnu website said glibc bugs should go to the distro first :)

After glibc-2.15-3 (& lib32-glibc 2.15-3.1) update there is one missing letter at the end of some in-game text including chat and a segfault when exiting the game.
This segfault leaves a message in the logs such as:

etqw.x86[1060]: segfault at fffff7fd ip b74805fb sp accc80c0 error 4 in libgcc_s.so.1[b747b000+a000]

or:

etqw.x86[1269]: segfault at fffff3d9 ip 00000000f74075fb sp 00000000ec57b0c0 error 4 in libgcc_s.so.1[f7402000+a000]

Apart from that the game runs fine.

Problems solved by downgrading to glibc 2.14.1-4 or by copying libc.so.6, libpthread.so.0 and librt.so.1 from glibc 2.14.1-4 into game dir (on 32bit).

Please have a look at the thread that caused this bug report:

https://bbs.archlinux.org/viewtopic.php?id=133922


Steps to reproduce:
Run etqw or etqw-rthread
This task depends upon

Closed by  Allan McRae (Allan)
Thursday, 02 February 2012, 00:49 GMT
Reason for closing:  Not a bug
Additional comments about closing:  Seems to be an ETQW issue
Comment by Allan McRae (Allan) - Wednesday, 25 January 2012, 04:04 GMT
Can you give a gdb backtrace?
Comment by Sam (smudge) - Thursday, 26 January 2012, 05:22 GMT
Haven't done this before so don't really know what I'm doing :)

gdb etqw.x86

"exited normally", doesn't seem to want to produce the segfault (text bug still there) but it did mention:

"Reading symbols from etqw.x86...(no debugging symbols found)...done."


I tried ulimit -c 65536 to get a core file:

gdb etqw.x86 core

Reading symbols from etqw.x86...(no debugging symbols found)...done.

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `./etqw.x86'.
Program terminated with signal 11, Segmentation fault.
#0 x86_fallback_frame_state (fs=<optimized out>, context=<optimized out>)
at ../../gcc-4.1.2/gcc/config/i386/linux-unwind.h:128
128 ../../gcc-4.1.2/gcc/config/i386/linux-unwind.h: No such file or directory.
(gdb) bt
#0 x86_fallback_frame_state (fs=<optimized out>, context=<optimized out>)
at ../../gcc-4.1.2/gcc/config/i386/linux-unwind.h:128
#1 uw_frame_state_for (context=0xec57b22c, fs=0xec57b140) at ../../gcc-4.1.2/gcc/unwind-dw2.c:984
#2 0xf7421382 in _Unwind_ForcedUnwind_Phase2 (exc=0xec57bd90, context=0xec57b22c) at ../../gcc-4.1.2/gcc/unwind.inc:159
#3 0xf742166b in _Unwind_ForcedUnwind (exc=0xec57bd90, stop=0xf76c2410 <unwind_stop>, stop_argument=0xec57b424)
at ../../gcc-4.1.2/gcc/unwind.inc:211
#4 0xf76c4b02 in _Unwind_ForcedUnwind () from /usr/lib32/libpthread.so.0
#5 0xf76c2581 in __pthread_unwind () from /usr/lib32/libpthread.so.0
#6 0xf76c0e84 in pthread_testcancel () from /usr/lib32/libpthread.so.0
#7 0x082e69ba in ?? ()
#8 0xf76bbd4c in start_thread () from /usr/lib32/libpthread.so.0
#9 0xf736268e in clone () from /usr/lib32/libc.so.6


Thought I'd try and rebuild lib32-glibc 2.15-3.1 with debugging symbols. Not sure if I did it right but wound up with this:

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `./etqw.x86'.
Program terminated with signal 11, Segmentation fault.
#0 x86_fallback_frame_state (fs=<optimized out>, context=<optimized out>) at ../../gcc-4.1.2/gcc/config/i386/linux-unwind.h:128
128 ../../gcc-4.1.2/gcc/config/i386/linux-unwind.h: No such file or directory.
(gdb) bt full
#0 x86_fallback_frame_state (fs=<optimized out>, context=<optimized out>) at ../../gcc-4.1.2/gcc/config/i386/linux-unwind.h:128
pc = 0xfffffa79 <Address 0xfffffa79 out of bounds>
sc = <optimized out>
new_cfa = <optimized out>
#1 uw_frame_state_for (context=0xec4f322c, fs=0xec4f3140) at ../../gcc-4.1.2/gcc/unwind-dw2.c:984
fde = 0x0
cie = <optimized out>
aug = <optimized out>
insn = <optimized out>
#2 0xf73a7382 in _Unwind_ForcedUnwind_Phase2 (exc=0xec4f3d90, context=0xec4f322c) at ../../gcc-4.1.2/gcc/unwind.inc:159
fs = {regs = {reg = {{loc = {reg = 0, offset = 0, exp = 0x0}, how = REG_UNSAVED} <repeats 18 times>}, prev = 0x0}, cfa_offset = 0, cfa_reg = 0, cfa_exp = 0x0,
cfa_how = CFA_UNSET, pc = 0x0, personality = 0, data_align = 0, code_align = 0, retaddr_column = 0, fde_encoding = 0 '\000', lsda_encoding = 0 '\000',
saw_z = 0 '\000', eh_ptr = 0x0}
action = <optimized out>
stop = 0xf7648420 <unwind_stop>
stop_argument = 0xec4f3424
code = _URC_NO_REASON
stop_code = <optimized out>
#3 0xf73a766b in _Unwind_ForcedUnwind (exc=0xec4f3d90, stop=0xf7648420 <unwind_stop>, stop_argument=0xec4f3424) at ../../gcc-4.1.2/gcc/unwind.inc:211
this_context = {reg = {0xec4f32f4, 0x0, 0xec4f32f8, 0xec4f32fc, 0x0, 0xec4f3308, 0xec4f3300, 0xec4f3304, 0xec4f330c, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
cfa = 0xec4f3310, ra = 0xf764ab12, lsda = 0x0, bases = {tbase = 0x0, dbase = 0xf73ab858, func = 0xf73a75f0}, args_size = 0}
cur_context = {reg = {0xec4f32f4, 0x0, 0xec4f32f8, 0xec4f3344, 0x0, 0xec4f3308, 0xec4f3348, 0xec4f3304, 0xec4f3350, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
cfa = 0xec4f3354, ra = 0xfffffa79, lsda = 0x0, bases = {tbase = 0x0, dbase = 0xf7652ff4, func = 0xf7646e60}, args_size = 0}
code = <optimized out>
#4 0xf764ab12 in _Unwind_ForcedUnwind (exc=0xec4f3d90, stop=0xf7648420 <unwind_stop>, stop_argument=0xec4f3424) at ../nptl/sysdeps/pthread/unwind-forcedunwind.c:138
forcedunwind = <optimized out>
#5 0xf7648591 in __GI___pthread_unwind (buf=<optimized out>) at unwind.c:130
ibuf = <optimized out>
self = 0xec4f3b40
#6 0xf7646e94 in __do_cancel () at pthreadP.h:265
No locals.
#7 pthread_testcancel () at pthread_testcancel.c:27
cancelhandling = <optimized out>
#8 0x082e69ba in ?? ()
No symbol table info available.
#9 0xf7641d3c in start_thread (arg=0xec4f3b40) at pthread_create.c:305
__res = <optimized out>
pd = 0xec4f3b40
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-144363532, 0, 4001536, -330353560, -460692491, -1294500925}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0},
data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
__PRETTY_FUNCTION__ = "start_thread"
#10 0xf72e893e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
No locals.

etqw.x86 will never have debug symbols of course so no idea if this useful.
Thanks.
Comment by Sam (smudge) - Friday, 27 January 2012, 05:53 GMT
The game has it's own libgcc_s.so.1 in its dir, renaming it stops the segfault. So I suppose the above is nonsense :)
Comment by Allan McRae (Allan) - Friday, 27 January 2012, 06:02 GMT
Does that also fix the missing text?
Comment by Simon (JinxterX) - Friday, 27 January 2012, 15:11 GMT
No it doesn't fix the missing text if you rename libgcc_s.so.1 in the game dir.

Good news, I've tracked it down, there's a bug in sysdeps/i386/i686/multiarch/wcslen-sse2.S resulting in a miscalculated string length.

The original commit is here: http://sourceware.org/git/?p=glibc.git;a=commit;h=fc2ee42abe595bbf6b8bbf0637648ad8b5d4faab


Comment by Simon (JinxterX) - Saturday, 28 January 2012, 05:56 GMT
@smudge, I've made a patch to remove only the offending commit, could you apply it and confirm this quick fix?
Comment by Allan McRae (Allan) - Saturday, 28 January 2012, 08:28 GMT
Can anyone replicate this using a simple program that calls wcslen? I can not...
Comment by Simon (JinxterX) - Saturday, 28 January 2012, 20:50 GMT
Woohoo, my first ever attempt at C coding :D This is fun :D

#include <stdio.h>
#include <wchar.h>

int main()
{
wchar_t *wstr = L"They flee in terror!";
int len = wcslen(wstr);
wprintf(L"%ls %d\n",wstr,len);
}

$ ./a.out
They flee in terror! 20

Behaves as expected with correct string length returned. So all I can think of is that in ETQW they use special characters in their strings
which throw a spanner in the works of wcslen + SSE2 optimisation? Because if you compile glibc 2.15 *without* the wcslen SSE2 patches
from that commit above, the truncated string problem in ETQW is fixed. I made sure to revert the strlen SSE2 patches first to see if that
was the cause but it had no effect (i.e. text chars still missing), so it's wcslen + SSE2, definitely, maybe ;)

I'll keep digging :P


Comment by Sam (smudge) - Sunday, 29 January 2012, 04:25 GMT
I can confirm the quick fix. Rebuilt lib32-glibc with your patch, copied libc-2.15.so into etqw dir and made a libc.so.6 link, no more missing letters :)
Comment by Simon (JinxterX) - Sunday, 29 January 2012, 20:37 GMT
Thanks smudge ;)

Allan, I've done some proper debugging and nailed down the problem, will post gdb logs soon so you can see exactly what's going on.
Comment by Simon (JinxterX) - Monday, 30 January 2012, 01:46 GMT
Ok, here's two short gdb logs best viewed side to side, with a few specific register dumps sprinkled in. I targeted a string in the limbo menu of ETQW,
the word "Spectate", because it has a really obvious "e" missing off the end when using glibc 2.15-3 and compared it to my own C program using the same
string.

What do you reckon? :) Something weird in ETQW or bug in wcslen-sse2.S ?
Comment by Simon (JinxterX) - Wednesday, 01 February 2012, 20:20 GMT
Arrrgh! It's an memory boundary alignment issue in ETQW I think. Some strings not padded correctly so the eax register is
pointing to an odd address before going into the string length routine. Does that make sense? __wcslen_sse2 ends up using
the wrong offset for the termination (null) character. Sorry for my noobish explanation ;)
Comment by Allan McRae (Allan) - Thursday, 02 February 2012, 00:49 GMT
All this leads me to believe that it is a ETQW issue and not a glibc one, so I am closing this bug.

Loading...