FS#24615 - [glibc] segfault in __libc_res_nquery

Attached to Project: Arch Linux
Opened by Thomas Dziedzic (tomd123) - Tuesday, 07 June 2011, 13:38 GMT
Last edited by Allan McRae (Allan) - Sunday, 30 October 2011, 06:51 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Allan McRae (Allan)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 17
Private No

Details

Upgrading to glibc 2.14-1 causes the attached segfaults in ruby and pacman
Downgrading to glibc 2.13-5 resolves the segfaults

Additional info:
* package version(s)
glibc 2.14-1

Fedora is also experiencing this:
https://bugzilla.redhat.com/buglist.cgi?quicksearch=__libc_res_nquery
This task depends upon

Closed by  Allan McRae (Allan)
Sunday, 30 October 2011, 06:51 GMT
Reason for closing:  Fixed
Additional comments about closing:  glibc-2.14.1-1
Also, use a better DNS server..
Comment by Thomas Dziedzic (tomd123) - Tuesday, 07 June 2011, 14:55 GMT
Just tried the attached patch at http://sourceware.org/bugzilla/show_bug.cgi?id=12684
It didn't fix the issue, so I guess it's not the same issue as this.
Comment by Florian Zeitz (Florob) - Tuesday, 07 June 2011, 17:07 GMT
Not sure it's helpfull, but for me this segfault occurs only if I have no network cable attached.
Comment by Anonymous Submitter - Wednesday, 08 June 2011, 07:09 GMT
I have

[ 28.503211] ntpd[1814]: segfault at 3 ip 00007f9d2c7eea65 sp 00007ffff1785e10 error 4 in libresolv-2.14.so[7f9d2c7e7000+13000]
[ 206.490454] gogoc[2750]: segfault at 3 ip 00007f884f110a65 sp 00007fff1538bba0 error 4 in libresolv-2.14.so[7f884f109000+13000]

for dmesg | grep -i segfault

Is this related to this bug?
Comment by Allan McRae (Allan) - Wednesday, 08 June 2011, 07:33 GMT
That is likely to be the same issue.
Comment by Marti (intgr) - Wednesday, 08 June 2011, 09:18 GMT
The bug seems to be in broken error handling in getaddrinfo() when sending DNS packets.

I can 100% reliably reproduce it by creating an iptables reject rule for DNS packets:
# iptables -A OUTPUT -p udp --dport 53 -j REJECT --reject-with icmp-admin-prohibited
# wget http://google.com/
--2011-06-08 12:13:58-- http://google.com/
Resolving google.com... zsh: segmentation fault (core dumped)

# ltrace wget http://google.com/
[...]
getaddrinfo("google.com", NULL, 0x7fffa31bf830, 0x7fffa31bf888 <unfinished ...>
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

# strace wget http://google.com/
[...]
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.0.6")}, 16) = 0
poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "\216A\1\0\0\1\0\0\0\0\0\0\6google\3com\0\0\1\0\1", 28, MSG_NOSIGNAL, NULL, 0) = -1 EPERM (Operation not permitted)
close(3) = 0
--- {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x3} (Segmentation fault) ---
+++ killed by SIGSEGV (core dumped) +++

To remove the iptables rule, just run:
# iptables -D OUTPUT -p udp --dport 53 -j REJECT --reject-with icmp-admin-prohibited
Comment by Sven-Hendrik Haase (Svenstaro) - Wednesday, 08 June 2011, 18:10 GMT
I can reproduce:

$ dmesg | grep -i segfault
[ 85.520794] ntpd[1012]: segfault at 3 ip 00007f576d7bda65 sp 00007fff31cb5710 error 4 in libresolv-2.14.so[7f576d7b6000+13000]
Comment by Mathieu Pasquet (mathieui) - Wednesday, 08 June 2011, 22:22 GMT
Same issue with firefox (see attached file)
Comment by Sander Jansen (GogglesGuy) - Friday, 10 June 2011, 18:55 GMT
I guess every software that does a (failed) dns lookup will be affected.
Comment by Rémy Oudompheng (remyoudompheng) - Friday, 10 June 2011, 19:02 GMT
Confirmed here with firefox.
Comment by Sander Jansen (GogglesGuy) - Friday, 10 June 2011, 22:06 GMT
I guess every software that does a (failed) dns lookup will be affected.
Comment by Garry Borman (Emae101) - Saturday, 11 June 2011, 08:33 GMT
>Architecture x86_64
I'm using x86 and experienced this bug.
It's nasty.

Comment by Hussam Al-Tayeb (hussam) - Saturday, 18 June 2011, 15:20 GMT
the exact crash is back in glibc-2.14-3
Comment by Thomas Dziedzic (tomd123) - Saturday, 18 June 2011, 15:33 GMT
OP here, I can't reproduce the ruby or pacman segfaults with 2.14-3
Comment by Hussam Al-Tayeb (hussam) - Saturday, 18 June 2011, 15:44 GMT
I'm getting the firefox crash.
Comment by Allan McRae (Allan) - Sunday, 19 June 2011, 04:29 GMT
Can you give a gdb trace? I can not replicate at all.
Comment by Hussam Al-Tayeb (hussam) - Sunday, 19 June 2011, 14:30 GMT
#0 0xb7fde424 in __kernel_vsyscall ()
#1 0xb7d27b8f in raise () from /lib/libc.so.6
#2 0xb7d29515 in abort () from /lib/libc.so.6
#3 0xb7d20655 in ?? () from /lib/libc.so.6
#4 0xb7d20707 in __assert_fail () from /lib/libc.so.6
#5 0xb509eda7 in __libc_res_nquery () from /lib/libresolv.so.2
#6 0xb509ef35 in ?? () from /lib/libresolv.so.2
#7 0xb509f5df in __libc_res_nsearch () from /lib/libresolv.so.2
#8 0xb3002509 in _nss_dns_gethostbyname4_r () from /lib/libnss_dns.so.2
#9 0xb7db303b in ?? () from /lib/libc.so.6
#10 0xb7db4e4d in getaddrinfo () from /lib/libc.so.6
#11 0xb7bad45b in PR_GetAddrInfoByName () from /usr/lib/libnspr4.so
#12 0xb6441011 in ?? () from /usr/lib/xulrunner-2.0/libxul.so
#13 0xb7bba221 in ?? () from /usr/lib/libnspr4.so
#14 0xb7fa3c77 in start_thread () from /lib/libpthread.so.0
#15 0xb7dcc43e in clone () from /lib/libc.so.6


pacman -Qi glibc
Name : glibc
Version : 2.14-3

Comment by Hussam Al-Tayeb (hussam) - Monday, 20 June 2011, 23:35 GMT
for what it it worth, glibc 2.14-2 did not show the crash. only glibc 2.14-1 and 2.14-3 show the crash
Comment by Hussam Al-Tayeb (hussam) - Wednesday, 22 June 2011, 12:31 GMT
firefox: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed.
Aborted

is what is shows in terminal when it crashes
Comment by Enjolras (enjolras) - Wednesday, 22 June 2011, 13:17 GMT
I can reproduce.
Comment by Plex (plexor) - Thursday, 23 June 2011, 04:31 GMT
I also am having troubles.

Many apps segfault either right away or 'randomly' including: nautilus, epiphany, totem, firefox and gnome-settings.

My gdb output for firefox:
firefox-bin: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffcf5ff700 (LWP 15142)]
0x00007ffff53de795 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007ffff53de795 in raise () from /lib/libc.so.6
#1 0x00007ffff53dfc0b in abort () from /lib/libc.so.6
#2 0x00007ffff53d753e in ?? () from /lib/libc.so.6
#3 0x00007ffff53d75e2 in __assert_fail () from /lib/libc.so.6
#4 0x00007fffecbf8e5a in __libc_res_nquery () from /lib/libresolv.so.2
#5 0x00007fffecbf8f5e in ?? () from /lib/libresolv.so.2
#6 0x00007fffecbf9545 in __libc_res_nsearch () from /lib/libresolv.so.2
#7 0x00007fffd07fb947 in _nss_dns_gethostbyname4_r () from /lib/libnss_dns.so.2
#8 0x00007ffff5469ab1 in ?? () from /lib/libc.so.6
#9 0x00007ffff546bb50 in getaddrinfo () from /lib/libc.so.6
#10 0x00007ffff57273b1 in PR_GetAddrInfoByName () from /usr/lib/libnspr4.so
#11 0x00007ffff63547f7 in ?? () from /usr/lib/firefox-5.0/libxul.so
#12 0x00007ffff5732f33 in ?? () from /usr/lib/libnspr4.so
#13 0x00007ffff7bc8d60 in start_thread () from /lib/libpthread.so.0
#14 0x00007ffff547de2d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

The segfault messages in my system log:
[ 99.362182] nautilus[1540]: segfault at 7f7842d22003 ip 00007f783a03cd86 sp 00007fff7edc96a8 error 4 in libc-2.14.so[7f7839fc2000+157000]
[ 377.092076] gnome-settings-[1288]: segfault at 7f70bb379210 ip 00007f70bb379210 sp 00007fffe5fc6918 error 14 in libdbus-1.so.3.5.5[7f70bb5bb000+42000]
[ 3754.461120] gnome-settings-[2299]: segfault at 7f0b20f01210 ip 00007f0b20f01210 sp 00007fff45f79ee8 error 14 in libdbus-1.so.3.5.5[7f0b21143000+42000]
[ 4115.005514] totem[2755]: segfault at 7f6874d72000 ip 00007f67d5dd8d86 sp 00007fff7b61dd78 error 4 in libc-2.14.so[7f67d5d5e000+157000]
[ 4272.177949] nautilus[3022]: segfault at 7f4166c0e003 ip 00007f415df2dd86 sp 00007fff3cc6fa58 error 4 in libc-2.14.so[7f415deb3000+157000]
[ 6322.581224] nautilus[6950]: segfault at 7f6047f37000 ip 00007f603f258024 sp 00007fff482a11b8 error 4 in libc-2.14.so[7f603f1dc000+157000]
Comment by Allan McRae (Allan) - Thursday, 23 June 2011, 08:53 GMT
Hmm... the assert fail is a different bug which appears caused by the same patch. It possibly is router specific as people tend to observe it on all computers on their network.
Comment by Hussam Al-Tayeb (hussam) - Thursday, 23 June 2011, 18:28 GMT
I don't have a router. direct cable from ISP. But it is a bad connection. The ISP uses transparent caching using a squid server. That could explain it.
Comment by Pham Ngoc Hai (phamngochai) - Friday, 24 June 2011, 05:55 GMT
Me too, my firefox crashes with:
=========
firefox-bin: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed.
Aborted
=========
Running glibc 2.14-3
Comment by Justin Gottula (jgottula) - Friday, 24 June 2011, 07:25 GMT
I experienced fairly consistent segfaults in Firefox and KTorrent under 2.14-1, and they seemed to go away with 2.14-2. Now with 2.14-3, however, I'm getting the assertion failure (and its SIGABRT counterpart) coming from __libc_res_nquery under the same applications on a much more sporadic basis.


firefox: res_query.c:251: __libc_res_nquery: Assertion `hp != hp2' failed.
Aborted

(no backtrace on this one, as it spitefully refused to happen when running in gdb)


Application: KTorrent (ktorrent), signal: Aborted
[Current thread is 1 (Thread 0x7ffff7f9a760 (LWP 3316))]

Thread 1 (Thread 0x7ffff7f9a760 (LWP 3316)):
#0 0x00007ffff5470ac4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x00007ffff56f659b in QWaitCondition::wait(QMutex*, unsigned long) () from /usr/lib/libQtCore.so.4
#2 0x00007ffff56ea741 in ?? () from /usr/lib/libQtCore.so.4
#3 0x00007ffff56ebc2f in QThreadPool::~QThreadPool() () from /usr/lib/libQtCore.so.4
#4 0x00007ffff56ebc69 in QThreadPool::~QThreadPool() () from /usr/lib/libQtCore.so.4
#5 0x00007ffff56ebc95 in ?? () from /usr/lib/libQtCore.so.4
#6 0x00007ffff3d1a311 in ?? () from /lib/libc.so.6
#7 0x00007ffff3d1a395 in exit () from /lib/libc.so.6
#8 0x00007ffff4a08f28 in ?? () from /usr/lib/libQtGui.so.4
#9 0x00007ffff62232c8 in KApplication::xioErrhandler(_XDisplay*) () from /usr/lib/libkdeui.so.5
#10 0x00007ffff19fdfae in _XIOError () from /usr/lib/libX11.so.6
#11 0x00007ffff19fb8ad in _XEventsQueued () from /usr/lib/libX11.so.6
#12 0x00007ffff19ec22f in XEventsQueued () from /usr/lib/libX11.so.6
#13 0x00007ffff4a3ffec in ?? () from /usr/lib/libQtGui.so.4
#14 0x00007fffeef89f24 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#15 0x00007fffeef8a7f2 in ?? () from /usr/lib/libglib-2.0.so.0
#16 0x00007fffeef8ad09 in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#17 0x00007ffff5807876 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#18 0x00007ffff4a401be in ?? () from /usr/lib/libQtGui.so.4
#19 0x00007ffff57dbdb2 in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#20 0x00007ffff57dbfb7 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#21 0x00007ffff57e01ab in QCoreApplication::exec() () from /usr/lib/libQtCore.so.4
#22 0x0000000000428292 in ?? ()
#23 0x00007ffff3d0417d in __libc_start_main () from /lib/libc.so.6
#24 0x0000000000429461 in _start ()
Comment by Allan McRae (Allan) - Saturday, 25 June 2011, 12:03 GMT
Try glibc-2.14-4 and let me know if all is good.
Comment by vicencb (vicencb) - Sunday, 25 September 2011, 17:39 GMT
With version glibc-2.14-6 i get this backtrack:
firefox-bin: res_query.c:258: __libc_res_nquery: Assertion `hp != hp2' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffd89fe700 (LWP 3022)]
0x00007ffff53ce735 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007ffff53ce735 in raise () from /lib/libc.so.6
#1 0x00007ffff53cfbab in abort () from /lib/libc.so.6
#2 0x00007ffff53c757e in ?? () from /lib/libc.so.6
#3 0x00007ffff53c7622 in __assert_fail () from /lib/libc.so.6
#4 0x00007fffec9cbd52 in __libc_res_nquery () from /lib/libresolv.so.2
#5 0x00007fffec9cbed5 in ?? () from /lib/libresolv.so.2
#6 0x00007fffec9cc48f in __libc_res_nsearch () from /lib/libresolv.so.2
#7 0x00007fffdcdfa987 in _nss_dns_gethostbyname4_r () from /lib/libnss_dns.so.2
#8 0x00007ffff5459401 in ?? () from /lib/libc.so.6
#9 0x00007ffff545b560 in getaddrinfo () from /lib/libc.so.6
#10 0x00007ffff5716bbf in PR_GetAddrInfoByName () from /usr/lib/libnspr4.so
#11 0x00007ffff63771c6 in ?? () from /usr/lib/firefox-6.0.2/libxul.so
#12 0x00007ffff5722853 in ?? () from /usr/lib/libnspr4.so
#13 0x00007ffff7bc8da0 in start_thread () from /lib/libpthread.so.0
#14 0x00007ffff546d7dd in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
(gdb) cont

Program received signal SIGABRT, Aborted.
0x00007ffff7bd06cb in raise () from /lib/libpthread.so.0
(gdb) cont
...
Program terminated with signal SIGABRT, Aborted.
Comment by Allan McRae (Allan) - Tuesday, 27 September 2011, 05:05 GMT Comment by vicencb (vicencb) - Tuesday, 27 September 2011, 19:36 GMT
Couldn't reproduce with glibc-2.14-6.1-x86_64.
I've visited the same web page that produced the problem and worked fine.
(the page was: http://wiki.openmoko.org/wiki/Manually_using_GPRS)
Comment by Cyker Way (cyker) - Wednesday, 12 October 2011, 12:34 GMT
Seems http://allanmcrae.com/tmp/glibc-2.14-6.1-i686.pkg.tar.xz fixed the problem.

I opened the page that previously caused the crash with Firefox, and no crash any more. Good job!
Comment by Allan McRae (Allan) - Tuesday, 25 October 2011, 09:42 GMT
I went back to the crappy workaround from glibc-2.14-4 again in glibc-2.14.1-1 as neither of the patches appear to fully fix the issue. This is all due to crappy DNS servers, so everyone who was affected should look into changing that because there is no guarantee that the workaround will stay once more commits get made to that area of the code...
Comment by Allan McRae (Allan) - Tuesday, 25 October 2011, 09:42 GMT
Also, please confirm glibc-2.14.1-1 fixes the issues for you.
Comment by vicencb (vicencb) - Tuesday, 25 October 2011, 18:42 GMT
Downgraded to glibc-2.14-6-x86_64 to check the mentioned web page still triggers the issue: SIGABRT.
Upgraded to glibc-2.14.1-1-x86_64 and retried: no problem detected.
So it's confirmed: glibc-2.14.1-1-x86_64 fixes the issues (at least for me).

Thanks.

Loading...