FS#11166 - DNS over IPv6 does not work any more

Attached to Project: Arch Linux
Opened by Andrej Podzimek (andrej) - Saturday, 09 August 2008, 21:49 GMT
Last edited by Kevin Piche (kpiche) - Saturday, 28 February 2009, 18:17 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Kevin Piche (kpiche)
Architecture All
Severity Critical
Priority Normal
Reported Version None
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:

Since last update of Bind and dnsutils, there is a bug that prevents DNS over IPv6 from functioning. Dig says „can't find IPv6 networking“ and Bind says „no IPv6 interfaces found“.

IPv6 interfaces and radvd are up and running, of course.

Additional info:
* package version(s)
* config and/or log files etc.

dnsutils 9.5.0-4
bind 9.5.0-4

Steps to reproduce:

On an up-to-date Arch installation, just try to make a DNS queery with Bind through IPv6.
This task depends upon

Closed by  Kevin Piche (kpiche)
Saturday, 28 February 2009, 18:17 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in bind 9.6.0-P1-1.
Comment by Andrej Podzimek (andrej) - Saturday, 09 August 2008, 22:02 GMT
Well, after a bit of googling, I see this could be a bug in Bind. Version 9.4.2 worked...
Comment by Andrej Podzimek (andrej) - Saturday, 09 August 2008, 23:21 GMT
A bit mor info:

--enable-getifaddr=no
--enable-getifaddr=glibc
--disable-getifaddr

All of them lead to the same result: IPv6 does not work.

What's even more interesting: Even version 9.4.2 does not work with IPv6 any more. Something must have changed either in glibc or in the kernel.
Comment by Andrej Podzimek (andrej) - Sunday, 10 August 2008, 02:42 GMT
Sorry for my multiposting, but... One of my computers *does* handle IPv6 DNS requests. It hasn't been updated for some time. And this is the package diff:

bash 3.2.033-2 3.2.039-2
bind 9.4.2-1 9.5.0-4
digikam 0.9.4-1 0.9.4-2
dirmngr 0.9.7-1 1.0.2-1
dnsutils 9.4.2-1 9.5.0-4
faad2 2.6-1 2.6.1-1
fakeroot 1.9.3-1 1.9.5-1
ffmpeg 20080715-2 20080715-3
firefox 3.0.1-1 3.0.1-2
gcc 4.3.1-2 4.3.1-3
gcc-libs 4.3.1-2 4.3.1-3
ghostscript 8.62-2 8.63-2
git 1.5.6.4-1 1.5.6.5-1
gnokii 0.6.22-3 0.6.26-1
gnupg2 2.0.8-1 2.0.9-1
hunspell 1.2.4-1 1.2.6-1
intltool 0.40.1-1 0.40.3-1
kdebase 3.5.9-5 4.1.0-2
kdelibs 3.5.9-6 4.1.0-4
kdepim 3.5.9-3 4.1.0-2
libassuan 1.0.4-2 1.0.5-1
libkdcraw 0.1.4-1 0.1.4-2
libkexiv2 0.1.7-1 0.1.7-2
libkipi 0.1.6-1 0.1.6-2
libksba 1.0.2-1 1.0.3-1
libmal 0.42-1 0.44-1
man-pages 3.03-1 3.05-1
pacman 3.1.4-1 3.2.0-1
pciutils 2.2.8-3 3.0.0-2
pm-utils 1.1.2.2-1 1.1.2.4-1
psi 0.12-1 0.12-2
qt 4.4.0-6 4.4.1-1
raptor 1.4.16-1 1.4.18-1
rarian 0.8.0-1 0.8.0-2
redland 1.0.7-2 1.0.8-1
ruby 1.8.7_p22-2 1.8.7_p71-1
soprano 2.1-1 2.1.1-1
sqlite3 3.5.9-2 3.6.1-1
thunderbird 2.0.0.14-2 2.0.0.16-1
ttf-dejavu 2.25-1 2.26-1
xine-lib 1.1.14-1 1.1.14-2
xkeyboard-config 1.2-1 1.3-1
xulrunner 1.9.0.1-1 1.9.0.1-2
zip 2.32-1 3.0-1

Both machines use exactly the same kernel, both are i686. (I am using Vanilla kernel 2.6.25.15.) Which of these packages could have caused such a trouble? Any ideas?
Comment by Andrej Podzimek (andrej) - Sunday, 10 August 2008, 02:54 GMT
Sorry again, but I've found a SOLUTION. Here it is:

http://bugs.gentoo.org/show_bug.cgi?id=227333
Comment by Andrej Podzimek (andrej) - Sunday, 10 August 2008, 05:00 GMT
The Gentoo patch did not work. This is a desperate user's dirty hack. Yes, it's terrible. Yes, it might cause unexpected behaviour. But it works for me. After 24 hours of (human) uptime, this is the best solution I could think of. My IPv6 network is up and running again.

Packages outside Bind did not cause this problem. My assumption above was completely wrong.

The problem is pretty simple, but hard to diagnose: __USE_GNU must be defined when netinet/in.h is parsed. (The structure in6_pktinfo, required by configure scripts, is inside #ifdef.) This enables IPv6 support.
Comment by Greg (dolby) - Sunday, 10 August 2008, 16:53 GMT
File a bug report upstream
Comment by Gilles Bedel (gillux) - Tuesday, 12 August 2008, 16:11 GMT
Hello,

I just want to warn anyone who is going to sent that to the upstream. I have already found that bug some times ago. So I sent a bugreport to the bind devloppers. But I couldn't find any way to make them understand their bug. They think it's a glib regression... Last bind version still don't include the fix. I attached my mail conversation.

Anyway, thanks for fixing it in ArchLinux :)
Comment by Andrej Podzimek (andrej) - Tuesday, 12 August 2008, 16:32 GMT
Well, this is why most of the „bigger“ distros like Gentoo, Ubuntu or Debian usually do not wait until the upstream bug is resolved, but introduce their own fix immediately, just to make it work. (On the other hand, this approach could sometimes lead to a big trouble, such as the recent RSA key bug in Debian...)

My ditrty hack (see the PKGBUILDs above) is not completely correct. A much better approach would be to replace

#include <netinet/in.h>

with

#ifndef __USE_GNU
#define ARCHLINUX_UNDEF_GNU
#define __USE_GNU
#endif

#include <netinet/in.h>

#ifdef ARCHLINUX_UNDEF_GNU
#undef __USE_GNU
#undef ARCHLINUX_UNDEF_GNU
#endif

This switches __USE_GNU on only in netinet/in.h, it does not change its value in the rest of the code. (To be honest, I don't know whether this would compile or run. My hacked PKGBUILDs included above compile and run just fine.)

Well, it's up to the package maintainer to make a final decision...
Comment by Gilles Bedel (gillux) - Tuesday, 12 August 2008, 16:50 GMT
Andrej,

as I explained in my e-mail, the only thing that have to be done is to put the -D_GNU_SOURCE CFLAG, at least in the ./configure test and in the code that include <netinet/in.h>. Gentoo made it wery well: http://bugs.gentoo.org/attachment.cgi?id=157021

IMHO, the BIND (which means Berkeley Internet Name Domain, read BSD) devloppers may just don't want to add a flag that "says" that the source code is GNU. Or I really don't understand their way of think.

See http://www.gnu.org/software/libc/FAQ.html#s-3.1 (and my previous attachment)
Comment by Andrej Podzimek (andrej) - Tuesday, 12 August 2008, 17:15 GMT
For some reason or other, the Gentoo patch did not work at all for me. After applying the patch, I could still see „Disabling runtime IPv6 support“ in the ./configure output. The reason was (again) the lack of in6_pktinfo. That's why I tried my stupid hack...

IMHO, it would be nice to fix the ArchLinux pakage first, using any possible (harmless) approach. There is always enough time for discussion with BIND developers on philosophical topics like GNU source. A package that can cause IPv6 DNS blackout to IPv6 sites and LANs is in the 'extra' repository right *now*.

Anyway, this is somewhat weird. I have never had such a problem with BIND.
Comment by Gilles Bedel (gillux) - Tuesday, 12 August 2008, 18:36 GMT
Basically, you only need to set _GNU_SOURCE in your CFLAGS:

$ CFLAGS=-D_GNU_SOURCE ./configure && make # works here.

Gentoo's patch only patches the configure.in, so you have to run autoconf to regenerate the configure script.

But you are right indeed, that's not enougth. After applying the patch, ./configure successfully detect IPv6, but build fails. The gentoo patch defines _GNU_SOURCE in config.h, but only include config.h in lib/isc/httpd.c, whereas there are plenty of files that include <netinet/in.h>. IMHO, we can't just #include <config.h> for all the files that requires <netinet/in.h>, we must set the -D_GNU_SOURCE CFLAG. I feel that make/rules.in seems the right place for that: I guess it generates make/rules, which is included by Makefile.in and configure.in through the BIND9_MAKE_RULES line. But I really don't know much about the automake/autoconf/aclocal things (isn't supposed to be a Makefile.am?). Maybe someone could tell us.

Also you know, your are never supposed to define __USE_GNU yourself, because it's an internal gcc thing. Use _GNU_SOURCE instead.
Comment by Kristoffer Jan-Olov Tångfelt (revellion) - Sunday, 12 October 2008, 22:46 GMT
added export CFLAGS="-D_GNU_SOURCE" to an PKGBUILD from testing/bind using abs and it seems to work fine. dunno how "clean" this addition is though. but it made both dnsutils and bind build with IPv6 support fine.
Comment by Gilles Bedel (gillux) - Wednesday, 15 October 2008, 10:49 GMT
Upstream fixed it, finally.

The bug (RT #18388) I opened 2 months and a half ago at the ISC is now closed. Their bugreports archives and their working repository are not publicy available, so you may not see it until the next bind release.

However, someone talked about it here: http://www.ietf.org/mail-archive/web/ipv6/current/msg09860.html
Comment by Andrej Podzimek (andrej) - Wednesday, 15 October 2008, 12:23 GMT
Wow! That's great, we won't have to worry about bind upgrades any more.

I'd suggest removing the --disable-threads from PKGBUILD. It doesn't make sense in the era of dual-core CPUs. BIND will use just one if this option stays there.
Comment by Glenn Matthys (RedShift) - Friday, 05 December 2008, 10:35 GMT
The fix for ipv6 support is in the 9.6.0 beta 1 release notes (http://oldwww.isc.org/sw/bind/view/?release=9.6.0b1&noframes=1)

Loading...