FS#41345 - [glibc] getaddrinfo() always prefers IPv4 over IPv6 on a dual-stack system, ignoring /etc/gai.conf
Attached to Project:
Arch Linux
Opened by Andrej Podzimek (andrej) - Friday, 25 July 2014, 21:54 GMT
Last edited by Allan McRae (Allan) - Monday, 28 July 2014, 13:27 GMT
Opened by Andrej Podzimek (andrej) - Friday, 25 July 2014, 21:54 GMT
Last edited by Allan McRae (Allan) - Monday, 28 July 2014, 13:27 GMT
|
Details
Description:
Since one of the recent updates, getaddrinfo() prefers IPv4 instead of IPv6 on a dual stack system. All applications using getaddrinfo() are affected (psi-plus, Thunderbird, Chromium, OpenSSH, ...). You can inspect what getaddrinfo() returns by installing perl-socket-getaddrinfo from AUR, for example. Additional info: * package version(s) The failing package version: glibc 2.19-5 The old and working package version: glibc 2.19-4 * config and/or log files etc. As for config files, I tried the default /etc/gai.conf, no /etc/gai.conf at all, and also my tweaked /etc/gai.conf (that favors even 6to4 over IPv4). Nothing helps, i.e., IPv4 always wins. After installing perl-socket-getaddrinfo from AUR, you can easily have a look at this: $ getaddrinfo www.google.com Resolved host 'www.google.com', service '0' socket(AF_INET , SOCK_STREAM, IPPROTO_TCP) + '74.125.232.49:0' socket(AF_INET , SOCK_DGRAM , IPPROTO_UDP) + '74.125.232.49:0' socket(AF_INET , SOCK_RAW , IPPROTO_IP ) + '74.125.232.49:0' [...] socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) + '[2a00:1450:400d:802::1012]:0' socket(AF_INET6, SOCK_DGRAM , IPPROTO_UDP) + '[2a00:1450:400d:802::1012]:0' socket(AF_INET6, SOCK_RAW , IPPROTO_IP ) + '[2a00:1450:400d:802::1012]:0' The expected result would be the other way round. On an old (not updated for >1 month) ArchLinux system with glibc 2.19-4, I get the correct and expected outcome (with *all* the /etc/gai.conf options mentioned above). $ getaddrinfo www.google.com Resolved host 'www.google.com', service '0' socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) + '[2a00:1450:4001:807::1013]:0' socket(AF_INET6, SOCK_DGRAM , IPPROTO_UDP) + '[2a00:1450:4001:807::1013]:0' socket(AF_INET6, SOCK_RAW , IPPROTO_IP ) + '[2a00:1450:4001:807::1013]:0' [...] socket(AF_INET , SOCK_STREAM, IPPROTO_TCP) + '173.194.112.240:0' socket(AF_INET , SOCK_DGRAM , IPPROTO_UDP) + '173.194.112.240:0' socket(AF_INET , SOCK_RAW , IPPROTO_IP ) + '173.194.112.240:0' Steps to reproduce: Try to connect to a dual stack server from a dual stack client. IPv4 will be preferred instead of IPv6. |
This task depends upon
Closed by Allan McRae (Allan)
Monday, 28 July 2014, 13:27 GMT
Reason for closing: None
Additional comments about closing: It seems that the bug is related to the strongswan AUR package rather than to glibc and getaddrinfo(). StrongSwan started setting 'preferred_lft 0' on IPv6 IPSec addresses (but not on IPv4), which confuses getaddrinfo() and causes it to prefer IPv4 (assuming there is no outbound IPv6 connectivity).
Monday, 28 July 2014, 13:27 GMT
Reason for closing: None
Additional comments about closing: It seems that the bug is related to the strongswan AUR package rather than to glibc and getaddrinfo(). StrongSwan started setting 'preferred_lft 0' on IPv6 IPSec addresses (but not on IPv4), which confuses getaddrinfo() and causes it to prefer IPv4 (assuming there is no outbound IPv6 connectivity).
IPv6 is "de facto" disabled in ArchLinux at the moment.
However, whenever my machine's IPv6 addresses are obtained using IPSec (StrongSwan) (which is the most common case for my laptop -- it's a "road warrior" connecting to various IPv4 networks behind NAT), getaddrinfo() prefers native IPv4 addresses, though it used to prefer IPv6 in this case before. Surprisingly, when I use both an IPv6 address from IPSec and a 6to4 address from a local router, connections are made preferably from the 6to4 address rather than from the "native" IPv6 addresses configured by IPSec. (But getaddrinfo() does prefer IPv6 in this mixed case.)
This is really strange. Up to a certain point, getaddrinfo() would treat all IPv6 addresses equally, no matter how they were configured, but something must have changed recently, not necessarily in getaddrinfo().
There's a difference I have noticed: The slightly outdated machine where getaddrinfo() still works as expected doesn't have any "metric" for IPv4 routes when I look at them using 'ip route show table all'. The machine with the incorrect address ordering does have some "metric" set even for some of the IPv4 routes. Yet changing the metric manually seems to have no effect and I have no firm evidence that getaddrinfo() is actually using the metrics when it comes to ordering of results.
Anyway, getaddrinfo() behaves as if it could somehow determine that a machine's IPv6 connectivity leads through an IPSec tunnel (capable of 'default' routing) and prefers IPv4. (Yet IPv6 works perfectly fine with 'ssh -6' or when connecting to IPv6-only machines.) I can't see a way to tell getaddrinfo() that using the IPSec IPv6 tunnel by default was my *intention* and that it should just avoid IPv4 whenever possible. (In general, preferring the IPSec tunnel (for IPv4 and IPv6) has the additional benefit of preserving most TCP connections when switching between unrelated WiFi networks or WiFi and ethernet. Preferring native addresses automatically will break this.)
Anyway, I'm no longer convinced that this has to be a glibc bug... I have no idea what got wrong and why. Perhaps I should post all this as a forum question instead.
On the other hand, I cannot identify the package that causes the problem. I have a working machine with glibc 2.19-4 where everything works exactly as expected. But downgrading to glibc 2.19-4 on my up-to-date machine doesn't help at all, the issue is still there. So it's probably not a glibc problem. I have no idea what could be causing this.
Edit: test-ipv6.com says "Your browser is avoiding IPv6.", confirming what I already observed. But it's not the browser's fault, it's just getaddrinfo() that got crazy.
The problem is that some version of StrongSwan (5.2 and later, I' guess) started to configure 'preferred_lft 0' on the IPv6 IPSec tunnel addresses, yet IPv4 tunnel addresses still get 'preferred_lft forever' for some reason. This must be a bug.
The meaning of 'preferred_lft 0' is explained here: http://www.davidc.net/networking/ipv6-source-address-selection-linux In my case, the IPSec tunnel's IPv6 address is usually the machine's *only* usable IPv6 address, so avoiding it is simply not an option. It can and must serve as a source address.
Manually selecting 'preferred_lft forever' (as in the command above) gives a hint to getaddrinf() that there is indeed a usable IPv6 source address. Consequently, getaddrinfo() prefers IPv6, exactly as desired and as done before. With 'preferred_lft 0', getaddrinfo() assumes there is no outbound IPv6 connectivity and orders IPv4 addresses first.
OK, it seems that the mystery has been solved. I think I should report this to StrongSwan.