Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#21153 - {archweb} mirrorstatus and mirrorlist: check for IPv6

Attached to Project: Arch Linux
Opened by PyroPeter (pyropeter) - Friday, 08 October 2010, 15:21 GMT
Last edited by Dan McGee (toofishes) - Monday, 25 April 2011, 22:47 GMT
Task Type Feature Request
Category Web Sites
Status Closed
Assigned To Dan McGee (toofishes)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

I would be nice if the Mirrorstatus/Mirrorlist service would check if a mirror is available via IPv4, v6 or both.

I am unsure how this should be checked. The easiest way would be to handle it like the Country-property, which seems to be manually checked once the mirror gets added to the list.
But there are also reasons for checking this on a regular basis:
But there could be mirrors that enable/disable IPv6 at some time without an announcement.
Some mirrors could also have connectivity issues confined to one protocol. (Is confined the right word? Intended meaning: issues that only occur if you use one of the protocols)
This task depends upon

Closed by  Dan McGee (toofishes)
Monday, 25 April 2011, 22:47 GMT
Reason for closing:  Implemented
Comment by Dan McGee (toofishes) - Friday, 08 October 2010, 16:44 GMT
We have a 4to6 tunnel set up on the server doing the checking so we can actually see IPv6 -only mirrors, and also end up using IPv6 for any mirror supporting them.

These pages are new and not very filled out yet:
http://www.archlinux.org/mirrors/bjtu.edu.cn/

But I can envision them showing a lot more information than what is there now. We can pull the status data in; do the required DNS lookups on the hostnames and show the A/AAAA records (which will show whether IPv4/IPv6 support is there), etc.
Comment by Dan McGee (toofishes) - Friday, 08 October 2010, 16:48 GMT
If you know Python and/or Django, you are more than welcome to chip in here. Steps I can think of:

1. Add a migration to add has_ipv4, has_ipv6 columns (or something like that) to the MirrorUrl model.
2. Add another management command we could run like once a day that does DNS lookups on all of the mirror URLs, and then updates the columns accordingly based on what info came back.
3. Possibly persist this DNS information in another table, or just retrieve it on the fly.
4. Display DNS info and more on the mirror details page.
5. Allow filtering of the generated mirrorlist based on IPv4/IPv6 status (not much different than the FTP/HTTP filter).
Comment by PyroPeter (pyropeter) - Friday, 08 October 2010, 21:51 GMT
After looking at the code I get the impression that the DNS lookups could easily be added to mirrorcheck.py. (urllib looks them up anyway) Saving the results of the lookup in MirrorLog would also be sensible.
What do you mean with "DNS information" in 3.? Are you planning to save the IP-addresses in a table?
Comment by Dan McGee (toofishes) - Friday, 08 October 2010, 22:06 GMT
That was a thought, that way we can implement #4 without much hassle.
Comment by Dan McGee (toofishes) - Friday, 08 October 2010, 22:43 GMT
This works pretty well for getting IP addresses:
In [15]: [i[4] for i in socket.getaddrinfo('testmyipv6.com', 80, 0, socket.SOCK_STREAM)]
Out[15]: [('24.73.171.238', 80), ('2001:4830:2502:8002::ac10:a', 80, 0, 0)]

Not sure how much we are going to have to do, but this might be helpful:
http://google-opensource.blogspot.com/2008/10/ipaddrpy-flexible-and-easy-python-ip.html
Comment by PyroPeter (pyropeter) - Friday, 08 October 2010, 23:25 GMT
How about that:

>>> import socket
>>> from urlparse import urlparse
>>> url = "http://archlinux.de/"
>>> families = [x[0] for x in socket.getaddrinfo(urlparse(url).hostname, None)]
>>> socket.AF_INET in families
True
>>> socket.AF_INET6 in families
True

EDIT:
The attached patch implements #1 to #4, but is work in progress.
I did not use threads for mirrorresolv.py. Are long running manage.py-commands blocking the server? If not, I see no reason to add threading.

Comment by PyroPeter (pyropeter) - Saturday, 09 October 2010, 13:04 GMT
I now also implemented #5. (Patch attached)

In my opinion the handling of the Country-input in /mirrorlist/ is not user friendly at the moment:
Choosing 'Any' should result in a list of all mirrors, and not in a list of all mirrors with unspecified ('Any') country-property.
I suggest to fix that like I did it in the patch:
If the country input is 'Any', disable filtering by country.

To make the 'Any' MultipleChoiceField-entry independent from the countries actually used by the Mirrors, it is now manually added to the form if it was not added with make_choice().
Comment by PyroPeter (pyropeter) - Saturday, 09 October 2010, 15:50 GMT
Added logging to mirrorresolv.py and fixed sorting in /mirrors/status.
Comment by Dan McGee (toofishes) - Sunday, 10 October 2010, 14:08 GMT
I disagree with the 'Any' changes completely. I can no longer choose "Any" and "Canada" for instance and get what I was looking for. This was the idea behind the top art of the mirrorlist update page. Unfortunately, we don't get to make any other selections in that part of the page such as IP version or being able to sort by status.

If we want to do what you propose, don't reuse a value- we should either add a checkbox to the form that indicates we want to use all mirrors, or add an "All" option to the list.

Other comments:
* there is some trailing whitespace in the patch, make sure that is as clean as possible.
* "mirrorUrls" -> "mirror URLs" in log messages, no need to introduce camelcase that doesn't exist elsewhere.
* You're mixing strings and integers in your choice field- it ends up working, but it is probably better to always use strings- ('4', 'IPv4') since you compare on strings later anyway.
* "ipv4_only" is misleading as a function argument name, and the way this works seems wrong/backwards. I would expect the method to take as input "supports IPv4" or "supports IPv6", and you would pass both as true if you could support either. Passing both as false should also return no mirrors.
Comment by PyroPeter (pyropeter) - Sunday, 10 October 2010, 17:57 GMT
I did the things you suggested and generated the attached commits:

99bccbb Add 'All' choice and make it the default
145c462 Add has_ipv{4,6} to MirrorUrl
c2763c4 Add mirrorresolv manage.py command
5b941e5 Allow filtering of mirrorlist by IP-version
1699dbe Display information about supported IP-versions

"Passing both as false should also return no mirrors."
That would be nice, but it would make things more complex without adding any value.
People without neither IPv4 nor IPv6 won't need mirrorlists anyway.
At the moment there is no way to call find_mirrors with both values set to False, because the form validation prevents it.
Comment by Dan McGee (toofishes) - Wednesday, 13 October 2010, 13:47 GMT
Do you have a github account or something at any other git hosting provider? It might make it easier for me to pull your branch and check it out. I do like this being split into a few patches though, so thanks for that.

You are being shortsighted in your logic with find_mirrors(). Of course I realize that there is only one way it is currently called, but the fact that this stuff didn't even exist until 2 months ago also tells me there is a lot that can be done in the future with it, and I think your argument as to why you shouldn't fix it is a bit silly.
Comment by PyroPeter (pyropeter) - Wednesday, 13 October 2010, 17:09 GMT
You convinced me to return an empty QSet when called without any IP support at all.
I created a repository at github and pushed my changes:

Github project page: http://github.com/pyropeter/archweb
Git-url for pulling: git://github.com/pyropeter/archweb.git
Comment by Dan McGee (toofishes) - Thursday, 14 October 2010, 00:33 GMT
OK, this is mostly implemented and now live on the main site. You have an email from me about one patch, and I made a few changes along the way to make sure this thing can run in cron without needing to be watched for failures and such. Let me know what you think.
Comment by Dan McGee (toofishes) - Monday, 25 April 2011, 22:46 GMT

Loading...