FS#8745 - failing WPA2 authentication after update to 2.6.23.8 kernel

Attached to Project: Arch Linux
Opened by Michal (broch) - Friday, 23 November 2007, 16:05 GMT
Last edited by Tobias Powalowski (tpowa) - Friday, 07 December 2007, 09:47 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture i686
Severity Critical
Priority Normal
Reported Version 2007.08-2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
serious problems with WPA2 authentication with kernel 2.6.23.8-x (where x is internal Arch version). System updated to the latest package versions.
WPA2 connection requires several network restarts.

Posible explanation:
either cfs is exposing an unknown bug in WPA2 authentication or cfs is responsible for failing WPA2 authentication.

Additional info:
* package version(s)
kernel 2.6.23.8
* config and/or log files etc.


Steps to reproduce:
1) install kernel 2.6.23.1 run wireless with WAP2. This setup will connect each time without any problems
2) install kernel 2.6.23.8 run wireless with WPA2. This setup will "eventually" connect but requires several network restarts (up to 20 in my case)
3) install custom kernel 2.6.23.8 patched only with sched-cfs-devel.patch
from
http://kamikaze.waninkoko.info/patches/2.6.23/kamikaze2/broken-out/
again WPA2 connection is restored and wireless network will work without authentication issues

Bug was confirmed here:
http://bbs.archlinux.org/viewtopic.php?pid=301994#p301994
by users with different hardware setup but exactly the same kernel version: 2.6.23/2.6.23.1 working and 2.6.23.8 failing
wireless NICs tested and having problems with WPA2 authentication:
Broadcom 4311
ipw2200
ipw3945
users report systems updated to the latest package versions


More info (personal experience).
I don't think that this is Arch issue, I see this problem with customized vanilla kernels too.
2.6.22.x without cfs never had wpa2 authentication issues
2.6.22.x with different versions of cfs had intermittent authentication problems. Intermittent means that some versions of cfs caused authentication issues some did not.
I never had this issue with SD cpu scheduler.

Possible resolution: test one cfs version and confirm that it does not interfere with WPA2 authentication. Use working cfs version with Arch kernels?
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Friday, 07 December 2007, 09:47 GMT
Reason for closing:  Fixed
Comment by Manuel C. (ekerazha) - Friday, 23 November 2007, 21:19 GMT
I confirm authentication issues (WPA "1" TKIP) after the last updates (kernel 2.6.23.8 was one of these updates, so it can be related).
Comment by Robert Fortune (RobF) - Saturday, 24 November 2007, 18:58 GMT
I confirm this problem, too, for the Broadcom 4311 chipset, using ndiswrapper v.1.49-3 and the Windows driver bcmwl5 v.4.10.40.0 (11/02/05) and WPA v.1 encryption (TKIP) with wpa_supplicant v.0.5.8-2. For me, the problem occurs with vanilla kernels 2.6.23.1 and 2.6.23.8 (the only 2.6.23 kernels that I tried); it didn't with 2.6.22.9. Note that this is different from the bug reporter's experience for whom kernel 2.6.23.1 worked (for WPA2). Altogether, 5 individuals have reported this problem after recent kernel upgrades, see http://bbs.archlinux.org/viewtopic.php?id=40145.

A sample output from my system follows:

iwconfig wlan0 essid networkname channel 11 key ....WPA key in hex here.....
wpa_supplicant -i wlan0 -c /etc/wpa_supplicant.conf
Trying to associate with xx:xx:xx:xx:xx:xx (SSID='MyAP' freq=2462 MHz)
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Authentication with xx:xx:xx:xx:xx:xx timed out.
CTRL-EVENT-DISCONNECTED - Disconnect event - remove keys
Trying to associate with xx:xx:xx:xx:xx:xx (SSID='MyAp' freq=2462 MHz)
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Authentication with xx:xx:xx:xx:xx:xx timed out.
CTRL-EVENT-DISCONNECTED - Disconnect event - remove keys
Trying to associate with xx:xx:xx:xx:xx:xx (SSID='MyAp' freq=2462 MHz)
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
Associated with xx:xx:xx:xx:xx:xx
WPA: Key negotiation completed with xx:xx:xx:xx:xx:xx [PTK=TKIP GTK=TKIP]
CTRL-EVENT-CONNECTED - Connection to xx:xx:xx:xx:xx:xx completed (auth) [id=0 id_str=]

I generally experience dozens of failed authentication attempts before it finally succeeds.
Comment by Robert Fortune (RobF) - Saturday, 24 November 2007, 19:05 GMT
In the forum thread referred to above, delphiki wrote today:

On my other laptop, a Lenovo 3000 N100, I updated everything again today and have not gotten any problems using wireless with the iwlwifi drivers (not the ipw3945 drivers). I use wpa_supplicant with WPA2-EAP, so you might want to try out your wireless again sometime soon, and possibly using the iwlwifi drivers because the ipw3945 drivers take 20+ iterations of failed authentications before it obtains authentication successfully.
Comment by Tobias Powalowski (tpowa) - Sunday, 25 November 2007, 16:01 GMT
http://www.archlinux.org/~tpowa/2.6.23/
does this kernel solve your isssue?
Comment by Tobias Powalowski (tpowa) - Sunday, 25 November 2007, 18:21 GMT
please also use modules provided in this directory, else your system might freeze
Comment by Michal (broch) - Sunday, 25 November 2007, 19:59 GMT
Hi Tobias,
just installed kernel and ipw3945 from your ftp dir
it works
Thank you very much

Would be good to know that this also work with other wireless NICs/WPA2 authentication.



Can you please tell me what went wrong? As it seems that this is vanilla kernel issue I would like to know how can I fix this in the future when using non-Arch kernels.

Thank you again.
Comment by Tobias Powalowski (tpowa) - Sunday, 25 November 2007, 20:55 GMT
this kernel includes ingo molnars backport of the cfs scheduler to .23
and the pre release of greg kroahs 23.9 kernel
Comment by Robert Fortune (RobF) - Tuesday, 27 November 2007, 03:22 GMT
http://www.archlinux.org/~tpowa/2.6.23/ had the test kernel plus modules in it yesterday but it's empty as of today. I haven't had a chance to download either.
Comment by Tobias Powalowski (tpowa) - Tuesday, 27 November 2007, 06:13 GMT
it's in testing now
Comment by Michal (broch) - Tuesday, 27 November 2007, 14:59 GMT
so that was scheduler...
It would be better if cfs would be optional along with SD. At least for some time.
I had other issues with cfs too (recently: touchpad slow like molasses when playing AmaroK/CD (but not from disk), though mouse worked). Well seems to be too late now.
Comment by André Prata (nDray) - Tuesday, 27 November 2007, 22:34 GMT
i'm experiencing the same problem with iwlwifi (3945) from current. Am i expected to see this solved with the update as well?
Comment by Manuel C. (ekerazha) - Wednesday, 28 November 2007, 10:59 GMT
kernel26 2.6.23.9-1 from [testing] doesn't fix my issues (is this really a kernel problem?).
Comment by Michal (broch) - Wednesday, 28 November 2007, 15:02 GMT
@nDray
unless you try 2.6.23.9 you will never know.

@ekerazha
In my case this is a kernel problem. I explained why I think so. I also used vanilla kernels. To make it clear:
one vanilla kernel version e.g. 2.6.23 with different versions of cfs (default or devel).
Arch kernel 2.6.23.1, Arch kernel 2.6.23.8 Arch kernel 2.6.23.9 (from testing with backported cfs)
vanilla kernel without cfs e.g 2.6.22.6 and for comparison vanilla kernel 2.6.22.6 with cfs (different versions).
vanilla kernel 2.6.24-rc1 or rc3 without

changing cfs only changes WPA2 authentication performance.

As I suggested, build your own custom kernel, patch with different versions of cfs. See if this fixes your issue.
I am not implying that problems with WPA2 authentication are related only to cfs. However, while I provided specific information, so this was easy to fix, you on the other hand have nothing except some doubts. Because there is a lot of possibilities, either you get lucky waiting and "something" eventually get fixed for you shortly or you may wait relatively long time as nobody knows what is wrong with your setup.

If you suspect that NIC driver is at fault or wpa_supplicant or something else then isolate the issue and fix will appear soon or wait for unspecified period of time.

tpowa provided also some wireless NIC drivers.
Comment by André Prata (nDray) - Wednesday, 28 November 2007, 15:11 GMT
the update kind of solved it... It still fails, sometimes, but usually takes less iterations... That is one part of the problem, for sure... I'm testing with a somewhat unstable connection. Many people is having some troubles, since recently there were changes in the network providers...

Personal thanks to tpowa and every other developers contributing.
Comment by Manuel C. (ekerazha) - Wednesday, 28 November 2007, 15:27 GMT
@broch
My production PC wouldn't be a "testing PC". I have *many* things to do every day and you should know I'm not the kernel26 mantainer or the wpa_supplicant mantainer. I just reported that it always worked fine until some well known updates (just look at the chronological list) and that the new kernel package doesn't fix the issue on my systems. Maybe one day I'll have the time to do some tests but there a *sure* thing here: broch, I know what to do and I really don't need your help and/or suggestions. Please don't speak to me because I don't like you and I think you are not a qualified person. Thank you and bye.
Comment by Michal (broch) - Wednesday, 28 November 2007, 17:39 GMT
If you feel that your problem is different, why not to try and open new bug report? Definitely more efficient than trying to prove whatever doubtful point.

Currently this issue as reported in first post is resolved.

No idea what "production" means for you really. Bleeding edge is quite opposite to stability. By definition Arch is "testing" distro.
Comment by Manuel C. (ekerazha) - Wednesday, 28 November 2007, 17:56 GMT
I don't try to prove anything, I just report my experience, I think you have a persecution complex... and *use* a "distro" is quite different to mantain its packages: an operating system is the base to install applications and make other things, never forget this. End of the story (I hope).
Comment by Robert Fortune (RobF) - Wednesday, 28 November 2007, 22:23 GMT
I upgraded kernel26 (2.6.23.8-1 -> 2.6.23.9-1) and ndiswrapper (1.49-3 -> 1.49-4) from testing. That doesn't seem to have made any major difference. The first time I rebooted after the upgrade, wlan0 actually started up fine. This was the first time that had happened in dozens of reboots over the past three weeks. But repeated cycling of the network daemon (stop->start) as well as trying to bring up wlan0 manually as well as rebooting shows that starting up the wireless network still is very flaky. My system still goes through dozens of WPA-PSK (TKIP) authentication attempts (failing at key negotiation?) and timeouts before authentication finally succeeds.
Comment by Robert Fortune (RobF) - Friday, 30 November 2007, 01:24 GMT
I installed wicd 1.3.1-10 [extra] and added wicd to the startup daemons. I also commented out NET_PROFILES=(....) in rc.conf, relying entirely on wicd to set up the wlan interface (BTW, in the Wicd Manager Preferences, I picked wext as the WPA Supplicant driver - the same setting I had used in network profiles). With these changes, wlan0 now is brought up automatically again at bootup and works just fine.
Comment by Alessio Bolognino (mOLOk) - Tuesday, 04 December 2007, 17:58 GMT
I think this is the same bug, so I report it here: with the 2.6.23.8 kernel, wpa_supplicant goes grazy and eats 100% of the cpu. This happens with madwifi and ndiswrapper drivers.
Comment by Alessio Bolognino (mOLOk) - Tuesday, 04 December 2007, 18:03 GMT
s/ndiswrapper/rt73/
Comment by Alessio Bolognino (mOLOk) - Tuesday, 04 December 2007, 18:28 GMT
with the kernel in testing (kernel26-2.6.23.9) everything works again.

Loading...