FS#20796 - [kernel26] Losing wifi connection followed by severe crash on Intel WiFi Link 6000

Attached to Project: Arch Linux
Opened by Alphazo (alphazo) - Thursday, 09 September 2010, 21:51 GMT
Last edited by Tobias Powalowski (tpowa) - Wednesday, 15 February 2012, 08:03 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 6
Private No

Details

After few hours connected to a WNDR3700 AP running DD-WRT I no longer had access to internet. I then decided to disconnect from the AP using nm-applet and reconnect to it. I got prompted for the password, which was already entered but when pressing OK I was unable to reconnect. Looking at the kernel.log shows a lot of
"kernel: iwlagn 0000:02:00.0: Received BA when not expected"

and toward the end of the file there is a pretty severe crash with Microcode SW error followed by a system trace.

This happened to me twice today.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Wednesday, 15 February 2012, 08:03 GMT
Reason for closing:  Upstream
Comment by Alphazo (alphazo) - Friday, 10 September 2010, 06:33 GMT
Forgot to mention that the problem is not linked only to WND3700 AP. I have had the problem with other brand/model or wifi access points. Furthermore, it takes much more time to crash when connected to 54G AP compared to 54N AP. Finally if using OpenWRT on WNDR3700 rather than DD-WRT the crash shows up within minutes compared to few hours.
Comment by Thomas Bächler (brain0) - Friday, 10 September 2010, 09:18 GMT
These kinds of bugs are common to iwl3945 and iwlagn, and have occured with different kind of hardware over time. You could try if the bug is gone when you use compat-wireless (http://wireless.kernel.org/en/users/Download). If not, this bug must be reported to Intel, as they either need to fix the driver or their firmware.
Comment by Alphazo (alphazo) - Friday, 10 September 2010, 19:39 GMT
I switched to compat-wireless this morning and rebooted. I got a lot of "Microcode SW error detected" during the day but no disconnect or crash. Back home when I connect to WND3700, I don't see any "Microcode SW error" (yet) but I do get a lot of "iwlagn 0000:02:00.0: Received BA when not expected".
I'm attaching my kernel log regarding the "Microcoded SW error" that happened during the day when connected to a 54G AP. I will post another log regarding home connection if it fails.

Are those errors critical ?
Comment by Thomas Bächler (brain0) - Friday, 10 September 2010, 20:14 GMT
The SW errors are critical, but not as bad as you loosing the connection - and there are no kernel traces now, so the driver seems to properly deal with the SW errors. I don't know what the 'Received BA' messages mean.

You should open a bug at bugzilla.kernel.org and include the information about both issues (with and without compat-wireless), so that the issue gets completely fixed.
Comment by Alphazo (alphazo) - Friday, 10 September 2010, 20:18 GMT
Dead again with WNDR3700 and very quickly. Just downloaded an Ubuntu CD then connection was lost (but shows as associated) and I had to reboot (could not associate with another AP).
Comment by Alphazo (alphazo) - Friday, 10 September 2010, 21:01 GMT
Posted a bug on kernel.org bugtracker under https://bugzilla.kernel.org/show_bug.cgi?id=18222
Comment by Cody Carey (codycarey) - Monday, 13 September 2010, 02:09 GMT
Alphazo, more information on this can be found on the following bug report.

http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2214

Apparently it's a problem with the firmware for some devices, and a fix has been in the works for quite a while now it seems.
Comment by sht0rm (sht0rm) - Sunday, 19 September 2010, 16:53 GMT
same to me with iwlwifi

lspci
0c:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN [Kedron] Network Connection (rev 61)

Wifi connection with router Asus RT-N16

dmesg:
iwlagn 0000:0c:00.0: Received BA when not expected
iwlagn 0000:0c:00.0: Received BA when not expected
Comment by Andrej Podzimek (andrej) - Saturday, 02 October 2010, 19:08 GMT
Exactly the same problem here:
03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)

Temporary workaround: ping6 -q -i0 <your-lan-router-ip>
(Or just 'ping' if you are on a legacy IPv4 LAN.)

This restores a network connection in a second or so. (But in some cases it gets so bad that constant ping flooding is necessary.)
Comment by Andrej Podzimek (andrej) - Saturday, 04 December 2010, 17:44 GMT
The problem is getting worse. Wireless networking on Linux has recently become so unreliable that it is virtually useless. (Everything works perfectly on Illumos, but Illumos doesn't support the EAP authentication I need.)

After a couple of minutes after associating and authenticating, I can see hundreds of these lines in dmesg:
iwlagn 0000:03:00.0: Received BA when not expected

Once in about half an hour, something like this occurs:
iwlagn 0000:03:00.0: low ack count detected, restart firmware
iwlagn 0000:03:00.0: On demand firmware reload
iwlagn 0000:03:00.0: Stopping AGG while state not ON or starting
iwlagn 0000:03:00.0: queue number out of range: 0, must be 10 to 19
iwlagn 0000:03:00.0: iwlagn_tx_agg_start on ra = **:**:**:**:**:** tid = 0

Has this problem been reported upstream? Where should I report it? The state of the current Linux WiFi stack has been disastrous since the advent of the new WiFi infrastructure. Resolving this is simply a must.
Comment by Jelle van der Waa (jelly) - Thursday, 14 April 2011, 21:42 GMT
you should report it at the linux kernel bug tracker. Btw is this still around with .38?
Comment by voltaic (voltaic) - Saturday, 30 April 2011, 14:23 GMT
Yes, the bug is still around with 2.6.38.4. I'm using an Intel 5300 AGN. On my network the bug only seems to affect wireless N connections.
Comment by Jelle van der Waa (jelly) - Thursday, 16 June 2011, 10:25 GMT
and how is it with .39?
Comment by Leo von Klenze (lepokle) - Thursday, 16 June 2011, 21:41 GMT
For me it works with 2.6.38.8-1 but not with 2.6.39.1-1


description: Wireless interface
product: Ultimate N WiFi Link 5300
vendor: Intel Corporation
physical id: 0
bus info: pci@0000:0c:00.0
logical name: wlan0
version: 00
serial: 00:21:6a:53:3c:d0
width: 64 bits
clock: 33MHz
capabilities: bus_master cap_list ethernet physical wireless
configuration: broadcast=yes driver=iwlagn driverversion=2.6.38-ARCH firmware=8.83.5.1 build 33692 ip=172.17.2.141 latency=0 multicast=yes wireless=IEEE 802.11abgn
resources: irq:47 memory:f69fe000-f69fffff
Comment by Torstein S. Skulbru (serrghi) - Sunday, 19 June 2011, 14:01 GMT
After I upgraded to .39 both my computers using intel 5100 now crash my router in seconds.
Comment by Leo von Klenze (lepokle) - Wednesday, 29 June 2011, 07:04 GMT
Problem still exists with kernel 2.6.39.2
Comment by Leo von Klenze (lepokle) - Monday, 08 August 2011, 17:20 GMT
Is there any progress?
kernel and router are still crashing with 3.0! Last working version was 2.3.38-8 for me.

This is really annoying. Can I do something? Is there already a kernel bug?

Thank you!
Comment by Jelle van der Waa (jelly) - Tuesday, 16 August 2011, 09:10 GMT
report a bugreport at the kernel bugzilla
Comment by Nicklas Overgaard (nicklas) - Wednesday, 17 August 2011, 12:06 GMT
According to this blog post: http://adeadhamster.blogspot.com/2010/10/linux-iwlagn-problems.html

The issue can be mitigate by adding either
options iwlagn 11n_disable=1

or
options iwlagn 11n_disable50=1

To /etc/modprobe.d/modprobe.conf. Please be ware that adding the incorrect value will result in the module not loading properly... It has solved my issues temporarily, however, it also removes N support.
Comment by voltaic (voltaic) - Thursday, 18 August 2011, 07:02 GMT
As Cody mentioned earlier, this is a firmware problem with some Intel wireless cards. Please report this bug directly to Intel instead of going to the kernel bugzilla:

http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2214

They have been working on this for 14 months now... Please make them aware that there are more people affected by this issue.

Loading...