FS#10984 - [iwlwifi-4965-ucode] 228.57.2.21-1-i686 upgrade causes hard freeze

Attached to Project: Arch Linux
Opened by rob miller (vasdee) - Tuesday, 22 July 2008, 18:18 GMT
Last edited by Allan McRae (Allan) - Saturday, 06 February 2010, 02:14 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Thomas Bächler (brain0)
Architecture i686
Severity Medium
Priority Normal
Reported Version None
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 6
Private No

Details

Description:

The latest iwlwifi-4965-ucode-228.57.2.21-1-i686 upgrade causes the system to hard freeze. A rollback to the previous version fixes the problem. I'm using the latest kernel from CORE.

/var/log/errors.log messages appear the same between both versions.

Additional info:

Others who have the problem: http://bbs.archlinux.org/post.php?tid=52078

Snippet from errors.log

Jul 21 19:12:10 hackbook usb 1-1: device not accepting address 2, error -71
Jul 21 19:12:10 hackbook hub 1-0:1.0: unable to enumerate USB device on port 1
Jul 21 19:12:10 hackbook hub 5-0:1.0: unable to enumerate USB device on port 2
Jul 21 19:12:10 hackbook hub 6-0:1.0: unable to enumerate USB device on port 6
Jul 21 19:12:16 hackbook hcid[4529]: Parsing /etc/bluetooth/main.conf failed: No such file or directory
Jul 21 19:12:17 hackbook hcid[4529]: Parsing /etc/bluetooth/input.conf failed: No such file or directory
Jul 21 19:12:17 hackbook pan0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.
Jul 21 19:12:43 hackbook dhclient: wmaster0: unknown hardware address type 801
Jul 21 19:12:43 hackbook dhclient: wmaster0: unknown hardware address type 801
** CRASH ( I THINK ) **
Jul 21 19:17:19 hackbook hub 1-0:1.0: unable to enumerate USB device on port 1
Jul 21 19:17:19 hackbook hub 3-0:1.0: unable to enumerate USB device on port 2
Jul 21 19:17:19 hackbook hub 6-0:1.0: unable to enumerate USB device on port 6
Jul 21 19:17:23 hackbook hcid[4538]: Parsing /etc/bluetooth/main.conf failed: No such file or directory
Jul 21 19:17:23 hackbook hcid[4538]: Parsing /etc/bluetooth/input.conf failed: No such file or directory
Jul 21 19:17:23 hackbook pan0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.


Steps to reproduce:

1. upgrade to latest package
2. wait for freeze

This task depends upon

Closed by  Allan McRae (Allan)
Saturday, 06 February 2010, 02:14 GMT
Reason for closing:  Upstream
Additional comments about closing:  Not much we can do with binary blobs. Issue should be tracked upstream.
Comment by Matt Runion (mrunion) - Tuesday, 22 July 2008, 19:18 GMT
I'm one of the "others" reporting in the post mentioned above. I'm popping in here for monitoring purposes. Let me know if there's any other information I may be able to provide.
Comment by Greg (dolby) - Tuesday, 22 July 2008, 20:02 GMT
Comment by Thomas Bächler (brain0) - Tuesday, 22 July 2008, 20:39 GMT
The new ucode actually contains 2 versions:

228.57.1.21 with the old ABI version and 228.57.2.21 with the new ABI. The driver in 2.6.25 (probably 2.6.26 as well) uses the old ABI. I cannot find anything about a bug on Intel's wireless mailing list.
Comment by Thomas Bächler (brain0) - Wednesday, 23 July 2008, 20:36 GMT
Are you guys all using i686? I just spoke to Damir and he says he's been using the new ucode fine for days now, but on x86_64. I checked the package and the ucode is identical for i686 and x86_64, as it should be.

Did any of you try with 2.6.26 from testing already?
Comment by Matt Runion (mrunion) - Wednesday, 23 July 2008, 20:49 GMT
I'm using i686 and have NOT tried 2.6.26 from [testing]. It's indeed odd that I can install the new version and just sit and watch my machine after logging in and it will just "freeze" up -- CAPS lock light flashing and everything.

Is there any hardware information that I need to give you guys that may help?
Comment by Thomas Bächler (brain0) - Wednesday, 23 July 2008, 22:52 GMT
The flashing means that this is a kernel panic. So, to get some info, you have to run stuff from a terminal instead of X and wait for it to freeze. It should print the panic and you can write it all down. It won't help me much, but possibly the guys at Intel.
Comment by rob miller (vasdee) - Wednesday, 23 July 2008, 23:11 GMT
I'm running i686 and NOT the 2.6.26 kernel. Exact same thing happens for me, flashing keys and all that.
Comment by Dan McGee (toofishes) - Thursday, 24 July 2008, 02:38 GMT
Someone testing with 2.6.26 would be helpful here.
Comment by josep (jpatufet) - Friday, 25 July 2008, 11:49 GMT
It works! With testing/kernel26 2.6.26-2 and new iwl

I must say that i had the same problem before, when I upgraded kernel 24->25 so maybe are not related.
Comment by Nicolas Bigaouette (big_gie) - Friday, 25 July 2008, 15:43 GMT
I have the same problem. I was reporting it on the forum but got a panic half through the post!!! ARRGGG

I'm running x86_64. I tried the default 2.6.25, and recompiled myself 2.6.26, without any results. I get the panic with both.

I downgraded ucode to 4.44.1.20 (revision 356 from svn). I'll reboot and check it out....
Comment by Greg (dolby) - Saturday, 26 July 2008, 05:33 GMT
Did you try the 2.6.26 kernel from testing though?
Comment by Nicolas Bigaouette (big_gie) - Saturday, 26 July 2008, 14:53 GMT
No I did not. When I compiled 2.6.26 it was not available in testing. I could try it, but then after downgrading and using the computer for 24 hours, I did not have any panic.
Comment by Matt Runion (mrunion) - Saturday, 26 July 2008, 15:07 GMT
If I get time this weekend I will try the 2.6.26 from testing. I'm "wary" of that, though because this is my work machine as well. I'm thinking that worst case I can just downgrade the kernel again. Oh well, I'll report back what I find if I get the chance to install the new kernel.
Comment by Matt Runion (mrunion) - Saturday, 26 July 2008, 17:19 GMT
In order to install the kernel from [testing] I am going to have to remove some other stuff from my machine. I'm honestly not willing to do this at the present time because I do have to finish some work things this weekend. One person has said it worked with 2.6.26 from [testing], so hopefully we just need the new kernel to get it working. I am going to either have to wait on the new kernel to come out of testing or wait until later in the week when I have more time to mess around with packages. My apologies, but I can't put the required effort into this for the next few days.
Comment by Greg (dolby) - Saturday, 26 July 2008, 18:17 GMT
No problem. Just remember to comment here when you upgrade.
Comment by Matt Runion (mrunion) - Monday, 11 August 2008, 17:06 GMT
I have upgraded to the 2.6.26 kernel and iwl driver version 228.57.2.21 and all seems to be working just fine. On 2.6.25 I could not run more than 5 minutes without a crash, and I;ve been up almost 3 hours this time. I think this is resolved.
Comment by Matt Runion (mrunion) - Monday, 11 August 2008, 17:14 GMT
Ok, no sooner had I typed the message than the lockup returned. CAPS lock flashed and this time the WiFi light (which never worked with the older driver, BTW) was flashing instead of being blue.

I have rebooted and downgraded the driver as before -- but I've not rebooted after downgrading it. I want to see if it goes longer without locking up -- maybe it was a fluke? Anyway, it's looking as if this one might not be solved after all. I'll post back with results.
Comment by Matt Runion (mrunion) - Monday, 11 August 2008, 17:27 GMT
If it's any help, I have gotten some:

Aug 11 13:06:24 highvoltage wlan0 (WE) : Wireless Event too big (320)

In the "everything.log" after I upgraded to the new kernel and iwl driver. I don't know if that means anything. Any other information I can provide to you guys?

Is any of this from "kernel.log" pertinent?

Boot BEFORE upgrade from today:

Aug 11 08:24:54 highvoltage wlan0: Initial auth_alg=0
Aug 11 08:24:54 highvoltage wlan0: authenticate with AP 00:19:e3:fc:63:22
Aug 11 08:24:54 highvoltage wlan0: RX authentication from 00:19:e3:fc:63:22 (alg=0 transaction=2 status=0)
Aug 11 08:24:54 highvoltage wlan0: authenticated
Aug 11 08:24:54 highvoltage wlan0: associate with AP 00:19:e3:fc:63:22
Aug 11 08:24:54 highvoltage wlan0: RX AssocResp from 00:19:e3:fc:63:22 (capab=0x431 status=0 aid=3)
Aug 11 08:24:54 highvoltage wlan0: associated
Aug 11 08:24:54 highvoltage wlan0: switched to short barker preamble (BSSID=00:19:e3:fc:63:22)
Aug 11 08:24:54 highvoltage wlan0 (WE) : Wireless Event too big (320)
Aug 11 08:24:54 highvoltage wlan0: WMM queue=2 aci=0 acm=0 aifs=3 cWmin=15 cWmax=1023 burst=0
Aug 11 08:24:54 highvoltage wlan0: WMM queue=3 aci=1 acm=0 aifs=7 cWmin=15 cWmax=1023 burst=0
Aug 11 08:24:54 highvoltage wlan0: WMM queue=1 aci=2 acm=0 aifs=2 cWmin=7 cWmax=15 burst=30
Aug 11 08:24:54 highvoltage wlan0: WMM queue=0 aci=3 acm=0 aifs=2 cWmin=3 cWmax=7 burst=15
Aug 11 08:24:54 highvoltage padlock: VIA PadLock not detected.
Aug 11 08:24:57 highvoltage r8169: eth0: link down

Boot AFTER upgrade today:

Aug 11 13:06:24 highvoltage wlan0: Initial auth_alg=0
Aug 11 13:06:24 highvoltage wlan0: authenticate with AP 00:19:e3:fc:63:22
Aug 11 13:06:24 highvoltage wlan0: RX authentication from 00:19:e3:fc:63:22 (alg=0 transaction=2 status=0)
Aug 11 13:06:24 highvoltage wlan0: authenticated
Aug 11 13:06:24 highvoltage wlan0: associate with AP 00:19:e3:fc:63:22
Aug 11 13:06:24 highvoltage wlan0: RX AssocResp from 00:19:e3:fc:63:22 (capab=0x431 status=0 aid=2)
Aug 11 13:06:24 highvoltage wlan0: associated
Aug 11 13:06:24 highvoltage wlan0: switched to short barker preamble (BSSID=00:19:e3:fc:63:22)
Aug 11 13:06:24 highvoltage wlan0 (WE) : Wireless Event too big (320)
Aug 11 13:06:25 highvoltage padlock: VIA PadLock not detected.
Aug 11 13:06:27 highvoltage r8169: eth0: link down
Comment by Matt Runion (mrunion) - Tuesday, 12 August 2008, 12:53 GMT
Here's something interesting...

At home last night I re-upgraded to the new driver and left the machine running, browsed the net, etc. No lock-ups. I also found an upgraded BIOS for my laptop. I had F.53 and there was an F.58 AND in F.58 support was added for 5007ABN wireless stuff. "Good," I thought. I got the BIOS update and things still ran fine with the new driver. I ran the laptop for 3-4 hours last night and never locked up. I even got to thinking that at work we use WPA, so I set my wireless up at home with WPA and set that up -- still fine! No crashes, etc.

Now, I get to work this morning and within 15 minutes I get a lockup. Could it be the environment? Here at work we're sitting behind a wireless router that is an Apple AIRPORT. At home I use a LinkSys. could there be something there? also at work the network seems to go up and down a couple of times a day. It's usually down less than a minute, but could that be causing an issue? I'll gladly try to help out debugging this as it's a very annoying problem. The 4.44.xxx version of iwlwifi DOESN'T crash, but the new one does. Can I try something that helps out?
Comment by Matt Runion (mrunion) - Tuesday, 12 August 2008, 18:00 GMT
I hope I'm not beating a dead horse, but if any of you are still experiencing this, can you chime in here:

http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1726
Comment by Matt Runion (mrunion) - Wednesday, 13 August 2008, 14:27 GMT
Another user with the same issue: http://bbs.archlinux.org/viewtopic.php?id=53331

(I added this to show it's affecting more than just me.)
Comment by Matt Runion (mrunion) - Thursday, 14 August 2008, 12:51 GMT
I have continued messing around with this and it SEEMS (though not completely confirmed) that it only occurs when using WPA on the wireless router. If I make the router "open", it seems to not crash the system. Does that help?
Comment by CJ Fleck (cujo) - Monday, 18 August 2008, 13:44 GMT
I also have this problem. I basically have all the same issue to report, including the fact that wpa on the router breaks is, but no encryption doesn't seem to cause the lock up. I can't verify wep.
Comment by Andrew Tarzwell (atarzwell) - Sunday, 31 August 2008, 10:11 GMT
Though I should pipe up here as well, I am going to try and downgrade the ucode.

I have noticed that this bug is related to other computers on the wireless network.

If I shut off the wifes laptop (Vista, Wireless N) I can work for a long time with no crash, but when I turn it on and start a large file transfer it locks up almost instantly.

Network is Wireless N DLink DIR-615, WPA.
Comment by Andrew Tarzwell (atarzwell) - Sunday, 31 August 2008, 10:27 GMT
Alright! Updated, and transfered 2gigs across the network with no problems, at the same time the wifes laptop grabbed a huge file off the network!
Comment by Noah (print) - Monday, 29 September 2008, 14:29 GMT
Seems like this is still broken. Anyone working on a fix? Thread here:
http://bbs.archlinux.org/viewtopic.php?pid=426553#p426553
Comment by sphetr2 (sphetr2) - Sunday, 19 October 2008, 00:18 GMT
I can confirm that this is still an issue...if it's relevant, I'm running 64-bit.
Comment by Thomas Bächler (brain0) - Sunday, 19 October 2008, 00:28 GMT
2.6.27 won't work with any other ucode!
Comment by ponto (ponto) - Sunday, 19 October 2008, 16:47 GMT
the roll back solution stopped to work on kernel 2.6.27 'cause an error occurs when loading the ucode. i tried the newer ucode again, and the kernel panics continue. but this time, with this kernel, that take longer to happen.

so what can i do? i don't like kernel panics at all : p
Comment by Nicolas Bigaouette (big_gie) - Sunday, 19 October 2008, 18:32 GMT
I upgraded the kernel last week and my connection stopped working: I needed to upgrade the ucode too...

So I upgraded the ucode too, without problem to my surprise. Until today...

I now cannot use my connection anymore.

What has changed is the wireless connection. Both connection are WPA. I don't have access to the working one (from a D-Link DI-624 http://support.dlink.com/products/view.asp?productid=DI-624_revC ) but here is the info from the non-working one:
DLink DIR-615 (http://www.dlink.com/products/?pid=565) firmware 2.23
WPA2 Only (Personal)
Channel 6
Mixed n, g and b
TKIP and AES cipher (needed AES for a PS3 to connect)

Nothing in the log appears. I could try to take a picture of the panic in a virtual terminal...

This is REALLY annoying. My computer is NOT usable with that problem.
Comment by Thomas Bächler (brain0) - Sunday, 19 October 2008, 23:23 GMT
The iwlagn driver in 2.6.27 requires the API version 2 ucode, thus older versions won't work anymore. You all have had this problem for a long time, has anyone reported this to Intel?
Comment by ponto (ponto) - Sunday, 19 October 2008, 23:43 GMT
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/272015 this bug report seems to be from the same bug.

yes, intel has this http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1703 and that's the same problem if i am not mistaken.
Comment by Thomas Bächler (brain0) - Thursday, 11 December 2008, 22:39 GMT
I put 228.57.2.23 to testing, is this better?
Comment by Nicolas Bigaouette (big_gie) - Thursday, 11 December 2008, 22:44 GMT
I'll test it this weekend with a 802.11n network.
I've been using the "debug firmware" from comment #52 of http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1703 since maybe a month. Before that I was still with 2.6.26 because of the freeze. I don't get the freezes anymore. It could be because of the firmware or maybe because of a >2.6.27.1 kernel that fixed the issue.
Comment by Matt Runion (mrunion) - Monday, 15 December 2008, 13:53 GMT
I have ran the version listed (though it moved to extra I think), and it worked most of Friday and I will let it continue to run today. So far it is working fine for me.
Comment by Thomas Bächler (brain0) - Monday, 15 December 2008, 14:12 GMT
I'll keep this open to maybe get more reports in, but it seems this issue is finally resolved.
Comment by Robson Roberto Souza Peixoto (robsonpeixoto) - Tuesday, 03 March 2009, 15:30 GMT
I am having this problem:
# tail -F /var/log/errors.log
Mar 1 17:33:26 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 1 17:33:27 robinho iwlagn: Can't stop Rx DMA.

Mar 1 17:33:27 robinho iwlagn: No space for Tx
Mar 1 17:33:27 robinho iwlagn: Error sending SENSITIVITY_CMD: enqueue_hcmd
failed: -28
Mar 1 17:33:27 robinho iwlagn: SENSITIVITY_CMD failed



http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1918
Comment by Thomas Bächler (brain0) - Tuesday, 03 March 2009, 16:58 GMT
And again. Does this happen a lot? Any other users with this?

(I had a microcode error with iwl3945 after much upload, but only once, but I fear this will come up more often as soon as iwl3945 is merged into iwlagn)
Comment by Robson Roberto Souza Peixoto (robsonpeixoto) - Tuesday, 03 March 2009, 17:26 GMT
# grep "^Mar 3" /var/log/errors.log | grep "Microcode SW error detected"
Mar 3 00:01:32 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 07:33:36 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 08:17:18 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 08:35:10 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 09:01:24 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 09:45:03 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 10:20:05 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 12:05:38 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 12:21:13 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 12:42:34 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 13:03:49 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 13:17:47 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 13:29:27 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 13:38:16 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 14:01:24 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 14:12:22 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Mar 3 14:19:09 robinho iwlagn: Microcode SW error detected. Restarting 0x82000000.
Comment by Robson Roberto Souza Peixoto (robsonpeixoto) - Tuesday, 03 March 2009, 17:29 GMT

Loading...