FS#27955 - [linux] Frequent Freezes while using wifi

Attached to Project: Arch Linux
Opened by Ben Mehne (ben0mega) - Monday, 16 January 2012, 03:29 GMT
Last edited by Tobias Powalowski (tpowa) - Saturday, 28 April 2012, 07:55 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:
Whenever I open up chromium and it fetches the pages I had open last, the monitor goes blank, the keyboard becomes unresponsive, and any music playing, loops. This happens more often when not plugged in (laptop is a MacbookPro 6,2), and could be related to power management issues. Has happened with all 3.1.X kernels. Does not happen when a single tab is loaded, only when multiple tabs are loaded. Within the past hour I have had my laptop freeze and restarted 2 times (while looking for similar bugs). If the wifi is off, I have not see any instability (I have not tried loading network while using wired internet). Laptop is all but unusable unless plugged in, and not always useful when plugged in (more frequent with 3.1.9 than 3.1.8 while plugged in). Nothing shows up in /var/log (no unusual messages and then boot-up messages). I loose any files I have open during freeze - could be ext4 "recovering" them. Using the in-kernel brcmsmac module (in staging). May be only 64-bit (I use a 32-bit install rarely and have not noticed problems). Attached is my everything.log, and I would like any other suggestions for debugging. See 21:38:40 for the one crash. Could be related to https://bugs.archlinux.org/task/26847 but I dont have the rcu_preempt_state.


Additional info:
* all packages up to date.


Steps to reproduce:
Boot up, open multiple pages in browser.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Saturday, 28 April 2012, 07:55 GMT
Reason for closing:  Upstream
Comment by Tobias Powalowski (tpowa) - Thursday, 19 January 2012, 07:15 GMT
Please try 3.2.1
Comment by Ben Mehne (ben0mega) - Thursday, 19 January 2012, 19:55 GMT
only one crash and it was after long usage - i have yet to experience the problem with the same regularity. Once I am either convinced it does not exist or I can reproduce it, I will report.

Edit:
Then five minutes later, crash. Attached is my everything.log, crashes at Jan 19 14:55 and 15:31 (by my watch). Both were experienced when opened chromium with 12 tabs reloading from the previous session and I was opening another tab.
Comment by WTFCoder (wtfcoder) - Friday, 20 January 2012, 06:17 GMT
Seeing similar thing in last week or two, seems to have started with 3.1.8 (or perhaps 3.1.7). Complete freeze, message logs show nothing of interest. Acer Aspite 4830G laptop


00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b4)
00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b4)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b4)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation HM65 Express Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
02:00.0 Ethernet controller: Atheros Communications AR8151 v2.0 Gigabit Ethernet (rev c0)
03:00.0 Network controller: Atheros Communications Inc. AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
04:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5116 PCI Express Card Reader (rev 01)
05:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04)
Comment by Steve F (frank419) - Tuesday, 24 January 2012, 01:55 GMT
I am on Asus 1015pem with Broadcam BC4313 wlan card. I swapped the brcmsmac for the brcmfmac and I am no longer having this issue.

edit: I am also experiencing the 30sec UDev hangup on boot with the above config. I manually added the brcmfmac module to rc.conf but this did not solve the 30sec UDev hangup. I tried to add the brcsmac to the rc.conf and this solved both the wifi freezing issues and the 30sec hangup.
Comment by Ben Mehne (ben0mega) - Tuesday, 24 January 2012, 02:53 GMT
I have been using "rmmod brcmsmac" and my computer runs fine - I get

[ 4476.203787] ERROR @wl_cfg80211_get_station : Could not get rssi (-1)
[ 4476.203819] ERROR @wl_cfg80211_get_station : Could not get rate (-1)
[ 4476.203821] ERROR @wl_cfg80211_get_station : Could not get rssi (-1)
[ 4477.201207] ERROR @wl_cfg80211_get_station : Could not get rate (-1)

in dmesg, but no problems using wifi.
Comment by Ben Mehne (ben0mega) - Wednesday, 25 January 2012, 01:15 GMT
No crashes since 3.2.1-2
Comment by WTFCoder (wtfcoder) - Wednesday, 25 January 2012, 04:56 GMT
Same here, no crashes since 3.2.1-2. It would good to know what changed (in regards to this area) so we dont have any regressions?
Comment by WTFCoder (wtfcoder) - Wednesday, 25 January 2012, 17:56 GMT
Arrghh a crash.. nothing in logs
Comment by Ben Mehne (ben0mega) - Wednesday, 25 January 2012, 18:06 GMT
Same here but
ACPI: EC: GPE storm detected, transactions will use polling mode

was in the log a bit before. along with a few unrelated python segfaults
Comment by Ben Mehne (ben0mega) - Monday, 13 February 2012, 01:56 GMT
I have ceased to get crashes due to this issue after adding brcmsmac to the MODULUES line in rc.conf (got rid of a udev issue as well).

I instead have the dubious honor of spontaneous crashes with ACPI: EC: GPE storm detected, transactions will use polling mode
Comment by Ben Mehne (ben0mega) - Wednesday, 07 March 2012, 20:32 GMT
So an update on my computer: I get around 6 freezes a day (including one while writing this), with

ACPI: EC: GPE storm detected, transactions will use polling mode

which is sometimes preceded by

Mar 7 15:29:35 SilverLeaf kernel: [ 121.156919] WARNING: at drivers/net/wireless/brcm80211/brcmsmac/main.c:8234 brcms_c_wait_for_tx_completion+0x99/0xb0 [brcmsmac]()
Mar 7 15:29:35 SilverLeaf kernel: [ 121.156924] Hardware name: MacBookPro6,2
Mar 7 15:29:35 SilverLeaf kernel: [ 121.156926] Modules linked in: aesni_intel aes_x86_64 cryptd aes_generic rfcomm hidp bnep snd_hda_codec_hdmi snd_hda_codec_cirrus snd_hda_intel snd_hda_codec nvidia(P) snd_hwdep snd_pcm coretemp uvcvideo acpi_cpufreq mperf freq_table videodev applesmc usb_storage firewire_ohci joydev snd_page_alloc v4l2_compat_ioctl32 snd_timer btusb bluetooth uas bcm5974 media snd arc4 soundcore firewire_core intel_agp tg3 i2c_i801 evdev i2c_core intel_ips input_polldev battery button iTCO_wdt pcspkr iTCO_vendor_support libphy ac video apple_bl processor intel_gtt crc_itu_t brcmsmac cordic crc8 brcmutil mac80211 cfg80211 rfkill fuse ext4 crc16 jbd2 mbcache sr_mod sd_mod cdrom pata_acpi hid_apple usbhid hid ata_piix libata scsi_mod uhci_hcd ehci_hcd usbcore usb_common
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157009] Pid: 5, comm: kworker/u:0 Tainted: P O 3.2.8-1-ARCH #1
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157012] Call Trace:
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157025] [<ffffffff8106609f>] warn_slowpath_common+0x7f/0xc0
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157030] [<ffffffff810660fa>] warn_slowpath_null+0x1a/0x20
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157038] [<ffffffffa02bf179>] brcms_c_wait_for_tx_completion+0x99/0xb0 [brcmsmac]
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157042] [<ffffffffa02b11bb>] brcms_ops_flush+0x3b/0x60 [brcmsmac]
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157049] [<ffffffffa026193e>] ieee80211_scan_work+0x11e/0x580 [mac80211]
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157055] [<ffffffffa0261820>] ? ieee80211_scan_rx+0x190/0x190 [mac80211]
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157059] [<ffffffff81082a56>] process_one_work+0x116/0x4d0
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157061] [<ffffffff810833ee>] worker_thread+0x15e/0x350
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157063] [<ffffffff81083290>] ? manage_workers.isra.29+0x230/0x230
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157065] [<ffffffff8108842c>] kthread+0x8c/0xa0
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157070] [<ffffffff8145f674>] kernel_thread_helper+0x4/0x10
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157071] [<ffffffff810883a0>] ? kthread_worker_fn+0x190/0x190
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157073] [<ffffffff8145f670>] ? gs_change+0x13/0x13
Mar 7 15:29:35 SilverLeaf kernel: [ 121.157074] ---[ end trace 12e7222d68c8904c ]---
Comment by Ben Mehne (ben0mega) - Monday, 12 March 2012, 21:41 GMT
So this bug is hard to reproduce on demand - it happens frequently a few seconds after startup, but once the computer is running for a minute or so it occurs very rarely (typically while browsing new sites or ones with lots of ads/ajax requests) for me. Does anyone have a reliable method for freezing the system sometime past boot up?
Comment by Jingcheng Liu (liuexp) - Tuesday, 20 March 2012, 04:08 GMT
With ipv6 enabled, plugging and unplugging network(for several times), waking up from sleeping(for laptop) are other major causes of kernel freezes for me.
Comment by Ben Mehne (ben0mega) - Wednesday, 21 March 2012, 17:37 GMT
Ok - so I have now separated this problem (for me) into two separate bugs. The first bug is in the Nvidia driver and gives

Mar 21 13:04:39 SilverLeaf kernel: [ 3396.045929] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Mar 21 13:04:39 SilverLeaf kernel: [ 3396.045965] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Mar 21 13:18:45 SilverLeaf kernel: [ 482.714706] NVRM: GPU at 0000:01:00.0 has fallen off the bus.

in the kernel.log - if persistence mode is enabled (/usr/bin/nvidia-smi -pm 1) the screen turns off, but the system is otherwise usable (ssh or by keyboard)

The other bug is a true freeze and makes the system inoperable. And this is typically immediately preceded by some sort of NetworkManager message
And not immediately preceded by

ACPI: EC: GPE storm detected, transactions will use polling mode
Comment by Ben Mehne (ben0mega) - Monday, 02 April 2012, 17:42 GMT
Ok, so after switching to the nouveau driver, all problems for me have ceased. I have not had a freeze or crash in over a week of use (where before I was getting at least one a day). I believe the bug I was experiencing is in the nvidia driver.
Comment by Tobias Powalowski (tpowa) - Saturday, 28 April 2012, 07:55 GMT
So this can be closed, we cannot fix anything nvidia related.

Loading...