FS#41906 - [linux] kernel 3.16 iwlwifi crash

Attached to Project: Arch Linux
Opened by adrin jalali (adrin) - Wednesday, 10 September 2014, 08:51 GMT
Last edited by Doug Newgard (Scimmia) - Friday, 10 June 2016, 23:24 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 12
Private No

Details

The system connects to the wireless, after some time, I get a crash on iwlwifi module.

# uname -a
Linux k-247.eduroam.uni.wroc.pl 3.16.2-1-ARCH #1 SMP PREEMPT Sat Sep 6 13:12:51 CEST 2014 x86_64 GNU/Linux

dmesg cut:

[ 225.197003] ------------[ cut here ]------------
[ 225.197053] WARNING: CPU: 2 PID: 253 at net/wireless/reg.c:1806 reg_process_hint+0x2d1/0x460 [cfg80211]()
[ 225.197062] Modules linked in: ctr ccm fuse uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev media joydev mousedev coretemp hwmon arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm iwldvm mac80211 iTCO_wdt crct10dif_pclmul iTCO_vendor_support crc32_pclmul crc32c_intel ghash_clmulni_intel samsung_laptop aesni_intel aes_x86_64 lrw gf128mul led_class glue_helper ablk_helper cryptd evdev psmouse microcode mac_hid iwlwifi snd_hda_codec_hdmi serio_raw pcspkr snd_hda_codec_realtek cfg80211 snd_hda_codec_generic i2c_i801 r8169 rfkill i915 fan thermal mii snd_hda_intel snd_hda_controller snd_hda_codec drm_kms_helper battery snd_hwdep drm snd_pcm tpm_tis tpm snd_timer intel_gtt snd i2c_algo_bit wmi i2c_core ac soundcore video mei_me mei shpchp lpc_ich button
[ 225.197159] processor nfs lockd sunrpc fscache ext4 crc16 mbcache jbd2 uas usb_storage sd_mod crc_t10dif crct10dif_common atkbd libps2 ahci ehci_pci libahci xhci_hcd ehci_hcd libata scsi_mod usbcore usb_common i8042 serio
[ 225.197198] CPU: 2 PID: 253 Comm: kworker/2:4 Not tainted 3.16.2-1-ARCH #1
[ 225.197203] Hardware name: SAMSUNG ELECTRONICS CO., LTD. 900X3C/900X4C/900X4D/SAMSUNG_NP1234567890, BIOS P02AAC 06/01/2012
[ 225.197219] Workqueue: events reg_todo [cfg80211]
[ 225.197223] 0000000000000000 00000000f4cfb330 ffff8800c644fd50 ffffffff8152afec
[ 225.197231] 0000000000000000 ffff8800c644fd88 ffffffff8106e45d ffff8800bce2f180
[ 225.197238] ffff880118490260 ffff8800bce2f19c ffff88011f298000 0ffff88011f29800
[ 225.197244] Call Trace:
[ 225.197256] [<ffffffff8152afec>] dump_stack+0x4d/0x6f
[ 225.197266] [<ffffffff8106e45d>] warn_slowpath_common+0x7d/0xa0
[ 225.197273] [<ffffffff8106e58a>] warn_slowpath_null+0x1a/0x20
[ 225.197288] [<ffffffffa05103c1>] reg_process_hint+0x2d1/0x460 [cfg80211]
[ 225.197304] [<ffffffffa05105c9>] reg_todo+0x79/0x1a0 [cfg80211]
[ 225.197316] [<ffffffff8108afa8>] process_one_work+0x168/0x450
[ 225.197324] [<ffffffff8108b5db>] worker_thread+0x6b/0x550
[ 225.197332] [<ffffffff8108b570>] ? init_pwq.part.22+0x10/0x10
[ 225.197340] [<ffffffff81091cea>] kthread+0xea/0x100
[ 225.197349] [<ffffffff811c0000>] ? vfs_truncate+0x130/0x1a0
[ 225.197356] [<ffffffff81091c00>] ? kthread_create_on_node+0x1b0/0x1b0
[ 225.197365] [<ffffffff81530cbc>] ret_from_fork+0x7c/0xb0
[ 225.197372] [<ffffffff81091c00>] ? kthread_create_on_node+0x1b0/0x1b0
[ 225.197377] ---[ end trace 74202afde656227d ]---
   dmesg.txt (61.3 KiB)
This task depends upon

Closed by  Doug Newgard (Scimmia)
Friday, 10 June 2016, 23:24 GMT
Reason for closing:  Fixed
Comment by Andrea Fagiani (Hador) - Tuesday, 16 September 2014, 08:44 GMT
Same issue (on the same hardware platform, apparently).
However, it only happens on resume from suspend/hibernate.
Comment by Andrea Fagiani (Hador) - Friday, 19 September 2014, 08:09 GMT
Still crashing on 3.17-rc5.

However, unloading the module before suspend (and reloading it afterwards) does not trigger the issue.
Comment by Benjamin Campbell (benjica) - Monday, 29 September 2014, 22:42 GMT
I am getting the same crash whenever using a network with WPA2 on different hardware.
Comment by Tigran G (tigrang) - Wednesday, 29 October 2014, 18:43 GMT
Same message, different hardware. Lenovo Z50-70.
Kernel 3.18rc2 (happened on 3.17.X too)

[67999.224512] WARNING: CPU: 3 PID: 18812 at net/wireless/reg.c:1846 reg_process_hint+0x2d1/0x460 [cfg80211]()
[67999.224514] Modules linked in: ehci_pci ehci_hcd thinkpad_acpi nvram psmouse msr cpufreq_stats ctr ccm fuse bbswitch(O) coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel snd_hda_codec_hdmi ghash_clmulni_intel aesni_intel joydev mousedev aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd rtsx_usb_ms uvcvideo rtsx_usb_sdmmc memstick rtsx_usb ecb videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common btusb videodev media bluetooth ppdev arc4 serio_raw evdev iTCO_wdt mac_hid iTCO_vendor_support i915 iwlmvm mac80211 i2c_hid r8169 mii hid drm_kms_helper iwlwifi drm dw_dmac cfg80211 rfkill spi_pxa2xx_platform dw_dmac_core i2c_designware_platform parport_pc battery video gpio_lynxpoint parport intel_gtt i2c_algo_bit i2c_designware_core
[67999.224537] i2c_i801 shpchp 8250_dw snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel button mei_me mei i2c_core lpc_ich snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd processor soundcore ac wmi ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod atkbd libps2 ahci libahci libata xhci_pci xhci_hcd scsi_mod usbcore usb_common i8042 serio sdhci_acpi sdhci led_class mmc_core [last unloaded: ehci_hcd]
[67999.224556] CPU: 3 PID: 18812 Comm: kworker/3:2 Tainted: G W O 3.18.0-1-mainline #1
[67999.224557] Hardware name: LENOVO 20354/Lancer 5A5, BIOS 9BCN25WW 04/10/2014
[67999.224561] Workqueue: events reg_todo [cfg80211]
[67999.224562] 0000000000000000 00000000ea22d11e ffff88010facbd28 ffffffff8154fc66
[67999.224563] 0000000000000000 0000000000000000 ffff88010facbd68 ffffffff81071b81
[67999.224565] ffff88010facbd58 ffff88012dcee940 ffff88023d11c260 ffff88012dcee95c
[67999.224567] Call Trace:
[67999.224571] [<ffffffff8154fc66>] dump_stack+0x4e/0x71
[67999.224574] [<ffffffff81071b81>] warn_slowpath_common+0x81/0xa0
[67999.224576] [<ffffffff81071c9a>] warn_slowpath_null+0x1a/0x20
[67999.224580] [<ffffffffa03694d1>] reg_process_hint+0x2d1/0x460 [cfg80211]
[67999.224584] [<ffffffffa03696d9>] reg_todo+0x79/0x1a0 [cfg80211]
[67999.224588] [<ffffffff8108a715>] process_one_work+0x145/0x400
[67999.224590] [<ffffffff8108acdb>] worker_thread+0x6b/0x4a0
[67999.224592] [<ffffffff8108ac70>] ? init_pwq.part.22+0x10/0x10
[67999.224594] [<ffffffff8108fd4a>] kthread+0xea/0x100
[67999.224596] [<ffffffff8108fc60>] ? kthread_create_on_node+0x1c0/0x1c0
[67999.224598] [<ffffffff8155573c>] ret_from_fork+0x7c/0xb0
[67999.224600] [<ffffffff8108fc60>] ? kthread_create_on_node+0x1c0/0x1c0
Comment by P.H. (Vain) - Thursday, 13 November 2014, 18:58 GMT
Does anybody know whether upstream is aware of this issue? Or is this an Arch-only bug?
Comment by Tigran G (tigrang) - Thursday, 13 November 2014, 19:02 GMT
I can't find an upstream report.
Comment by karoneun (karoneun) - Thursday, 13 November 2014, 20:31 GMT
At least on my machine (TP x220, Intel Centrino Advanced-N 6205), I sometimes get "Fail to flush Tx queue" iwlwifi crashes in addition to the crashes mentioned here. Because of this correlation, maybe this bug could be related? https://bugzilla.kernel.org/show_bug.cgi?id=56581
Comment by Aliaksandr Stelmachonak (ava1ar) - Friday, 14 November 2014, 06:21 GMT
Reproducible for me on System76 Galago UltraPro with Atheros AR9462. After hibernate I am getting

[ 276.195519] WARNING: CPU: 0 PID: 4 at net/wireless/reg.c:1806 reg_process_hint+0x2d1/0x460 [cfg80211]()
[ 276.195521] Modules linked in: hid_generic hidp hid ctr ccm rfcomm md5 pci_stub vboxpci(O) vboxnetadp(O) ec_sys joydev mousedev sr_mod cdrom bnep zram lz4_compress ip6t_REJECT ath3k ecb btusb bluetooth xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT iTCO_wdt iTCO_vendor_support xt_limit xt_tcpudp arc4 nvram ath9k ath9k_common ath9k_hw ath led_class xt_addrtype mac80211 coretemp hwmon x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm cfg80211 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel rfkill aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse pcspkr rtc_efi rtsx_pci_ms serio_raw i2c_i801 memstick lpc_ich thermal tpm_infineon tpm_tis wmi evdev battery mac_hid ac nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack snd_hda_codec_via snd_hda_codec_generic
[ 276.195564] ip6table_filter ip6_tables iptable_filter ip_tables x_tables snd_hda_codec_hdmi snd_hda_intel e1000e snd_hda_controller mei_me sch_fq_codel snd_hda_codec mei ptp shpchp pps_core snd_hwdep vboxnetflt(O) vboxdrv(O) snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore nfs lockd sunrpc fscache fuse exfat(O) ecryptfs cbc sha256_ssse3 sha256_generic encrypted_keys sha1_ssse3 sha1_generic hmac trusted tpm cpufreq_powersave cpufreq_userspace cpufreq_conservative processor vhba(O) ext4 crc16 mbcache jbd2 sd_mod crc_t10dif crct10dif_common rtsx_pci_sdmmc mmc_core atkbd libps2 xhci_hcd ahci libahci ehci_pci ehci_hcd libata usbcore scsi_mod rtsx_pci usb_common i8042 serio i915 button intel_gtt i2c_algo_bit video drm_kms_helper drm i2c_core
[ 276.195608] CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: G W O 3.17.2-1-ARCH #1
[ 276.195610] Hardware name: Notebook W740SU /W740SU , BIOS 4.6.5 04/21/2014
[ 276.195616] Workqueue: events reg_todo [cfg80211]
[ 276.195618] 0000000000000000 00000000f32fff3f ffff8804095e7d58 ffffffff815367d0
[ 276.195621] 0000000000000000 ffff8804095e7d90 ffffffff8107054d ffff8803e13b98c0
[ 276.195623] ffff8800dae50260 ffff8803e13b98dc 0000000000000000 ffff88041fa18000
[ 276.195625] Call Trace:
[ 276.195631] [<ffffffff815367d0>] dump_stack+0x4d/0x6f
[ 276.195635] [<ffffffff8107054d>] warn_slowpath_common+0x7d/0xa0
[ 276.195638] [<ffffffff8107067a>] warn_slowpath_null+0x1a/0x20
[ 276.195644] [<ffffffffa079b3b1>] reg_process_hint+0x2d1/0x460 [cfg80211]
[ 276.195649] [<ffffffffa079b5b9>] reg_todo+0x79/0x1a0 [cfg80211]
[ 276.195655] [<ffffffff81088b85>] process_one_work+0x145/0x400
[ 276.195657] [<ffffffff8108914b>] worker_thread+0x6b/0x4a0
[ 276.195660] [<ffffffff810890e0>] ? init_pwq.part.22+0x10/0x10
[ 276.195663] [<ffffffff8108e06a>] kthread+0xea/0x100
[ 276.195665] [<ffffffff8108df80>] ? kthread_create_on_node+0x1b0/0x1b0
[ 276.195669] [<ffffffff8153c6fc>] ret_from_fork+0x7c/0xb0
[ 276.195671] [<ffffffff8108df80>] ? kthread_create_on_node+0x1b0/0x1b0
Comment by Alfred Krohmer (devkid) - Friday, 14 November 2014, 18:23 GMT
Same problem here with Thinkpad T440s 20ARS0XL00. Exact same stack traces as above. This happens about 2 times early after boot and then every once in a while, kind of annoying. Happens independently of hibernate / suspend. Doing a manual disconnect and reconnect in NetworkManager works, though.
Comment by Jay (GSF1200S) - Friday, 14 November 2014, 18:47 GMT
Like karoneun above, I have both the problem in this bug report (with a stack trace in journalctl) and I get the issues of:
https://bugzilla.kernel.org/show_bug.cgi?id=56581

Lenovo T530:
03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 3e)

It doesnt seem that the stack trace occuring really changes anything as my network continues to work. However, once I get the "iwlwifi 0000:03:00.0: Q 5 is active and mapped to fifo" crap, my network slows to a crawl and eventually becomes useless.
Comment by Tigran G (tigrang) - Sunday, 16 November 2014, 02:04 GMT
I can no longer connect to 5G networks, Keep getting this:

[67880.036075] iwlwifi 0000:02:00.0: fail to flush all tx fifo queues Q 0
[67880.036079] iwlwifi 0000:02:00.0: Current SW read_ptr 81 write_ptr 82
[67880.036100] iwl data: 00000000: 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[67880.036112] iwlwifi 0000:02:00.0: FH TRBs(0) = 0x00000000
[67880.036123] iwlwifi 0000:02:00.0: FH TRBs(1) = 0xc01100b9
[67880.036134] iwlwifi 0000:02:00.0: FH TRBs(2) = 0x00000000
[67880.036145] iwlwifi 0000:02:00.0: FH TRBs(3) = 0x80300051
[67880.036156] iwlwifi 0000:02:00.0: FH TRBs(4) = 0x00000000
[67880.036167] iwlwifi 0000:02:00.0: FH TRBs(5) = 0x00000000
[67880.036178] iwlwifi 0000:02:00.0: FH TRBs(6) = 0x00000000
[67880.036189] iwlwifi 0000:02:00.0: FH TRBs(7) = 0x007090a3
[67880.036237] iwlwifi 0000:02:00.0: Q 0 is active and mapped to fifo 3 ra_tid 0x0000 [81,82]
[67880.036285] iwlwifi 0000:02:00.0: Q 1 is active and mapped to fifo 2 ra_tid 0x0000 [0,0]
[67880.036333] iwlwifi 0000:02:00.0: Q 2 is active and mapped to fifo 1 ra_tid 0x0000 [94,95]
[67880.036381] iwlwifi 0000:02:00.0: Q 3 is active and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036428] iwlwifi 0000:02:00.0: Q 4 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036476] iwlwifi 0000:02:00.0: Q 5 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036524] iwlwifi 0000:02:00.0: Q 6 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036572] iwlwifi 0000:02:00.0: Q 7 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036619] iwlwifi 0000:02:00.0: Q 8 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036667] iwlwifi 0000:02:00.0: Q 9 is active and mapped to fifo 7 ra_tid 0x0000 [164,164]
[67880.036715] iwlwifi 0000:02:00.0: Q 10 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036763] iwlwifi 0000:02:00.0: Q 11 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036811] iwlwifi 0000:02:00.0: Q 12 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036858] iwlwifi 0000:02:00.0: Q 13 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036906] iwlwifi 0000:02:00.0: Q 14 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.036954] iwlwifi 0000:02:00.0: Q 15 is active and mapped to fifo 5 ra_tid 0x0000 [0,0]
[67880.037002] iwlwifi 0000:02:00.0: Q 16 is active and mapped to fifo 1 ra_tid 0x0000 [171,212]
[67880.037049] iwlwifi 0000:02:00.0: Q 17 is inactive and mapped to fifo 3 ra_tid 0x0006 [24,24]
[67880.037097] iwlwifi 0000:02:00.0: Q 18 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
[67880.037145] iwlwifi 0000:02:00.0: Q 19 is inactive and mapped to fifo 0 ra_tid 0x0000 [0,0]
Comment by Tigran G (tigrang) - Sunday, 16 November 2014, 02:05 GMT
Can someone make a new upstream bug report and link here?
Comment by karoneun (karoneun) - Friday, 28 November 2014, 16:35 GMT
Done. :) https://bugzilla.kernel.org/show_bug.cgi?id=89001
If theres anything I should add/change, let me know.
Comment by Steven (Stebalien) - Monday, 05 January 2015, 21:44 GMT
FYI, there's a patch now. It will be in 3.19 and is cosmetic so you probably shouldn't bother applying.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/net/wireless/reg.c?id=70dcec5a488a7b81779190ac8089475fe4b8b962
Comment by Samantha McVey (samcv) - Friday, 10 June 2016, 22:55 GMT
Looks like this has been fixed upstream. Recommending this be closed.

Loading...