FS#55133 - [linux] NIC crash

Attached to Project: Arch Linux
Opened by Andrea Amorosi (AndreaA) - Sunday, 13 August 2017, 23:02 GMT
Last edited by Tobias Powalowski (tpowa) - Thursday, 17 August 2017, 13:23 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

After the upgrade to Linux n752vx 4.12.6-1-ARCH #1 SMP PREEMPT Sat Aug 12 09:16:22 CEST 2017 x86_64 GNU/Linux, the following crash happens on my pc 74 seconds after startup (this is from dmesg). The system seem to continue working correctly, but LAN connection is not established anymore


[ 74.725482] ------------[ cut here ]------------
[ 74.725498] WARNING: CPU: 4 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x21a/0x220
[ 74.725502] Modules linked in: ctr ccm cmac rfcomm bnep joydev nls_iso8859_1 nls_cp437 vfat fat intel_rapl arc4 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper snd_hda_codec_realtek snd_hda_codec_generic iwlmvm mac80211 snd_hda_intel snd_hda_codec snd_hda_core iwlwifi asus_nb_wmi i2c_designware_platform snd_hwdep iTCO_wdt i2c_designware_core asus_wmi iTCO_vendor_support cryptd uvcvideo sparse_keymap mxm_wmi snd_pcm videobuf2_vmalloc intel_cstate rtsx_pci_ms videobuf2_memops cfg80211 intel_rapl_perf memstick hci_uart r8169 btusb snd_timer videobuf2_v4l2 btrtl snd btqca input_leds btbcm videobuf2_core soundcore mei_me mii pcspkr i2c_i801 idma64
[ 74.725589] btintel videodev bluetooth media mousedev mei processor_thermal_device intel_lpss_pci shpchp ecdh_generic intel_pch_thermal intel_soc_dts_iosf elan_i2c i2c_hid thermal int3403_thermal asus_wireless wmi rfkill battery intel_lpss_acpi int3400_thermal int3402_thermal evdev int3406_thermal intel_lpss acpi_thermal_rel int340x_thermal_zone led_class mac_hid acpi_pad ac sch_fq_codel sg acpi_call(O) ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid sr_mod sd_mod cdrom xhci_pci serio_raw ahci rtsx_pci_sdmmc xhci_hcd atkbd libahci mmc_core libps2 libata usbcore rtsx_pci scsi_mod usb_common i8042 serio nvidia_drm(PO) nvidia_uvm(PO) nvidia_modeset(PO) nvidia(PO) i915 video button i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm intel_agp intel_gtt
[ 74.725686] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P O 4.12.6-1-ARCH #1
[ 74.725689] Hardware name: ASUSTeK COMPUTER INC. N752VX/N752VX, BIOS N752VX.301 08/18/2016
[ 74.725693] task: ffff8f53f16ec9c0 task.stack: ffffb18b41954000
[ 74.725701] RIP: 0010:dev_watchdog+0x21a/0x220
[ 74.725704] RSP: 0018:ffff8f5403d03e50 EFLAGS: 00010282
[ 74.725709] RAX: 000000000000003d RBX: 0000000000000000 RCX: 0000000000000000
[ 74.725712] RDX: 0000000000000000 RSI: ffff8f5403d0dcc8 RDI: ffff8f5403d0dcc8
[ 74.725715] RBP: ffff8f5403d03e80 R08: 0000000000000865 R09: 0000000000000004
[ 74.725718] R10: ffff8f5403d03ee0 R11: ffffffffbeca052d R12: ffff8f53f01c4a80
[ 74.725721] R13: 0000000000000004 R14: ffff8f53ebd96000 R15: 0000000000000001
[ 74.725725] FS: 0000000000000000(0000) GS:ffff8f5403d00000(0000) knlGS:0000000000000000
[ 74.725729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 74.725732] CR2: 00007fc97801ee14 CR3: 00000002bea09000 CR4: 00000000003406e0
[ 74.725735] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 74.725738] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 74.725741] Call Trace:
[ 74.725745] <IRQ>
[ 74.725754] ? qdisc_rcu_free+0x50/0x50
[ 74.725760] call_timer_fn+0x33/0x160
[ 74.725766] ? qdisc_rcu_free+0x50/0x50
[ 74.725771] run_timer_softirq+0x442/0x490
[ 74.725776] ? ktime_get+0x40/0xa0
[ 74.725783] ? lapic_next_deadline+0x26/0x30
[ 74.725789] ? clockevents_program_event+0xc8/0x100
[ 74.725795] __do_softirq+0xde/0x2d7
[ 74.725803] irq_exit+0xb6/0xc0
[ 74.725807] smp_apic_timer_interrupt+0x3d/0x50
[ 74.725812] apic_timer_interrupt+0x89/0x90
[ 74.725818] RIP: 0010:cpuidle_enter_state+0x12b/0x300
[ 74.725821] RSP: 0018:ffffb18b41957e58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[ 74.725825] RAX: ffff8f5403d18c40 RBX: 0000001165fb3be2 RCX: 000000000000001f
[ 74.725828] RDX: 0000001165fb3be2 RSI: ffff8f5403d16458 RDI: 0000000000000000
[ 74.725831] RBP: ffffb18b41957e98 R08: cccccccccccccccd R09: 0000000000000018
[ 74.725834] R10: 000000000000020f R11: 0000000000014185 R12: ffff8f5403d21300
[ 74.725837] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffffbeaa91f8
[ 74.725840] </IRQ>
[ 74.725847] ? cpuidle_enter_state+0x11b/0x300
[ 74.725851] cpuidle_enter+0x17/0x20
[ 74.725857] call_cpuidle+0x23/0x40
[ 74.725861] do_idle+0x18a/0x1e0
[ 74.725867] cpu_startup_entry+0x71/0x80
[ 74.725873] start_secondary+0x158/0x1a0
[ 74.725877] secondary_startup_64+0x9f/0x9f
[ 74.725882] Code: 63 8e 64 04 00 00 eb 95 4c 89 f7 c6 05 2a 59 57 00 01 e8 6a 7c fd ff 89 d9 48 89 c2 4c 89 f6 48 c7 c7 e8 fc 99 be e8 99 1d c2 ff <0f> ff eb c3 66 90 0f 1f 44 00 00 48 c7 47 08 00 00 00 00 55 48
[ 74.725955] ---[ end trace c3f9d58ba87406cc ]---

This task depends upon

Closed by  Tobias Powalowski (tpowa)
Thursday, 17 August 2017, 13:23 GMT
Reason for closing:  Not a bug
Comment by Andrea Amorosi (AndreaA) - Sunday, 13 August 2017, 23:40 GMT
If the LAN cable is not connected the crash does not happen.
If I connect the LAN, the crash happens after 55 seconds.
Comment by Doug Newgard (Scimmia) - Sunday, 13 August 2017, 23:52 GMT
And you think this is a packaging problem? If not, nothing will get fixed here.
Comment by loqs (loqs) - Sunday, 13 August 2017, 23:58 GMT
Please try 4.12.7-1 (currently in testing) if that does not resolve the issue bisect the kernel to find the bad commit and report the issue upstream.
Comment by Andrea Amorosi (AndreaA) - Monday, 14 August 2017, 00:10 GMT
I've found this:
https://sourceforge.net/p/e1000/bugs/571/?limit=25
Do you think it is related?
Comment by loqs (loqs) - Monday, 14 August 2017, 18:17 GMT
As your dmesg shows the e1000e is not loaded if that issue is with that driver then your issue is different.
Comment by Andrea Amorosi (AndreaA) - Thursday, 17 August 2017, 09:41 GMT
SHORT VERSION
The problem has resolved by itself
LONG VERSION
I have tried to downgrade packages till a date in which the LAN worked (10 days ago), but the problem was still present.
So I realized that some days ago I've used the windows 10 which is in dual boot and maybe that has left the NIC in an inconsistent state.
So I've logged in windows again and I've discovered that LAN didn't work here too.
I've shut down the system and reboot in windows and it has started to work both under windows and under linux.
So maybe windows had left something "dirty" in the NIC generating the crash in linux, but the following usage of windows has somehow cleaned the inconsistent state and now the crash does not happen again also using the latest kernel (and I'm not able to reproduce the bug again).
So the issue can be closed.
Thank you

Loading...