Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#79020 - [linux] Realtek RTL8111/8168/8411 non-funcitoning on 6.4.1 and later

Attached to Project: Arch Linux
Opened by twixt (twixt) - Sunday, 09 July 2023, 06:35 GMT
Last edited by Toolybird (Toolybird) - Thursday, 24 August 2023, 22:31 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
After installing kernel version 6.4.1 and 6.4.2 the drivers for the Realtek GigE RTL8111/8168/8411 would cease to function approximately 1-2 hours after boot. I've experienced it as early as half an hour after boot. Downgrading back down to 6.3.9 and below, the issue never comes up again. No issues with the latest kernel for Ethernet controllers from Intel, both I225-V and I219-LM.

First noticed when network went down, and any attempts to fix did not work. Plugging in a USB NIC and taking the link down would cause traffic to flow normally, but RTNETLINK would always fail to find the device to attempt to bring the link back up. After reboot, once net went down, trying to run "ip a" to see if DHCP address was still available would cause a hang for around 5 minutes. Following this, the DHCP route and address would be gone from the interface, and taking it down/up would result in behavior previously mentioned. Following this, I had installed the 6.4.2 kernel to repeat issues.

Unfortunately, I do not have logs since I was more focused on bringing the machine back online to continue work. I can attempt to replicate and get logs if really necessary, but this is a machine in use almost every day. Forcing it into downtime isn't ideal.
This task depends upon

Closed by  Toolybird (Toolybird)
Thursday, 24 August 2023, 22:31 GMT
Reason for closing:  Upstream
Additional comments about closing:  If still happening with latest kernels, please report it upstream. This is not an Arch packaging issue.
Comment by Toolybird (Toolybird) - Sunday, 09 July 2023, 06:51 GMT
It sounds like a kernel regression. The general advice for debugging kernel regressions is here [1]. You might have to report this upstream to the kernel folks. On the upstream kernel regression tracker [2] I found a possible lead [3] "transmit queue timeout on r8169". Please let us know how you get on.

[1] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
[2] https://linux-regtracking.leemhuis.info/regzbot/mainline/
[3] https://lore.kernel.org/netdev/c3465166-f04d-fcf5-d284-57357abb3f99%40freenet.de/
Comment by twixt (twixt) - Sunday, 09 July 2023, 07:00 GMT
Cheers. I was going to post upstream, but saw the warning for reporting non-self-compiled kernel bugs to be done downstream.
I'll try and schedule some debugging in low-load hours to present there, checking out mainline to see if the issue is still present sounds like a plan.
Comment by twixt (twixt) - Sunday, 23 July 2023, 07:54 GMT
Doing some testing with kernel 6.4.4 -- so far so good. Only been 2 hours, though.

Will update once more after leaving system up for a day or more, then we can be certain this is resolved.
Comment by twixt (twixt) - Tuesday, 25 July 2023, 11:55 GMT
No dice. While it lasted a bit longer with the newer kernel, eventually the NIC dropped off the face of the Earth.

Loading...