FS#64366 - dhcpcd does not reliably detect carrier change/reacquiring

Attached to Project: Arch Linux
Opened by Nico Schottelius (telmich) - Sunday, 03 November 2019, 12:19 GMT
Last edited by Antonio Rojas (arojas) - Wednesday, 20 November 2019, 08:59 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Ronald van Haren (pressh)
Antonio Rojas (arojas)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:

I am running wpa_supplicant to manage the wifi networks directly and dhcpcd or dhcpcd <interface> to manage the ip address assignements.

Since install arch linux to a new Dell XPS 13" 2-in-1, I see the following behaviour:

After some suspend&resume cycles, dhcpcd will not see that the link was re-established and stays in "deleted IP" state.
wpa_supplicant on the other hand though realises the disconnect and reassociates itself correctly.

I am using "modprobe.blacklist=intel_lpss_pci" to be able to boot the device, I am not sure if this is related.

My assumption is that *maybe* the kernel is losing the netlink message, however the symptom is that I don't have network connectivity anymore after a couple of suspend/resume cycles.
This is fully


Additional info:
* package version(s)

[root@diamond ~]# pacman -Ss dhcpcd
core/dhcpcd 8.1.1-1 [installed]
RFC2131 compliant DHCP client daemon
[root@diamond ~]# uname -a
Linux diamond 5.3.8-arch1-1 #1 SMP PREEMPT @1572357769 x86_64 GNU/Linux
[root@diamond ~]# pacman -Ss wpa_supplicant
core/wpa_supplicant 2:2.9-1 [installed]
A utility providing key negotiation for WPA wireless networks
[root@diamond ~]#


* config and/or log files etc.

[root@diamond ~]# dhcpcd -B wlp0s20f3 [93/1464]
main: control_open: Connection refused
DUID 00:04:4c:4c:45:44:00:32:4c:10:80:56:b4:c0:4f:30:5a:32
wlp0s20f3: IAID 9a:54:c3:bf
wlp0s20f3: adding address fe80::3b98:cb58:ed02:c25
wlp0s20f3: soliciting an IPv6 router
wlp0s20f3: soliciting a DHCP lease
wlp0s20f3: offered 192.168.4.22 from 192.168.4.188
wlp0s20f3: probing address 192.168.4.22/24
wlp0s20f3: Router Advertisement from fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: adding address 2a0a:e5c1:111:111:6aa6:5bc:535a:8e21/64
wlp0s20f3: adding route to 2a0a:e5c1:111:111::/64
wlp0s20f3: adding default route via fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: leased 192.168.4.22 for 600 seconds
wlp0s20f3: adding route to 192.168.4.0/24
wlp0s20f3: adding default route via 192.168.4.188
wlp0s20f3: Router Advertisement from fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: fe80::20d:b9ff:fe46:3bd4 is unreachable
wlp0s20f3: carrier lost
wlp0s20f3: deleting address 2a0a:e5c1:111:111:6aa6:5bc:535a:8e21/64
wlp0s20f3: deleting route to 2a0a:e5c1:111:111::/64
wlp0s20f3: deleting default route via fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: deleting address fe80::3b98:cb58:ed02:c25
wlp0s20f3: deleting route to 192.168.4.0/24
wlp0s20f3: deleting default route via 192.168.4.188
wlp0s20f3: carrier acquired
wlp0s20f3: IAID 9a:54:c3:bf
wlp0s20f3: adding address fe80::9a78:130:c176:96c7
wlp0s20f3: soliciting an IPv6 router
wlp0s20f3: soliciting a DHCP lease
wlp0s20f3: offered 192.168.43.38 from 192.168.43.51
wlp0s20f3: ignoring offer of 192.168.43.38 from 192.168.43.51
wlp0s20f3: probing address 192.168.43.38/24
wlp0s20f3: leased 192.168.43.38 for 3600 seconds
wlp0s20f3: adding route to 192.168.43.0/24
wlp0s20f3: adding default route via 192.168.43.51
wlp0s20f3: no IPv6 Routers available
wlp0s20f3: carrier lost
wlp0s20f3: deleting address fe80::9a78:130:c176:96c7
wlp0s20f3: deleting route to 192.168.43.0/24
wlp0s20f3: deleting default route via 192.168.43.51
wlp0s20f3: carrier acquired
wlp0s20f3: IAID 9a:54:c3:bf
wlp0s20f3: adding address fe80::3b98:cb58:ed02:c25
wlp0s20f3: soliciting an IPv6 router
wlp0s20f3: soliciting a DHCP lease
wlp0s20f3: Router Advertisement from fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: adding address 2a0a:e5c1:111:111:6aa6:5bc:535a:8e21/64
wlp0s20f3: adding route to 2a0a:e5c1:111:111::/64
wlp0s20f3: adding default route via fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: offered 192.168.4.22 from 192.168.4.188
wlp0s20f3: probing address 192.168.4.22/24
wlp0s20f3: Router Advertisement from fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: leased 192.168.4.22 for 600 seconds
wlp0s20f3: adding route to 192.168.4.0/24
wlp0s20f3: adding default route via 192.168.4.188
wlp0s20f3: fe80::20d:b9ff:fe46:3bd4 is unreachable
wlp0s20f3: soliciting an IPv6 router
wlp0s20f3: Router Advertisement from fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: Router Advertisement from fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: fe80::20d:b9ff:fe46:3bd4 is unreachable
wlp0s20f3: carrier lost
wlp0s20f3: deleting address 2a0a:e5c1:111:111:6aa6:5bc:535a:8e21/64
wlp0s20f3: deleting route to 2a0a:e5c1:111:111::/64
wlp0s20f3: deleting default route via fe80::20d:b9ff:fe46:3bd4
wlp0s20f3: deleting address fe80::3b98:cb58:ed02:c25
Killed

* link to upstream bug report, if any

Steps to reproduce:

* suspend / resume a couple of times with the specific notebook

To fix:

* pkill -9 dhcpcd
** interestlingly, SIGINT is ignored by dhcpcd in this state!


This task depends upon

Closed by  Antonio Rojas (arojas)
Wednesday, 20 November 2019, 08:59 GMT
Reason for closing:  Fixed
Comment by Nico Schottelius (telmich) - Sunday, 03 November 2019, 21:45 GMT
Update: while I first thought it's suspend/resume related, it is actually not. Moving inside a building and roaming to another access point can trigger the same behaviour / problem. I also tested SIGTERM for dhcpcd, but it also does not react anymore.
Comment by Nico Schottelius (telmich) - Tuesday, 05 November 2019, 17:46 GMT
Just discussed with the upstream author and there seems to be a fix in the repository already. I'm testing from the checkout and will report back whether post 8.1.1 fixes the problem.
Comment by George C. Privon (privong) - Wednesday, 20 November 2019, 03:17 GMT
I haven't done any careful testing, but since dhcpcd was updated to 8.1.2-1, I have no longer been having this issue.
Comment by Antonio Rojas (arojas) - Wednesday, 20 November 2019, 06:47 GMT
@telmich can you confirm that this is fixed?
Comment by Nico Schottelius (telmich) - Wednesday, 20 November 2019, 08:03 GMT
@arojas I can confirm that - I was actually running dhcpcd 8.1.2 from source for some time and it already showed there that it was fixed.

Loading...