FS#68744 - [unbound] dependency problem with systemd at startup

Attached to Project: Community Packages
Opened by Gaetan Bisson (vesath) - Wednesday, 25 November 2020, 19:30 GMT
Last edited by Toolybird (Toolybird) - Saturday, 03 June 2023, 21:49 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To David Runge (dvzrv)
Bruno Pagani (ArchangeGabriel)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

With unbound-1.12.0-1 and systemd-246.6-1 and the following in /etc/systemd/resolved.conf:

[Resolve]
DNS=127.0.0.1

There's a depdency cycle at startup: unbound waits for the network to be on (as stipulated in its service file) and systemd waits for the DNS resolver to be up before declaring that the network is on. The cycle only breaks when systemd network initialization times out and finally the unbound service file is allowed to start. This results in the system not being able to resolve names for the first ninety seconds after bootup. Note a huge deal but seriously annoying. Here's the relevant excerpt from the journal:

Nov 25 09:18:43 kujira systemd-networkd[339]: enp3s0: DHCPv4 address 192.168.8.100/24 via 192.168.8.1
Nov 25 09:18:43 kujira systemd-resolved[344]: Using degraded feature set UDP instead of UDP+EDNS0 for DNS server 127.0.0.1.
Nov 25 09:18:43 kujira systemd-resolved[344]: Using degraded feature set TCP instead of UDP for DNS server 127.0.0.1.
Nov 25 09:18:43 kujira systemd-resolved[344]: Using degraded feature set UDP instead of TCP for DNS server 127.0.0.1.
Nov 25 09:18:43 kujira systemd-resolved[344]: Using degraded feature set TCP instead of UDP for DNS server 127.0.0.1.

[..]

Nov 25 09:20:18 kujira systemd-resolved[344]: Using degraded feature set UDP instead of TCP for DNS server 127.0.0.1.
Nov 25 09:20:18 kujira systemd-resolved[344]: Using degraded feature set TCP instead of UDP for DNS server 127.0.0.1.
Nov 25 09:20:26 kujira systemd-networkd-wait-online[342]: Event loop failed: Connection timed out
Nov 25 09:20:26 kujira systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Nov 25 09:20:26 kujira systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Nov 25 09:20:26 kujira systemd[1]: Failed to start Wait for Network to be Configured.
Nov 25 09:20:26 kujira systemd[1]: Reached target Network is Online.
Nov 25 09:20:26 kujira systemd[1]: Starting Validating, recursive, and caching DNS resolver...
Nov 25 09:20:26 kujira unbound[827]: [827:0] notice: init module 0: subnet

And finally unbound starts!
This task depends upon

Closed by  Toolybird (Toolybird)
Saturday, 03 June 2023, 21:49 GMT
Reason for closing:  No response
Comment by David Runge (dvzrv) - Wednesday, 25 November 2020, 19:56 GMT
@vesath: Hi and thanks for the report! :)

I think it has been introduced by this upstream change [1].
Would you be able to test with that reverted?

It might also be worthwhile creating a ticket upstream about this.
There must be a better common dependency definition than what was introduced here [2].

Any testing and suggestions for change very much welcome!

[1] https://github.com/NLnetLabs/unbound/commit/afbc7bb4fec5026f6a1a1487e643b94b2ba1d694#diff-4a3cdefa4a03a59d6a6c6dbf348e92660a0ff71336574e4988f81c4988d1fb0a
[2] https://github.com/NLnetLabs/unbound/issues/296
Comment by Gaetan Bisson (vesath) - Thursday, 26 November 2020, 18:23 GMT
Reverting the commit you point to does indeed fixes the problem.

From my perspective, the issue this commit was trying to fix is misunderstood: the github issue is entitled "systemd nss-lookup.target is reached before unbound can successfully answer queries #296". But surely there are pleny of cases where nss-lookup.target has been reached yet unbound is unable to answer queries, such as intermitent network problems up the line (so systemd still belives the network is on but there is 100% packet drop for example). So it seems to me that if you do not want nss-lookup.target to be reached when the system is unable to answer queries you need to rely on some other mechanism and not simply schedule unbound to be started at a certain time (which breaks other things).
Comment by Toolybird (Toolybird) - Thursday, 04 May 2023, 07:57 GMT
Related [1]. Still happening with latest pkgs? Can we close this?

[1] https://github.com/NLnetLabs/unbound/commit/5c041c0ba9f48f25e73e7675c4c654b2815d483b

Loading...