FS#56812 - [systemd] does not start enabled units for no reason
Attached to Project:
Arch Linux
Opened by Noel Kuntze (thermi) - Thursday, 21 December 2017, 21:18 GMT
Last edited by Dave Reisner (falconindy) - Tuesday, 02 January 2018, 16:21 GMT
Opened by Noel Kuntze (thermi) - Thursday, 21 December 2017, 21:18 GMT
Last edited by Dave Reisner (falconindy) - Tuesday, 02 January 2018, 16:21 GMT
|
Details
Description:
systemd does not start units, despite them being enabled and the dependencies already having started successfully. The target that the unit wants is the target systemd is configured to boot into. I have reproduced this problem by disabling and enabling a unit. It seems to happen randomly to services, too. E.g. named.service (belongs to bind9). This happens on at least two distinct machines (x86_64). The unit starts normally and works, if it's started using `systemctl start <unit>`. Additional info: * package version(s) systemd version 235 * config and/or log files etc. Steps to reproduce: 1. Enable a unit/have a unit enabled 2. Verify it's actually pulled in by a target 3. reboot 4. check the status |
This task depends upon
Closed by Dave Reisner (falconindy)
Tuesday, 02 January 2018, 16:21 GMT
Reason for closing: Not a bug
Additional comments about closing: User configuration error
Tuesday, 02 January 2018, 16:21 GMT
Reason for closing: Not a bug
Additional comments about closing: User configuration error
Units I encountered this so far:
lightdm.service
named.service
strongswan-swanctl.service
The units are only running because I started them manually.
As you can see, they're enabled and hence should start automatically.
multi-user.target is pulled in.
```
systemctl status -n0 named
● named.service - Internet domain name server
Loaded: loaded (/usr/lib/systemd/system/named.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/named.service.d
└─override.conf
Active: active (running) since Thu 2017-12-21 22:11:44 CET; 1h 32min ago
Main PID: 5300 (named)
Tasks: 5 (limit: 4915)
Memory: 33.3M
CGroup: /system.slice/named.service
└─5300 /usr/bin/named -f -u named
systemctl cat named
# /usr/lib/systemd/system/named.service
[Unit]
Description=Internet domain name server
After=network.target
[Service]
ExecStart=/usr/bin/named -f -u named
ExecReload=/usr/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/named.service.d/override.conf
[Unit]
Before=dnsmasq.service
```
```
systemctl cat lightdm
# /usr/lib/systemd/system/lightdm.service
[Unit]
Description=Light Display Manager
Documentation=man:lightdm(1)
Conflicts=getty@tty1.service
After=getty@tty1.service systemd-user-sessions.service plymouth-quit.service acpid.service
[Service]
ExecStart=/usr/bin/lightdm
Restart=always
IgnoreSIGPIPE=no
BusName=org.freedesktop.DisplayManager
[Install]
Alias=display-manager.service
systemctl status lightdm
● lightdm.service - Light Display Manager
Loaded: loaded (/usr/lib/systemd/system/lightdm.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2017-12-21 17:25:59 CET; 6h ago
Docs: man:lightdm(1)
Main PID: 838
Tasks: 13 (limit: 4915)
CGroup: /system.slice/lightdm.service
├─838 /usr/bin/lightdm
└─953 /usr/lib/xorg-server/Xorg :7 -seat seat0 -auth /run/lightdm/root/:7 -nolisten tcp vt7 -novtswitch
Dez 21 17:25:57 thermi-pc.thermicorp.lan systemd[1]: Starting Light Display Manager...
Dez 21 17:25:59 thermi-pc.thermicorp.lan systemd[1]: Started Light Display Manager.
Dez 21 17:26:14 thermi-pc.thermicorp.lan lightdm[838]: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
Dez 21 17:26:14 thermi-pc.thermicorp.lan lightdm[3729]: pam_unix(lightdm-greeter:session): session opened for user lightdm by (uid=0)
Dez 21 17:26:46 thermi-pc.thermicorp.lan lightdm[838]: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
Dez 21 17:26:46 thermi-pc.thermicorp.lan lightdm[5949]: pam_unix(lightdm:session): session opened for user thermi by (uid=0)
```
```
┌─[root][thermi][/home/thermi]
└──╼ systemctl status strongswan-swanctl -n0
● strongswan-swanctl.service - strongSwan IPsec IKEv1/IKEv2 daemon using swanctl
Loaded: loaded (/etc/systemd/system/strongswan-swanctl.service; enabled; vendor preset: disabled)
Active: inactive (dead)
┌─[root][thermi][/home/thermi]
└──╼ systemctl cat strongswan-swanctl -n0
# /etc/systemd/system/strongswan-swanctl.service
[Unit]
Description=strongSwan IPsec IKEv1/IKEv2 daemon using swanctl
Requires=network.target
After=network.target
[Service]
Type=notify
ExecStartPre=-/usr/bin/ip xfrm policy flush
ExecStartPre=-/usr/bin/ip xfrm state flush
ExecStartPre=-/usr/bin/ip route flush table 220
ExecStart=/usr/bin/charon-systemd
ExecStartPost=-/usr/bin/swanctl --load-all --noprompt
ExecReload=-/usr/bin/swanctl --reload
Restart=on-failure
PermissionsStartOnly=true
#User=strongswan
#Group=strongswan
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_RAW CAP_NET_BIND_SERVICE
NoNewPrivileges=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX AF_NETLINK
MemoryDenyWriteExecute=true
ProtectSystem=full
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
RestrictRealtime=true
ReadWritePaths=/var/log/charon.log
[Install]
WantedBy=multi-user.target
┌─[root][thermi][/home/thermi]
└──╼ systemctl status multi-user.target
● multi-user.target - Multi-User System
Loaded: loaded (/usr/lib/systemd/system/multi-user.target; indirect; vendor preset: disabled)
Active: active since Fri 2017-12-08 10:46:44 CET; 1 weeks 6 days ago
Docs: man:systemd.special(7)
Dez 08 10:46:44 thermi.consulting systemd[1]: Reached target Multi-User System.
```
* any error regarding the units in the logs
* Nor any indication it ever tried starting it
The units' status is "dead". systemd itself says it never tried starting it.
The units run fine though if they are started manually, *which is what I did and also wrote in the details*.
> The units' status is "dead". systemd itself says it never tried starting it.
You can't have it both ways...
Booting with debug level logging will produce useful logs.
systemctl says exactly that for units that it never started. They are dead.
systemd detected a dependency cycle and deleted a start job. That somehow caused itself to delete the start jobs for units that are enabled themselves. It only logs that if debug logging is enabled. The complete log is 13016 lines long.
Jan 02 14:35:33 thermi.consulting systemd[1]: network.target: Found ordering cycle on iptables.service/start
Jan 02 14:35:33 thermi.consulting systemd[1]: network.target: Found dependency on dns_ipset.service/start
Jan 02 14:35:33 thermi.consulting systemd[1]: network.target: Found dependency on network.target/start
Jan 02 14:35:33 thermi.consulting systemd[1]: network.target: Job iptables.service/start deleted to break ordering cycle start
ing with network.target/start
Jan 02 14:35:33 thermi.consulting systemd[1]: smbd.service: Deleting job smbd.service/start as dependency of job iptables.serv
ice/start
Jan 02 14:35:33 thermi.consulting systemd[1]: network.target: Deleting job network.target/start as dependency of job iptables.
service/start
Jan 02 14:35:33 thermi.consulting systemd[1]: znc.service: Deleting job znc.service/start as dependency of job network.target/
start
Jan 02 14:35:33 thermi.consulting systemd[1]: privrouteserver.service: Deleting job privrouteserver.service/start as dependenc
y of job network.target/start
Jan 02 14:35:33 thermi.consulting systemd[1]: strongswan-swanctl.service: Deleting job strongswan-swanctl.service/start as dependency of job network.target/start
The dependency cycle is nearly always shown (not just in debug logging) since it's done at the warning level:
https://github.com/systemd/systemd/blob/master/src/core/transaction.c#L411-L416
I suppose because there's an ordering cycle, there's some amount of undefined behavior and you see random units not included in the initial transaction. The fix is clear -- you need to resolve your ordering cycle, and I'd suggest looking at both dns_ipset.service and privrouteserver.service as the first suspects.
systemctl status rsyslog
● rsyslog.service - System Logging Service
Loaded: loaded (/etc/systemd/system/rsyslog.service; disabled; vendor preset: disabled)
Active: inactive (dead)
I fixed the dependency cycles now. The problems were not with those units. The problems were with the ordering in iptables.service and some other units.