FS#31250 - [openssh] shutdown of systemd install doesn't notify/close client connections

Attached to Project: Arch Linux
Opened by c (c) - Wednesday, 22 August 2012, 17:24 GMT
Last edited by Gaetan Bisson (vesath) - Sunday, 14 October 2012, 10:28 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Gaetan Bisson (vesath)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
IIRC initscripts or openssh's init script has code to notify or close client connections on shutdown. Without that if you shutdown a machine from within an ssh session the ssh client has to wait a long time before it aborts with a pipe error.

Additional info:
* 6.0p1-3



Steps to reproduce:
* ssh to systemd install
* sudo halt -p
This task depends upon

Closed by  Gaetan Bisson (vesath)
Sunday, 14 October 2012, 10:28 GMT
Reason for closing:  Won't fix
Additional comments about closing:  Nothing we can do until upstream comes up with something magic. Feel free to reopen then.
Comment by Gaetan Bisson (vesath) - Wednesday, 22 August 2012, 23:56 GMT
The initscripts solution is hackish, and I consider this an upstream problem (or a problem with your ServerAliveInterval configuration).
So unless you can suggest a clean, one-line solution, I am unlikely to implement anything.
Comment by c (c) - Thursday, 23 August 2012, 10:49 GMT
How do opensuse and fedora solve this? Do they set a ServerAliveInternal default in /etc/ssh/ssh_config?
I know that Fedora with systemd works the same as initscripts with the workaround. I agree that the existing initscripts solution is not the right way. We should look over there and see how they fixed it. I don't think they use ServerAliveInternal as the disconnect is immediate and not with a short timeout due to a low ServerAliveInternal setting.
Comment by Gaetan Bisson (vesath) - Thursday, 23 August 2012, 12:41 GMT
I was hoping you'd tell me to do some basic research...
Comment by c (c) - Thursday, 23 August 2012, 12:47 GMT
Sorry that wasn't my intention. I honestly wanted to ask before trying to get hold of the relevant info as I have no fedora or opensuse system. I'll probably look at their spec files and see if there's anything interesting.
Comment by Gaetan Bisson (vesath) - Thursday, 23 August 2012, 12:49 GMT
The reason it "works" in Fedora is because they do not have "KillMode=process" in sshd.service; but we really need this: otherwise existing connections get killed when restarting SSHD (please check for yourself that it is indeed the case with Fedora).
Comment by Dave Reisner (falconindy) - Thursday, 23 August 2012, 13:05 GMT
There's 2 sshd configurations, the inetd style activation and the single daemon process. Which one are we talking about here? I cannot reproduce any hang with the single daemon process.
Comment by c (c) - Thursday, 23 August 2012, 13:31 GMT
I used 'systemctl enable' and 'systemctl start' after that and let systemd start it automatically on each subsequent bootup.
Comment by Dave Reisner (falconindy) - Thursday, 23 August 2012, 13:43 GMT
Great, you've omitted the interesting part. What comes _after_ enable and start?
Comment by c (c) - Thursday, 23 August 2012, 13:50 GMT
I don't understand the question. I kept the existing config in /etc/ssh unmodified after switching to systemd.
Comment by Evangelos Foutras (foutrelis) - Thursday, 23 August 2012, 14:10 GMT
I believe Dave is asking for the full command you used; e.g. 'systemctl enable sshd.service' vs 'systemctl enable sshd.socket'.
Comment by Dave Reisner (falconindy) - Thursday, 23 August 2012, 14:12 GMT
"enable" and "start" expect 1 to many units to follow. Let's try making this multiple choice:

Did you run:

a) systemctl enable sshd.service; systemctl start sshd.service
b) systemctl enable sshd.socket; systemct; start sshd.socket
c) cowsay "moo"
Comment by c (c) - Thursday, 23 August 2012, 14:12 GMT
Oh now I see. 'systemctl enable sshd.service'.
Is sshd.socket the on-demand version akin to inetd style of activation? Does that spawn one sshd per connection? I suppose it's configurable in systemd's unit file support.
Comment by Andreas (Evilandi666) - Tuesday, 02 October 2012, 22:42 GMT
  • Field changed: Percent Complete (100% → 0%)
This isn't fixed and an Arch problem (other distris don't have this problem). Also it should be fixed before arch can move to systemd. This Bug does not occur with old initscripts. See https://bbs.archlinux.org/viewtopic.php?pid=1166093
Comment by Tom Gundersen (tomegun) - Tuesday, 02 October 2012, 22:47 GMT
Reopening as I think I might have a solution (though possibly not ideal), but I cannot test atm:

Could you try adding After=network.target to systemd-user-sessions.service. This means that all user sessions are shut down before the network connection is broken. If I understand correctly, stopping sshd.service will not necessarily kill the open ssh connections, so adding a dependency there won't help.

Could someone confirm/refute this theory? Even if this works it is not ideal as it means boot will be slowed down by not allowing anyone to log in until the network daemon has been started and possibly a network connection set up. If it works I'll try to come up with a better compromise and send it upstream.
Comment by Andreas (Evilandi666) - Wednesday, 03 October 2012, 12:08 GMT
I copied systemd-user-sessions.service to /etc/systemd/system/ and added After=network.target. I tried several reboots, no hang! I was logged out automatically (like it was before systemd). Thanks for your (temporary) solution!

Edit: I tried it on another machine, does not work there. It still hangs instead of logging out when trying to reboot via ssh.
Edit2: On the first machine (mentioned before edit) it works without this solution now. WTF.

I think this means that adding After=network.target to systemd-user-sessions.service does not help.. sry :(
Comment by Gaetan Bisson (vesath) - Wednesday, 03 October 2012, 12:31 GMT
Even if we can get the systemd service to implement "StopBefore=network.target", one might not always want sshd to stop when eth0 is brought down, as sshd can still be used over the loopback interface, or over other interfaces that were configured manually.

So I think it might just be one of those problems for which we cannot provide a satisfying solution out of the box, and which is up to system administrators to address.
Comment by Tom Gundersen (tomegun) - Wednesday, 03 October 2012, 12:36 GMT
Gaetan: the ordering relation is orthogonal to the requirement relation. Ordering the stopping of user sessions (including ssh ones) before shutting down the network, does not imply that the user sessions are stopped whenever the network is stopped. Just that this happens if they both happen to be scheduled for stopping in the same transaction (e.g. at shutdown).

I think this should work just fine (but I might obviously be missing something).
Comment by Gaetan Bisson (vesath) - Wednesday, 03 October 2012, 12:52 GMT
Ah, thanks Tom! I am glad we switched to systemd. :)
Comment by Steven Noonan (neunon) - Sunday, 14 October 2012, 08:05 GMT
I can confirm that in my testing (multiple poweroffs), adding 'network.target' to 'After=' resolves the issue. Can we please get this added to the openssh package?
Comment by Gaetan Bisson (vesath) - Sunday, 14 October 2012, 09:11 GMT
Steven: No. It's a flaky solution, as Andreas reported (see Tom's first post for an explanation), and therefore not worth the inconvenience at bootup (see Tom's first post again).
Comment by Steven Noonan (neunon) - Sunday, 14 October 2012, 09:29 GMT
Er, I should clarify. I added it to 'After=' in sshd.service rather than systemd-user-sessions.service. It worked reliably.
Comment by Dave Reisner (falconindy) - Sunday, 14 October 2012, 09:50 GMT
No, this does _not_ work reliably. There's no way it could. ssh sessions opened by users are not children of the main sshd process. They aren't even in the same cgroup because of pam_systemd usage. Therefore, any ordering on sshd.service itself will never have any effect on the ordering of remote user sessions and this can (and does) still hang on shutdown.
Comment by c (c) - Sunday, 14 October 2012, 09:57 GMT
Does anybody know how it's done in Fedora?
Comment by Steven Noonan (neunon) - Sunday, 14 October 2012, 10:02 GMT
Hmm, weird. I don't know why it would work for so many reboots (probability dictates at least one halt would yield a bad result). But what you're saying matches what I'm seeing in other bug reports (i.e. https://bugzilla.redhat.com/show_bug.cgi?id=626477).

Fedora 18's sshd.service doesn't seem to do anything particularly special to avoid the issue (though I see they also include 'network.target' in After, but probably not for the same reason I did): http://pkgs.fedoraproject.org/cgit/openssh.git/tree/sshd.service?h=f18
Comment by Steven Noonan (neunon) - Sunday, 14 October 2012, 10:06 GMT Comment by Dave Reisner (falconindy) - Sunday, 14 October 2012, 10:18 GMT
Except that they do. password-auth pulls in pam_systemd:

http://pkgs.fedoraproject.org/cgit/pam.git/tree/password-auth.pamd

Loading...