FS#37007 - [systemd] 207 hangs a few minutes on shutdown/reboot, 204 doesn't

Attached to Project: Arch Linux
Opened by cju (cju) - Saturday, 21 September 2013, 07:42 GMT
Last edited by Dave Reisner (falconindy) - Tuesday, 05 November 2013, 18:39 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dave Reisner (falconindy)
Tom Gundersen (tomegun)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 23
Private No

Details

Description:

Every time since upgrading to systemd 207-3, Arch hangs on shutdown/reboot for a few minutes on my systems. On one machine (a VBox), there's the following message displayed:

> A job is running for User Manager for 1002

On another actually pretty fresh machine (also VBox), the screen just stays black for that amount of time, and so does Arch (a real system) on my laptop.


Additional info:

This happens only with systemd 207, downgrading to 204 solves this problem entirely! Besides, it doesn't seem to matter which shutdown/reboot command I use; I tried them all and didn't observe any differences...


Steps to reproduce:

Upgrade to systemd 207-3 an try to shutdown/reboot.
This task depends upon

Closed by  Dave Reisner (falconindy)
Tuesday, 05 November 2013, 18:39 GMT
Reason for closing:  Upstream
Comment by Dave Reisner (falconindy) - Saturday, 21 September 2013, 10:35 GMT Comment by cju (cju) - Saturday, 21 September 2013, 12:09 GMT
Thanks for the info.

So... all I can do now is staying with systemd 204 until this patch gets appliyed, right?
Comment by Dave Reisner (falconindy) - Saturday, 21 September 2013, 12:12 GMT
Or apply the patch yourself and rebuild systemd...
Comment by cju (cju) - Saturday, 21 September 2013, 12:35 GMT
Hm, ok...

I'll absolutely try this, but since I'm not used to the whole patching-thing, how long does it usually take until a comparable patch hit at least the systemd git repo so one could use therefor the corresponding aur-package?
Comment by Dave Reisner (falconindy) - Saturday, 21 September 2013, 12:44 GMT
I'd strongly discourage you from running systemd-git if you're unfamiliar with systemd development, let alone patching packages. Patching the package in core means literally adding 3 lines -- the patch file in the source array, the checksum, and the actual patch command in the prepare function. It's trivial. Attached is a source package which should build for you.

I'm not so familiar with systemd's internals or else I'd review and merge the patch myself. The linux plumbers conference ended yesterday and we're all heading home so there may be a bit of a lag until someone more familiar looks at it.
Comment by cju (cju) - Saturday, 21 September 2013, 12:55 GMT
All right, thank you very much for your support, I'm going to try this out and report if this actually does the trick.
Comment by Simon Perry (pezz) - Saturday, 21 September 2013, 13:08 GMT
Built and tested Dave's source package, has fixed the issue for me.
Comment by cju (cju) - Saturday, 21 September 2013, 14:14 GMT
So after rebuilding the package, I tested several shutdown/reboot scenarios (on VBox) and it actually worked most of the times – but every now and then, the error described above still occures.

I tested both my own one and that one built w/ Dave's source package, same behaviour in either case; so for the moment, I "learned" at least something useful after all. ;)
Comment by Eric Wang (enihcam) - Sunday, 22 September 2013, 10:39 GMT
This message could be a hint to the root cause:

Sep 22 10:30:56 archbox systemd[1370]: Failed to open private bus connection: Failed to connect to socket /run/user/1000/dbus/user_bus_socket: No such file or directory
Comment by cju (cju) - Sunday, 22 September 2013, 10:43 GMT
Since I get the same one w/ 207, you're probably right.
Comment by ido (ik_5) - Sunday, 22 September 2013, 16:46 GMT
Also happens to me, but not only for reboot but also any process that is executed hangs a lot. only downgrading to 204 solved it.
I can not find any logs to indicate any error though.
Comment by kemad zhong (kemadz) - Monday, 23 September 2013, 06:47 GMT
14:50:01 arch systemd[1]: Removed slice system-netctl.slice.
14:51:31 arch systemd[1]: user@0.service stopping timed out. Killing.

I found the above info on my box(VBox). You can find this  FS#36266  through searching `user@0.service`. If you stop the service via`systemctl stop user@0` manually, the reboot goes smoothly.

So I doubt that the upstream bug did not fixed properply yet.
Comment by Jonathan Rodent (Nezumisama) - Tuesday, 24 September 2013, 10:00 GMT
I have the same issue with systemd 207: when shutting down or rebooting, the console is first blank for about half a minute, afterwards the usual messages get printed and the box shuts down (or restarts).

I've built the last 204 from arch's svn and that version doesn't have this problem.

Also, this problem occurs for me only if I have any console spawned - when using a DM and no consoles, the shutdown sequence is normal.
Comment by cju (cju) - Tuesday, 24 September 2013, 17:38 GMT
That's an interesting point I didn't notice so far since I normally don't use any DM. But apperently, you're right: When using KDM for example, the shutdown sequence seems actually pretty normal.
Comment by Tomasz Jędrzejewski (Zyx) - Tuesday, 24 September 2013, 19:04 GMT
For me, there is a problem with php-fpm.service only and I've been redirected here. Systemd does not notice that the process ends smoothly and keeps trying killing it for ~3 minutes, when it gives up. This happens both if I try to stop it manually and during the shutdown, but only with this one service. Everything else seems to work well. Version: systemd 207-5, fresh installation.
Comment by cju (cju) - Tuesday, 24 September 2013, 19:07 GMT
This is probably another issue/bug or at least an additional symptom since I don't have any PHP-stuff on my machine...
Comment by Olivier Brunel (jjacky) - Tuesday, 24 September 2013, 19:44 GMT
About the messages not showing up on shutdown/reboot, I don't think this is related to this issue (Type=notify service "timing out" after exiting). I've had such issue as well, which was linked to console messages being stopped on boot, and not "re-enabled" on shutdown.

See http://lists.freedesktop.org/archives/systemd-devel/2013-September/013330.html for more.
Comment by Larry Johnson (keepitsimpleengineer) - Tuesday, 24 September 2013, 19:48 GMT
This does not occur on my i686 laptop.
On my x86_64 workstation, it crashes the system including the f9 debug console.
Comment by douteiful (douteiful) - Wednesday, 25 September 2013, 09:19 GMT
I also confirm this bug. Running x86_64.

When shutting down the system was stuck at stopping the Transmission daemon service, and trying to stop it with systemctl also made it hang.
I was blaming Transmission until I saw this ticket. I downgraded to 204 and it stopped happening.
Comment by DexterLB (DexterLB) - Wednesday, 25 September 2013, 11:57 GMT
Confirmed, in about 1/3 of the cases it hangs at a blank screen with a cursor.
Comment by Dave Reisner (falconindy) - Wednesday, 25 September 2013, 12:36 GMT
Please stop +1'ing this bug report if you have nothing new to add. I know it exists, it really doesn't need to be "confirmed" for the Nth time. The posted patch fixes this. If it doesn't, then you have an unrelated problem. Feel free to take technical discussions upstream because this has nothing to do with Arch packaging.

I will happily delete any further comments which are not additive in a meaningful way.
Comment by douteiful (douteiful) - Wednesday, 25 September 2013, 18:48 GMT
In all projects I've worked in the confirmation of bugs are a helpful factor to determine the severity of the bug. I see, I didn't know Arch Linux was the exception, you have enlightened me.
It is true that this should be taken upstream though.
Comment by Pierre Schmitz (Pierre) - Monday, 30 September 2013, 16:33 GMT
Do we know which commit introduced this regression? Or can we apply that mentioned patch? Having to wait minutes for a service restart or system reboot can be an issue.

For the time being I maintain a patched version at https://repo.pierre-schmitz.com/
Comment by Dave Reisner (falconindy) - Monday, 30 September 2013, 16:40 GMT
If it weren't for all the noise on this bug report, your eyes would probably have gone to the mailing list link in the first reply which explains when this broke.
Comment by Jonathan Liu (net147) - Tuesday, 01 October 2013, 11:52 GMT Comment by Dave Reisner (falconindy) - Wednesday, 02 October 2013, 14:41 GMT
systemd-208 has the above commit, so please test if 208 resolve the hang on shutdown.
Comment by Jonathan Rodent (Nezumisama) - Wednesday, 02 October 2013, 17:07 GMT
208-1 seems to fix the issue.
The message "Failed to open private bus connection: Failed to connect to socket /run/user/<uid>/dbus/user_bus_socket: No such file or directory" appears still so this seems to be a separate issue (if it's an issue at all).
Comment by Dave Reisner (falconindy) - Wednesday, 02 October 2013, 17:11 GMT
The dbus connection failure is entirely unrelated.
Comment by Arokux (arokux) - Tuesday, 05 November 2013, 18:39 GMT
  • Field changed: Percent Complete (100% → 0%)
Dave, can you please stop deny reopening this bug as closed or at least change its state to WONTFIX or UPSTREAM? What is the problem with that?

Loading...