FS#5560 - samba makes shutdown freeze

Attached to Project: Arch Linux
Opened by pajaro (pajaro) - Tuesday, 10 October 2006, 09:26 GMT
Last edited by Aaron Griffin (phrakture) - Wednesday, 07 November 2007, 21:35 GMT
Task Type Bug Report
Category System
Status Closed
Assigned To No-one
Architecture not specified
Severity Critical
Priority Normal
Reported Version 0.7.2 Gimmick
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

If you have samba shares mounted and then then the conection gets broken with the sharing computer, the system doesn't shutdown. It freezes.

Since the shutdown doesn't finish the system gets an unclean umount.
This task depends upon

Closed by  Aaron Griffin (phrakture)
Wednesday, 07 November 2007, 21:35 GMT
Reason for closing:  None
Additional comments about closing:  Closure requested by reporter: this is frozen and i am not in a network anymore
Comment by Jan de Groot (JGC) - Wednesday, 11 October 2006, 15:50 GMT
This isn't something we can fix, this is an issue of your own. You just have to make sure your connection doesn't die. It's the same for NFS, we can't fix mounted dead filesystems
Comment by pajaro (pajaro) - Wednesday, 11 October 2006, 22:11 GMT
I see windows shuting down even with dead connections... :S
Comment by Jan de Groot (JGC) - Thursday, 12 October 2006, 06:16 GMT
That's because windows isn't doing much with that connection from kernelspace, it's just a silly program using some connection driver/library that times out and gets killed when shutting down. Try to close a hanging explorer window that is waiting on a network connection, windows will kill it and take your whole explorer with it. In this case it wouldn't be the explorer but the kernel.

If you don't want your system to hang on dead samba mounts, then simply make sure the mount doesn't go dead, or don't mount it at all and use programs that utilize libsmbclient, like most gnome applications that use gnomevfs do, or all KDE applications that utilize the SMB KIO module do.
Comment by pajaro (pajaro) - Thursday, 12 October 2006, 07:33 GMT
that's true, closing an explorer window that hanged up because of a network connection is imposible, because it hangs, thought after some minutes explorer gets back.

Killing makes the bar disapear because the bar an browsing windows share the same process. If you activate multiprocess for explorer that won't happen again.

What I am thinking to do is to make a script for /etc/rc.d/ that sets timeouts for smbfs/cifs/nfs...(all network shares?) shares to be unmounted, and kills what needs to be killed. smbmount shouldn't compromise arch linux integrity ;)

Do you know if there is any way to know what process is associated to a certain mount point/device?
Comment by Jan de Groot (JGC) - Thursday, 12 October 2006, 07:59 GMT
The problem is that any process that touches a dead mounted filesystem becomes a dead process in D status, so there's no detection. A different solution is to use umount -f on network shares, which forces an umount when possible.
Comment by pajaro (pajaro) - Thursday, 12 October 2006, 08:12 GMT
Thank your for the explanations.

Gonna see what I can do next week.

I'll keep you up to date.
Comment by Roman Kyrylych (Romashka) - Monday, 13 November 2006, 21:55 GMT
Status?
Comment by pajaro (pajaro) - Monday, 13 November 2006, 22:26 GMT
smbfs is a dead end about this.

Even if you kill the process and erase the entry in mtab you can't mount it again.

Now i use cifs, since smbfs has no future.

the only thing that i miss from from smbfs is that cifs keeps your user throught the network, so even if i can mount my friend's mac, i can't access his files, unless i create a common group with the same uid.

The sollution is fuse: smbnetfs, for example.
Comment by Leon Roy (dogbait) - Friday, 17 November 2006, 23:21 GMT
I have this problem as well. Can't the system just continue to shutdown instead of hanging?
Comment by pajaro (pajaro) - Saturday, 18 November 2006, 14:24 GMT
Check how they do that in Mac OS X.

I searched many times for configuration settings that do this: if a computer that you are connected to dies, samba closes the connection, but i never found.

I really don't find any logic to the fact that when a computer dies, your samba mount process hangs up instead of closing. I programmed servers and clients and that doesn't make any sense.

I feel that if went to check samba's source code I could see something like this:
// Microsoft's patch. This is required for samba protocol. Implemented on 1992.
if (connection_status==DEAD_CONNECTION) {
connection_status=ALIVE_CONNECTION;
}

People says that it is impossible, even with nfs. Why?

:P
Comment by héctor (hacosta) - Thursday, 23 November 2006, 07:39 GMT
shouldn't this be confirmed?
Comment by pajaro (pajaro) - Thursday, 23 November 2006, 08:05 GMT
To confirm why mounts freeze we need to know:
- how mounting takes care of filesystems integrity.
- how samba takes care of dead connections.

It would be good to have the list of files and functions that do that, so that coders have quick access to the problem.

I am going to create a post in the forum to start the investigation.
Comment by pajaro (pajaro) - Thursday, 23 November 2006, 08:12 GMT Comment by pajaro (pajaro) - Wednesday, 24 January 2007, 17:11 GMT
i found the solution.

the default arch setup doesn't take care of keep alive.

I did a successful test

here it goes:
___________________________
keep alive (G)
The value of the parameter (an integer) represents the number of seconds between 'keepalive' packets. If this parameter is zero, no keepalive packets will be sent. Keepalive packets, if sent, allow the server to tell whether a client is still present and responding.

Keepalives should, in general, not be needed if the socket being used has the SO_KEEPALIVE attribute set on it (see "socket options"). Basically you should only use this option if you strike difficulties.

Default: keep alive = 0

Example: keep alive = 60
________________________________
Comment by Jan de Groot (JGC) - Wednesday, 24 January 2007, 21:41 GMT
So what's the solution in this? It tells the server a client is still active or not. If your server shuts down, you can send keepalive packages as much as you want, but the server is dead and you still have the same problem.
Comment by pajaro (pajaro) - Wednesday, 24 January 2007, 21:57 GMT
ops, keep alive is only for servers.
Well, anyway, now, after a long delay, i can unmount with "umount -fl /path/to/mount/point".
Comment by Dawid Wróbel (cromo) - Friday, 06 April 2007, 11:47 GMT
pajaro, it seems to not work here the way it works for you:
516:cromo@kromka:~$ sudo umount -fl smb4k/MAGDA/nwo
usage: smbumount mountpoint
usage: smbumount mountpoint

Umount by default calls umount.smbfs helper and it seems umount passes -fl to it, which it doesn't recognize. The only way to force umounting here is:
519:cromo@kromka:~$ sudo umount -fli smb4k/MAGDA/nwo
The -i switch tells umount to not call the helper and it works this way. So basically I think we should find the way to check if the share is dead and if so, force umounting it.


Comment by Dawid Wróbel (cromo) - Friday, 06 April 2007, 12:03 GMT
Actually, it's enough to just use the umount -i. No need for -l and -f switches then.
Comment by pajaro (pajaro) - Friday, 06 April 2007, 12:12 GMT
i have a daemon called netfs_helper that what does is when it gets stoped it send a delayed unmount to force umount of samba shares.

I will set it to umount with -i and check what happens.
Comment by David Fuhr (dcf) - Saturday, 16 June 2007, 11:04 GMT
i had this problem with dead smbmounts as well, but my system _did_ shut down. it took it some time saying "unmounting filesystems..." but after 1 or 2 minutes it continued regularly...
Comment by Felix (thetrivialstuff) - Sunday, 02 September 2007, 18:28 GMT
This problem causes nasty side-effects when the mount is over an SSH tunnel and the router dies (somehow this caused X to deadlock -- probably because I had konqueror copying some files to the lost share when it happened).

I was able to SSH in, but certain other things (including the shutdown command and dmesg) caused deadlocks as well and eventually sshd stopped responding and I had to do a hard reset.

It seems that smbfs/cifs lacks any facility for "device gone" -- or maybe it's the kernel? I've had similar lockups when a hard drive dies. I think this is something that bears fixing; total system freezeups because a network share became unreachable is what Windows 98 did.

Loading...