10-25-2006 03:14 PM
I know that this issue has been discussed a lot, but I must admit that even after all the reading, I still don't really understand the issue, nor am I aware of any useful work-arounds.
The issue is this:
When I ssh to a remote server, and restart a program that runs on that server (in this case, the program is 'foglight', which is a system monitoring tool), then type exit at the prompt, my xterm window hangs. I can't even ctrl-z or ctrl-c out. I believe the issue is that some of the processes started on the remote host are attached to the terminal, eg pts/4, and ssh doesn't exit out properly till that is detached, or the processes it started are killed.
So for example, as a sys-admin, if I want to restart this program (foglight) on a remote server, I ssh to that server, run /sbin/init.d/foglight stop ; /sbin/init.d/foglight start
and the terminal window (xterm) will hang requiring me to kill the window.
Now, while the window is in a hung state, I and open another window, ssh to that server, and when I do a ps -ef command, I can see that there are some foglight processes still attached to the terminal (eg pts/4) of the original window. So obviously, it is these processes that are stopping the ssh session from completely exiting.
FYI the sbin/init.d/foglight script executes an
su - quest -c "/foglight/startfgl"
BTW this sometimes happens with other commands also.
So... How can I stop the ssh session hanging? Is there some option? Can I modify the way the command is executed to force it to detach from the tty? Is there a workaround? Do other people have this same problem? How do they work-around it?
Any comments/questions welcome.
Solved! Go to Solution.
10-25-2006 05:01 PM
Running the process in the background with a nohup doesn't fix the problem. The problem isn't so much associated with the process being a child of the remote shell, I believe it's to do with the daemon not properly detaching from the tty, and ssh doesn't like this.
NB I notice that remsh doesn't exhibit this behaviour...it exits like it is supposed to after starting the daemon and then typing exit, but ssh behaves differently. Ssh will hang.
Has anyone else experienced a similar thing where they have ssh'd to a remote server, run some commands and then exited, but found that the window hangs and the only way thing left to do is terminate the window. Are there any fixes for this problem?
10-25-2006 06:47 PM
One way to fix this would be to do something like this (all on one line):
su - quest -c "/foglight/startfgl" /dev/null 2>&1
As the program is going to be running as a daemon, you obviously don't intend to give it any input from your terminal. However, you might want to create a file for normal and/or error output instead of redirecting the output to /dev/null, so you don't lose any error messages the program might display.
To direct all the messages to a single log file (foglight.out) the command line would be:
... "/foglight/startfgl" foglight.out 2>&1
To separate the error output:
... "/foglight/startfgl" foglight.out 2>foglight.err
If /foglight/startfgl is a script, it probably runs something in the background using the "&" sign: a better way might be to add the I/O redirections (as above) to that line of the script, so you don't need to type those manually each time.
The process of separating from the controlling terminal is known as "daemonizing". It is a bit tricky to do in a shell script, and many startup scripts do an incomplete job.
Sometimes the programmer does not even intend a complete separation, e.g. if the startup script is designed to be run as a part of normal system startup. If the program later has problems, it can output the error messages just like a non-daemon program: the messages will show up on the system console.
This may be acceptable if someone actually monitors the console, but in my opinion it indicates the programmer is simply too lazy to design proper logging. It's not very hard to use syslog: even a shell script can do it with the "logger" command.
10-25-2006 11:40 PM
Maybe you could as a makeshift start your foglight as a batch job?
# echo /sbin/init.d/foglight start|batch
If your tty hangs, are you able to send ssh escapes?
For instance you could send the session in the background with a ~&, or disconnect with a ~.
Issue a ~? to get a view at ssh escapes.
If you know that your daemonizing application won't cleanly close all unused file handles,
you could als initiate your ssh session with the -n option.
How about creating a specialized RSA key for nothing else but starting your foglight?
You merely need to prepend the command= option to the public key file just in front of the key itself, before appending it to authorized_keys on remote host.
$ ssh-keygen -t rsa -b 1024 -N "" -f ~/.ssh/id_rsa_foglighter
and at the start insert something like
Then distribute it like
$ ssh remuser@remhost 'cat >>.ssh/authorized_keys' < ~/.ssh/id_rsa_foglighter.pub
Next time you'd do a
$ ssh -l remuser -i ~/.ssh/id_rsa_foglighter remhost
and the thing should get started.
If this is too much typing every time,
either set an alias, or better edit ~/.ssh/config and insert a new Host entry wit h the IdentityFile directive for that remhost.
10-26-2006 01:07 PM
Thanks for those suggestions. I tried the ssh-keygen thing, and the batch command thing, and the redirecting all 3 inputs/outputs thing, and they all worked.
ssh -n dind't work, because we need to redirect all three file handles (input, output & stderr).
I think I'll stick with the redirecting input/output to /dev/null option, as it seems the most straight forward. It's a pity we lose the output though.
So something like the following is what I'll use:
ssh remhost '/sbin/init.d/foglight stop ; echo finished foglight stop ; /sbin/init.d/foglight start < /dev/null > /tmp/foglight_start.log 2>&1 ; echo finished start ; cat /tmp/foglight_start.log ; rm /tmp/foglight_start.log'
This still gives me some output from the command, and even though I've deleted the file, I suspect that it will still exist on the filesystem until the next time I shut down foglight. Would that be right?
By the way, is there any chance that the developers of ssh will modify the behaviour so that it behaves similar to remsh such that processes still attached to the terminal do not cause the ssh terminal to hang???
Anyway, thanks again for your help. Appreciated.