07-15-2011 12:52 PM
I've got the infamous login hang requiring ctrl-c to finally get in.
Moved user .profile to .profile.no. Moved /etc/profile to /etc/profile.no
Just the one newly built box 11.11 with patch bundles.
As root I can su user without problem. But when su - user , sourcing env, it hangs and I have
to do cntrl-c two or three times and wait a while to get login prompt.
I've delt with problems like this before with resolve and path and nfs. But can't find anything
that would be causing this.
Solved! Go to Solution.
07-15-2011 01:01 PM
>I've got the infamous login hang requiring ctrl-c to finally get in.
Any infamous login hang that I've seen usually can't be ctrl-C.
You need to add "set -x" to your .profile to see what the last lines are.
07-15-2011 06:48 PM
Did you try ssh under debug mode?
#ssh abc.om -v
we can get somehint from there
:: Really appreciate if you could assign some points.
:: Don't know how to assign point? Click the KUDOS! star!
07-17-2011 01:00 PM
I did the set -x and it goes through all he way to the end of /etc/profile. I even moved my .profile and the /etc/profile so they would not get sourced at all. Tried to run ssh host -v but that works find. It's just that I do not get a login prompt unless I do a ctrl-c several times. The last thing I did on the server was doing a patch commit to clear out the save dir. Does doing a patch commit require a reboot??
07-17-2011 01:41 PM
>I even moved my .profile and the /etc/profile so they would not get sourced at all.
And it still hung? Where is your $HISTFILE pointing to?
>I do not get a login prompt unless I do a ctrl-c several times.
Do the following (on another window) to see which process is hanging:
UNIX95=EXTENDED_PS ps -H -fu your-hung-username
>Does doing a patch commit require a reboot?
No, it just fiddles with the IPD.
07-17-2011 02:09 PM
I do know that as root if I su user it does not hang. If as root I su - user it hangs. - its sourcing something.
Could you explain UNIX95. I vaguely remember something about that. How are you saying to do it on another window??
07-17-2011 02:22 PM - edited 10-19-2013 08:11 PM
>If as root I su - user it hangs. - its sourcing something.
Right, you said that. What shell are you using? (I've been assuming sh/ksh.)
>Could you explain UNIX95. I vaguely remember something about that.
Nothing to explain, just do it. ;-)
It enables the -H option, for a hierarchical listing of processes.
>How are you saying to do it on another window?
One window is hung, do it in another, it doesn't matter if you are root or any other user.
07-17-2011 03:00 PM - edited 07-17-2011 03:02 PM
I am currently hung on one window with user moorej. Im running in another window
# UNIX95=EXTENDED_PS ps -H -fu moorej
UID PID PPID C STIME TTY TIME CMD
moorej 19309 19307 0 16:54:24 ? 00:00 sshd: moorej@pts/0
moorej 19312 19309 0 16:54:24 pts/0 00:00 -ksh
What's interesting is that I su - phamn who is running csh and that session comes right up. What gives with ksh??
07-17-2011 03:07 PM - edited 07-30-2011 07:09 PM
>What's interesting is that I su - phamn who is running csh and that session comes right up.
(When you use the scummy C shell, it's not tricky, so it can't fail like this. ;-)
I asked above, what is $HISTFILE? For old OS versions, if it was over NFS, you could get hangs.
Do you have all your NFS, automounter, etc patches?
But if you moved aside ~/.profile , then HISTFILE wouldn't be set.
07-17-2011 03:33 PM
Correct. no histfile
This server is a new build with all patches. Login was working before. Seems after the server was rebooted that this problem started.
I am out of tricks and possible things to check. There is no truss on this server or I would try to use that. I will have to look into loading it.
07-17-2011 05:15 PM
You could try setting it to a local filesystem.
07-17-2011 06:52 PM - edited 07-17-2011 06:53 PM
I'm running truss on the sshd process following forks but I dont really see anything. I can see my ctrl-c and returns but it then just sleeps. I am using Putty, but the problem does not occur on other 11.11 or 11.23 11.31 boxes. I just don't get a prompt.
07-17-2011 07:07 PM - edited 07-17-2011 07:14 PM
Holy crap! It is the .sh_history over nfs mount. A default histfile does get created even though I am not specifying one. I saw it in the truss ouput. I changed it to be on a local filesystem, as you suggested, and now it works. WHY!!!!
I still don't know why it only does this on this one box??
07-18-2011 12:57 AM
And read 'man profile' - "If the file /etc/profile exists, it is executed by the shell for every user who logs in."
In a cold installation there will of course exists a default /etc/profile.
07-18-2011 07:42 AM
Please check your DNS server entry in /etc/resolv.conf file and confirm it is working properly or not. Otherwise disable DNS in sshd_config file. Also check /var/adm/wtmp file size, may it is very big.
07-18-2011 08:16 AM - edited 07-18-2011 08:18 AM
Nyga, read farther back. Already isolated /etc/profile. They are the same as other servers we have.
Hakki, it is not a stale nfs. Other users could login using same home nfs mount. Automounter Configs
are same as other servers.
Arunabha, I have seen problems with DNS in the past. But resolv.conf is the same as other servers here.
Also, I had two servers with slow login not perm hang that had 1-2 gig size wtmps files. But this is not the case
This is something with writing the history file to the home dir nfs mount with this server. Other 11.11 server
07-18-2011 09:42 AM
the 2 common problems are:
1) dns not responding, and nswitch.conf configuring it.
2) nfs when the account is mounted over NFS, or mail directory is nfs mounted,
To tusc/truss rlogind you need to either configure it in inetd.conf or just tusc
inetd with options -p -f -E -o resfile.out pidofinetd,
then try a rlogin and give the output.
07-18-2011 10:14 AM
>A default histfile does get created even though I am not specifying one.
Are you sure? I thought the default was some temp file or just memory? What was the name?
> I changed it to be on a local filesystem, as you suggested, and now it works. WHY!!!!
>I still don't know why it only does this on this one box?
You don't have the right patches on it? Or there was a networking glitch?
Do you have the RPC lock demons working correctly on the client and the server?
(Perhaps your control-C works because you have your NFS mounted with INTR?)
I've had this problem over and over for more than a decade because I need a shared history file.)
>Hakki, it is not a stale nfs. Other users could login using same home nfs mount.
>This is something with writing the history file to the home dir nfs mount with this server.
Right, it is the RPC lock demons. Are these other users using ksh? Are they on the same machine?
You may need the NFS guru Dave to help you.
07-18-2011 10:52 AM
The .sh_history file gets created by default.
It's the patch bundles in their ignite build that they have been using. The other box has the same.
The export is /nfshomes with no options. Homes is auto.direct file with /nfshomes unity:/nfshomes
pretty straight forward.
RPC lock daemons should be doing whatever the default is.
Any users have the same problem unless the histfile is redirected to local filesystem.
Who is Dave?
07-18-2011 11:10 AM - edited 10-19-2013 08:11 PM
>The .sh_history file gets created by default.
Hmm, I wasn't aware there was a file based default.
>RPC lock daemons should be doing whatever the default is.
The problem is they get tired and stop working correctly and hang. ;-)
>users have the same problem unless the histfile is redirected to local filesystem.
As expected if the RPC lock demons aren't working.
>Who is Dave?
07-18-2011 02:24 PM
I've had to restart them in the past before. But I restarted them on this box already.
nfs.server, client stop/start
But I did not restart them on the nfs server. It would cause an outag I'm sure.
07-19-2011 04:24 AM
>I've had to restart them in the past before.
That's using a hammer, you need a scalpel:
1) Kill rpc.statd and rpc.lockd ON BOTH SYSTEMS.
$ ps -ef | grep rpc.lockd
$ kill <rpc.lockd pid>
$ ps -ef | grep rpc.statd
$ kill <rpc.statd pid>
2) Remove all entries in /etc/sm and /etc/sm.bak ON BOTH SYSTEMS.
$ rm -r /var/statmon/sm /var/statmon/sm.bak
3) Restart rpc.statd (first), then rpc.lockd, on both systems:
07-19-2011 10:36 AM
Dennis. YOU ARE THE MAN!!! It worked. No more hangs. It took two logins to finally clear but
then it stopped hanging.
Thank you soo much for not giving up on me. It was really starting to bother me that users may
start logging in and getting hung sessions.
I will remember this behavior with rpc.