07-05-2011 04:15 AM
Hi I have K class server.
OS detalis : HP-UX ho B.10.20 A 9000/889 255864301 two-user license.
Please find my shutdown log .
cat: Cannot ope: No such file or directory
$ cat shutdownlog
18:57 Thu Aug 26, 2010. Halt: (by ho!root)
19:14 Thu Aug 26, 2010. Halt: (by SAM)
19:15 Thu Aug 26, 2010. Halt: (by ho!root)
16:32 Wed Oct 20, 2010. Reboot: (by ho!root)
23:30 Tue Nov 9, 2010. Halt: (by ho!root)
11:54 Fri Jan 14, 2011. Halt: (by ho!root)
11:05 Thu Jan 20, 2011. Halt: (by ho!root)
14:31 Thu Feb 17, 2011. Reboot: (by ho!root)
09:51 Sat May 14, 2011. Reboot: (by ho!root)
15:54 Sat May 14, 2011. Halt: (by ho!root)
15:53 Tue May 31, 2011. Halt: (by ho!root)
08:03 Thu Jun 30 2011. Reboot after panic: , isr.ior = 0'10240001.0'398e707c
11:42 Thu Jun 30 2011. Reboot after panic: , isr.ior = 0'10240001.0'390e50b8
13:51 Thu Jun 30 2011. Reboot after panic: , isr.ior = 0'0.0'0
03:17 Sun Jul 03 2011. Reboot after panic: , isr.ior = 0'240028.0'6d5b4670
03:42 Mon Jul 04 2011. Reboot after panic: , isr.ior = 0'240028.0'6d5b4670
07-05-2011 04:44 AM
You have had 5 system crashes (panic) in the last few days. Since this is version 10.20 (obsolete for more than 10 years) the solution will be difficult. Patches are no longer available from HP. Start by looking at syslog.old in the /var/adm/syslog directory. You may also have crash dumps in /var/adm/crash. These will be taking a lot of disk space from /var. You may want to remove all bu the most recent two core.# directories.
If your system has q4 installed, you can run the attached script. Just cd into /var/adm/crash/core.5 (or whatever core directory you want to analyze). Then run the script. There will be a file create called WhatHappened.txt. That file may explain why your system crashed.
07-05-2011 06:06 AM
Those crashes are likely the result of Service Guard -- and like Bill said there's nothing to be done as long as the versions are so out of date.
Best thing to do is to document the failure -- perhaps with some basic diagnostics (eg "if q4 says <this> then reboot") -- so that it's a known issue.
07-05-2011 07:25 AM
Speedy action is necessary. If the system keeps crashing you may fill up /var or /var/crash. If your crash files are going to /var filesystem, your system could become unusable in short order.
I would also use mstm/cstm/xstm (your choice here) and look for broken hardware. This could be contributing to the crash.
Owner of ISN Corporation
07-05-2011 08:05 AM
Also, those could be the result of the system crashing due to hardware events, or HPMC's. The shutdownlog rarely reports "HPMC" from what I have seen anyways. If your OS is patched, you should have a directory: /var/tombstones. Within the directory there will be lots of tsXX. The file ts99 will be the latest. For K Class servers, reading these is not always difficult.
First off check the timestamps to be sure they are valid. If so, look at each processor chasiss codes. The easiest to look for is anything starting with 0x2... This would indicate that the processor is causing the HPMC's to occur. Of course that is only one scenario.
There can be a lot more to this, so if you like, you can attach the ts99 file for review (if you have one).
07-06-2011 07:15 AM
The panic was due to an HPMC. The cause of the HPMC was due to a bus check. In other words some module on the system bus initiated an HPMC. A review of the tombstone data would be needed to determine the cause. If you dont have the /var/tombstones directory you can do one of the following:
- Install the pdcinfo patch that will give you the /var/tombstones file. However, this being 10.20, you will have some troubles getting this from HP...if you can locate it, the patch ID is PHSS_14980.
- Reboot the server and access the BCH menus. Run SER PIM and capture the output.
Either of the two above will give you the register data needed to decode the problem.