12-14-2012 10:25 AM - last edited Monday by Lisa198503
Hi, please we need help to find out what is causing our rx8640 to crash. This server is online without errors since Feb-2012 but crash occurs when a reboot is issued and then the server generate crash about 3 or 4 times and after that became online without problems.
08:29 Sun Feb 26 2012. Reboot after panic: MCA, IIP:0xe000000000664ae0 IFA:0x0000000000000045
I attached the crash file.
P.S. This thread has been moved from HP-UX>General to Servers > Integrity. -HP Forum Moderator
12-14-2012 12:41 PM
As Dennis said, MCA (Machine Check Abort) usually means a hardware problem somewhere - I don't think anyone on the forums here is going to decode a full crash dump for you (but I guess you never know!), you'll need to get a call logged with HP support for that. They have experts, who in my experience will pinpoint the issue fairly quickly for most panics.
And if you don't have a support contract? Well, now you know why you need one...
12-17-2012 07:53 AM
I'm only following my nose here, and make no claim to read registers, etc. However, crashinfo.txt contains the following.
= Global Error Counters / kmem_writes =
scsi_bioerrors = 15
scsi_bioerrors_logged = 0
scsi_async_write_bioerrors = 0
scsi_async_write_bioerrors_logged = 0
sd_strategy_bioerrors = 3
sd_strategy_async_write_bioerrors = 0
sd_epowerf_bioerrors = 0
sd_epowerf_async_write_bioerrors = 0
Note: There are scsi_bioerrors !
Please contact HP support referencing CRASHINFO_NOTE_SCSI1.
This confirms the likelihood of a hardware problem that other respondents have noted. Best of luck!
12-18-2012 01:46 AM
crashinfo is a hp support tool. The right place to interpret it is hp support.
At the beginning of crashinfo output you have:
Note: HP CONFIDENTIAL
I also had the same problem with my BL860 server so i got the solution below, Please see this it may help you :