11-14-2013 05:44 AM
I have a range of servers (DL360, DL380 & DL580) all G5 version all running Windows 2003 or Windows 2003 R2. If I restart/reboot any of the servers, they go through the shutting down process and then freeze at a grey screen. A ping to the server continues to respond. This happens if the restart is initiated by the shutdown command as well (shutdown /r /f /t 1 /m \\<servername>). I've left servers in that state for several hours to test if they will eventually restart but they never do.
The only way to restart the server is by power cycling the server from the iLO. The servers all have a P400i array controller running firmware v4.12 or v5.20. The other common factor I can find is that all the affected servers have an iLO 2 with firmware v2.15 installed.
If the server is restarted by the iLO so it's back at the CTRL/ALT/DEL screen then the shutdown command run again straight away, it will restart normally so this issue only seems to develop after the server has been up for a while.
I have multiple other Proliant servers of different generations running Windows 2003 but none of them experience this issue.
Has anyone experienced this issue and managed to overcome it ?
11-14-2013 10:13 PM
this sounds like something in the shutdown process of the OS is stuck. Is there a way to check what Windows is doing when it is shutting down?
Perhaps there are some services running on the servers that doesn't stop properly and then the shutdown gets stuck..
http://support.microsoft.com/kb/324268 has some pointers.
11-20-2013 01:15 AM
Thanks, I'll go through that troubleshooting guide
One thing I have noticed and it seems strange but is very consistent is that all of the Proliants I have the issue with have iLO2 firmware 2.05 or 2.15. All those with iLO2 on older firmware (eg 1.60) reboot without any problems. This is 100% common factor across my servers that don't restart. I've got a couple of cases where I didn't used to have the issue on a specific server, firmware has been upgraded and the next restart failed so it does seem to be tied to the iLO2 firmware version but can't find any reason why it would be.
I've downloaded the latest iLO2 firmware (2.23) and am testing that on a couple of problematic servers. If that doesn't work, I plan to downgrade the firmware on the same test servers. If the change of firmware fixes the issue, I'll post back here.