05-08-2012 01:44 AM
I have 4 Dl180G6 servers, 3 with older firmware, and one brand new that i upgraded with the latest firmware on everything (DVD 10)
The LO100 interface stops responding on all 4 servers after some time.
I.e when you try and browse to the LO100 interface using a browser, you get "The website is too busy to show the webpage" and using telnet/ssh you just get a connection time out.
I have rebooted two of the servers, and this does not help.
So i completely powered down one of the servers, and back on, and then the LO interface comes back online and starts working again.
Also i use HP SIM to monitor these servers, and about once or twice a week, 3 out of the 4 servers LO100 interfaces becomes un-pingable , i.e i get an alert from HP sim that the ilo interface cannot be pinged and is down, all 3 at the exaxt same time (i have around 100 servers, all connected to the same ILO Switch, and it is only 3 that looses network connection)
On the servers running windows, i get these entries in the eventlog (several hundred entries) :
EVENT ID : 57
DriverState = STATE_ILO_HOSED
Is there a way to correct this without completely powering down the server? , and is there a more permanent fix? i.e the latest firmware does not seem to correct the issue as i have to keep power cycling the servers.
The scary bit is that the server seems to think it is a good idea to do an ASR to correct the issue on the machines running linux, so i have had to uninstall the insight managment agents to prevent the servers from randomly rebooting.
05-08-2012 03:23 AM
You can try with an update of the management controller driver, if not already done.
05-08-2012 03:30 AM
any network scanners or monitors that running on the iLO network that could perhaps cause issues? I haven't checked HP SIM updates but maybe there is an update to that?
The 3 out of 4 that loses connection - are they all running the older firmware? Do they become unpingable with irregular intervals?
I'd double-check that IP settings are correct, maybe the switch-ports' settings can be changed - like to fixed 100duplex instead of auto.
Is it LO100i version 4.23A in the DVD 10?
05-08-2012 04:30 AM
I have the updated mgmt controller driver only on one of the servers.
The three that completely looses network connection are the ones with the older firmware set.
The server that has the latest firmware has 4.23 for Lo100 (cannot see an A at the end)
I will try and set the switch ports to 100fdx
I know of at least two departments that use HP SIM on our ILO network, so yes it could be something like that happening (i know in ver 5.x something there was a similar issue with hp sim) , i use hp sim 6.3 myself, but not sure what the other department uses.
So as i can see it, my way forward would be :
1 Make sure my ilo network switch uses 100fdx on the 4 dl 380 servers
2 Switch the 3 older servers off completely to kick ilo back into life
3 Update the three servers firmware using the firmware dvd
4 Update to the latest PSP on the OS for drivers