ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset (4322 Views)
Reply
Advisor
Iain Binnie
Posts: 34
Registered: ‎11-25-2008
Message 1 of 22 (4,322 Views)

ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Hi,

This issue involves several DL380 and DL360 G5 servers.

I recently updated the ILO2 firmware to version 1.79 and also the HP System Management Homepage to v3.0.2.77 on several DL380 and Dl360 Servers.

Since doing so I have had sporadic power up and down issues and when looking in the ILO2 log I receive this message each time before the power cycle "BMC IPMI Watchdog Timer Timeout: Action=System Power Reset"

I have dug around online and find various threads leading nowhere specifically all indicating that HP are working on this then ending with a disable ASR as a fix in-between, which I have done and it appears to have stopped the reboots.

My questions are 1. What is the issue I am experiencing and how do I fix it ?

2. What exactly are ASR and BMC IPMI Watchdog Timer? And what are they used for?

Your help as ever is greatly appreciated!

Best Regards
Iain
Honored Contributor
Michael Leu
Posts: 508
Registered: ‎01-17-2005
Message 2 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I'm no expert in this but I think the recommendation (for Linux systems) is to disable ASR.

The DL585 G5 User Guide on ASR:
ASR is a feature that causes the system to restart when a catastrophic operating system error occurs, such as a blue screen, ABEND, or panic. A system fail-safe timer, the ASR timer, starts when the System Management driver, also known as the Health Driver, is loaded. When the operating system is functioning properly, the system periodically resets the timer. However, when the operating system fails, the timer expires and restarts the server.
---------------------------------------------------------------------------------
Navigation: Forum Site Map // ye olde ITRC Tree
@HP: please get rid of the Passport login timeout
Honored Contributor
Michael Leu
Posts: 508
Registered: ‎01-17-2005
Message 3 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

By the way, someone else has answered your second question much better then I ever could :-)

http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1374922
---------------------------------------------------------------------------------
Navigation: Forum Site Map // ye olde ITRC Tree
@HP: please get rid of the Passport login timeout
Honored Contributor
acartes
Posts: 685
Registered: ‎04-22-2004
Message 4 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

The problem solution requires both:
- update to iLO 2 v1.78 or later
- update the Windows Management Controller Driver to 1.11.2.0 or later

Updating one or the other is not a complete fix.

Discussed in this Customer Advisory:
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?locale=en_US&objectID=c01802766
Occasional Visitor
Support Microsoft Serv
Posts: 4
Registered: ‎03-19-2008
Message 5 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

have yuo tried upgrading the iLO 2 Management Controller Driver Version?
Occasional Visitor
Billy Barule
Posts: 4
Registered: ‎04-16-2007
Message 6 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

acartes, the Advisory you cite talks about this issue occurring when the iLO2 Firmware is at or below 1.70. Iain's issue (and mine coincidentally) did not manifest until the iLO Firmware was upgraded to 1.79.
I had a colleage reference the same Advisory article, but I'm not convinced the problem is solved.
Occasional Visitor
Mark Ottaway
Posts: 3
Registered: ‎08-06-2009
Message 7 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I'm having the same issue. Mine didn't start unitl upgrading to 1.79!! I have updated the Instance driver, controller driver and the system rom. Still gettign random reboots and reports of BMC IPMI Watchdog Timer Timeout.
Occasional Contributor
John Cunningham_4
Posts: 4
Registered: ‎03-20-2008
Message 8 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I have the same issue here on a number of DL360G5s. Reboots a random time after upgrading to ILO 1.79

HAs anyone tried reflashing back to 1.78?

It is suggested elsewhere that the reboot will only happen once - can anyone confirm this?

John
Occasional Advisor
Jimmy Tn
Posts: 12
Registered: ‎10-29-2008
Message 9 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Hi,

My experience is that iLO2 Management Driver 1.11.2.0 AND iLO2 firmware 1.78 or later will fix the issue.

BR / Jimmy
Honored Contributor
acartes
Posts: 685
Registered: ‎04-22-2004
Message 10 of 22 (4,322 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

>> acartes, the Advisory you cite talks about this issue occurring when the iLO2 Firmware is at or below 1.70. Iain's issue (and mine coincidentally) did not manifest until the iLO Firmware was upgraded to 1.79.
I had a colleage reference the same Advisory article, but I'm not convinced the problem is solved.

A complete solution requires that both iLO and the OS driver are updated.
Details for each are in the CA.
The iLO change addresses a "duplicate records in the SEL" bug (generated by the OS driver), the OS driver change prevents a communication hang-up that can result in an ASR.

The iLO change was introduced in iLO 2 v1.78, and released at the same time as the OS driver update. It is a difficult "update two pieces of software to address the issue" fix.
Occasional Visitor
Brutus the Barber Beefc
Posts: 1
Registered: ‎10-14-2009
Message 11 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I had this problem on a first generation DL320s.

I have sucessfully reflashed the iLo firmware back to 1.78 from 1.79.

You can do this by using HP Software Update Manager.
1. Download the latest HP firmware Maintenance CD ISO
2. Edit the ISO and add iLo Firmware 1.78 .scexe file if not already present.
3. write the image to a USB stick for faster booting and your away!
You may have to use force downgrade and then select verion 1.78.

Also use the latest iLo management driver.

I have had loads of grief on DL370's but this went away with 1.78 and the Windows iLo driver
Advisor
Matt Sebel
Posts: 21
Registered: ‎04-30-2009
Message 12 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

We just had the same problem last night after upgrading iLO to 1.79 in a couple of BL460 blades running 2003 SP2. The both rebooted early this morning with the same BMC message. Upgraded the driver to 1.12 (latest version available). No we wait to see if the problem manifests itself again.
Advisor
Martin T. Greene
Posts: 23
Registered: ‎04-19-2004
Message 13 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Same issue with our 460c G1 blades. ILO Firmware is 1.79 and the HP ProLiant iLO 2 Management Controller Driver is at 1.12. Still a problem
Occasional Visitor
Dennis D
Posts: 2
Registered: ‎10-29-2009
Message 14 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

DL360 G5
iLO 179 - BMC IPMI Watchdog Timer Timeout: Action=System Power Reset.

had to revert to 178 to stabilize.
Occasional Visitor
Dennis D
Posts: 2
Registered: ‎10-29-2009
Message 15 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

forgot to mention
i am running Server 2003 R2 with SQL2005 SP3.
Occasional Visitor
Klause
Posts: 2
Registered: ‎10-30-2009
Message 16 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

We are having the same issue on 4 DL360 G5
It started with NO UPDATES. Just started at the start of the month. I've updated EVERYTHING as per HP (using their software) and it's still happening. Disabled ASR, still happening but now I don't get the watchdog error. We are on 1.79 now, but HP suggested I drop back to 78. It's not cool to have production servers reboot.
Occasional Visitor
Klause
Posts: 2
Registered: ‎10-30-2009
Message 17 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Does anyone know if you have to reboot after the ILO firmware upgrade\downgrade?
Occasional Visitor
Mike Murphy_5
Posts: 4
Registered: ‎06-18-2003
Message 18 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Same issue on a ML370G5; however, didn't notice it until now. The issue started the next day AFTER updating iLO2 to FW 1.79 (logs support this). I have since updated to FW 1.81. We'll see.
Scary, The logs show many instances of BMC IPMI Watchdog Timer Timeout and ASR's. Wonder how that effects the W2K8 64bit file systems. Fingers crossed.
There was a "temp solution" for RHEL of disabling ASR, but that defeats the purpose of ASR.
Advisor
8i5
Posts: 30
Registered: ‎04-16-2007
Message 19 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

We have the same issue on ilo2 firmware 1.79 and 1.81 plus latest psp8.30 and post 8.30 drivers.

Waiting for HP's answer and I suspect it will be "disable ASR".
Occasional Visitor
Carol Northcut
Posts: 1
Registered: ‎06-06-2008
Message 20 of 22 (4,312 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

We've been having the same issue with a DL380 G5 for several months. Have been on the phone with HP multiple times with three different case numbers! The occurrence is random, but the logs always record the same errors. ASRs have occurred three times in the last three days. With ASR disabled, the blue screen only says *** Hardware Malfunction Call your hardware vendor for support ***The system has halted***
Each time it ASRs, the iLO2 log shows "BMC IPMI Watchdog Timer" and "Server power removed" and "server power restored." The Integrated Management Log Viewer always shows "PCI Bus Error (Slot0, Bus0, Device 0, Function 0) and "ASR Detected by system ROM." The only thing disabling ASR does is prevent the server from resetting after failure. Not a good thing. We've replaced the system board, updated all the software/firmware multiple times (currently on iLO 1.81) and it's getting worse. This is a Win2K3 R2 SP2 64x server running Tivoli, connected to a library and multiple MSAs with P800 and P400. HP has asked that I next re-run the SmartStart diagnostics and run a repair of Windows. Aaargh!
Occasional Visitor
Patrick Metcalfe
Posts: 1
Registered: ‎02-26-2010
Message 21 of 22 (4,310 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Had similiar issues - look at this link for Blade Servers probably works for DL380 Server also

http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01723453/c01723453.pdf

http://h18004.www1.hp.com/products/blades/components/c-class.html

See also attached document


Advisor
8i5
Posts: 30
Registered: ‎04-16-2007
Message 22 of 22 (4,310 Views)

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Those compatibility matrix docs are standard and if you don't follow them you will always be looking for trouble.

The latest I've been told by HP for my up to date blade servers which are experiencinig this issue is to downgrade the ilo management controller driver to 1.8.0.0 then upgrade back up to 1.13.0.0 again. I cannot imagine how this will help but will try it if the issue persists.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.