12-28-2012 01:46 PM
following a high temperature situation but now resolved, one of my rx 6600 does not boot neither from disk nor from install media. the front panel led are as : system --> flash red; internal --> off; external --> green; power --> green
the server BMC seems not to even pass the EFI stage.
i get into the MP and read the SL entries logs; this is the listing
Log Entry 17:
Alert Level 7: Fatal
No events were received from system firmware
Logged by: Baseboard Management Controller;
Data1: FRB2/Hang in POST failure
my fw is 02.17, bmc is 05.23 and ROM system fw is 04.03
i urgently have to bring up this server. any help ?
12-28-2012 02:18 PM
I'm not trying to make a joke, if you have a system that was exposed to high heat that says it's hanging in POST and won't boot:
CALL YOUR HARDWARE MAINTENANCE PROVIDER or swap the server out. High heat is A Real Bad Thing for modern computer hardware. Warning or failure messages or indicators in this circumstance aren't trivial. Often you won't have some cute little way to defeat a test that fails after a system has been overheated. More modern technology is MUCH more sensitive and their more densely packed components make systems even more sensitive than older units so manufacturers have built-in mechanisms to more closely monitor heat conditions.
12-30-2012 07:13 AM
If mirrored root disk there ?Then boot with Alternenate HDD .Or Boot with C D rom and LVM Maintainance mode try boot with old kernel
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
01-01-2013 01:06 AM
Hi and happy new year to you all.
the problem is almost solved. since BMC was talking about Processor in the mp logs, i suspected that one of my 3 dual processors was faulty. I then start testing them one by one by removing each one after other in their load order.
After that, nothing changed. the server was still not able to boot.
Luckily, i have the same server in another rack which is not used all the time. so, when users were done, i replaced the entire processor board chassis of the faulty server with the one of the server in the second rack..and boot process completed fine. I think i will ask for a quote for a new processor board.
Thanks to all. and Kevin , dont be afraid, the cooling system has been fully restored.