RAM errors? (rx2600) (1753 Views)
Reply
Frequent Advisor
Marco Gariboldi
Posts: 46
Registered: ‎01-21-2011
Message 1 of 3 (1,753 Views)

RAM errors? (rx2600)

[ Edited ]

Recently I had a very unfortunate problem with one of my rx2600s, which I had powered off for several days. Yesterday I powered it up again and rebooted it more than once. At some point, the system hung (after successfully booted into OpenVMS) and I believe I had to do a power cycle (via iLO. I also posted about this at comp.os.vms, but got no responses.)

 

Later the EFI presented me with a series of memory (RAM) errors, as seen below:

 

    3 0 0x000FA0 0xFFFFFFFF000AFF74 DRAM failure on DIMM 0A, deallocate rank
    3 0 0x000FA0 0xFFFFFFFF000AFF74 DRAM failure on DIMM 0A, deallocate rank
    3 0 0x000FA0 0xFFFFFFFF004AFF74 DRAM failure on DIMM 4A, deallocate rank
    3 0 0x000FA0 0xFFFFFFFF002AFF74 DRAM failure on DIMM 2A, deallocate rank

After reseating the DIMMs on the above mentioned locations, the system ‘works’ again (up to this very moment of writing this message) and the errors have seemingly disappeared.

 

The reason I rebooted was because sometimes there would be visual artifacts in the screen (in weird varying ‘quantities’). Strangely enough, these visual artifacts appear to be caused whenever I'd have a simultaneous serial console connection or after I rebooted from a console. However, now it almost seems as if the visual artifacts are showing up more frequently than before.

 

The video card in question is a Radeon 7500 (PCI), I have two of them and both are installed in rx2600s (they should be identical, I purchased them from one and the same retailer). Another thing that I've noticed is that the Radeon 7500 in the other rx2600 provides EFI graphics output, while the other one — the system with the [memory] issues — does not. Both rx2600s have the same firmware and I'm not aware of any specific Radeon 7500 firmware.

 

Could all of this be caused by faulty memory, or is perhaps the PCI card backplane faulty? Maybe even the system board? Because, although the system is up and running again, I'm not very confident about it and I wonder for how long that will be...

Frequent Advisor
Marco Gariboldi
Posts: 46
Registered: ‎01-21-2011
Message 2 of 3 (1,703 Views)

Re: RAM errors? (rx2600)

Does anyone perhaps have an idea?  Could it be caused by one or more faulty DIMMs or am I perhaps looking at a defective system board?

As a student and OpenVMS Hobbyist, it's quite a commitment for me, so I'd like to have some certainty before I act and buy spare parts.

Also, the strange issue with the Radeon 7500 puzzles me.  That on one rx2600 it gives me graphical output and on the other it doesn't.  The cards are from the same reseller and should be identical.  (I'm not aware of any firmware differences either, or how to reflash it.)

I have also swapped card cages, but that didn't affect the situation (so nothing appears wrong with those, thus neither with the graphics adapters).

HP Pro
Ian Miller.
Posts: 4,371
Registered: ‎06-03-2003
Message 3 of 3 (1,592 Views)

Re: RAM errors? (rx2600)

Did you try running the offline diagnostics on this server ?

 

IPF Offline Diagnostics and Utilities

 

There are two HP Support articles which may be of interest 

 

HP Integrity Server rx2600 - General Troubleshooting

HP Integrity Server rx2600 - possible problems

____________________
Purely Personal Opinion
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.