07-14-2005 07:47 AM
(The overheating is happening overnight, when airconditioning for the server room is problematic - long story - so it is happenin for a reason, not randomply).
The reason I bring it up is that this particular server seems to be particularly prone to overheating (my three dual processor DL380G3 and one quad processor ML570G2 tend to keep running, often indefinitely, after this one keels over).
At the time it shuts down (early AM hours), activity on the server should close to zero, so I don't think it's overheating faster than the others due to workload...
Just added a RILOE II card to more easily resucitate the server following shutdowns, but this actually seems to have dramatically exacerbated the heat problem, so I'm thinking of removing it (the ML570 has a RILOE II as well, with no ill effect).
Saw that there is an advisory regarding internal tape drives in the ML530 and overheating, so I will remove the one sitting unutilized in the bay, hopefully this will help. Since I never use it, however, could it really be causing a problem?
Looking at the System Management Homepage, I noticed that the ML530 has a 66'F threshold on the CPUs, while the ML570 (basically the same design, no?) has a 70'F on the CPUs. Why the difference? Can I adjust that?
Finally, the ambient temperature in my server room when the ML530 starts shutting down is 99'F - not ideal, but not crazy either... or?
Anybody have any thoughts, advice?
07-14-2005 09:52 PM
Updated to the latest System ROM, and installed the latest Proliant Support Pack overnight. [Also took out the internal tape drive, never used it anyhow.]
This has raised the threshold temperature on the CPUs from 66'F to 69'F. Standing next to the server, it also sounds like the internal fans are working more aggressively.
Yesterday, ambient was 99'F and CPU was at 66'F. Today, ambient is at 95'F and CPU at 52'F.
07-19-2005 07:10 AM
We just completed a study of temperature levels in our server room vs. system shutdown.
Essentially with the AC at 65, the (near server) environmental probe registers 75 and the CPU temp (via SNMP) is 91.
When we turned the AC off, within 20 mins the probe went to 82 and the CPU to 102.
We extrapolated the rest.. after 45 mins the probe would be at 91 and the CPU at 116.
120 is about where the CPU shuts down and 160 is where it melts down.
I'd suggest 70 ambient room temp and 40 relative humidity if at all possible.