10-18-2007 02:35 AM
10-23-2007 02:21 AM
12-04-2007 09:54 PM
Does anyone have the definitive fix for this? I have a brand new ML350 G5 with the same issue - intermittent 1792 notification on boot. I am supposed to go into production with this box in the next week or so and after reading this am reticent to do so. I have the latest firmware and drivers (7.91 series) installed and am running SBS 2003 R2 SP2. I don't believe I have the aforementioned MS KB installed but will look into it.
12-05-2007 05:18 AM
12-05-2007 05:49 PM
Thanks Pinnacle CS. I checked both supplies against the Customer Advisory and both are good. I installed hotfix 932755 and will beat the snot out of the system and reboot 50 times before I go into production. The few reboots I've done since the hotfix have been clear of 1792's, although I have not done a lot of I/O on the system. Please do let me know how testing goes with your new G5 as well. Thanks again.
12-11-2007 08:45 AM
HP sent me a new cache board based on output from the ADU, which I knew would not work. It didn't. I'm going to call them back and see what they have to say, which I'm sure will be helpful! ;)
Now I am faced with either leaving the old driver on and dealing with hangs or stops on shutdown, or putting the latest drivers on and waiting for corruption. Great choice.
12-19-2007 09:18 AM
We have the same problem with 5 ML350G5 servers now. Alle have SBS 2003 R2 installed and freeze now and again. We have a case witch MS and one with HP. We foud out that even witch the newest SmartStart 7.91 we have bus erros on our hard drives. Even when there is no MS OS on the server !!. HP say now (after 2 weeks of testing) that this is a known bug in firmware 1.66 of the E200i raid controller and that this will be solved in the next firmware update....
We suggested upgrading the E200i to an P400 controller to help our customers but no can do. We have to test and test and test to sulte the HP problem. We spend over 2 weeks on hours on this problem and 4 servers are in our office waiting to be completed. Installation is posponed... Clients are not verry happy.
We see the problem with servers that have a lot of disk activity. I keep you informed if there is a solution
01-02-2008 12:05 AM
as you can read in my post dated september 9, 2007 you had almost the same experience I had, but it is quite funny to discover the driver that seems to be stable is 18.104.22.168 (I got this results using release 22.214.171.124).
Anyway, I can confirm that AFTER DISABLING ACCELERATOR (I suppose this means using a write-through alghoritm in "HP language"), ALL IS WORKING without problems from the end of August (but I didn't apply any new patch or driver, waiting for some "official" solution).
I am very interested in knowing if this workaround solves your issues.
After reading all this (dramatic) posts, this is my definitely thought: the E200i controller (hardware+firmware+driver) is BAD, so the best thing you can do is: DON'T BUY IT.
I am very interested in knowing if using a P400 controller is the ultimate solution as expected.
If anyone made some testing in replacing the E200i controller, please let us know.
01-02-2008 08:07 AM
I concur with your findings. Disabling the write cache does indeed stop the 1792's from occurring.
I had a follow-on issue and unfortunately cannot say what caused it, but I am suspicious of the old driver that I was using. I found the machine crashed one night, with no indication of the cause evident in any of the logs. It had corrupted data on the partitions.
I decided to reload everything from scratch and retrace my steps. What I found makes me somewhat suspicious of SBS 2003 vs. the HP controller, but perhaps it is the combination. I found that the 1792's happen after every reboot once the first phase of the SBS install is performed (domain controller, etc.). When the remainder is installed (R2, patches, etc.) the problem becomes intermittent again, which makes me wonder if SBS is not shutting down properly and you only see it with a controller that has a battery-backed cache. At this point, I gave up and purchased a P400 controller w/o bbc (for a lot of reasons) just to get going.
I agree that the e200i hardware-firmware-driver combination is weak. I would add that HP's support is also Very BAD. I sent an engineer the logs he requested approximately three weeks ago and have heard nothing back (and I paid for 7x24-4hr). I am going to call them and complain today.
01-02-2008 10:13 PM
HP have replaced the mainboard, and the battery backed up cache. One thing we did notice is that if we run a disk i/o stress test, it works on the RAID 1 config, but not on the RAID 5 config (system locks up after 10 seconds). We did this test with the HP tech standing next to us, so he could see the results himself. He is going to source another RAID controller so we are not using the e200i - will let you know once this has been done.
01-02-2008 10:49 PM
I get the same warning during post (1792) on a Windows Server 2003 R2 installation, so I don't think the problem comes from SBS itself, but all Windows installations. I don't know if the problem exists in Linux too, but anyway HP servers are certified to work with Windows O.S.
Yesterday I had a long talk with the reseller's technical support (HP Certified Partner) and and we decided to try to replace the internal E200i with a P400 controller, probably with BBC. I hope this will be an ultimate solution.
As I already declared in a previous post HP technical support is worse . They actively creates damage.
Maybe next week they will call you asking if you fixed your issue, like they did with me..... :-(
01-03-2008 11:24 AM
Please let me know what you find if indeed you get to test a P400 w/bbc. I'd be interested in the results.
As to SBS vs. generic Windows being an issue, I did a basic load of Windows 2003 server (no domain, DNS, etc.) as a sort of control for the test. No 1792's. My approach wasn't all that scientific but it makes makes me wonder about SBS' role in this.
01-03-2008 11:35 PM
We encountered the lockups on heavy disk i/o, too, even after applying all updates from SmartStart, SupportPack and Firmware Maintenance CD.
Also, on one of our servers the RAID array would vanish after a single disk failure and had to be rebuild from backup.
HP support would first assume the WD SATA drives we're using were not supported by HP but then had to realize that that's actually what they're shipping.
After exchanging reports from various analysis tools, HP confirmed the bus errors and now blames an inconsistency between the E200 BBWC and SATA drives regarding Native Command Queueing.
Now we're scheduled to get the E200 replaced with something else (most probably an E400) and the SATA drives by SAS drives.
Interesting to learn that it's actually a firmware issue (i.e. easy to fix), looking at the cost this is incurring on us as well as on HP.
01-03-2008 11:39 PM
The server would start to boot, and then just fail (we had only installed the card, hadn't attached any drives to it as yet - as per HP supports instructions. We tried numerous things, all to no avail. HP are going to come back next week with another P400 and see if they can get this working.
01-04-2008 12:02 AM
I will keep you informed about any evolution, but I think it will take some time.... I hope to replace the controller before the end of January, but I am not sure about it.
Anyway, I always get the 1792 warning on a Windows Server 2003 *R2* SP1, a different release of Windows compared with SBS2003 Standard R2 which runs Windows Server 2003 (R1) SP1 + SP2 update, but in this case the warning *seems* harmless (I had no problem with disks).
This server is a Domain Controller (AD+DNS+DHCP+WINS) with two SAS disk (RAID 1), the SBS server uses four SAS disk (RAID 1+0).
01-06-2008 11:58 PM
Today we installed another P400 controller, ran our stress tests, and the server passed with flying colours! Yay...
Now all we need to do is convince HP that it is some fault with the E200i (be it drivers or firmware - I don't really care) on the ML350 G5, get them to supply the part (P400 controller) for free (and compensate my collegues and I for the many hours we have wasted!).
We also need to get another P400 for our other client that is experiencing the exact same issues on the exact same hardware.
01-07-2008 03:12 PM
They told me (like HP technical support did) this is a software issue caused by an **improper configuration** of server. They make this decision after our talk, they never had a look at the server!
Maybe working with HP products produces this intellectual damages? ;-)
So, considering that the customer doesn't want to pay more that 500 Euro for a new controller, I think this could bring to legal actions....
please let us know any news about the P400 testing (with BBWC?).
01-07-2008 03:50 PM
No BBC on the P400, just the standard 256 MB cache. One thing I was reading about the E200i - it will support RAID 5 with 128 MB cache - both our servers had this upgrade in place, but still failed.
I will let you all know how we go with HP in regards to getting these parts for free - and also what sort of response we get from HP about the inabality of the E200i to work properly.
01-08-2008 07:58 AM
About a month ago we needed to purchase a replacement server for one of our applications so we knowingly purchased a ML350 w/E200 and BBWC. I did some testing on this server. It was a plain Jane Windows 2003 build. I intentionally did not put SP2 on it. I tested w/7.9 and after 20 reboots and gigs of data, I did not receive the error. I put SP2 on the server and after just two reboots, I got the error message. I was not able to duplicate it performing any particular action. I installed the STOR Port update from MS, another 20 reboots later and gigs of data and no error. The ML370 has been in production for 3 months and the ML350 for about a month. Knock on wood, we haven't had any issues with either. Fortunately for my customer, I caught this and questioned it before it went into production. Iâ ve built enough servers and been in this industry long enough to know that any Array Controller status message after the server is built is not a good thing regardless of what the stupid Indian in Tech Support says.
IMHO, I believe SP2 in the culprit and the issue occurs during shutdown. I have not seen nor heard of anyone having the same issue with a Linux or Novell based host. I don't think the SCSI subsystem in Windows is working with the driver correctly. I have not been able to replicate this issue once I installed the updated STOR port driver from MS. I wish I had time to build a Novell or Linux box and test with that.
BTW, I second everyoneâ s opinion on HP's support. The stupid Indian techs don't know crap, they don't give a crap and they are morons. Just another instance of why we need to stay the HELL out of those countries, those people don't know anything nor do they care and they have no business in the tech industry. I see it time and time again even with the people who come to this country to work. Why in the hell doesn't HP see that, oh wait, the only thing most execs see is green. Sorry......
Hope this helps.
01-11-2008 09:16 AM
01-11-2008 09:36 AM
01-11-2008 10:01 AM
i'm running firmware 1.66 on the e200i with driver 126.96.36.199. i manage about 30 of these ML350 G5's at various clients. they are all identical and were all purchased within 6 months of each other. almost all report 1794 errors which are preceded by 1792 errors - saying the battery charge is low. fortunately, i'm running sas drives, so i hope to not experience the failures of the entire sata raid array that were reported above when a single drive fails. i want to get these tech issues resolved, but I want to do something to send a loud message to hp corporate so it doesn't happen. you know that many of their components are IBM, right? it's just how they integrate that's different. god knows IBM has been sucking lately too.
01-13-2008 05:50 PM
Now they have asked for one my techs to go onsite again (so much time being wasted on this issue) so they can try playing around with the e200i settings, turn off write caching or something.
Will let you all know how it goes ofcourse - needless to say (but I will say it anyway) getting really p*ssed off with all this.
02-25-2008 06:57 AM
After disabling it I neither got one "1792-Drive Array Reports Valid Data Found in Array Accelerator" message, nor I got any DATA CORRUPTION problem, with any driver version (tested versions: 188.8.131.52, 184.108.40.206, 220.127.116.11).
So, if you get problems with E200i+BBWC controller, DISABLE ACCELERATOR !!!
I hope this could help you.
In the meanwhile I discovered another nasty issue on ML350G5, but I will post it in a specific thread........