01-05-2011 06:58 AM
Currently the ports distribution is marked as broken for amd64, and trying to install gives this message:
"===> hpacucli-7.50_3 is marked as broken: currently does not work on amd64 (see ports/128288).
*** Error code 1
Stop in /usr/ports/sysutils/hpacucli."
I am trying to diagnose an issue with my Smart Array P410 rebuilding the RAID 1+0 array using the hot-spare once or twice a week.
I've installed /usr/ports/sysutils/smartmontools hoping it would provide some useful data. However, during/after the array rebuild I've checked each of the 6 SAS drives in the array, plus the 7th hot-spare, and the only real errors I can spot are the following:
"Log Sense for temperature failed [scsi response fails sanity test]"
Though, each drive that reports this also reports: "SMART Health Status: OK"
FreeBSD has a page dedicated to HP Proliant servers located here: http://people.freebsd.org/~jcagle/
The author lists two HP engineers responsible for the FreeBSD ports of the Proliant software:
* hpasmd: Soumitri Kadambi (firstname.lastname@example.org)
* hpacucli: Sri Sai Ganesh Venkataramani (email@example.com)
I have attempted to contact both engineers for the last 18mo, but have not received a single response to my inquiries.
Solved! Go to Solution.
01-05-2011 09:22 AM
01-05-2011 10:42 AM
I can't afford the downtime! This system is a back-office system in production for a CFTC regulated Commodities Exchange.
I wrote a simple shell script to run cciss_vol_status once a day, and email me when the array status returns UNHEALTHY. The problem with this is that by the time I arrive to the office in the morning, receive the UNHEALTHY alert email, and finally make my way to the server room; the array finished rebuilding and all LED indicators show good status.
I have twice checked in on it remotely during the night session when the script returned UNHEALTHY, and cciss_vol_status does say that the array is rebuilding using the hot-spare. Though, when I get in to the office in the morning to physically check the machine, the array status is HEALTHY with the hot-spare returned to available status.
I've received two alerts from HP regarding the Smart Array P410i controller:
However I have been unable to ascertain from the notes whether these updates pertain to my issue. As such, with these servers in production, I have neglected to research patch methods under FreeBSD (I suspect I will have to create a LiveCD running Windows or Linux, since UNIX and UNIX-like support seems to be restricted to HPUX).
Thanks for the response!
01-05-2011 10:53 AM
Also, HP used to support FreeBSD by porting the utilities I listed above, but the engineers responsible for the ports have gone off-grid and do not respond to email.
FreeBSD currently supports the HP Smart Array P410i through the generic CISS driver.
I can pull SMART data from the individual drives using smartmontools, but I want to know more about the RAID array. I want to see a log of what caused the array to rebuild, etc. To do that I *think* I need hpacucli updated to work with amd64 systems.
01-05-2011 11:02 AM
I'm pretty sure I have at least 2 FreeBSD boxes out in the wild that have a very good potential to live longer than myself! :)
PS) My laptop is running Slackware 13.1
01-05-2011 11:21 AM
BTW, BSD is my Alma Mater OS (along with the old SunOS) Sir...
Good OS... very good OS...
01-05-2011 11:44 AM
Yes very strange!
I'm currently building a FreeBSD 7.2-RELEASE (x86) LiveCD with the hpasmd/hpacucli applications installed.
I've put in a downtime request for that particular machine, and I'm hoping to get in queue for the window this weekend so I can test the LiveCD.
Are you aware of any other RAID tools for FreeBSD that might help me diagnose the underlying issue? I found some CAM and PASS driver utilities that I might try to install on my test server, but I'm still keeping my hopes up that someone will track down one of the engineers that ported those utilities!
01-05-2011 04:58 PM
Again, it has also had two SAS drives fail and sent back to HP. So my gut tells me bad backplane or controller.
01-06-2011 12:00 PM
I'm trying to build a bootable USB stick with the Smart Array firmware found here:
I will try to apply it using the tools on the Firmware Maintenance disc I found here:
I'm not 100% sure if the version referenced is the same as the "SmartStart" disc that came with the systems, but my discs say Version 8.20 and the newest firmware says to use something later than Version 9.00. The Firmware Maintenance disc says Version 9.20.
The HP bootable USB tool was having an issue reading the .ISO (md5 checksums matched... odd...) so I'm currently burning it to DVD, and will try to create the USB image using that option instead.
Anyway, assuming it's the same utilities, I will be testing the new Smart Array firmware on our test Proliant this weekend (I wasn't able to get in a maintenance window for the problematic machine.) So, I'll test the firmware on the test machine, and put in a request for next weekend (Jan 15th) to down the troublesome server.
Thank $Deity I ordered three of these! :)
01-10-2011 05:32 AM
I updated the firmware on the test machine first, and it went smoothly and by the book. I noticed that it also seemed to have updated the main BIOS (wasn't expecting that, but cool!).
I'll run this in testing for a week and hopefully I get the down time this weekend to apply these updates to the symptomatic box.