10-21-2011 02:31 AM
I have a pair of rx7640's hooked up to a cx4-960 Clariion. Along with a multitude of other headaches, the ioscan -fnC disk is taking forever to return, (sometimes up to 5 - 7 minutes.)
The Clariion is setup with 6 luns through a Brocade 5100B switch.I can see no issues with the fiber cards on the system, (AH401A.) Any ideas?
10-21-2011 03:56 AM
What OS release are you running? If you are on 11.31 (11iv3) you can issue the following to at least determine what element is taking so long to scan:
ioscan -P ms_scan_time
10-21-2011 08:16 AM
I ran that command, (I'm running 11.31.) Everything comes back in milli seconds, except for anything coming off our fiber cards. All the lunpaths, and EMC disks are into the 2 and 3 minute range.
Any ideas where to go from here?
10-21-2011 11:22 AM - edited 10-24-2011 03:06 AM
Unless you're looking for newly created LUNs, run ioscan with the -k switch to scan the kernel structures rather than the actual hardware: ioscan -kfnC disk
10-22-2011 09:45 PM
That worked as advertised! However, upon futher frustration, I found that anything fiber related is very, VERY slow to respond. That includes any powermt command, starting or stopping the Unisphere agent, or even just listing the fiber cards. Booting the system is very slow too. It seems to hang starting and configuring powerpath or unisphere.
I'm trying to rule out my system(s) that are hooked up to the cx4.
10-23-2011 03:50 AM
Glad that helped, Ron. When you get an answer that you like, you can reward the responder by clicking on the kudos star.
Pete (I work for kudos) Randall
10-23-2011 04:53 AM
Kudos awarded Pete. Thanks again. Any thoughts on my response? We're getting to a critical phase here. I am working overseas right now, and have to leave in 3 days for a medical semi-emergency. I'd like to get at least connected before I go. At least from there I can VPN in and work on from whatever morgue I'm laying in.
10-23-2011 10:11 AM
Thanks Pete. I appreciate your help. This is getting so frustrating, espcially with the language barrier, I'm beginning to spell EMC using different letters.
10-23-2011 12:12 PM
I'm not a SAN administrator, but here are some thoughts:
Clariion is EMC's line of active/passive arrays. This means each disk/LUN should normally be accessed only through the SAN path that goes through the primary storage controller. If the non-primary controller is used to access the disks, it triggers a "trespass event" in the storage system, causing the primary controller to hand over the control of the disk/LUN to the other controller. The controller cannot serve disk requests while a trespass event is being handled, and the design intent is that it should only happen when there are problems with the primary controller or the SAN path(s) leading to it. Your storage administrator should be able to check if there is an excessive amount of trespass events.
HP-UX 11.31 can handle this type of storage, although according to EMC documentation, a March 2008 or later release of 11.31 is required. With an active/passive array and HP-UX 11.31, you should strongly prefer the new Agile DSFs (/dev/disk/diskNN or /dev/rdisk/diskNN) over of the old legacy DSFs (/dev/[r]dsk/cXtYdZ). You might even want to completely disable the old legacy DSFs (rmsf -L).
We once had a 11.31 system connected to an active/passive array of another manufacturer. It also behaved abysmally slow, until we found that our standard monitoring package (BMC Patrol with an old version of Hardware Sentry module) was using the legacy DSFs to monitor the state of all the disks it saw. It repeatedly polled each legacy DSF... and since each legacy DSF was associated with a specific SAN path, this kept triggering trespass events in the storage, and those frequent trespass events totally ruined the performance. The fix was to 1.) disable the legacy DSFs to prevent any other program from causing the same problem, and 2.) to update the Hardware Sentry module to a version that could actually use the new agile DSFs to monitor the local disks.
Either you or your SAN administrator (or whoever has the required EMC Powerlink account) should log in to the powerlink.emc.com website and find the latest version of a document titled "EMC Connectivity Guide for HP-UX". It is located in path:
Home > Support > Technical Documentation and Advisories > Host Connectivity/HBAs
It has a chapter titled "Configuration requirements for VNX series and CLARiion support with 11iv3" (starting at page 186 of the current version of the document). With your SAN administrator, you should carefully double-check each point mentioned: it has been my experience that active/passive arrays can require a higher level of attention to configuration details than the active/active ones.
Note that those requirements don't mention PowerPath at all, so I must assume you must comply with them, whether you're using PowerPath or not.
10-24-2011 12:47 AM
MK,s advice is spot on - the problem you describe sounds very much like a Clariion where the host modes are not set correctly for 11iv3 (they have different settings from 11iv1 and 11iv2), resulting in the inactive paths being constantly trespassed by the MPIO subsystem round-robin'ing - which in turn causes the Clariion to contsnatly move the "owning controller" for any given LUN.
One quick and easy way to check if things are setup correctly - execute the following on one of your agile DSFs for the Clariion storage:
scsimgr lun_map -D /dev/rdisk/diskXX
You should see the same number of paths listed STANDBY as you see ACTIVE (standard characteristics for a ALUA disk array like the Clariion) If you don't then the Clariion host modes are not configured correctly for HP-UX11iv3
10-25-2011 06:25 AM
I ran the command, but it complained that I was still using the legacy DSF's. If I run the rmsf command, will that immediately get rid of the old and bring in the new, or, (I imagine,) I have to run the insf command to see the new /dev/disk/disk##?
I'm only here a couple of more hours, then it's 2 days of travelling. I'll have to look into this when I get home. I appreciate any input while I'm in transit though.
10-25-2011 06:55 AM
There is an excellent whitepaper (and a script: '/usr/contrib/bin/vgdsf') to guide you in migrating from legacy to agile devices:
11-02-2011 09:19 AM
I took a look at that white paper briefly. I even tried to run that script on another server, (BL870C,) that was not hooked up to the Clariion. The first thing I noticed was that it did not change anything on the server, (perhaps a reboot is requried?) Secondly, this script seems to only change the device file names if they are currently setup with a volume group. The clariion has luns setup on it, however we have not been able to get as far as setting up luns on the system. Is there anything else I may be missing? I'm leaning towards the switch. The system dmesg shows both fiber cards losing, then regaining connection. If it were only one of the cards, I would replace it. But both?
11-10-2011 11:48 AM
Are you using PowerPath? From my previous HPUX install with a Clarrion storage, we were using 11.31 and it was better performance without the powerpath.