03-23-2012 10:32 AM
Does anyone even use hpux11.3+Sybase+Itanium+Vnx?
I am wondering how obscure my problems actually are. Is this a customer base of one? Just me?
My crazy performance issue has continued since mid February. With the very painful, chaotic speed changes (2x faster, or 600x slower), I am surprised that I have not seen anyone with this type of problem.
I'll give the condition and not any of the details. Perhaps someone will recognize the symptom.
I have a computer that writes data to a disk array. But every so often, at VERY random times, it decides to throttle the data transfer to a trickle, making 2k writes take 1/5 of a second each.
The array says it is idle. And it sure looks like it.
Glance says there is a 100% chance of a disk bottleneck.
Sybase says it is waiting on the disks.
We stop everything. We start it. It runs great. We think it was some unknown glitch. Then as we say this, it starts to grind to a halt again.
It is like there is an imaginary little devil that tells the disks there is nothing to do, while at the same time telling the database that the disks are too busy to do anything.
Solved! Go to Solution.
03-26-2012 06:55 AM
You will need to log a HP support call.
But 2 things.
1. their is the pretty well known select system call issue, introduced with patch PHKL_41700. PHKL_41700, gets installed with the march 2011 and sept 2011 release of HP-UX 11.31
PHKL_41700 11.31 fs_select cumulative patch
- PHKL_41700 introduced behavior whereby on systems that
do not enable high resolution timers, applications
that use the select(2) system call as a timers may
suffer performance degradation. If the performance
degradation disappears after setting the kernel
then the performance degradation observed is caused by
the changes introduced in the select(2) system call
code path provided by PHKL_41700.
- This behavior is known to affect the performance of
DataProtector, Quest Shareplex, Oracle SQL, Veritas
Cluster Server, xntp (Network Time Protocol) and
xterm(1) commands. Furthermore, any application that
uses the select(2) system call with zero timeout and
a value for nfd > 0 will experience this behavior.
solution : install successor patch PHKL_41967
2. their is a not very well known utild issue, introduced with the sept 2011 release of hp-ux 11.31
utild comes with the utilprovider fileset.
# swlist -l fileset|grep -i utilprovider
The utild version that is introduced with the sept 2011 HP-UX 11.31 version, is A.01.08.06.01.
This A.01.08.06.01 utild version, will try to send per utild "request", 8 scsi inquiry IO, on all lunpaths, off all luns, every 5 seconds. Try, because, utild will only be able to send the scsi inquiry IO if it can open the lunpath device file with the open system call. If the lunpath device file is allready opened, because its part of a active volumegroup, or because some application has the lunpath device file "opened" , no scsi inquiry IO will be send through that device file.
symptoms : on systems which have problems with this, utild will be one of the top10 cpu using processes. on systems with problems, sar -H 1 10000, will show 100's of very small IOs sent, while sar -L,sar -d, doesnt show a comparable amount of data IOs sent to the diskarray.scsi inquiry IO are typically between 128 and 256 bytes size.scsi inquiry IO is only displayed through sar -H not through sar -L and sar -d. data IO is displayed through and sar -H and sar _L and sar -d.
workaround : utild is started through /etc/inittab on multiuserlevel 2. change the util entry in /etc/inittab to only start utild on multiuserlevel 4 and do a init q.
solution : upgrade to utilprovider fileset A.01.08.06.03, downloadable from software.hp.com.
04-02-2012 07:14 AM
I have not responded back on your suggestions because I do not want to jinx myself. Things have been running good so far since the phkl patch, and up-whatever software. Your suggestions did not hurt anything. And there are other things outside of HPUX that have affected performance, like where to put datalogs.
As soon as I say everything works great, it will DIE again, just like the other 2 times it did before. So I'll have to type quietly so the computer can't hear me say.....
I think you suggestions may have fixed my problem.
In a few weeks I hope to really know truth and can hopefully give a better account of the resolution of my crazy performance issues.