05-23-2011 12:47 AM
we have a big problem with our brand new EVA8400 system. It's a version with 22GB cache, 27 enclosures and 260x 600GB FC 15k disks (one diskgroup). Attached are only VMWare hosts with vSphere 4.0. Every evening at 10PM when the backup jobs are starting the EVA generates high write latencies. The ESX hosts get path failures. I have studied all available documents about best practise, best performance,...., but the best is the following which I could see with EVAPerf: Both CPUs went to 99% and then to 0% for a few minutes. At the same time the write latency of the diskgroup at controller A goes to 250 ms and controller B to 350ms. The result is the time out for the ESX hosts. Now my question, does anybody can explain why the CPUs went to idle when they have a total host requests of 16000 with 300 MB/s? Many thanks for some ideas, Norbert
05-23-2011 06:15 PM
I hope you are talking about VCB... What is the Host you are using for VCB?? If Windows 2008?? then it is a know issue, you need to fine tune the FC HBA Driver on the Host. The Backup is a Serial IO Job, with 2008 it induces a larger IO Transfer sizes which is causing the issue.
The Latest FC HBA Driver states about the IO Transfer size on HBA(Preferably 128k) so that you will get rid from this issue.
Bases on the HBA Model fine tune the parameter so that you will not get the issue.
05-23-2011 10:08 PM
- c02697105 - HP StorageWorks Enterprise Virtual Arrays - VMware Hosts Reports Multitude of 'Task Set Full' Error in vmkernel Log
05-24-2011 06:38 AM
The EVA system is not able to handle and process the amount of data. FYI, we have worldwide 20 EVA systems, from EVA5000,6000,8100 to EVA8400, and no other machine has such performance problems. It's very strange, maybe it's a bug in the controller firmware (09534000)?
05-24-2011 07:20 AM
How many vSPhere physical servers are hooked up and what kind of servers? Are these Blade based servers?
Also - how many FC Connections do your Linux Physicals have total?
Maybe you are just saturating your EVA?
And lastly - what kind of Virtual Machines do you run on these Virtual Environment? Windows solely or a mix of Linux, Solaris, etc?
05-26-2011 06:47 AM
05-26-2011 07:12 AM
Also - are these ESX Phys Servers - Blades? If so -- how many fibre runs from the Blade Enclosures to the SAN/EVA?
How many vDisks presented as Datastores so far to your VI? And what sized Vdisks?
Do you use RDMs?
Things to check and suggestions:
- Check multipathing of the Datastore LUNs
- Adjust Queue Depths as well
- Make sure your EVA Vdisks alternate between controllers
And finally, find out the heavily taxed Vdisks via EVAperf -- something should come out of there as far as further tuning/adjustments etc.
05-26-2011 07:16 AM
Excuse me, but that does NOT make sense !!
If EVAperf cannot retrieve the data,
it must say so and not give wrong values !!
Competent database systems, for example, offer the NULL value as a means to indicate unknown / missing data.
And no, I don't mean to "shoot the messenger" ;-)
05-26-2011 07:28 AM
11-11-2011 08:43 AM
Are you still having issues with your EVA8400? I also have several EVA6400 and EVA 8400s that are giving me problems that I do not see with my 6100s or 8100s.
We have set the xfer size down to 128KB, which helps, but still get high latencies at times and the CPU goes close to 100% on the controllers.
We also had issues with multiple snapshots on a LUN - almost killed us and we had to switch to snapclones to get better performance.
I think they changed something on the cache algorithms..