Performance issues when using VSA on ESX with VMXNET3 driver (14420 Views)
Regular Advisor
Posts: 160
Registered: ‎07-27-2011
Message 51 of 84 (5,448 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Hi Braak,

 

Any update on your support case?

Frequent Advisor
Posts: 45
Registered: ‎10-20-2009
Message 52 of 84 (5,407 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

We stopped using the VSA's. We cant work with the product if it isn't stable and lacks of proper support :(

Last advice of vmware was setting the --iops parameter to 1.

 

Hope you have more luck

Frequent Advisor
Posts: 45
Registered: ‎10-20-2009
Message 53 of 84 (4,771 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Let's do some new testing with ESXi 5.1 and Lefthand OS 10 as soon it's GA. (4th of December)

I hope the problems are fixed by now..

 

New version 10 VSA's have 2 vCPU's and should work a lot better.

Also see http://marcelbraak.wordpress.com/2012/11/03/saniq-10-lefthand-os-soon-to-be-released/  

Regular Advisor
Posts: 160
Registered: ‎07-27-2011
Message 54 of 84 (4,760 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

I have had very positive experiences with ESXi 5 and 9.5. Like the blog post says, 2 vcpus provides a noticeable boost in performance.
I hope to see the write performance increase in 10.0 as I did notice a dip in throughput after upgrading to 9.5.

I think the one thing HP needs to do is add a CMC plugin for the vSphere client. If I could manage my SANiQ cluster from the vsphere client that would be killer.
Advisor
Posts: 20
Registered: ‎05-26-2010
Message 55 of 84 (4,731 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Hi all,

 

I hope that bonding between more virtual NICs will be supported in VSA 10.0 at last.

With only one NIC is VSA very "slow"...

Honored Contributor
Posts: 878
Registered: ‎07-20-2011
Message 56 of 84 (4,721 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

for VSA's, why can't you just do the bonding before presenting to the VM?  Thats what I'm doing in hyper-v and it works fine.

Advisor
Posts: 20
Registered: ‎05-26-2010
Message 57 of 84 (4,695 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Hi oikjn,

 

of course, I have bonding before presenting to the VM (VSA) but it is failover type (active-pasive) i. e. I can see traffic flow on one NIC only in realtime.

Honored Contributor
Posts: 878
Registered: ‎07-20-2011
Message 58 of 84 (4,681 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

I don't really follow vmware much... can it not do something like LCAP bonding at the host?  I would have thought it could do that.  If not, I guess the guest is the only option, but LCAP or the equivalent requires the switches to accept that active-active link so if you don't have that you can't really solve your problem through the VSA anyway.

Frequent Advisor
Posts: 56
Registered: ‎06-25-2010
Message 59 of 84 (4,676 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

VMware can nic bond, but LACP doesn't really give you twice the bandwidth unless you are talking to multiple destinations. It isn't the recommended way of doing things with VMware, as their built in load balancing and redundancies work better.

Also, you won't have more than 1 gbps unless you are using VMXNET3, which is what this whole thread is about. All of the other nics are 1 gbps, so actually bonding channels won't get you 2 gbps of throughput out of a 1 gpbs virtual nic.

 

Honored Contributor
Posts: 878
Registered: ‎07-20-2011
Message 60 of 84 (4,652 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

understood.  my point was purely, why bond in the VM?  what method of bonding inside the VM would give you full bonding bandwidth?

Regular Advisor
Posts: 160
Registered: ‎07-27-2011
Message 61 of 84 (4,247 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

I wanted to update this thread as I have been able to pinpoint the exact cause of the latency problem after repeated testing in two different environments. If some of you could try to duplicate my findings that would be great.

I found the cause of the high latency is due to setting the IOPS per path lower than the default of 1000 using the command:

esxcli storage nmp psp roundrobin deviceconfig set --device=naa.xxx --iops 1 --type iops

Watching the device latency in esxtop shows that after applying this setting the latency to my SANiQ volumes increases dramatically for any virtual machine that happens to be running on the same host as the gateway VSA. In some cases the latency is in the 1000's.

Changing the IOPS setting on the fly and watching esxtop can reproduce or eliminate the high latency at will.

If someone else could try this and verify they see similar behavior that would be awesome.

My configuration mimics what HP recommends, separate vswitch and (4) NICs for iSCSI, iSCSI port bindings, psp set to round robin, etc.

Hope to hear back from some of you.
Frequent Advisor
Posts: 56
Registered: ‎06-25-2010
Message 62 of 84 (4,219 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Just to clarify, when the default is used, you see low latency, when you set it to 1, you see high latency?

Have you tried somewhere down the middle, say 100?

 

Regular Advisor
Posts: 160
Registered: ‎07-27-2011
Message 63 of 84 (4,211 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Hi Rons, That is correct. I haven't tried anything in between. The difference is very noticeable however. I still see occasional blips in latency with the IOPS at 1000, but there's a vast improvement overall.

I think the root cause of the latency is having a VSA portgroup sharing vmnics that are used for iscsi port bindings. I'm am testing to prove that theory now and will report my findings.
Occasional Visitor
Posts: 1
Registered: ‎03-06-2013
Message 64 of 84 (4,061 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

For those of you reporting performance issues, how are your vSwitches configured? I ran into a huge latency issue (ESX 5.0 software iSCSI, SANiQ 9.5) when my vmk bound to iSCSI was sharing the same vSwitch as the VSA node... this was with the system largely idle. By simply moving the VSA to its own vSwitch, and forcing iSCSI traffic out through the physical switch we saw dramatic improvement. The latency was only being seen by the ESX host local to the VSA, conditions were reproducible with other ESX hosts. Each time, the second ESX host in the cluster (remote to the first VSA) saw no issues with latency when attached to the first VSA. Single physical switch with separate VLANs for iSCSI and DATA used to connect the ESX hosts.

 

I've only deployed VSA using the flexible adaptor, always keeping the VMXNET3 disconnected in vSphere. iSCSI VLAN is also always routed to allow VSA's to communicate with email, NTP, CMC etc. in the data network. HP VSA is a great piece of software IMO, I will be testing ESX 5.1 SANiQ 10 in the coming weeks.

 

HTH

thanks,

 

 

 

 

Regular Advisor
Posts: 160
Registered: ‎07-27-2011
Message 65 of 84 (4,040 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

"or those of you reporting performance issues, how are your vSwitches configured? I ran into a huge latency issue (ESX 5.0 software iSCSI, SANiQ 9.5) when my vmk bound to iSCSI was sharing the same vSwitch as the VSA node... this was with the system largely idle. By simply moving the VSA to its own vSwitch, and forcing iSCSI traffic out through the physical switch we saw dramatic improvement."

 

This is exactly what I am seeing. I wrote in my previous post that the iSCSI port bindings and the VSA on the same vSwitch seems to be the root cause of the majority of the latency. Setting the IOPS = 1 makes the problem much worse.

 

After separating the VSA from the iSCSI vSwitch the latency improved dramaticaly. Changing the IOPS doesn't seem to make any difference with this configuration.

 

In some cases I see a  VSA max out a 2 Gbps etherchannel (3 node cluster).  I would imagine there must be extreme resource contention when that volume of traffic is mixed with iSCSI traffic.

 

It seems separating the VSA and iSCSI initiator is the way to go.

 

Frequent Advisor
Posts: 45
Registered: ‎10-20-2009
Message 66 of 84 (3,994 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

That is indeed a workaround HP provided me. Downside is you need additional nics and switch ports.

I still find it very hard to believe HP still didn't fix this issue.
Frequent Advisor
Posts: 56
Registered: ‎06-25-2010
Message 67 of 84 (3,992 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

You could do something like this, http://blog.davidwarburton.net/2010/10/25/rdm-mapping-of-local-sata-storage-for-esxi/

But this is NOT supported by VMware. I personally won't run my production storage on an unsupported solution.

Some local storage can use RDMs out of the box, if you have that set up, then great, run with it. I would, because I also feel like RDMs have to be at least a little bit faster. I have just never seen any documentation to show that they are. VMware will tell you there is no performance improvement. A former director at LeftHand also told me there was no improvement.

Regular Advisor
Posts: 160
Registered: ‎07-27-2011
Message 68 of 84 (3,952 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

I tried to configure my VSAs to use RDMs and even carved out 5 additional luns on my RAID controller to do it. Unfortunately, the RAID controller didn't allow me to use them. I think the exact reason was due to the RAID controller not reporting a unique NAA number for each lun. I read on these forums that RDMs cut down on the latency, which made me eager to give it a shot.

 

Now that I have eliminated as much latency as possible via network and iSCSI settings I notice the read latency is a still little high for the VSA. With very low IO to the SAN, in the dozens of IOPS according to CMC, the VSA read latency hovers around 20 ms. The write latency is fine, likely due to the RAID controller cache. I see this same behavior on systems with 8 spindles and also on systems with 25 spindles. I guess this is more or less normal as even the best predicitive read-ahead-and-cache algorithm won't help with random reads.

 

When you factor in the operation of the VSA, for example; the gateway VSA may have to request blocks from the other VSAs, then that server's seek times, and then transferring the IO back through the gateway VSA to the initiator, it makes sense the latency is going to be a little higher than usual.

 

If HP would create their own NMP/SATP/PSP for ESXi that functions similar to how the DSM works for Windows that would probably help with the performance. If I understand correctly the DSM for Windows has a gateway to each VSA and accesses the appropriate VSA for any given block. Someone had a good post recently on here that called out the differences.

 

I can live with the latency, because of what the VSA allows me to do. If space is a concern and you have extremely high consoldiation ratios the VSA is the best option out there.

 

 

 

 

Occasional Visitor
Posts: 1
Registered: ‎08-23-2013
Message 69 of 84 (3,208 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

5y53ng

 

I am using VMware ESXi 5.0.0 build-469512.  (ESXi5.0 base).  I have a single node 9.5 VSA deployed on this ESX server and i have a vSwitch configured so that my iSCSI adapter vmhba33 is bound to the vmkernel port on vSwitch1.  vSwitch is also where all VSA iSCSI traffic is configured to be.

 

I have a HP DL370G6 with the integration NC375i (quad port network card) and P410i Smart Array controller.

 

I created a volume in CMC and presented it to my ESXi5.0 server via iSCSI.  I created a datastore on this volume and use it for IO tests to try and duplicate the high latency issues.

 

I am unable to duplicate the issue you encountered below and were hoping you could share more configuration details.  

 

On the config above a ~1GB file create using dd takes about 10 seconds as I am able to get over 100MB/s of write throughput and latency less than 1ms range.

 

/vmfs/volumes/52180571-4a9dfce4-fab5-0025b3a8ec7a # time dd if=/dev/zero of=testfile bs=1024 count=1024000 1024000+0 records in

1024000+0 records out

real 0m 12.07s

 

 

For a ~10GB file create similar latencies are observed and i get:

/vmfs/volumes/52180571-4a9dfce4-fab5-0025b3a8ec7a # time dd if=/dev/zero of=testfile bs=1024 count=10240000 10240000+0 records in

10240000+0 records out

real 1m 41.95s

 

So i think i may be missing a key configuration item to duplicate these high latency issues.

 

For my test, I decided to use a single vSwitch and keep everything, VSA, management network and iSCSI traffic on that same switch.  So vmhba33 (iSCSI initiator) is bound to only vmk0.

 

/vmfs/volumes/52180571-4a9dfce4-fab5-0025b3a8ec7a # esxcli iscsi logicalnetworkportal list -A vmhba33
Adapter Vmknic MAC Address MAC Address Valid Compliant
------- ------ ----------------- ----------------- ---------
vmhba33 vmk0 00:25:b3:a8:ec:78 true true

 

 

The vSwitch info looks like this:

vSwitch0
Name: vSwitch0
Class: etherswitch
Num Ports: 128
Used Ports: 5
Configured Ports: 128
MTU: 1500
CDP Status: listen
Beacon Enabled: false
Beacon Interval: 1
Beacon Threshold: 3
Beacon Required By:
Uplinks: vmnic0
Portgroups: VM Network, Management Network

 

 

Thanks ahead.

Valued Contributor
Posts: 184
Registered: ‎06-25-2013
Message 70 of 84 (3,132 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

try doing 2 svmotion -> one from das over network to vsa

one from iscsi to das

 

then run CDM (pure random mode) on 3 clients at the same time 100 % random read/write

 

watch the network stack crumble and hosts doing svmotion skew time

Member
Posts: 5
Registered: ‎03-20-2014
Message 71 of 84 (2,599 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

I know that this is a very old thread... but I seem to be having this issue with ESX 5.x and VSA 11.0. 

 

Was this ever resolved? Also, I do not see a way to change the VMXNET3 to E1000 (not during the install).

 

Please advise. Thanks.

Occasional Advisor
Posts: 7
Registered: ‎04-01-2014
Message 72 of 84 (2,535 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

FYI: I am also having the same problem very high cluster write latency with ESXi 5.5 (build 1623387) and VSA 11.0. latest patches on everything.

 

write latency on each node is fine, only the cluster write latency is bad 50ms to 150ms.

 

it seems to be vSwitch related like this thread says. I will try to split my VSA and my software iSCSI ports as suggested. What a waste of NICs.

Member
Posts: 5
Registered: ‎03-20-2014
Message 73 of 84 (2,533 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Another (lame?) solution is to disable the MPOI for the unit, ie: do not select Round Robin as the policy.

Occasional Advisor
Posts: 7
Registered: ‎04-01-2014
Message 74 of 84 (2,529 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

turning off round robin for the ESXi software iscsi selection policy also fixes the problem?

Occasional Advisor
Posts: 7
Registered: ‎04-01-2014
Message 75 of 84 (2,522 Views)

Re: Performance issues when using VSA on ESX with VMXNET3 driver

Has any one tried modifying any of these advanced network settings within ESXi??

 

 

The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.