04-02-2014 08:27 AM - edited 04-02-2014 08:34 AM
I am having the write latency issue on 1 production cluster and 1 lab cluster. the production cluster is using round robin with two paths to each datastore.
the lab cluster currently is using fixed path policy and it only has 1 iSCSI NIC and 1 available path.
write latency is very poor on both clusters (esxtop shows 200ms+ for write latency) so I don't think the path policy is helpful in my situation.
01-28-2015 02:11 AM
I feel compelled to contribute since we have experienced the same latency issues using VSA 11.5 / ESXi 5.5 u2. Similar to other contributors experiences in this discussion, the latency seems to occur whenever the VSA cluster is accessed via a local gateway VSA node, thus requiring iSCSI traffic to pass through the local ESXi vSwitch network stack. Accessing the cluster via a remote VSA gateway on another host shows good performance in contrast. The issue would seem to be that having your VSA node sharing the same local vSwitch as your iSCSI vmk ports, introduces the latency if you are accessing a VSA presented datastore that the VSA cluster has determined should be presented by the same local VSA node on the same vSwitch.
This infers that it is as likely to be a hypervisor network stack performance issue as a VSA cluster issue.
2 x HP DL380 Gen8's; local 15K SAS HDD Storage Array; vSphere ESXi 5.5 u2 (HP Build)
2 v HP VSA 10TB v11.5; Software iSCSI Adapter; Standard twin path iSCSI Initiator configuration.
Network is 10GbE with Jumbo Frames (9000MTU). Throughput to non-local VSA node is around 3-400MB/s @ <20ms latency. Throughput to local VSA node is around 1-200MB/s with >1000ms latency spikes.
The VSA paths tend to settle on a pattern where one particular volume / datastore presented by the cluster VIP is always mapped to a local VSA on a particular host. This is desirable since this offers load balancing between VSA’s. However, often this will mean that VSA Datastore 1 being accessed by ESXi Host 1 via its local VSA, and VSA Datastore 2 is being accessed by ESXi Host 2 via its local VSA respectively. Storage degradation is then experienced by ESXi Host 1 on VSA Datastore 1 (local) but not on VSA Datastore 2 (remotely accessed via pNIC / Switch), and vice versa.
Running various storage performance tools, it seems that the throughput / latency to the local VSA node begins acceptably, but as you ramp up the test data it suddenly seems to become saturated wherby latency goes through the roof. Using Round Robin Path Policy at iops 1 or default 1000 gives very good storage performance on the non-local VSA, but abysmal performance on the local VSA. Defaulting to Most recently Used Path Policy gives poorer but acceptable performance on the non-local VSA, and poor performance on the local VSA, however latency seems to remain just within acceptable tolerances - still spiking occasionally to several hundred ms, but averaging between 20-30ms. The inference perhaps is that the lower throughput / path switching reduces the frequency of the saturation of the local hypervisor network stack with iSCSI traffic passing between a local target and initiator.
As suggested here already in this dicussion, the solution would seem to be to separate out the VSA and the iSCSI Software Initiator vmk's, however we have no more pNIC's to offer each ESXi Node at the moment and 10Gbe cards and switch modules are expensive!
Hope all this helps someone.
01-28-2015 01:05 PM
I agree with your conclusions, but getting VMware to resolve the bug will only happen if a very large customer of theirs complains about this. Any customer large enough to have the clout needed will probably not be using HP VSA.
03-06-2015 04:38 AM
Quite likely I have to agree. I doubt if my organisation has the clout certainly....
One solution I have come up with since last posting that I'll share if its of use to anyone:
As I suggested in the last post, the issue described can be potentially negated by separating out both iSCSI target and initiator interfaces used on the VSA. The goal is to avoid sending iSCSI traffic through the ESXi hypervisor network stack locally on a host, which seems to inroduce a lag with certain pathing / load balancing configurations. Adding additional 10GbE adaptors however is expensive and adds more cabling complexity to the solution. One way round is to use a HP FlexFabric 10GbE adapter that supports NPAR, such as the 533FLR-T. This allows partitioning of the two physical 10GbE interfaces into additional virtual adaptors in ESXi. My idea is to split each physical 10GbE port into 2 x virtual adapters in ESXi, then distribute these virtual adapters between 2 x vSwitches for iSCSI traffic rather than a single one with 2 x pNIC's as before (with the usual sharing of a virtual adapter from each physcial port in order to retain port failure resiliency). If we then split the target / initiator interfaces between the two vSwitches, this forces all iSCSI traffic to leave the hypervisor to reach its destination using a physical port. It does mean that adapter bandwidth is halved - i.e. 10GbE could be partitioned into two virtual ports at 5GbE each. However, this should still be more than acceptable for your average iSCSI setup, especially when using MTU9000.
I'll post results of testing this when I next have an oppertunity to build the same sort of rig again soon.
"Most recent path used" seems to settle things down a little - storage performance is still poor though not crippling so. This and "Fixed path" still allows a path to be created locally to the VSA, the pathing distribution itself being managed by the VIP and the internal pathing logic employed by LeftHand OS (which you cannot manually change).
I have now tried using the NPAR featue of the adapter. As expected, each port was given an additional 3 virtual adapters that can be used to created dedicated vSwitches - one for VSA and one for iSCSI Adapter. In practice however I have found that the setup is unstable - it tended to work on only one host at a time in a cluster where two Active iSCSI paths could be sustained. On the other partner host in the cluster, one path would always be down. This seems more likely an artifact of how the cluster VIP interprets requests to establish paths from the adapter than anything else - it did not seem to like virtual adapters.
Thus I reverted back to the 2 x 10GbE 'real physical' Port arrangement. Then I discovered the article linked below, implemented the recommendation, and now storage runs like the wind with no slowdowns. It seems HP LeftHand is amongst those storage systems suseptable to this condition, especially when storage data is passed over the hypervisor stack and MTU9000 is used. Once enabled, the reverse situation seems to be the case - local VSA storage access is even faster than storage data accessed over the network from the partner VSA (which is still fast). According to the Performance Monitor in the CMC, I am now getting anything up to 600MB/s with latencies of 7-8ms. VM's are also reporting substantial storage performance via vCenter. Happy days!
Thank you for sharing the link.
Just to confirm, you applied the work around "Disabling Delayed ACK in ESX/ESXi 4.x and ESXi 5.x" and now your HP VSA storage system is performing considerably faster, correct?