09-26-2013 05:19 AM
I have a simple setup for testing purposes, which, when working, will be implemented at a busy animation studio.
HP 2910al-48G switch (will be connected to linux server via 10Gbe-CX4 backplane).
2x Windows 7 PCs equipped with HP NC382t dual NICs, bonded using HP Network Configuration Utility which makes a link to each pc = 2Gbps. My goal is 2Gbps speeds for both sending and receiving traffic to/from these machines over lan, equally utilising both legs of the bonded connection.
The best result I got to so far, was Windows machines utilising both legs of the "Team" for transmit (TX) traffic, but only one for receive (RX). I achieved this by testing all possible variations, and the only one which worked in a way described above was SLB+Round Robin. See the screenshot below.
I'm pretty sure the configuration of the switch is correct as I was able to utilise both legs of a bonded connection when testing linux to linux machines. I posted the 'show config' output below, just in case something needs to be different.
I'm not sure I'm using correct settings in HP network configuration utility on win7 machines though. From what I read, this should work (and is preferred) with 802.3ad + Automatic. But it doesn't seem to be the case for me. I'm also not too fussed about the speed at the moment as the test computers are running standard hard drives, not raided or solid state so they wouldn't be able to go faster. But the main point to figure out is how to make them use both legs of a bond equally in the first place.
Hope this makes sense and I look forward to suggestions as I'm banging my head against a brick wall for quite some time now.
Here are various outputs from the CLI of the switch, which may be useful:
Status and Counters - VLAN Information Maximum VLANs to support : 256 Primary VLAN : DEFAULT_VLAN Management VLAN : VLAN ID Name | Status Voice Jumbo ------- -------------------- + ---------- ----- ----- 1 DEFAULT_VLAN | Port-based No Yes ProCurve 2910al-48G Switch(config)# show jumbo Jumbos Global Values Configured : MaxFrameSize : 9216 Ip-MTU : 9198 In Use : MaxFrameSize : 9216 Ip-MTU : 9198 ProCurve 2910al-48G Switch(config)# show trunks Load Balancing Port | Name Type | Group Type ---- + -------------------------------- --------- + ----- -------- 25 | 100/1000T | Trk1 LACP 26 | 100/1000T | Trk1 LACP 27 | 100/1000T | Trk2 LACP 28 | 100/1000T | Trk2 LACP 29 | 100/1000T | Trk3 LACP 30 | 100/1000T | Trk3 LACP 31 | 100/1000T | Trk4 LACP 32 | 100/1000T | Trk4 LACP 33 | 100/1000T | Trk5 LACP 34 | 100/1000T | Trk5 LACP 35 | 100/1000T | Trk6 LACP 36 | 100/1000T | Trk6 LACP ProCurve 2910al-48G Switch(config)# show lacp LACP PORT LACP TRUNK PORT LACP LACP NUMB ENABLED GROUP STATUS PARTNER STATUS ---- ------- ------- ------- ------- ------- 25 Active Trk1 Up Yes Success 26 Active Trk1 Up Yes Success 27 Active Trk2 Up Yes Success 28 Active Trk2 Up Yes Success 29 Active Trk3 Down No Success 30 Active Trk3 Down No Success 31 Active Trk4 Down No Success 32 Active Trk4 Down No Success 33 Active Trk5 Down No Success 34 Active Trk5 Down No Success 35 Active Trk6 Down No Success 36 Active Trk6 Down No Success ProCurve 2910al-48G Switch(config)# show config Startup configuration: ; J9147A Configuration Editor; Created on release #W.14.38 hostname "ProCurve 2910al-48G Switch" module 1 type J9147A module 3 type J9149A interface 25 speed-duplex auto-1000 exit interface 26 speed-duplex auto-1000 exit interface 27 speed-duplex auto-1000 exit interface 28 speed-duplex auto-1000 exit interface 29 speed-duplex auto-1000 exit interface 30 speed-duplex auto-1000 exit interface 31 speed-duplex auto-1000 exit interface 32 speed-duplex auto-1000 exit interface 33 speed-duplex auto-1000 exit interface 34 speed-duplex auto-1000 exit interface 35 speed-duplex auto-1000 exit interface 36 speed-duplex auto-1000 exit trunk 25-26 Trk1 LACP trunk 27-28 Trk2 LACP trunk 29-30 Trk3 LACP trunk 31-32 Trk4 LACP trunk 33-34 Trk5 LACP trunk 35-36 Trk6 LACP ip default-gateway 192.168.1.254 vlan 1 name "DEFAULT_VLAN" untagged 1-24,37-48,B1-B2,Trk1-Trk6 ip address 10.0.100.152 255.255.255.0 ip address 192.168.1.250 255.255.255.0 ip address 10.0.10.250 255.255.255.0 jumbo ip igmp exit snmp-server community "public" unrestricted snmp-server contact "Roman Sputnik" location "Office" spanning-tree spanning-tree Trk1 priority 4 spanning-tree Trk2 priority 4 spanning-tree Trk3 priority 4 spanning-tree Trk4 priority 4 spanning-tree Trk5 priority 4 spanning-tree Trk6 priority 4 no autorun
09-26-2013 05:46 PM
How many ip interfaces do these PCs have?
The reason I ask is that the algorithm usd for allocating a frame to a member of an aggregated link is not a load-balancing algorithm, it usually is based on an IP hash, and if the destination IP is the same for all packets, then all packets will be assigned the same physical link member of the Aggregated link.
09-27-2013 07:49 AM
Thanks for your reply.
Each pc has 1 inbuilt network interface on the motherboard (not being used) and both have HP NC382t dual interface PCIe network cards, configured as described above. Even if I try to send 2 files separately, only one physical member is being used for RX, but both for TX.
09-28-2013 05:46 AM
like Vince mentioned before, it is important to understand the principle of the loadbalancing.
Loadbalancing over L2 interfaces will typically not be packet based, but hash based.
The reason it is not normally not packet based is that there could be minor latency differences on the 2 links, so packets of 1 link could arrive a bit sooner/later than the packets of the other link.
When all these packets are for the same TCP session, it means that the packets would arrive out of order, and you would experience a lot of TCP retransmits, causing worse performance.
So with the has-based system you do not have this problem.
The hash is calculated per packet and the algoritm will use several fields from the actual packet for the calculation.
The original concept was to use the source + dest mac address of the packet. As a result, every packet with the same source and dest mac would result in the same hash result (e.g. 0 or 1). This hash result is used to assign the packet to a physical interface.
So assume a packet from mac A1 to mac A2 would result in a hash 0, and hash 0 would be assigned to link 1, then all future packets from A1 to A2 will actually take link1 (ensuring in order delivery). There is no adaptive loadbalancing here, so even when link1 is at 100% and link2 at 0%, the packets from A1 to A2 would still be sent over link1.
A packet from A1 to A3 would result in hash 1, and would always take link2 in this example.
Since this means that e.g. all traffic from a host to a gateway (which has the same source/dest mac), would take the same link.
To overcome this limitation, the algoritm has been updated, and can use other fields from the packet. So all current switches will take the source/dest IP from the packet for the calculation. Again, this means that all packet from/to the same hosts would take the same link.
More recent switches (not the 2910 AFAIK), will allow the Layer4 (TCP/UDP port) to be used as input for the algoritm. So when you have 2 different TCP sessions between 2 hosts, these 2 sessions could be sent over 2 links. Consider however that in case the dest TCP port remains the same, and the client TCP port (which will change per TCP session) for session 1 is 1024 and for session 2 is 1026 (increment by 2), then the resulting hash will be the same and both sessions will use the same link.
If the switch supports this, it is typically an option which needs to be configured on the switch.
All comware based switches support this, for the provision, you should consider 2920/3500/3800 models (2620 supports this, but it is only100Mbps).
Also consider that when performing 2 file copies, it is possible that the same TCP session is used, so you would not experience this benefit.
Last consideration is that each device in the chain will perform its own hash calculation. So when you have pc1 to switch1 to switch2 to pc2 (all connections with 2 links using link-agg), then it is possible that pc1 will transmit based on tcp to the switch (which is were you will see outbound good loadbalancing), but switch1 will calculate its own hash for the traffic to switch2 (in case of 2910 this will be based on source/dest ip), so it is possible all traffic is sent over 1 link, then switch2 will do its own calculation to pc2 (so if switch 2 would support tcp based loadbalancing, then traffic would be sent over 2 links again), and then it arrives on the pc2.
The return path is again calculated based on its own rules.
So bottomline : 2x 1G links is not 2G ! if you need more performance, the only valid solution is 10G (10GbaseT is quite cost effictive nowadays).
Hope this helps,
09-30-2013 04:03 PM
You've described the physical interfaces.
What you need to look at is the IP interfaces. Regardless of the number of physical interfaces you might bond together, if they support just the one IP interface, then the switch will only ever send traffic up one of the physical links unless you figure out a way to reconfigure its hash method to something different.
09-30-2013 04:07 PM
After setting in HP network configuration utility, two bonded interfaces appear as one interface in network settings, they have one IP and one MAC address. It also appears so in cmd->ipconfig
09-30-2013 05:47 PM
If the target IP address is that single IP belonging to the bonded-pair, then the switch will only ever send frames down one physical link.
You could get traffic to use both physical links if the target IP address was varied - if you had multiple virtual NICs and multiple virtual hosts or different applications using a different one as their source address.
10-01-2013 04:52 AM
May I reveal an almost complete lack of understanding on this and try and start from a most basic position, as I too have attempted something similar to the OP without success.
My belief is that there is a distinct difference between the possibilities afforded by switch to switch LACP connections, and switch to NIC in a single machine, to the extent that it is perhaps unrealistic to even expect faster than 1Gbe over bonded multiple 1Gbe connections to a single machine?
Regardless of the hashes used etc. etc. (and indeed most of the detail in the above posts) I've concluded that one will never be able to, for example, connect an HP dual gigabit NIC via 2 cables to 2 ports on a switch like the 2910-al and see anything faster than 1Gbe (ie 125MBs at best) over the link.
Can someone confirm this please? Then I (and perhaps also the OP) can stop trying to accomplish the impossible!
But also, is this true in both directions, or as the OP observes, is it possible to transmit over both links from the machine to the switch (and beyond) but only ever receive at 1Gbe speeds because the of the hash issue etc.
The lure of the 2910-al, for example, is when it's used with a 10Gbe backplane module to a compatible 10Gbe interface on a server, for example. In this instance one might assume that multiple 2Gbe LACP connections between several workstations with dual gigabit NICs etc. would be able to communicate with bonded ports on the switch at up to 2Gbe speeds each, with a 10Gbe backplane ensuring that there was plenty of bandwidth for several such workstations to that server?
But this appears to be nonsense from what people are posting above, as while a workstation transmit speed might exceed 125MBs, the receive speeds (which is probably what people are most interested in?) will not.
Am I getting this right?
10-01-2013 02:45 PM
Think of it from the switch's point of view: it decides on a packet-by-packet basis which physical link in the link-aggregation up which to send the packet.
The way it decides is to look at the destination IP address, hash it, and match the hash to a link member.
If the destination IP address is always the same, then the hash is always the same, and so the physical link member will always be the same.
To use the second link member, you would need one of:
- varied destination IP addresses
- different hashing algorith based on something that does vary