07-31-2012 07:33 AM
I have several Windows 2008 R2 HA clusters all running on DL580 G7's. Primary network link is an NFT team (2 NIC's) going to a CISCO 3750 switchstack. There is no loadbalancing configured on the switches and STP is disabled on the teamed NIC ports. Both NICs are in the same VLAN
What I'm seeing is CPQTEAMMP errors and then the NIC's appear to reset and then some Windows Failover-Clustering errors.
The errors in order are:-
CPQTEAMMP - 384
HP Network Team #1: PROBLEM: A Failover occurred: The Primary Network Link is down. ACTION: Please check your cabling or switch port status, or run diagnostics to test card. Also make sure all teamed NICs are on the same network.
CPQTEAMMP - 439
HP Network Team #1: PROBLEM: A non-Primary Network Link is being Closed. This is typically because of a PnP action, possibly it was reconfigured through Network-Properties or through HP Network Configuration Utility? Possibly it was Disabled? Possibly it is being dropped from a Team or the Team is being Dissolved? ACTION: No action is required if the described behavior is expected. Otherwise, investigate the PnP reason, possibly re-enable the miniport.
FailoverClustering - 1127
Cluster network interface 'server1 - NetworkTeam' for cluster node 'server1' on network 'Cluster Network 1' failed. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
I then get this event repeating for each interface until NIC's have failed.
CPQTEAMMP - 416
HP Network Team #1: A Failover occurred: The Primary NIC is being PnP Closed by the Operating System.
CPQTEAMMP - 434
HP Network Team #1: PROBLEM: A non-Primary Network Link is not receiving. Receive-path validation has been enabled for this Team by selecting the Enable receive-path validation Heartbeat Setting. ACTION: Please check your cabling to the link partner. Check the switch port status, including verifying that the switch port is not configured as a Switch-assist Channel. Generate Broadcast traffic on the network to test whether these are being received. Also make sure all teamed NICs are on the same broadcast domain. Run diagnostics to test card. Drop the NIC from the team, determine whether it is receiving broadcast traffic in that configuration.
The validate your cluster wizard says everything is OK. I'm pretty sure the switch and NIC team configs are OK - we are only running NFT.
The only other setting that I can think of changing is disabling Receive Path Validation as its mentioned in the errors? All of the above errors are logged within a few seconds of each other (many of them logged in the same second) and sometimes CPQTEAMMP 434 is one of the first.
All suggestions gratefully recieved!