Re: Bonding Failover Problem (87 Views)
Reply
Occasional Advisor
Tim Goodman
Posts: 12
Registered: ‎10-23-2007
Message 1 of 5 (87 Views)

Bonding Failover Problem

I have DL380 G4s with NC7771 NIC cards running redhat ES 3 update 6. I have them in mode 1 plugged in to separate switches. I have tried the tg3 and ncm5700 nic drivers. The problem is that when I unplug the active nic it won't pass traffic to the other nic for about 60-90 seconds. When you view the dmesg output it shows it failing immediately and activating the other nic. Any ideas why this is not working correctly?

modules.conf -
#alias eth0 tg3
alias eth0 bcm5700
#alias eth1 tg3
alias eth1 bcm5700
#alias eth2 bcm5700
alias scsi_hostadapter cciss
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias bond0 bonding
options bond0 mode=1 miimon=100

ifcfg-bond0 -
DEVICE=bond0
BOOTPROTO=none
IPADDR=10.70.80.119
NETMASK=255.255.240.0
ONBOOT=yes
TYPE=Ethernet
USERCTL=no

ifcfg-eth0 -
# eth0
DEVICE=eth0
#IPADDR=10.70.80.119
#NETMASK=255.255.240.0
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
ONBOOT=yes
#ETHTOOL_OPTS="speed 100 duplex full autoneg off"
TYPE=Ethernet

ifcfg-eth1 -
DEVICE=eth1
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
ONBOOT=yes
#ETHTOOL_OPTS="speed 100 duplex full autoneg off"
TYPE=Ethernet

dmesg output
bcm5700: eth0 NIC Link is Down
bond0: link status definitely down for interface eth0, disabling it and making interface eth1 the active one.
Please use plain text.
Honored Contributor
Ivan Ferreira
Posts: 6,957
Registered: ‎05-07-2004
Message 2 of 5 (87 Views)

Re: Bonding Failover Problem

Maybe a switch (spanning tree or something) issue. It looks that it takes too long to identify the location of the MAC address.

Check the status in proc/net/bonding/bond0.

Check also the port status for autonegotiation problems.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Please use plain text.
Occasional Advisor
Tim Goodman
Posts: 12
Registered: ‎10-23-2007
Message 3 of 5 (87 Views)

Re: Bonding Failover Problem

It's not a switch problem because I have some DL380 G5s and DL360 G3s that fail over correctly. The servers are negotiating correctly.
Please use plain text.
Honored Contributor
Ivan Ferreira
Posts: 6,957
Registered: ‎05-07-2004
Message 4 of 5 (87 Views)

Re: Bonding Failover Problem

Check your kernel configuration for ARP values, for example, arp_filter.

The, I would try with fail_over_mac option.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Please use plain text.
Occasional Advisor
Tim Goodman
Posts: 12
Registered: ‎10-23-2007
Message 5 of 5 (87 Views)

Re: Bonding Failover Problem

arp_filter is 0

I can't do fail_over_mac because it was added in v 3.2 and I'm running 2.6
Please use plain text.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation