Expert Day: HPVM migration network & host SG heartbeat

by Community Manager on ‎12-21-2011 06:18 AM


Observing warnings from cmcld about heartbeat that correlate with when HPVM online migrations are being run.




Dec 11 12:15:08 node-t cmcld[11261]: Warning: cmcld was unable to run for the last 3.5 seconds.  Consult the Managing Serviceguard manual for guidance on setting MEMBER_TIMEOUT, and information on cmcld.

Dec 11 12:15:08 node-t cmcld[11261]: This node is at risk of being evicted from the running cluster.  Increase MEMBER_TIMEOUT.

Dec 11 12:15:08 node-t cmcld[11261]: Member node-a seems unhealthy, not receiving heartbeats from it.


node-t common.log:


12/11/11 12:14:45|SUMMARY|CLI|root|/opt/hpvm/bin/hpvmmigrate -S -P vm-j -h node-a


12/11/11 12:18:20|SUMMARY|vm-j|root|Guest 'vm-j' migrated successfully to VM host 'node-a'


Hosts have multiple networks. Per SG doc, “HP recommends that you configure all subnets that connect cluster nodes as heartbeat networks”, have done this.


Would you recommend making a change to not run heartbeat on the one network used for online migration? (Changing the migration network IP to STATIONARY_IP.) Provided have enough other networks for heartbeat redundancy.


Background: Customer on SX2000-based midrange server npars using HPVM for database and app workloads, in production on HPVM 4.2 with VMs as Serviceguard A.11.19 packages configuration on Virtual Disk SAN storage. Planning update to HPVM 4.3 and A.11.20. (Started with HPVM 2.0 in 2007.) Test/dev VMs are on an i2 blade using HPVM 4.3 on Virtual FileDisks on NFS.



> Certain optimization features are disabled automatically and the NIC port is put into promiscuous mode.

The above is for two-port NICs on SX2000-based midrange. Also have blades with Flex10 NICs. Do you know if these two changes when a vswitch is defined only impacts how the HP-UX network stack handles the interface or if it changes how the entire NIC card operates. What I’m getting at is if lan1 and lan2 are on the same two-port NIC or in the blade if lan16 - lan23 are FlexNICs on the same two-port Flex10 mezzanine card, if a vswitch is configured on lan1 or lan16 and there is no vswitch on lan2 or lan17 for example, does lan2 or lan17 get affected since they are on the same card?


Answer> In the example lan2 or lan17 should not be affected.


Question: Also have the HPVM host configured with another pair of interfaces that are just for the HPVM host administration (admin logins, reaches the default router for the HPVM host, monitoring tools, additional heartbeat network, network recovery archive creation) that currently does not have a vswitch configured on it. Since this interface is lightly used, considering moving some frontend traffic of VMs onto it (app, client traffic) that is currently on other interfaces so defining a vswitch. Would you recommend against this?


Answer> One of the uses you mention is for archive creation. I would be worried about the potential impact this could have on the VM traffic.

by on ‎12-28-2011 01:28 PM

Current Answer section above is the detailed answer that does not directly answer the question.


Basic answer is:


Yes, should have a network that is dedicated to online migration.


Dedicated means no vswitches on this network and if using Serviceguard on the host, no heartbeat on this network. Regarding heartbeat, this is an exception to the HP recommendation to use all available networks for heartbeat.


The detailed answer above in the Answer section further explains why just the presence of a vswitch on a network makes that network not optimal for host traffic, including online migration traffic.


(Am the person who posted the original question that became this knowledge base item.)



Showing results for 
Search instead for 
Do you mean 
HP Blog

Follow Us
Twitter Stream
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.