Understanding clearly about AUTO_RUN, Failback and NODE_FAIL_FAST options of a package (326 Views)
Reply
Advisor
Senthil_N
Posts: 28
Registered: ‎03-04-2013
Message 1 of 2 (326 Views)
Accepted Solution

Understanding clearly about AUTO_RUN, Failback and NODE_FAIL_FAST options of a package

Hi All,

 

I want to clearly understand the AUTO_RUN option of a package.

 

There is a package called abcdxyz in a cluster.

 

# grep -v "#" abcdxyz.ascii


PACKAGE_NAME abcdxyz


PACKAGE_TYPE FAILOVER


FAILOVER_POLICY CONFIGURED_NODE


FAILBACK_POLICY MANUAL

NODE_NAME server1
NODE_NAME server2
NODE_NAME server3


AUTO_RUN YES

NODE_FAIL_FAST_ENABLED NO


# cmviewcl -v -p abcdxyz

PACKAGE STATUS STATE AUTO_RUN NODE
abcdxyz up running disabled server1

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 2 0 abcdxyz_01
Subnet up xxx.xxx.xxx.xxx

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled server1 (current)
Alternate up enabled server2
Alternate up enabled server3


My Questions:


1)AUTO_RUN, as per package configuratio file this is enabled, then why output "cmviewcl -v -p abcdxyz" is showing it is disabled. Yes I know that we can use the command "cmmodpkg -e <pkg name> to enabel this option.

1.1)Why AUTO RUN is disabled even if it enabled in package configuration file.

1.2)When the package configuration file will be red and this option will enable automatically with using using the command "cmmmodpkg".

1.3)What will happen, if the output of "cmviewcl" shows AUTO_RUN is disabled and node switching options for all nodes are enabled?

1.4)What will happen, if the output of "cmviewcl" shows AUTO_RUN is enabled and node switching options for all nodes are disabled?

 

2)Fail back, as per my configuration. it is manual.

2.1)For fail back the package to desired node, Do we need to halt the package in running node and then run the package on desired node? How to failback the package please explain with commands.

 

3)What is the purpose of NODE_FAIL_FAST?

Honored Contributor
Matti_Kurkela
Posts: 6,271
Registered: ‎12-02-2001
Message 2 of 2 (283 Views)

Re: Understanding clearly about AUTO_RUN, Failback and NODE_FAIL_FAST options of a package

1.1) The configuration file specifies the initial state for the AUTO_RUN option for each package. This initial state is used at the cluster start-up only (i.e. when you run "cmruncl"). After that, as long as even one cluster node remains running, the cluster will maintain the current state of each package in RAM on all active cluster nodes. If new nodes join the cluster later, they will receive a copy of this state from the already-running node(s).

 

With the cmmodpkg -d and -e, you are updating the current state of the package AUTO_RUN option, not the initial state.

 

When you run "cmhaltpkg" for a package, the AUTO_RUN option for that package will automatically be set to disabled. This is required: otherwise Serviceguard would automatically restart the package immediately.

 

In short:

  • When AUTO_RUN is enabled, Serviceguard has the authority to start a failover operation for the package if it is necessary, and Serviceguard will decide where the package will failover to, using the package switching options.
  • When AUTO_RUN is disabled, Serviceguard will not start or failover the package automatically: the system administrator has full authority to start or stop packages, and to choose where to move them.
  • When you use the cmhaltpkg command, it will automatically disable AUTO_RUN. Newer Serviceguard versions will remind you of this each time.

 

1.2) The ASCII package configuration file will be read when you run "cmcheckconf" or "cmapplyconf" commands only, and never otherwise.

 

When cmapplyconf is run for the package and the ASCII configuration file is successfully validated, the contents of the ASCII package configuration file will be copied to the binary Serviceguard configuration file, which is automatically kept in sync on all cluster nodes. After that, you can even delete the ASCII package configuration file if you want: as long even one of the cluster nodes is useable, you can get an up-to-date copy of the ASCII package configuration file using the cmgetconf command. Serviceguard will only use the settings in the binary configuration file.

 

The AUTO_RUN setting that is stored in the binary Serviceguard configuration file (the initial state for the package AUTO_RUN setting) will be used only at the cluster start-up time. At any other time, the cluster will hold the current state of the package in RAM and will automatically copy this state information to any nodes that (re)join the cluster later.

 

1.3) If the switching options are enabled for all nodes, the package can run on any node. But if AUTO_RUN is disabled, the package will not failover automatically. When the options are set like this, the package may either be halted or running.

 

If AUTO_RUN is disabled in the package configuration file, the package will not be started automatically at cmruncl time: the sysadmin must start it manually, using the cmrunpkg command.

 

1.4) Trying to enable AUTO_RUN for a package while the switching options are all disabled will not be very useful: Serviceguard will log an error message, telling that the package cannot run because there are no more nodes that are allowed to run it, and will automatically disable AUTO_RUN.

 

If the switching options for a package are all disabled, the package cannot be running on any node. So in this situation, the package must be in halted state.

 

2.1) To failback the package abcdxyz to the primary node server1:

  • first, fix the problem that caused the original failover
  • if necessary, use "cmmodpkg -n server1 -e abcdxyz" to re-enable the switching option for server1 for this package (with this action, you're telling Serviceguard that the problem is fixed: if Serviceguard detects a problem with a package on a node, it may disable the switching option for that node to prevent being stuck in an infinite loop trying to failover the package to a node where it already failed once.)
  • run "cmhaltpkg abcdxyz" on any node. (This will automatically change the current state of the package AUTO_RUN setting to disabled, as you are taking over the authority to make decisions on the package state.)
  • run "cmrunpkg -n server1 abcdxyz" on any node, or "cmrunpkg abcdxyz" on node server1.
  • to re-enable automatic failover, run "cmmodpkg -e abcdxyz" after you've confirmed that the package is running normally (At this point, you're authorizing Serviceguard to automatically failover the package again if necessary). This is an important step, don't forget it!

3) Serviceguard can failover a package slightly faster if it does not need to wait for the package to halt normally, but can intentionally crash the node with the failing package. This is what the NODE_FAIL_FAST setting does.

 

Obviously, this is only suitable for clusters where each node will only run at most one package at a time, and for applications that can handle crashes safely (e.g. all the important data is stored in a database elsewhere; the package filesystem(s) contain only application binaries that won't need to be updated while the application is running).

 

If you are not sure that your application is suitable for this, do not enable NODE_FAIL_FAST. If your cluster typically runs more than one package per node, do not enable NODE_FAIL_FAST.

MK
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.