03-29-2014 05:04 AM
we re having the following situation and really appreciated your opinion please:
following a catastrophic power outage, our ServiceGuard platform went down. It is formed of 2 rp5440.
After troubleshooting we were forced to re-install node1 from scratch (11iv2) after migrating all VGs/packages to the 2nd node. It is really critical especially if we loose node2. So, in this case what is the procedure to follow on node1 after reinstalling 11iv2 / patches / SG / SGeRAC / and mirroring the boot disk?
Could you please help by providing some listing of the tasks to proceed with?
- do we need to import the ascii file from the running node?
- what about the cluster lock disk config? does the hardware path of the disk change on node 1? Apparently it is not the same hardware path on both nodes...
Your help is much appreciated
Thanks in advance
03-30-2014 08:29 PM
A complete reinstall? I suppose it is too late to ask about your Ignite backups that eliminate the massive time required to cold install and configure any HP-UX system. You should be able to take an Ignite backup of the working server and restore the dead one from that backup and get a lot farther along. If the hardware is very different between the two nodes, yes, you'll have to do some ioscan comparisons and match devices. Once ignited, you'll have a copy of the /etc/cmcluster directory which should have all the required files. However, your regular filesystem backup can restore any differences.
03-31-2014 01:15 AM
Thanks Bill for your reply.
However, suppose that we do not have ignite and we ve proceeded to reinstall OS from scratch on node 1 and do the mirroring of the boot disks / patch node1 with the same level as the other node / installed SG / and SGeRAC.
Can we proceed as follows (same as we re adding a 3rd node to the running cluster?)
=> we get the latest version of the ascii file using cmgetconf from the running node and transfer it to the reinstalled node, then proceed to check and apply the config? is that right?? kindly confirm plz just to avoid problems from occurring on the running node otherwise it would be catastrophic
Thanks in advance
03-31-2014 05:09 PM
The ASCII file is just the beginning. Depending on the version of SG you are running, you may have package config scripts, application and database scripts, and perhaps monitor scripts. Once you get the ASCII file, you'll have to modify it to match the reinstalled node (LAN devices, shared volume groups and the cluster quorum lock disk). You need good records about how the pacakges and the cluster were designed to avoid errors during cmapplyconf. The running cluster must be stopped in order to apply the new config.
This is why Ignite backup is not optional -- it is mandatory for all service guard nodes. And not once a year...weekly or at least monthly.
04-02-2014 01:19 PM
Make an ignite "golden" image of the active node and install it on the target node. That will install the SAME software, features, patches and the ioscan instance numbering and /etc/lvmtab. Then boot in single-user mode and change the hostname and IP addresses (/etc/rc.config.d/netconf) to the valid names. Then reboot and see if it will join the cluster.
The next method is not straight-forward (meaning alternative measures may be needed).
Since you have reinstalled the system, match the product and patch levels to the running node.
Copy /etc/services, /etc/hosts, /etc/nswwitch.conf from the running node to the target node
Import the shared VGs into /etc/lvmtab on the target system, using the LVM map files generated on the running node, with 'vgexport -ps -m <mapfile_name> /dev/<VG_name>' - eg:
$ mkdir /dev/<VG_name>
$ mknod /dev/<VG_name>/group c 64 0xNN0000 (where NN is a unique number)
$ vgimport -vs -m <mapfile_name> /dev/<VG_name>
Recreate the mount directories for the package file systems.
If used, determine whether the lock disk path on the target system is the same as it was before. If not, the cluster configuration file must be updated with the new path and cmapplyconf.
If the lan names and MAC addresses changed, the cluster configuration must be refreshed. It may be necessary to remove the target node from the cluster and re-add it. This entails removing the node from all packages and the cluster (comment out all reference to it in the cluster and package configuration files) and cmapplyconf. Then uncomment the references and cmapplyconf again to add the target node and it's new LAN names/paths etc.
04-03-2014 02:12 AM
hello and thanks for your reply and indeed we need lot of good luck ;)
Suppose we need to adopt the easiest method that you ve mentioned above so we can once again reformat the already installed from scratch OS and make an ignite golden image of the remaining active node and then install this image on the target node. (here i guess that ignite will only backup vg00 is it right?) And after restoring the golden image to the node I guess that we still need to re do the mirroring of the boot disk. In addition, could you confirm please by listing the files that we need to modify after the ignite restore? (/etc/rc.config.d/netconf, /etc/hosts ...)
Moreover, by applying this method, do we still need to remove the failed node from the cluster/package and re add it? or we just boot it after the ignite restore & files modifications and see if it can join the cluster?
Thanks in advance