11-01-2013 08:04 AM
We had one server running rhel 5.8 which was about one hour behind other servers in a 6 node RAC cluster. When i changed the ntp.conf file and restarted the NTP the time was correct but the DB crashed. I want to know the correct way to slew the time slowly so it adjusts. I know it can be done in HP-UX. How do i do it in RHEL5?
11-03-2013 01:39 AM
With default settings, ntpd causes the system clock to jump if the required correction is more than +/- 128 ms.
On database systems, you'll probably want to start ntpd with the option -x, which increases this threshold to 600 s.
But as you have a difference of a whole hour (= 3600 s), even that is not enough. If the required correction is more than +/- 1000 s, ntpd won't even start at all.
If you want to slew the system clock to the correct value, you might use "ntpdate -B" to do it. But slewing a whole hour is likely to take several days, and you might have to run "ntpdate -B" several times during that time to maintain the maximal slew rate. Apparently a single "ntpdate -B" command can only achieve a certain maximum amount of slewing in either direction, and to slew more than that, you'll have to repeat the command after the first slew operation is completed.
"man 3 adjtime" indicates that the maximum amount of slew achievable with a single operation is +/- 2145 seconds, so you'll need to run "ntpdate -B" at least twice to fix a one-hour error. Unfortunately I haven't found documentation of the slew rate achievable with Linux; I have seen some things suggesting that different kernel versions might have different slew rates, but I might be wrong on that.
If you can temporarily switch the RAC databases to the other cluster nodes, you might be able to jump the clock to the correct value on the problematic node, and then resume normal operations. That will most likely be much quicker than getting rid of a one-hour difference by slewing the clock.
11-04-2013 06:37 AM
Strange that Linux does not have a straight option like HP-UX. I remember doing this on HP-UX MC service guard cluster some years ago with sucess. Thanks again Matti