Modify the Private Network Information in Oracle Clusterware with HAIP in place

The MOS note “How to Modify Private Network Information in Oracle Clusterware (Doc ID 283684.1)” explains how to change the private Network information with the interconnect made of a single interface (then producing downtime).

So what’s about changing the interconnect configuration with Highly Available IP (HAIP) in place ?

Let’s remember what HAIP is (from Oracle® Database High Availability Best Practices):

haip

As HAIP provides redundant interconnect, we should be able to change the interconnect configuration of one private interface without any downtime, right ?

First let’s check the current interconnect configuration:

oifcfg getif       
eth4  172.16.0.128  global  cluster_interconnect
eth6  172.16.1.0  global  cluster_interconnect

and the associated Virtual IP:

oifcfg iflist -p -n
eth4  172.16.0.128  PRIVATE  255.255.255.128
eth4  169.254.0.0  UNKNOWN  255.255.128.0
eth6  172.16.1.0  PRIVATE  255.255.255.128
eth6  169.254.128.0  UNKNOWN  255.255.128.0

I removed the public network from the output. As you can see each private interface is hosting a Virtual IP (169.xxx.x.x).

Now your sysadmin do the change (for example he will change the subnet and the VLAN) on one of the private interface (Let’s say eth4 for example), so that the ohasd.log log file reports something like:

2014-01-27 11:16:17.154: [GIPCHGEN][1837012736]gipchaInterfaceFail: marking interface failing 0x7f4c0c1b5f00 { host '', haName 'CLSFRAME_olrdev1', local (nil), ip '172.16.0.129:28029', subnet '172.16.0.128', mask '255.255.255.128', mac 'e8-39-35-12-77-7e', ifname 'eth4', numRef 0, numFail 0, idxBoot 0, flags 0x184d }
2014-01-27 11:16:17.334: [GIPCHGEN][1856595712]gipchaInterfaceDisable: disabling interface 0x7f4c0c1b5f00 { host '', haName 'CLSFRAME_olrdev1', local (nil), ip '172.16.0.129:28029', subnet '172.16.0.128', mask '255.255.255.128', mac 'e8-39-35-12-77-7e', ifname 'eth4', numRef 0, numFail 0, idxBoot 0, flags 0x19cd }
2014-01-27 11:16:17.339: [GIPCHDEM][1856595712]gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x7f4c0c1b5f00 { host '', haName 'CLSFRAME_olrdev1', local (nil), ip '172.16.0.129:28029', subnet '172.16.0.128', mask '255.255.255.128', mac 'e8-39-35-12-77-7e', ifname 'eth4', numRef 0, numFail 0, idxBoot 0, flags 0x19ed }

So now let’s check the virtual IP and the available interfaces and subnet again:

oifcfg iflist -p -n
eth4  172.17.3.0  PRIVATE  255.255.255.128
eth6  172.16.1.0  PRIVATE  255.255.255.128
eth6  169.254.128.0  UNKNOWN  255.255.128.0
eth6  169.254.0.0  UNKNOWN  255.255.128.0
bond0  158.168.4.0  UNKNOWN  255.255.252.0

Well, we can see 2 things:

The first one, is that eth6 is now “hosting” both Virtual IPs (169.xxxx).
The second one, is the new “available” subnet for eth4 (172.17.3.0).

So that, we just have to remove the previous eth4 configuration

oifcfg delif -global eth4/172.16.0.128

and put the new one that way:

oifcfg setif -global eth4/172.17.3.0:cluster_interconnect

Now, check again the Virtual IPs:

oifcfg iflist -p -n
eth4  172.17.3.0  PRIVATE  255.255.255.128
eth4  169.254.128.0  UNKNOWN  255.255.128.0
eth6  172.16.1.0  PRIVATE  255.255.255.128
eth6  169.254.0.0  UNKNOWN  255.255.128.0

Perfect, now each private Interface hosts a Virtual IP (We are back to a “normal” configuration).

Remarks:

No downtime will occur as long as:

You don’t change both interfaces configuration at the same time ;-)
You don’t remove by mistake the configuration of the interface that hosts both Virtual IPs.
The interconnect hosting both VIPs doesn’t fail before you put back the updated interface.

There is nothing new with this post. I just had to do the exercise so that I share it ;-)

Conclusion:

Thanks to HAIP, we have been able to change the Interconnect Network configuration of one interface without any downtime.