This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

This web page is no longer maintained. Information presented here exists only to avoid breaking historical links.
The Project stays maintained, and lives on: see the Linux-HA Reference Documentation.
To get rid of this notice, you may want to browse the old wiki instead.

1 February 2010 Hearbeat 3.0.2 released see the Release Notes

18 January 2009 Pacemaker 1.0.7 released see the Release Notes

16 November 2009 LINBIT new Heartbeat Steward see the Announcement

Last site update:
2017-12-13 05:49:22

Network failover strategies

There has been a lot of discussion of this topic on the list. Several things seem clear about this topic:

  • It is an important and fundamental capability for most HA clusters
  • What type of takeover strategy works best depends on local network topology. Three basic kinds have been discussed:
    • IP address takeover
    • MAC address takeover
    • Dynamic DNS reconfiguration
  • Good implementations may need custom tweaks to work optimally with the local networking hardware (routers, switches, etc.)
  • Basic function is not too hard, but it looks challenging and difficult to do well. This may be an ideal candidate for the Bazaar style of development, since many people need to test in many environments to produce a good product. Since the cost of commercial clusters is often high, this may be a case where the greater testing that free software may obtain could work to a significant advantage.
  • It may be the case that the local routers and switches have to be "certified" if you want truly reliable operation. The same could be true for NICs, but it isn't obvious yet.

Each of these technologies has a certain inherent takeover speed. For example MAC address takeover is almost instantaneous (but messy), IP address takeover is a little slower and less reliable, and Dynamic DNS reconfiguration is slower yet, but has nice load-balancing properties.

For scheduled outages of things like web server cluster elements, disabling a node via Dynamic DNS an hour or so beforehand could be a useful adjunct to the other techniques. One could envison a cluster of DNS servers using IP address takeover to ensure that query server switchovers don't hang clients. MS clients don't seem to handle server switching very gracefully.

There is a software package called Fake, which has been designed to switch in backup servers on a LAN. It does most of what whatwe need for IP address takeover, but expects to be in control of the cluster, whichisn't appropriate. It has been incorporated into the heartbeat code.

NICs based on the DEC Tulip chips are especially well-suited to MAC address takeover, since they can support multiple MAC addresses on a single NIC. Otherwise, you need a spare NIC for each MAC address to be taken over.

Another class of methods for IP address takeover is discussed in the Linux Network Address Translation project web site.