This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

This web page is no longer maintained. Information presented here exists only to avoid breaking historical links.
The Project stays maintained, and lives on: see the Linux-HA Reference Documentation.
To get rid of this notice, you may want to browse the old wiki instead.

1 February 2010 Hearbeat 3.0.2 released see the Release Notes

18 January 2009 Pacemaker 1.0.7 released see the Release Notes

16 November 2009 LINBIT new Heartbeat Steward see the Announcement

Last site update:
2014-04-17 16:35:45

The ha.cf file

The ha.cf file is one of the more important files to understand when configuring Heartbeat. It lists the cluster nodes, the communications topology, and which features of the configuration are enabled. This page does not discuss the FineArtOfConfiguringaCluster, which is a separate topic worthy of significant thought. For Heartbeat novices, it is probably worthwhile to read the article on GettingStartedWithHeartbeat.

Global ha.cf options

It is important to note that certain options in the ha.cf file are global in nature, and that ordering of these global options is important in configuring the ha.cf file, since each directive is interpreted as it is encountered in ha.cf.

These global options are:

It is recommended that these options be placed first in the ha.cf file when they are entered. In particular, placing the logging entries first is especially recommended.

The default values for each of these pages can be found on the web.

A Minimum ha.cf file

A minimum ha.cf file contains one or more node directives, and one or more of the communication topology (bcast, mcast, ucast, or serial) directives.

An Exhaustive Alphabetical List of ha.cf Directives

apiauth - API authorization directive

The apiauth directive specifies what users and/or groups are allowed to connect to a specific API group name. The syntax is simple:

apiauth apigroupname [uid=uid1,uid2 ...] [gid=gid1,gid2 ...]

You can specify either a uid list, or a gid list, or both. However you must specify either a uid list or a gid list. If you include both a uid list and a gid list, then a process is authorized to connect to that API group if if it is either in the uid-list or it is in the gid-list.

The API group name default has special meaning. If it is specified, it will be used for authorizing clients without any API group name, and all client groups not identified by any other apiauth directive.

Unless you specify otherwise in the ha.cf file, certain services will be provided default authorizations as follows:

service

default apiauth

ipfail

uid=hacluster

ccm

gid=haclient

ping

gid=haclient

cl_status

gid=haclient

lha-snmpagent

uid=root

crm

uid=hacluster

auto_failback directive - set failback policy

The auto_failback option determines whether a resource will automatically fail back to its "primary" node, or remain on whatever node is serving it until that node fails, or an administrator intervenes.

The possible values for auto_failback are:

  • on - enable automatic failbacks

  • off - disable automatic failbacks

  • legacy - enable automatic failbacks in systems where all nodes in the cluster do not yet support the auto_failback option.

Both the auto_failback on and off are backwards compatible with the old "nice_failback on" setting.

See the FAQ document for information on how to convert from "legacy" to "on" without a flash cut (i.e., using a RollingUpgrade process)

The default value for auto_failback is "legacy", which will issue a warning at startup. So, make sure you put an auto_failback directive in your ha.cf file (note: auto_failback can be any HeartbeatBoolean value or legacy). Typically, you want to set auto_failback on for an ActiveActive cluster, and commonly to off for an ActivePassive cluster.

NOTE: auto_failback does not have any effect on a Release 2 CRM-style cluster (one configured with crm on). For CRM-style clusters, this has been replaced with the default_resource_stickiness attribute in the CIB.

autojoin - enables automatic node joining

The autojoin directive enables nodes to join automatically just by communicating with the cluster, hence not requiring node directives in the ha.cf file. Since our communication is normally strongly authenticated, only nodes which know the cluster key can join (automatically or otherwise).

The general syntax of the autojoin directive is:

autojoin (none|other|any)

All legal autojoin directives are shown below:

autojoin none
autojoin other
autojoin any

The values you can give for the autojoin directive have the following meanings:

  • none: disables automatic joining.

  • other: allows nodes other than ourself who are not listed in ha.cf to join automatically. In other words, our node has to be listed in ha.cf, but other nodes do not.

  • any: allows any node to join automatically without being listed in ha.cf, even the current node.

Note that the set of nodes currently considered part of the cluster is kept in the hostcache file.

With autojoin enabled, the node directive is no longer authoritative - the hostcache file is.

baud - set serial communication speed

The baud directive is used to set the speed for serial communications. Any of the following speeds can be specified, provided they are supported by your operating system: 9600, 19200, 38400, 57600, 115200, 230400, 460800. The default speed is 19200. A sample baud directive is shown below:

baud 38400

bcast - configure broadcast communication path

The bcast directive is used to configure which interfaces Heartbeat sends UDP broadcast traffic on. More than one interface can be specified on the line. The udpport directive is used to configure which port is used for these broadcast communications if the udpport directive is specified before the bcast directive, otherwise the default port will be used. A couple of sample bcast lines are shown below.

bcast eth0 eth1  # on Linux systems
bcast le0        # for Solaris systems

On CRM-enabled clusters, the bcast directive does not work on FreeBSD and OpenBSD because of the fragmentation issue described in

W. Richard Stevens - Unix Network Programming - Vol 1 - 3rd Edition: The Sockets Networking API

20.4 dg_cli Function Using Broadcasting
..
IP Fragmentation and Broadcasts

Berkeley-derived kernels do not allow a broadcast datagram to be fragmented.
If the size of an IP datagram that is being sent to a broadcast address exceeds
the outgoing interface MTU, EMSGSIZE is returned (pp. 233-234 of TCPv2).
This is a policy decision that has existed since 4.2BSD. There is nothing that
prevents a kernel from fragmenting a broadcast datagram, but the feeling is
that broadcasting puts enough load on the network as it is, so there is no need
to multiply this load by the number of fragments.
....
AIX, FreeBSD, and MacOS implement this limitation. Linux, Solaris, and HP-UX
fragment datagrams sent to a broadcast address. 

This results because CRM clusters try and send large (>MTU size) packets over the cluster communication media.

compression - set compression method

The compression directive sets which compression method will be used when a message is big and compression is needed.

It could be either zlib or bz2, depending on whether you have the corresponding library in the system. You can check  /usr/lib/heartbeat/plugins/HBcompress  to see what compression module is available.

If this directive is not set, there will be no compression.

compression    <bz2/zlib>

compression_threshold - set compression threshold for a message

The compression_threshold directive sets the threshold to compress a message, e.g. if the threshold is 1, then any message with size greater than 1 KB will be compressed. The default is 2 (KB). This directive only makes sense if you have set the compression directive.

compression_threshold 2

conn_logd_time - the directive to set interval to reconnect to the logging daemon

The conn_logd_time directive specifies the time Heartbeat will reconnect to the logging daemon if the connection between Heartbeat and the logging daemon is broken. The conn_logd_time is specified according to the HeartbeatTimeSyntax. For example,

conn_logd_time 60 #60 seconds

Default is 60 seconds.

Note: Heartbeat will not automatically reconnect to the logging daemon. It only tries to reconnect when it needs to log a message and conn_logd_time have passed since the last attempt to connect.

coredumps - enable capturing core dumps

The coredumps directive tells Heartbeat to do things to enable making core dumps - should it need to dump core.

The general syntax of a coredumps directive is:

coredumps HeartbeatBoolean

The most common coredumps directives are shown below:

coredumps true
coredumps false

crm - enabling and disabling the 2.x cluster manager

The crm directive specifies whether Heartbeat should run the 1.x-style cluster manager or the 2.x-style cluster manager that supports more than 2 nodes.

The syntax is simple:

crm off|on|respawn

When set to on|respawn, the directive automatically implies:

        apiauth stonithd        uid=root
        apiauth crmd            uid=hacluster
        apiauth cib             uid=hacluster

        respawn hacluster       ccm
        respawn hacluster       cib
        respawn root            stonithd
        respawn root            lrmd
        respawn hacluster       crmd

deadping - set failure (death) detection time for ping nodes

The deadping directive is used to specify how quickly Heartbeat should decide that a ping node in a cluster is dead. Setting this value too low will cause the system to falsely declare the ping node dead. Setting it too high will delay detection of communication failure.

The deadping value is specified according to the HeartbeatTimeSyntax. Two sample deadping specifications are shown below.

deadping 20    # 20 seconds
deadping 750ms # 750 milliseconds

deadtime - set failure (death) detection time

The deadtime directive is used to specify how quickly Heartbeat should decide that a node in a cluster is dead. Setting this value too low will cause the system to falsely declare itself dead. Setting it too high will delay takeover after the failure of a node in the cluster. Please read the FAQ document for more information on how to configure (tune) this important parameter.

The deadtime value is specified according to the HeartbeatTimeSyntax. Two sample deadtime specifications are shown below.

deadtime 10    # 10 seconds
deadtime 250ms # 250 milliseconds (1/4 second)

debug - set debug level

The debug directive is used to set the level of debugging in effect in the system. Production systems should have their debug level set to zero (i.e., turned off). This is the default. Legal values of the debug option are between 0-255. The most useful values are between 0 (off) and 3. Setting the debug level greater than 1 can have an adverse effect on the size of your log files, and on the system's ability to send heartbeats at rapid rates, thus affecting the cluster reliability.

The debug level of the system can also be specified on the command line using the -d option. Additionally, the debug level of the system can be dynamically changed by sending the heartbeat process SIGUSR1 and SIGUSR2 signals. SIGUSR1 raises the debug level, and SIGUSR2 lowers it. A sample debug directive is shown below.

debug 0

debugfile - configures file for debug messages

The debugfile directive is deprecated for version 2.x configurations. Please enable the use_logd directive instead.

The debugfile directive specifies the file Heartbeat will write debug messages to.

A sample debugfile directive is shown below:

debugfile /var/log/ha-debug

See Also

ha.cf/UseLogdDirective

hbaping directive

Hbaping directives are given to declare fiber channel devices as PingNodes to Heartbeat.

The syntax of the hbaping directive is simple:

hbaping fc-card-name

The fc-card-name is the name obtained from the hbaapitest program that is part of the hbaapi package mentioned below. Running hbaapitest will produce verbose output. One of the first lines is similar to:

  • Adapter number 0 is named: qlogic-qla2200-0

Here fc-card-name is qlogic-qla2200-0.

This directive is not normally enabled in distributed versions of the Linux-HA software. To enable this directive, follow these steps:

  • Obtain the source to the HBAAPI libary from http://hbaapi.sourceforge.net,

  • Compile it (Unfortunately the Makefile included in the tarball does not work in linux. You can download the Makefile for linux here)

  • Copy the libHBAAPI.so file it produced into /usr/lib,

  • Copy the hbaapi.h file from the package to /usr/include,

  • Obtain and install the vendor-specific HBAAPI plugin specific to your HBA (Host Bus Adapter) from your HBA vendor,
  • Configure, compile and install the Linux-HA (Heartbeat) package.
    • As an alternative:
      1. install Heartbeat from RPM
      2. configure and compile Heartbeat from same-version source, and manually copy only the hbaping.so file to /usr/lib/heartbeat/plugins/HBcomm.

hbgenmethod - specifies method for creating Heartbeat communications generation number

The hbgenmethod directive specifies how Heartbeat should compute its current generation number for communications. This is a specialized and obscure directive, used mainly in firewalls which have no local disk, and other devices which do not have a method of storing data persistently across reboots. It defaults to storing the Heartbeat generations in a file. Generation numbers are used by Heartbeat for replay attack protection.

All legal hbgenmethod directives are shown below:

hbgenmethod time
hbgenmethod file # this is the default.

Caveats

If one specifies the time method, there are certain possible cases where troubles can arise. If a machine restarts Heartbeat and its local time of day clock is less than or equal to than the value of the time of day clock when Heartbeat last started, then that node will be unable to join the cluster.

hopfudge - sets serial port forwarding maximum count

The hopfudge directive controls how many nodes a packet can be forwarded through before it is thrown away in the worst case. However, the hopfudge value is added to the number of nodes in the system. It defaults to 1.

A sample hopfudge directive is shown below:

hopfudge 1

initdead - set initial deadtime detection interval

The initdead parameter is used to set the time that it takes to declare a cluster node dead when Heartbeat is first started. This parameter generally needs to be set to a higher value, because experience suggests that it sometimes takes operating systems many seconds for their communication systems before they operate correctly. initdead is specified according to the HeartbeatTimeSyntax. A sample initdead value is shown below:

initdead 30

In some switched network environments, switches engage in a spanning tree algorithm whenever a NIC connects to a port. This can take a long time to complete, and it is only necessary if the NIC being connected is another switch. If this is the case, you may be able to configure certain NICs as not being switches and shrink the connection delay significantly. If not, you'll need to raise initdead to make this problem go away.

If this is set too low, you'll see one node declare the other as dead, and for non-CRM clusters, you'll see "both nodes own XXX resources" in the logs if initdead is set too low.

keepalive - set heartbeat keep-alive interval

The keepalive directive sets the interval between heartbeat packets. It is specified according to the HeartbeatTimeSyntax.

Two sample keepalive directives are shown below:

keepalive 100ms

keepalive 2 # 2 seconds

logfacility - configures syslog logging facility

The logfacility is used to tell Heartbeat which syslog logging facility it should use for logging its messages.

The possible values for logfacility vary by operating system, but some of the most common ones are {auth, authpriv, daemon, syslog, user, local0, local1, local2, local3, local4, local5, local6, local7}.

A sample logfacility directive is shown below:

logfacility local7

If you want to disable logging to syslog:

logfacility none

logfile - configures logging file

The logfile directive is deprecated for version 2.x configurations. Please enable the use_logd directive instead.

The logfile directive configures a log file. All non-debug messages from Heartbeat will go into this file.

A sample logfile directive is shown below:

logfile /var/log/ha-log

Caveats

Configuring a log file (instead of using syslog logging) can cause Heartbeat to block for several seconds under heavy load. This can affect the deadtime required for the system.

See Also

ha.cf/UseLogdDirective

mcast - configures multicast communication path

The mcast directive is used to configure a multicast communication path.

The syntax of an mcast directive is:

mcast dev mcast-group udp-port ttl 0
  • dev - IP device to send/rcv heartbeats on

  • mcast-group - multicast group to join (class D multicast address 224.0.0.0 - 239.255.255.255). For most Heartbeat uses, the first byte should be 239.

  • port - UDP port to sendto/rcvfrom (set this to the same value as udpport)

  • ttl - the ttl value for outbound heartbeats. This affects how far the multicast packet will propagate. (0-255). Set to 1 for the current subnet. Must be greater than zero.

A sample mcast directive is shown below:

mcast eth0 239.0.0.1 694 1 0

Bugs

This directive has a few more parameters than it should.

See Also

Wikipedia, The TCP/IP guide

msgfmt - the directive to set the message format in wire

The msgfmt directive specifies the format Heartbeat uses in wire.

msgfmt  <classic/netstring>

Default is classic.

  • classic - Heartbeat will convert a message into a string and transmit in wire. Binary values are converted with a base64 library.

  • netstring - Binary messages will be transmitted directly. This is more efficient since it avoids conversion between string and binary values.

If not sure, choose classic (default).

node directive

The node directive tells what machines are in the cluster. The syntax of the node directive is simple:

node nodename1 nodename2 ...

Node names in the directive must (normally) match the "uname -n" of that machine.

You can declare multiple node names in one directive. You can also use the directive multiple times. Normally every node in the cluster must be listed in the ha.cf file, including the current node, unless the autojoin directive is enabled.

Note that starting with 2.0.4, the node directive is not completely authoritative with regard to nodes heartbeat will communicate with. If a node has ever been added in the past, it will tend to remain in the hostcache file more until it's manually removed. See Also: http://www.osdl.org/developer_bugzilla/show_bug.cgi?id=1226

ping directive

Ping directives are given to declare PingNodes to Heartbeat.

The syntax of the ping directive is simple:

ping ip-address ...

Each IP address listed in a ping directive is considered to be independent. That is, connectivity to each node is considered to be equally important.

In order to declare that a group of nodes are equally qualified for a particular function, and that the presence of any of them indicates successful communication, use the ping_group directive.

ping_group directive

Ping group directives are given in the ha.cf file to declare a group PingNode to Heartbeat.

The syntax of the ping_group directive is simple:

ping_group group-name ip-address ...

Each IP address listed in a ping_group directive is considered to be related, and connectivity to any one node is considered to be connectivity to the group.

A ping group is considered by Heartbeat to be a single cluster node (group-name). The ability to communicate with any of the group members means that the group-name member is reachable. This is useful when (for example) two different routers may be used to contact the internet, depending on which is up, or when finding an appropriate reliable single ping node is difficult.

realtime - enable realtime features in Heartbeat

The realtime directive specifies whether or not Heartbeat should try and take advantage of the operating system's realtime scheduling features. When enabled, Heartbeat will lock itself into memory, and raise its priority to a realtime priority (as set by the rtprio directive). This feature is mainly used for debugging various kinds of loops which might otherwise cripple the system and impair debugging them. The realtime flag is a HeartbeatBoolean value, whose default value is true. A sample realtime directive is shown below.

realtime on

respawn - specifies programs for Heartbeat to run at startup

The respawn directive is used to specify a program to run and monitor while it runs. If this program exits with anything other than exit code 100, it will be automatically restarted. The first parameter is the user id to run the program under, and the second parameter is the program to run. Subsequent parameters will be given to the program as arguments.

At the current time, the program most people will be interested in running this way is ipfail.

A sample respawn directive is shown below:

respawn hacluster /usr/lib/heartbeat/ipfail

SECURITY NOTE: It is a bad security practice to run programs from Heartbeat as root unless they are prepared to change their user ids once they're started. None of the programs which come with Heartbeat to be used with respawn should be run as root. Do not run them as root or. If you ignore this advice, just remember that BadThingsMayHappen, and don't blame us.

rtprio - specifies Heartbeat's realtime priority

The rtprio directive is used to specify the priority at which Heartbeat runs. It does not need to be specified unless other realtime priority programs are also running on the system. The minimum and maximum values for this field can be determined from the sched_get_priority_min(SCHED_FIFO) and sched_get_priority_max(SCHED_FIFO) calls respectively. The default value for rtprio is halfway between the minimum and maximum values.

A sample rtprio directive is shown below:

rtprio 5

serial - configure serial communication path

The serial directive tells Heartbeat to use the specified serial port(s) for its communication. The parameters to the serial directive are the names of tty devices suitable for opening without waiting for carrier first. On Linux, those ports are typically named /dev/ttySX.

A few sample serial directives are shown below:

serial /dev/ttyS0 /dev/ttyS1     # Linux
serial /dev/cuaa0                # FreeBSD
serial /dev/cua/a                # Solaris

The baud directive is used to configure the baud rate for the port(s) if the baud directive is specified before the serial directive, otherwise the default baud rate will be used.

stonith directive

The stonith directive is used to configure Heartbeat's (release 1 only), STONITH configuration. It assumes you're going to put in a STONITH configuration file on each machine in the cluster to configure the (single) STONITH device that this node will use to reset the other node in the cluster.

Sample stonith directive

stonith {stonith-device-type} {stonith-configuration-file}

where {stonith-device-type} is the type of (supported) STONITH device being configured, and {stonith-configuration-file} is the name of the file in which you put the STONITH configuration information for this particular STONITH device.

To get a list of valid {stonith-device-type}s, issue this command:

stonith -L

To get a list of how to configure each type of STONITH device, issue the following command:

stonith -h

NOTE: This command is mutually exclusive with the stonith_host directive.

stonith_host directive

The stonith_host directive is used to configure Heartbeat's (release 1 only), STONITH configuration. With this directive, you put all the STONITH configuration information for the devices in your cluster in the ha.cf file, rather than in a separate file.

You can configure multiple stonith devices using this directive. The format of the line is:

stonith_host {hostfrom} {stonith_type} {params...} 
  • {hostfrom} is the machine the stonith device is attached to or * to mean it is accessible from any host.

  • {stonith_type} is the type of stonith device

  • {params...} are the configuration parameters this STONITH device requires.

Only one stonith_host directive can have a * for {hostfrom}.

Caveats:
If you put your stonith device access information in ha.cf, and you make this file publically readable, you're inviting a denial of service attack.

To get a list of valid {stonith-device-type}s, issue this command:

stonith -L

To get a list of {params...} for each type of STONITH device, issue the following command:

stonith -h

NOTE: This command is mutually exclusive with the stonith directive.

traditional_compression - controls compression mode

The general syntax of a traditional_compression directive is:

traditional_compression HeartbeatBoolean

A sample traditional_compression directive is shown below

traditional_compression false

Note that it is highly recommended that you set traditional_compression to false - otherwise heartbeat performance can be significantly negatively impacted.

ucast - configures unicast Heartbeat communication

The ucast directive configures Heartbeat to communicate over a UDP unicast communications link. The udpport directive is used to configure which port is used for these unicast communications if the udpport directive is specified before the ucast directive, otherwise the default port will be used.

The general syntax of a ucast directive is:

ucast dev peer-ip-address

Where dev is the device to use when talking to the peer, and peer-ip-address is the IP address we will send packets to.

Although this is a unicast communication link, the UDP packets sent over this link is a multicast protocol.

A sample ucast directive is shown below:

ucast eth0 10.10.10.133

This directive will cause us to send packets to 10.10.10.133 over interface eth0.

Note that ucast directives which go to the local machine are effectively ignored. This allows the ha.cf directives on all machines to be identical.

udpport - specifies port for UDP communication

The udpport directive specifies which port Heartbeat will use for its UDP intra-cluster communication. There are two common reasons for overriding this value: there are multiple bcast clusters on the same subnet, or this port is already in use in accordance with some locally-established policy.

The default value for this parameter is the the port ha-cluster in /etc/services (if present), or 694 if port ha-cluster is not in /etc/services. 694 is the IANA registered port number for Heartbeat (a.k.a. ha-cluster).

A sample udpport directive is shown below.

udpport 694

You have to configure udpport (in ha.cf) before you configure ucast or bcast, if not heartbeat will use the default port (694)

NOTE: The GUI doesn't use UDP, and isn't intracluster communications, so GUI communication is not affected by this directive.

BUGS: Due to a specification error in the syntax of the mcast directive, this directive does not apply to mcast communications.

use_logd - the directive to determine whether heartbeats use logging daemon or not

The use_logd directive specifies whether Heartbeat logs its messages through logging daemon or not. The syntax is simple:

use_logd <on/off>

(Note: use_logd can be any HeartbeatBoolean value)

The detailed policy is:

1. if there is any entry for debugfile/logfile/logfacility in ha.cf

  • a) if use_logd is not set, logging daemon will not be used

    b) if use_logd is set to on, logging daemon will be used

    c) if use_logd is set to off, logging daemon will not be used

2. if there is no entry for debugfile/logfile/logfacility in ha.cf

  • a) if use_logd is not set, logging daemon will be used

    b) if use_logd is set to on, logging daemon will be used

    c) if use_logd is set to off, config error, i.e. you can not turn off all logging options

If the logging daemon is used, all log messages will be sent through IPC to the logging daemon, which then writes them into log files. In case the logging daemon dies (for whatever reason), a warning message will be logged and all messages will be written to log files directly.

If the logging daemon is used, logfile/debugfile/logfacility in this file are not meaningful any longer. You should check the config file for logging daemon (the default is /etc/logd.cf).

If use_logd is not used, all log messages will be written to log files directly.

The logging daemon is started/stopped in heartbeat script.

Setting use_logd to "yes" is recommended.

uuidfrom - selects how the local UUID is generated

In the normal case, heartbeat generates a UUID for each node in the system as a way of uniquely identifying a node - even if it should change nodenames. This UUID is typically stored in the file /var/lib/heartbeat/hb_uuid.

For certain kinds of installations (those booting from CDs or other read-only media), it is impossible for heartbeat to save a generated to disk as it normally does. In these cases, one can use the uuidfrom directive to instruct heartbeat to use the nodename as though it were a UUID, by specifying uuidfrom nodename.

All possible legal uuidfrom directives are shown below.

uuidfrom file
uuidfrom nodename

warntime - set late heartbeat warning time

The warntime directive is used to specify how quickly Heartbeat should issue a "late heartbeat" warning.

The warntime value is specified according to the HeartbeatTimeSyntax. A sample warntime specification is shown below.

warntime 10    # 10 seconds

The warntime directive is important for tuning deadtime.

watchdog - configure watchdog device

The watchdog directive configures Heartbeat to use a watchdog device. In some circumstances, a watchdog device can be used in place of a STONITH device. In any case, it is a reasonable thing to configure if you don't have a STONITH device, or if you wish, in addition to your STONITH device.

It is the purpose of a watchdog device to shut the machine down if Heartbeat does not hear its own heartbeats as often as it thinks it should. This keeps things like scheduler bugs from becoming split-brain configurations.

The general syntax of a watchdog directive is:

watchdog watchdog-device-name

A sample watchdog directive is shown below:

watchdog /dev/watchdog

The most common watchdog device currently used with general Linux systems is the softdog device. The softdog device is a software-based watchdog device and is usually referred to as /dev/watchdog - although like most UNIX devices, this is a convention not a rule.

Special Watchdog Caveats

Heartbeat tries to set the watchdog device to reboot the system at the next second after it would declare itself dead.

It also tries to ensure that if it is shut down gracefully, that it will keep the system from rebooting when it exits. However, this behavior is out of its hands. It depends on the watchdog device driver. For the softdog driver see the softdog page for details on how you can make this work the way you want it to.

See Also

haresources, authkeys, ha.cf Default Values, Configuring Heartbeat