
Contents
Version 1.x of Heartbeat[1] supported three actions:
Version 2 also supports only the start and stop actions directly but brings the benefit of being able to specify a timeout for either action. If the action does not complete within the timeout, the action is considered to have failed and recovery measures will be taken.
Timeouts must be specified per-action, per-resource. There is no global or resource default.
<primitive id="NameServer" class="lsb" type="named">
<operations>
<op id="1" name="stop" timeout="3s"/>
<op id="2" name="start" timeout="5s"/>
</operations>
</primitive>
One of the most requested Heartbeat features was the ability for it to detect when a resource failed (not just the whole node).
To support this, the CRM[3] also knows about monitor actions.
NOTE: monitor actions are not executed by default. If you wish Heartbeat to make sure the resource is running, then you must specify one or more monitor actions in the operations section of the resource. You have to define one monitor action for each of the resources roles (e.g. role="master", role="slave").
In addition to the timeout field, monitor actions must also specify an interval. This tells the Heartbeat how often it should check the resource's status.
This example indicates that the resources should be checked every 10 seconds to see if it is still running.
<primitive id="NameServer" class="lsb" type="named">
<operations>
<op id="1" name="stop" timeout="3s"/>
<op id="2" name="start" timeout="5s"/>
<op id="3" name="monitor" interval="10s" timeout="3s"/>
</operations>
</primitive>
Here we add a second monitor action, one that runs once per minute.
<primitive id="NameServer" class="lsb" type="named">
<operations>
<op id="1" name="stop" timeout="3s"/>
<op id="2" name="start" timeout="5s"/>
<op id="3" name="monitor" interval="10s" timeout="3s"/>
<op id="4" name="monitor" interval="1min" timeout="5s"/>
</operations>
</primitive>
NOTE: Each monitor operation for the resource must have a unique interval.
Here we define a monitor action for a MultiState[4] (master_slave) resource.
<master_slave id="ms_1" interleave="true">
<meta_attributes id="ms_1_ma">
<attributes>
...
</attributes>
</meta_attributes>
<primitive class="ocf" id="drbd" provider="heartbeat" type="drbd">
<operations>
<-- role="Started" is the default value -->
<op name="monitor" id="drbd_www_mon_normal" interval="15s" timeout="10s" />
<op name="monitor" id="drbd_www_mon_slave" interval="10s" timeout="10s" role="Slave" />
<op name="monitor" id="drbd_www_mon_master" interval="5s" timeout="10s" role="Master" />
</operations>
</primitive>
</master_slave>
NOTE: As always, each monitor operation for the resource must have a unique interval. Moreover, if no role="" attribute is given, role defaults to "Started".
It is also possible to pass extra parameters to a ResourceAgent[5] depending on the type of action being performed. This is done using instance_attributes[6].
Below you'll find an example of how to be told what type of check to use to determine the resource's status.
The example below follows the OCF[7] standard (found here[8], specifically section 2.5.3.1) to specify what type of check to make.
<primitive id="NameServer" class="ocf" type="apache" provider="heartbeat">
<operations>
<op id="1" name="stop" timeout="3s"/>
<op id="2" name="start" timeout="5s"/>
<op id="3" name="monitor" interval="10s" timeout="3s">
<instance_attributes id="monitor_10s">
<attributes>
<nvpair id="OCF_CHECK_LEVEL_MON_10SEC" name="OCF_CHECK_LEVEL" value="0"/>
</attributes>
</instance_attributes>
</op>
<op id="4" name="monitor" interval="1min" timeout="5s">
<instance_attributes id="monitor_1min>
<attributes>
<nvpair id="OCF_CHECK_LEVEL_MON_1MIN" name="OCF_CHECK_LEVEL" value="10"/>
</attributes>
</instance_attributes>
</op>
<op id="5" name="monitor" interval="30min" timeout="20s">
<instance_attributes id="monitor_30min">
<attributes>
<nvpair id="OCF_CHECK_LEVEL_MON_30MIN" name="OCF_CHECK_LEVEL" value="20"/>
</attributes>
</instance_attributes>
</op>
</operations>
</primitive>
The example below assumes someone named named-provider has provided you with an OCF-compliant resource agent.
<primitive id="NameServer" class="ocf" type="named" provider="named-provider">
<operations>
<op id="1" name="stop" timeout="3s"/>
<op id="2" name="start" timeout="5s"/>
<instance_attributes id="start">
<attributes>
<nvpair id="foo" name="foo" value="bar"/>
</attributes>
</instance_attributes>
<op id="3" name="monitor" interval="10s" timeout="3s">
<instance_attributes id="monitor_10s">
<attributes>
<nvpair id="OCF_CHECK_LEVEL_MON_10S" name="OCF_CHECK_LEVEL" value="0"/>
<nvpair id="check_hosts_mon_10s" name="check_hosts" value="www.mycorp.com"/>
</attributes>
</instance_attributes>
</op>
<op id="4" name="monitor" interval="1min" timeout="5s">
<instance_attributes id="monitor_1min">
<attributes>
<nvpair id="OCF_CHECK_LEVEL_MON_1MIN" name="OCF_CHECK_LEVEL" value="10"/>
<nvpair id="check_hosts_mon_1min" name="check_hosts" value="www.mycorp.com,www.google.com"/>
</attributes>
</instance_attributes>
</op>
<op id="5" name="monitor" interval="30min" timeout="20s">
<instance_attributes id="monitor_30min">
<attributes>
<nvpair id="OCF_CHECK_LEVEL_MON_30MIN" name="OCF_CHECK_LEVEL" value="20"/>
<nvpair id="check_hosts_mon_30_min" name="check_hosts" value="www.mycorp.com,www.google.com"/>
<nvpair id="verify_with" name="verify_with" value="alt.dns.server"/>
</attributes>
</instance_attributes>
</op>
</operations>
</primitive>
Remember parameters will be named differently depending on the type of ResourceAgent[5] you are using.
LSB init scripts do not take parameters.
OCF resource scripts are the only form of resource agent which takes name/value parameters. In this case, they're prefixed by an OCF_RESKEY_ prefix.
echo $OCF_RESKEY_check_hosts www.mycorp.com,www.google.com
Named parameters are not supported and instead the name must refer to the (relative) position of the value as an argument.
There is no example provided here as use of action parameters with Legacy Heartbeat RAs is painful and discouraged. Please consider using the OCF version instead or converting any custom RAs to the OCF scripts. It's really quite easy.
There is a reason they are called Legacy scripts
| [1] | http://www.linux-ha.org/HeartbeatProgram |
| [2] | http://www.linux-ha.org/ClusterResourceManager/DTD1.0/Annotated#action |
| [3] | http://www.linux-ha.org/CRM |
| [4] | http://www.linux-ha.org/v2/Concepts/MultiState |
| [5] | http://www.linux-ha.org/ResourceAgent |
| [6] | http://www.linux-ha.org/ClusterResourceManager/DTD1.0/Annotated#attributes |
| [7] | http://www.linux-ha.org/OCF |
| [8] | http://opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=1.10&content-type=text/vnd.viewcvs-markup |
This information provided courtesy of the Linux-HA project at http://linux-ha.org/