This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

21 December 2007 Heartbeat release 2.1.3 is now out Download it and install it!

11 October 2007 NEW educational HA/DR Blog hosted by Alan Robertson

9 April 2007 Check out the Cool Heartbeat Screencasts: Installation, Intro to the GUI Part of the Heartbeat Education project

Last site update:
2008-05-17 09:33:33

CIB Idioms

Like any language, the CIB has certain good ways to do express certain things. Not that there aren't equivalent ways, but there are certain ways of expressing certain things which are known to be good, to work, and to have minimal side-effects.

One could also call these CIB recipes, or CIB HOWTOs. The ideas are about the same.

We call those CIB Idioms. This page is dedicated to cataloging the most common idioms.

Contents

  1. CIB Idioms
  2. Simple Resource Location Idioms
    1. Locating a Resource on Only Certain Nodes
    2. Locating a Resource on Nodes with a Certain Attribute
  3. Master/Slave Idioms
    1. Colocating a resource with a resource in master state
    2. Colocating a resource with a resource in slave state
    3. Controlling which node becomes master
    4. Controlling which nodes remain slaves
  4. Pingd-related idioms
    1. Run a resource on the node with the best external connectivity
    2. Make a resource stop when external connectivity is lost
  5. Time-based idioms
    1. Restricting failback to weekends
    2. Running an expensive monitor only at night
  6. Reboot-related idioms
    1. Causing a node to reboot when a resource stop fails
  7. STONITH-related idioms
    1. Configure an iLO STONITH device with the riloe plugin
    2. Configure an IBM RSA STONITH device with the ibmrsa plugin
    3. Configure an IBM RSA STONITH device with the ibmrsa-telnet plugin

Simple Resource Location Idioms

Locating a Resource on Only Certain Nodes

It is sometimes necessary to restrict a resource so that it will only run on certain nodes, or nowhere at all. The example below creates such a restriction:

<rsc_location id="rloc_unique_id">
  <rule score="-INFINITY" boolean_op="and" resource="my_resource" 
        id="rcloc_rule_unique_id">
    <expression attribute="#uname" operation="ne" value="host1" id="expr:h1">
    <expression attribute="#uname" operation="ne" value="host2" id="expr:h2">
    <expression attribute="#uname" operation="ne" value="host3" id="expr:h3">
  </rule>
</rsc_location>

Translating this into English: We will never schedule resource my_resource to run on any node unless it is one of the set of host1, host2, or host3. This is equivalent to this expression:

  if ($uname != host1 && $uname != host2 && $uname != host3)
      {score=-INFINITY;};

This creates an absolute prohibition from running on any other node, regardless of anything else in the CIB relating to my_resource.

Locating a Resource on Nodes with a Certain Attribute

It is sometimes necessary to restrict a resource so that it will only run on nodes with a certain attribute value, or nowhere at all. The example below creates such a restriction:

<rsc_location id="rloc_unique_id">
  <rule score="-INFINITY" boolean_op="or" resource="my_resource" 
        id="rcloc_rule_unique_id">
    <expression id="expr:hfc:undefined"
         attribute="has_fibre_channel" operation="not_defined"/> 
    <expression id="expr:hfc:false"
         attribute="has_fibre_channel" operation="ne" value="true"/>
  </rule>
</rsc_location>

Translating this into English: We will never schedule resource my_resource to run on any node unless it has the attribute "has_fibre_channel" with a value of "true". This is equivalent to this expression:

  • if (!defined($has_fibre_channel) || $has_fibre_channel != true)

    • {score=-INFINITY;};

This creates an absolute prohibition from running on any other node, regardless of anything else in the CIB relating to my_resource. Note that if you have an agent (like a cron job, or whatever) which sets has_fibre_channel to false when there is a failure of the fibre channel connection, then this node will become unable to run my_resource.

Master/Slave Idioms

Colocating a resource with a resource in master state

It is very common to want to have a certain resource run only on a node which is master for a given resource.

You can either make sure the resource is run on the master, which is the propably more common approach:

<rsc_order id="ms-drbd0_before_fs0" from="fs0" action="start" to="ms-drbd0" to_action="promote"/>
<rsc_colocation id="fs0_on_ms-drbd0" to="ms-drbd0" to_role="master" from="fs0" score="infinity"/>

Translation:

  • Start fs0 after ms-drbd0 has been promoted
  • Only start fs0 when ms-drbd0 is running as master

Or you can make sure the resource is not run on the slave:

<rsc_order id="ms-drbd0_before_fs0" from="fs0" action="start" to="ms-drbd0" to_action="promote"/>
<rsc_colocation id="fs0_not_on_stopped_ms-drbd0" to="ms-drbd0" to_role="stopped" from="fs0" score="-infinity"/>
<rsc_colocation id="fs0_not_on_slave_ms-drbd0" to="ms-drbd0" to_role="slave" from="fs0" score="-infinity"/>

Translation:

  • Never run fs0 on a node where ms-drbd0 is stopped.

  • Never run fs0 on a node where ms-drbd0 is running as slave.

  • Only start fs0 after promoting ms-drbd0 to master.

Since the the total set of states for a master/slave resource is {stopped, slave, master}, this only allows it to run on a node which is running as master.

Colocating a resource with a resource in slave state

The idiom to use if you have a resource which you want to run on the slave node is similar to the colocation with master case:

<rsc_colocation id="fs_on_drbd0" to="drbd0-partition"
  to_role="stopped"
  from="mount-drbd0" score="-infinity"/>
<rsc_colocation id="fs_on_drbd0" to="drbd0-partition"
  to_role="master"
  from="mount-drbd0" score="-infinity"/>

Translation:

  • Never run fs_on_drbd0 on a node where drbd0-partition is stopped.

  • Never run fs_on_drbd0 on a node where drbd0-partition is running as master.

Since the the total set of states for a master/slave resource is {stopped, slave, master}, this only allows the resource to run on a node which is running as slave.

Question:
I don't know how to write the corresponding ordering rule for this case - to ensure that it only runs after being started OR after being demoted from master. Is it necessary? If it isn't necessary, then probably the ordering rule for the master case isn't needed either.

Controlling which node becomes master

It is sometimes desirable to preferentially constrain the master instance of a master/slave resource to run on a particular node or set of nodes in the cluster.

The CIB snippet below will add prefer the node ace to be master of the resource ms-drbd1 by 100 points.

<rsc_location id="loc:ms-drbd1_likes_ace" rsc="ms-drbd1">
  <rule id="rule:ms-drbd1_likes_ace" role="master" score="100">
    <expression  attribute="#uname" operation="eq" value="ace"/>
  </rule>
</rsc_location>

Controlling which nodes remain slaves

It is sometimes desirable to preferentially constrain the slave instance of a master/slave resource to run on a particular node or set of nodes in the cluster.

However, since all resources go through slave status before becoming promoted, what we have to do is constrain it to avoid becoming master.

The CIB snippet below will prefer to promote any node but fred to be master of the resource drbd1 by 100 points. This is effectively the same as saying "we want fred to avoid becoming master if possible".

<rsc_location id="loc:drbd1_likes_fred" rsc="drbd1">
  <rule id="rule:drbd1_likes_fred" role="master" score="100">
    <expression  attribute="#uname" operation="ne" value="fred">
  </rule>
</rsc_location>

Pingd-related idioms

In some of these rules, you'll see rules being given scores through the score attribute, and sometimes through the score_attribute attribute. It is an error to have both a score and a score_attribute set on a rule, because they conflict.

Run a resource on the node with the best external connectivity

It is often desirable to allow the value of the attribute that pingd sets directly as a the score for a particular rule.

If you set the pingd scaling factor to 100, then having access to one node is worth 100, 2 nodes is worth 200, and so on.

This way, if all else is equal, the node with the highest ping connectivity will be selected. If two or more eligible nodes have the same score, then they will be given equal weight according to the rule below.

<rsc_location id="my_resource:connected" rsc="my_resource">
  <rule id="my_resource:connected:rule" score_attribute="pingd" >
    <expression id="my_resource:connected:expr:defined"
      attribute="pingd" operation="defined"/>
  </rule>
</rsc_location>

Of course, if you have configured the pingd daemon to set some attribute name besides its default (pingd), then you need to change the name of the score_attribute above from pingd to whatever attribute you have configured the pingd daemon to use.

Make a resource stop when external connectivity is lost

It is sometimes desirable to shut a particular service down if ping connectivity is lost. This rule will prohibit the service from running anywhere that there is no ping connectivity to the outside world, and all nodes with some connectivity are treated as the same, regardless of how many ping nodes are accessible.

<rsc_location id="my_resource:connected" rsc="my_resource">
  <rule id="my_resource:connected:rule" score="-INFINITY" boolean_op="or">
    <expression id="my_resource:connected:expr:undefined"
      attribute="pingd" operation="not_defined"/>
    <expression id="my_resource:connected:expr:zero"
      attribute="pingd" operation="lte" value="0"/>
  </rule>
</rsc_location>

Of course, if you have configured the pingd daemon to set some attribute name besides its default (pingd), then you need to change the name of the attribute above from pingd to whatever name you have configured the pingd daemon to use.

Attention: Note that this will stop the resource everywhere if the pinged node(s) indeed go down or heartbeat loses connectivity to them (firewalls et cetera). Consider using the wiki:CIB/Idioms/PingdAttrAsScore instead, which instead expresses a positive preference for the node with the best connectivity.

Time-based idioms

There are a number of things one might want to do differently depending on what time or day it is.

This section contains some examples of how to do these things in the CIB.

Restricting failback to weekends

Many clusters find it desirable to have a policy which allows failback only on weekends, and prohibits it during the week.

Below is an XML snippet for the crm_config section which does that.

Note that the CRM follows the convention of making Monday the first day of the week, similarly to the date %u string, and different from the %w string. Cron allows it to either be first (0) or last (7). For the CIB, it is the last day of the week (7) only.

<cluster_property_set id="weekend_override" score="100">
  <rule id="my_ip:failover" boolean_op="and">
    <date_expression id="my_ip:days" operation="date_spec">
      <date_spec id="my_ip:days" weekdays="6-7"/>
    </date_expression>
  </rule>
  <attributes>
    <nvpair id="weekend-sticky"
          name="default_resource_stickiness"
         value="0"/>
  </attributes>

 </cluster_property_set>
 <cluster_property_set id="default_cluster_properties" score="10">
  <attributes>
   <nvpair id="default-sticky" 
         name="default_resource_stickiness"
        value="INFINITY"/>
  </attributes>
 </cluster_property_set>

Translating this into English: The first cluster property set is given a score of 100, and has a date restriction on it which is only true on weekends (days 6-7).

In other words, if the date expression is true, then the elements of this rule set apply to are given a weight of 100.

The second cluster_property set has a weight (score) of 10, and is always TRUE (in other words it always applies). However since its weight is only 10, and the conditional rule above it has a weight of 100, then when both rule sets apply, and there are conflicts, the elements provided by the weekend_override {{cluster_property_set}} will win out.

The net result of this is that for days 6 and 7, default_resource_stickiness has the value 0, but on other days, it has the value INFINITY.

Running an expensive monitor only at night

Sometimes a resource will have a montior operation which is too expensive to run during normal production so it would then be desirable to run the cheap monitor operation frequently during normal hours, but run the expensive monitor operation only at night.

CategoryMissingIdiom

Reboot-related idioms

Causing a node to reboot when a resource stop fails

It is often desirable to cause a stop failure on a particular operation to trigger a reboot. To do that, add the XML below to the <operations/> section for the resource:

  <op id="dummy-resource-stop-id" name="stop" on_fail="fence"/>

If there are any timeouts, etc. that you want to specify for this resource, then those should be included in this op tag as well.

STONITH-related idioms

STONITH plugins can be classified in two types: those that can control more than one node (e.g. some power switches), and those that can control only one node (iLO or DRAC for ex).

In both cases, the usual way to configure STONITH in the CIB is to make use of "clone" resources. The reason for this is that heartbeat must be able to reach one STONITH Resource Agent for any node (i.e. a process that provices STONITH capabilities for that node), no matter the state of the cluster. So the simplest and most straightforward approach is to run one instance on each node, hence the use of clones. By configuring a single clone STONITH RA, the agent is then run by Heartbeat on each node.

For STONITH devices where each agent can only control a single node, the total number of STONITH resources running will be equal to the square of the number of nodes.

Configure an iLO STONITH device with the riloe plugin

Each instance of the external iLO STONITH module (riloe) can only poweroff a single node. There must be, therefore, a clone for each node with an iLO card.

In the example below, the iLO interface is on node01-rm.

<clone id="CL_stonithset_node01">
   <instance_attributes id="CL_stonithset_node01">
    <attributes>
      <nvpair id="CL_stonithset_node01_clone_node_max" name="clone_node_max" value="1"/>
    </attributes>
  </instance_attributes>
  <primitive id="CL_stonith_node01" class="stonith" type="external/riloe-ng" provider="heartbeat">
    <operations>
      <op name="monitor" interval="30s" timeout="20s" id="CL_stonith_node01_monitor"/>
      <op name="start" timeout="60s" id="CL_stonith_node01_start"/>
    </operations>
    <instance_attributes id="CL_stonith_node01">
      <attributes>
        <nvpair id="CL_stonith_node01_hostlist" name="hostlist" value="node01"/>
        <nvpair id="CL_stonith_node01_RI_HOST" name="RI_HOST" value="node01"/>
        <nvpair id="CL_stonith_node01_RI_HOSTRI" name="RI_HOSTRI" value="node01-rm"/>
        <nvpair id="CL_stonith_node01_RI_LOGIN" name="RI_LOGIN" value="Heartbeat"/>
        <nvpair id="CL_stonith_node01_RI_PASSWORD" name="RI_PASSWORD" value="password"/>
      </attributes>
    </instance_attributes>
  </primitive>
</clone>

Configure an IBM RSA STONITH device with the ibmrsa plugin

Each instance of the external IBM RSA stonith resource agent (ibmrsa) can only poweroff a single node and it is not clone aware. This means you have to provide a primitive resource per node to shoot.

The following example is a snippet which shoots node01 if requested. Put this snippet to a file 'file.xml', change values appropriately and put the whole stuff to a running HAv2 cluster via  cibadmin -M -x 'file.xml' 

The constraint guarantees that the stonith resource agent does not run on the node which has to be shot in case. Suicide is not allowed at the moment (2008-01-27).

This external stonith agent requires the IBM tool 'MPCli.sh'.

<?xml version="1.0" ?>
<cib>
  <configuration>
    <resources>
      <primitive id="r_stonith-node01" class="stonith" type="external/ibmrsa" provider="heartbeat" resource_stickiness="INFINITY">
        <operations>
          <op name="monitor" interval="60" timeout="300" prereq="nothing" id="r_stonith-node01-mon"/>
          <op name="start" timeout="180" id="r_stonith-node01-start"/>
          <op name="stop" timeout="180" id="r_stonith-node01-stop"/>
        </operations>
        <instance_attributes id="r_stonith-node01">
          <attributes>
            <nvpair id="r_stonith-node01-hostname" name="hostname" value="node01"/>
            <nvpair id="r_stonith-node01-ipaddr" name="ipaddr" value="192.168.0.1"/>
            <nvpair id="r_stonith-node01-userid" name="userid" value="userid"/>
            <nvpair id="r_stonith-node01-passwd" name="passwd" value="password"/>
            <nvpair id="r_stonith-node01-type" name="type" value="ibm"/>
          </attributes>
        </instance_attributes>
      </primitive>
    </resources>
    <constraints>
      <rsc_location id="r_stonith-node01_hates_node01" rsc="r_stonith-node01">
        <rule id="r_stonith-node01_hates_node01_rule" score="-INFINITY">
          <expression attribute="#uname" id="r_stonith-node01_hates_node01_expr" operation="eq" value="node01"/>
        </rule>
      </rsc_location>
    </constraints>
  </configuration>
</cib>

(contributed by Andreas Mock)

Configure an IBM RSA STONITH device with the ibmrsa-telnet plugin

Each instance of the external IBM RSA stonith resource agent (ibmrsa-telnet) can only poweroff a single node and it is not clone aware. This means you have to provide a primitive resource per node to shoot.

This external stonith plugin uses the possibility to connect via telnet to the RSA board to reboot the node. It does not requires the IBM tool 'MPCli.sh' which consumes relative much resources.

The following example is a snippet which shoots node01 if requested. Put this snippet to a file 'file.xml', change values appropriately and put the whole stuff to a running HAv2 cluster via  cibadmin -M -x 'file.xml' 

The constraint guarantees that the stonith resource agent does not run on the node which has to be shot in case. Suicide is not allowed at the moment (2008-01-27).

<?xml version="1.0" ?>
<cib>
  <configuration>
    <resources>
      <primitive id="r_stonith-node01" class="stonith" type="external/ibmrsa-telnet" provider="heartbeat" resource_stickiness="INFINITY">
        <operations>
          <op name="monitor" interval="60" timeout="300" prereq="nothing" id="r_stonith-node01-mon"/>
          <op name="start" timeout="180" id="r_stonith-node01-start"/>
          <op name="stop" timeout="180" id="r_stonith-node01-stop"/>
        </operations>
        <instance_attributes id="r_stonith-node01">
          <attributes>
            <nvpair id="r_stonith-node01-nodename" name="nodename" value="node01"/>
            <nvpair id="r_stonith-node01-ipaddr" name="ipaddr" value="192.168.0.1"/>
            <nvpair id="r_stonith-node01-userid" name="userid" value="userid"/>
            <nvpair id="r_stonith-node01-passwd" name="passwd" value="password"/>
          </attributes>
        </instance_attributes>
      </primitive>
    </resources>
    <constraints>
      <rsc_location id="r_stonith-node01_hates_node01" rsc="r_stonith-node01">
        <rule id="r_stonith-node01_hates_node01_rule" score="-INFINITY">
          <expression attribute="#uname" id="r_stonith-node01_hates_node01_expr" operation="eq" value="node01"/>
        </rule>
      </rsc_location>
    </constraints>
  </configuration>
</cib>

(contributed by Andreas Mock)