
Index
Contents
You have basically two options to integrate DRBD with Heartbeat v2 in CRM mode.
Use the "legacy" Heartbeat v1 style drbddisk resource agent to move the Primary role. In this case, you have to let /etc/init.d/drbd load and configure DRBD.
Use the "new" Heartbeat v2 style DRBD OCF resource agent. In this case, you must not let init load and configure DRBD, because this resource agent wants to do that itself.
Of course, you are free to invent more options yourself.
Note: as of 2008-02-15, the DRBD developers recommend to use the v1 drbddisk RA, although the v2 DRBD RA has been reported to work by some users (decide on your own!)
Heartbeat version 2's Cluster Resource Manager supports multi-state resource natively; those are resources which can be in one of three states instead of the usual two - stopped and started is completed with promoted (slave is equivalent to started). See http://wiki.linux-ha.org/v2/Concepts/MultiState[1] for more details.
The first resource agent to make full use of this functionality is the DRBD one. DRBD's primary and secondary concept directly map to this concept; in fact, they were used to design the model.
While the configuration in this way - as opposed to just letting drbddisk move the primary state, as in v1-legacy style configurations - is slightly more complex, it does provide advantages:
Secondary can be relocated and moved as well in response to failures (see below: Floating peers).
DRBD must not be started by init. Prevent DRBD from being started by your init system: (chkconfig drbd off, insserv -r drbd) The DRBD RA takes care of loading the DRBD module and all other start-up requirements.
A recent version of pacemaker (that's the new name of the crm) must be used. Versions 0.6.3 and above. Do not use older versions, they will not work reliably.
The XML snippets below will need to be somewhat customized to fit your setup and be loaded into the CIB using cibadmin. You should be familar with how to use this tool.
This cannot be correctly configured using the GUI. You will need to use the commandline tools.
The most common way to configure DRBD to replicate a volume between two fixed nodes, using IP addresses statically assigned on each.
Your /etc/drbd.conf will look similar to this; of course, you must adjust it to fit your environment. This example configures /dev/drbd0 to be replicated between xen-1 and xen-2; the instance is called drbd0:
# 'drbd0' is the identifier of this DRBD instance. You will need it to configure the resource
# in the CIB correctly. This name is arbitrary, but I chose to name it after the device node.
resource drbd0 {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
startup {
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error pass_on;
}
net {
# TODO: Should these timeouts be relative to some heartbeat settings?
# timeout 60; # 6 seconds (unit = 0.1 seconds)
# connect-int 10; # 10 seconds (unit = 1 second)
# ping-int 10; # 10 seconds (unit = 1 second)
on-disconnect reconnect;
}
syncer {
rate 100M;
group 1;
al-extents 257;
}
on xen-1 {
device /dev/drbd0;
disk /dev/hdd1;
address 192.168.200.1:7788;
meta-disk internal;
}
on xen-2 {
device /dev/drbd0;
disk /dev/hdc1;
address 192.168.200.2:7788;
meta-disk internal;
}
}
From now on, we will assume that you've setup DRBD and that it is working (test it with the DRBD init script outside Heartbeat's control). If not, debug this first. The DRBD users guide is quite a helpful document to set DRBD up. Read http://www.drbd.org/users-guide/index.html[2]
As explained, the resource is configured differently than a drbddisk resource before. It makes use of some more advanced CIB features (explained below), and goes into the resources section:
<master_slave id="ms-drbd0">
<meta_attributes id="ma-ms-drbd0">
<attributes>
<nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
<nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
<nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
<nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
<nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
<nvpair id="ma-ms-drbd0-6" name="globally_unique" value="false"/>
<nvpair id="ma-ms-drbd0-7" name="target_role" value="stopped"/>
</attributes>
</meta_attributes>
<primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd">
<instance_attributes id="ia-drbd0">
<attributes>
<nvpair id="ia-drbd0-1" name="drbd_resource" value="drbd0"/>
</attributes>
</instance_attributes>
<operations>
<op id="op-drbd0-1" name="monitor" interval="59s" timeout="10s" role="Master"/>
<op id="op-drbd0-2" name="monitor" interval="60s" timeout="10s" role="Slave"/>
</operations>
</primitive>
</master_slave>
The primitive DRBD resource, similar to what you would have used to configure drbddisk, is now embedded in a complex object master_slave. This specifies the abilities and limitations of DRBD - with DRBD v7, there can be only two instances, one per node, and only one master ever. The notify attribute specifies that DRBD needs to be told about what happens to its peer; globally_unique set to false lets Heartbeat know that the instances cannot be told apart on a single node.
Note that I'm creating the resource in stopped state first, so that I can finish configuring its constraints and dependencies before activating it. (To do that later, you can use crm_resource -r ms-drbd0 -v '#default' --meta -p target_role , for example.)
If you have a two node cluster, you could skip this step, because obviously, it can only run on those two. If you want to run drbd0 on two out of more nodes only, you will have to tell the cluster about this constraint:
<rsc_location id="drbd0-placement-1" rsc="ms-drbd0">
<rule id="drbd0-rule-1" score="-INFINITY">
<expression id="exp-01" value="xen-1" attribute="#uname" operation="ne"/>
<expression id="exp-02" value="xen-2" attribute="#uname" operation="ne"/>
</rule>
</rsc_location>
These two constraints tell the Policy Engine that, first, drbd0 can not run anywhere else except on xen-1 or xen-2. Second, they tell the PE that yes, it can run on those two.
Note: This assumes a symmetric cluster. If not, you will have to invert the rules.
If you want to prefer a node to run the master role (i.e. DRBD primary), you can express that like this:
<rsc_location id="drbd0-master-1" rsc="ms-drbd0">
<rule id="drbd0-master-on-xen-1" role="master" score="100">
<expression id="exp-1" attribute="#uname" operation="eq" value="xen-1"/>
</rule>
</rsc_location>
With this, you can now activate the DRBD resource. It should be started and promoted on one of the two nodes - or, if you specified such a constraint, on the node you wanted the master role on.
DRBD is rarely useful by itself; you will want to run a service on top of it. Or, very likely, you want to mount the filesystem on the master side.
Let us assume that you've created an ext3 filesystem on top of drbd0, which you now want managed by heartbeat as well. The filesystem resource object is straightforward, and if you have got experience with configuring Heartbeat v2 at all, will look rather familar:
<primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0">
<meta_attributes id="ma-fs0">
<attributes>
<nvpair name="target_role" id="ma-fs0-1" value="stopped"/>
</attributes>
</meta_attributes>
<instance_attributes id="ia-fs0">
<attributes>
<nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
<nvpair id="ia-fs0-2" name="directory" value="/mnt/share1"/>
<nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
</attributes>
</instance_attributes>
</primitive>
Make sure that the various settings match your setup. Again, this object has been created as stopped first.
Now the interesting bits. Obviously, this should only be mounted on the same node where drbd0 is in primary state, and only after drbd0 has been promoted, which is easily expressed in two constraints:
<rsc_order id="drbd0_before_fs0" from="fs0" action="start" to="ms-drbd0" to_action="promote"/>
<rsc_colocation id="fs0_on_drbd0" to="ms-drbd0" to_role="master" from="fs0" score="infinity"/>
Et voila! You now can activate the filesystem resource and it'll be mounted at the proper time in the proper place.
Just as this was done with a filesystem resource, this can be done with a group. In a lot of cases, you will not just want a filesystem, but also an IP-address and some sort of daemon to run on top of the DRBD master. Put those resources in a group, use the constraints above and replace "fs0" with the name of your group. The following example includes an apache webserver.
<resources>
<master_slave id="ms-drbd0">
<meta_attributes id="ma-ms-drbd0">
<attributes>
<nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
<nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
<nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
<nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
<nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
<nvpair id="ma-ms-drbd0-6" name="globally_unique" value="false"/>
<nvpair id="ma-ms-drbd0-7" name="target_role" value="stopped"/>
</attributes>
</meta_attributes>
<primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd">
<instance_attributes id="ia-drbd0">
<attributes>
<nvpair id="ia-drbd0-1" name="drbd_resource" value="drbd0"/>
</attributes>
</instance_attributes>
<operations>
<op id="op-drbd0-1" name="monitor" interval="59s" timeout="10s" role="Master"/>
<op id="op-drbd0-2" name="monitor" interval="60s" timeout="10s" role="Slave"/>
</operations>
</primitive>
</master_slave>
<group id="apache_group">
<meta_attributes id="ma-apache">
<attributes>
<nvpair id="ma-apache-1" name="target_role" value="started"/>
</attributes>
</meta_attributes>
<primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0">
<instance_attributes id="ia-fs0">
<attributes>
<nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
<nvpair id="ia-fs0-2" name="directory" value="/usr/local/apache/htdocs"/>
<nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" provider="heartbeat" type="apache" id="webserver">
<instance_attributes id="ia-webserver">
<attributes>
<nvpair id="ia-webserver-1" name="configfile" value="/usr/local/apache/conf/httpd.conf"/>
<nvpair id="ia-webserver-2" name="httpd" value="/usr/local/apache/bin/httpd"/>
<nvpair id="ia-webserver-3" name="port" value="80"/>
</attributes>
</instance_attributes>
<operations>
<op id="op-webserver-1" name="monitor" interval="30s" timeout="30s"/>
</operations>
</primitive>
<primitive id="virtual-ip" class="ocf" type="IPaddr2" provider="heartbeat">
<instance_attributes id="ia-virtual-ip">
<attributes>
<nvpair id="ia-virtual-ip-1" name="ip" value="10.0.0.1"/>
<nvpair id="ia-virtual-ip-2" name="broadcast" value="10.0.0.255"/>
<nvpair id="ia-virtual-ip-3" name="nic" value="eth0"/>
<nvpair id="ia-virtual-ip-4" name="cidr_netmask" value="24"/>
</attributes>
</instance_attributes>
<operations>
<op id="op-virtual-ip-1" name="monitor" interval="21s" timeout="5s"/>
</operations>
</primitive>
</group>
</resources>
<constraints>
<rsc_order id="drbd0_before_apache_group" from="apache_group" action="start" to="ms-drbd0" to_action="promote"/>
<rsc_colocation id="apache_group_on_drbd0" to="ms-drbd0" to_role="master" from="apache_group" score="infinity"/>
<rsc_location id="drbd0_master_on_xen-1" rsc="ms-drbd0">
<rule id="drbd0_master_on_xen-1_rule1" role="master" score="100">
<expression id="drbd0_master_on_xen-1_expression1" attribute="#uname" operation="eq" value="xen-1"/>
</rule>
</rsc_location>
</constraints>
This will load the drbd module on both nodes and promote the instance on xen-1. After successful promotion, it will first mount /dev/drbd0 to /usr/local/apache/htdocs, then start the apache webserver and in the end put the virtual IP-address 10.0.0.1/24 on eth0.
If you want to move the DRBD master role the other node, you should not attempt to just move the master role. On top of DRBD, you will propably have a Filesystem resource or a resource group with your application/Filesystem/IP-Address or whatever (remember, DRBD isn't usually useful by itself). If you want to move the master role, you can accomplish that by moving the Filesystem resource that is co-located with the DRBD master (and properly ordered). This can be done with v2/AdminTools/crm resource[3]. Given the group example from above, you would do crm_resource -M -r apache_group -H <hostname>. This will stop all resources in the group, demote the current master, and start the group on the other server (become DRBD primary and mount the FS) after successful promotion.
With Heartbeat managing the slave as well, there's a further possibility: the two sides do not have to be statically assigned to a node! How is that useful, you ask?
The use case for which this was developed is where DRBD is used to replicate between two sets of nodes (possibly two geographically distributed sites). While each set does have shared storage locally, this is not available across the sets. So, DRBD can move freely from one node to the other within one set.
(While theoretically possible to use it to build a hot-spare slave without shared storage, this has serious deficiencies; it'd always require a full resync to the new slave. This might be worth it in some scenarios though.)
This setup is somewhat more involved. Heartbeat must also manage the IP addresses which DRBD uses to replicate across. Heartbeat must know to not activate both DRBD instances within one set. DRBD must be told that it is not tied to specific nodes.
The following sections will walk you through the required settings.
To get this explanation sane, I will assume that anything ending in _0 goes with site A, and anything ending in _1 belongs to site B.
/etc/drbd.conf is the easiest change in this setup. Instead of referring to the nodes by their assigned hostnames, change the references to node_0 and node_1. (The DRBD RA will tell DRBD which one to use.)
Configure node_0 to use the path by which the shared storage can be accessed on site A, and configure it to use the proper IP address too. Same for node_1's section.
It'll look something like this:
resource drbd0 {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
startup {
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error pass_on;
}
net {
# TODO: Should these timeouts be relative to some heartbeat settings?
# timeout 60; # 6 seconds (unit = 0.1 seconds)
# connect-int 10; # 10 seconds (unit = 1 second)
# ping-int 10; # 10 seconds (unit = 1 second)
on-disconnect reconnect;
}
syncer {
rate 100M;
group 1;
al-extents 257;
}
on node_0 {
device /dev/drbd0;
disk /dev/hdd1;
address 192.168.200.100:7788;
meta-disk internal;
}
on node_1 {
device /dev/drbd0;
disk /dev/hdc1;
address 192.168.200.200:7788;
meta-disk internal;
}
}
It is useful to tag the nodes in the CIB according to which site they belong to. We'll use a node attribute named site for this:
# for N in 1 2 3 4 ; do crm_attribute -t nodes -U xen-$N -n site -v a ; done
# for N in 5 6 7 8 9 ; do crm_attribute -t nodes -U xen-$N -n site -v b ; done
You can also set the node attribute via XML of course.
Another easy step is to create the appropriate IP addresses:
<primitive class="ocf" provider="heartbeat" type="IPaddr" id="ip0">
<instance_attributes id="ia-ip0">
<attributes>
<nvpair id="ia-ip0-1" name="ip" value="192.168.200.100"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" provider="heartbeat" type="IPaddr" id="ip1">
<instance_attributes id="ia-ip1">
<attributes>
<nvpair id="ia-ip1-1" name="ip" value="192.168.200.200"/>
</attributes>
</instance_attributes>
</primitive>
This is very similar to the one we created above - can you spot the difference?
<master_slave id="ms-drbd0">
<meta_attributes id="ma-ms-drbd0">
<attributes>
<nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
<nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
<nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
<nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
<nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
<nvpair id="ma-ms-drbd0-6" name="globally_unique" value="false"/>
<nvpair id="ma-ms-drbd0-7" name="target_role" value="stopped"/>
</attributes>
</meta_attributes>
<primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd">
<instance_attributes id="ia-drbd0">
<attributes>
<nvpair id="ia-drbd0-1" name="drbd_resource" value="drbd0"/>
<nvpair id="ia-drbd0-2" name="clone_overrides_hostname" value="yes"/>
</attributes>
</instance_attributes>
</primitive>
</master_slave>
Indeed; it is the clone_overrides_hostname setting tells the resource agent to handle this for us.
We will assume you have a filesystem configured on top of this as well, named fs0 just as above.
The constraints are more interesting though. The easy part is making sure that the IPs only get started on the proper site; we do this by telling Heartbeat that all other nodes are not able to run them:
<rsc_location id="location-ip0" rsc="ip0">
<rule id="ip0-rule-1" score="-INFINITY">
<expression id="exp-ip0-1" value="a" attribute="site" operation="eq"/>
</rule>
</rsc_location>
<rsc_location id="location-ip1" rsc="ip1">
<rule id="ip1-rule-1" score="-INFINITY">
<expression id="exp-ip1-1" value="b" attribute="site" operation="eq"/>
</rule>
</rsc_location>
And now, we are building drbd0 on top of this; the IPs need to be active before DRBD can start, and properly colocated as well:
<rsc_order id="order_drbd0_ip0" to="ip0" from="ms-drbd0"/>
<rsc_order id="order_drbd0_ip1" to="ip1" from="ms-drbd0"/>
<rsc_colocation id="colo_drbd0_ip0" to="ip0" from="drbd0:0" score="infinity"/>
<rsc_colocation id="colo_drbd0_ip1" to="ip1" from="drbd0:1" score="infinity"/>
All set! Placing the filesystem on top of drbd0 is exactly as above and not repeated here.
So you want to have both sides of the mirror within one site, but obviously not on the same node? This is easily done as well. Instead of the location-ip rules above, simply add:
<rsc_colocation id="ip0_vs_ip1" to="ip0" from="ip1" score="-infinity"/>
This will ensure that the IPs never get assigned to the same node.
| [1] | http://wiki.linux-ha.org/v2/Concepts/MultiState |
| [2] | http://www.drbd.org/users-guide/index.html |
| [3] | http://www.linux-ha.org/v2/AdminTools/crm_resource |
| [4] | http://www.linux-ha.org/CategoryHowto |
This information provided courtesy of the Linux-HA project at http://linux-ha.org/