This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

This web page is no longer maintained. Information presented here exists only to avoid breaking historical links.
The Project stays maintained, and lives on: see the Linux-HA Reference Documentation.
To get rid of this notice, you may want to browse the old wiki instead.

1 February 2010 Hearbeat 3.0.2 released see the Release Notes

18 January 2009 Pacemaker 1.0.7 released see the Release Notes

16 November 2009 LINBIT new Heartbeat Steward see the Announcement

Last site update:
2017-12-11 10:32:48

Node-granularity fencing in the new ClusterResourceManager framework

This is the summary of the discussion on linux-ha-dev.

The goal is to make fencing as much as a regular resource as that is possible. This goal is reached except for a special action which the STONITH resources need to support; see below.

Who initiates STONITH requests

STONITH requests are always initiated from the DesignatedCoordinator.

How are the STONITH controllers configured in the ClusterInformationBase

The STONITH controllers are configured in the resources section of the CIB as resources of the class stonith. All normal constraints for resource placement et cetera apply.

For sanity, a stonith-class resource may not require node fencing itself.

Who owns the STONITH device

The STONITH device is controlled via a StonithAgent, which is a special resource agent running under the control of the LocalResourceManager; see LocalResourceManager/FencingOperations.

As the STONITH controller is a regular resource internally, just of a special class, the regular node placement rules apply. This limits access to the STONITH device to the nodes which actually can do so - this will likely either be a single node for serial STONITH device or a wildcard for most network power switches.

As all requests are made through this single node, we also avoid the limitation that some network power switches only allow a single session to connect to them.

As explained on the StonithAgents page, we learn which nodes a given STONITH device can control on start time.

Monitoring the STONITH device

As the STONITH controller, through which all further requests to a given STONITH device are gated, is a regular resource, it will also be subject to monitoring, and thus we can find out immediately (and not only at the time where we want to use it) that a STONITH device has become unuseable and can inform the administrator and re-allocate the STONITH controller somewhere else.

When to STONITH

Whether or not a STONITH dependency is needed in the TransitionGraph is of course decided by the PolicyEngine via the resource parameters.

How do we determine whether a given resource needs to wait/block on node fencing

For regular resources, whether or not they need node-granularity fencing is controlled via the mandatory node_fencing="(yes|no)" attribute in the CIB.

The default for this attribute should be set by the GUI/administrator from the Resource Agent metadata for OCF agents (available in the CIB in the lrm_agent section), and for safety default to yes for heartbeat or lsb agents.

Which nodes need to be STONITHed

We need to compute the maximum set of eligible nodes for a given resource - assuming that all nodes where up right now and no other resources were running - and contrast this with the list of nodes which actually are up and healthy. Everything else needs killing.

STONITH in response to stop failures

Another scenario where a node may be STONITHed is a failed stop operation. Before we can recover the resource on another node, we must clean up by force.

Whether a failed stop operation causes the node the resource is running on to be STONITHed shall be controlled by a failstop_type=(ignore|block|stonith) attribute of either the resource or a resource depending on it.

ignore should only be used for self-fencing resources; the default must be either stonith or block for all others. As for the node_fencing attribute, the default should be retrieved from the resource agent metadata.

LarsMarowskyBree still wonders what happens if a lower priority resource has stonith set, fails to stop, but a higher priority resource (not depending on the first) is happily running along on that node; if we follow the wish of the lower prio resource, we affect the service level of the higher priority resource...

STONITHing failed nodes in general

Yet another scenario is that a STONITH induced reboot of a failed node may cure a intermittent fault of the node and thus reduce the MeanTimeToRepair and the time we spent in a partially degraded mode. Even if no resource might actively require the node to be shot, it may still be desireable because of this.

Whether or not a potentially failed node is shot because of this shall be controlled by a global always_stonith_failed_nodes flag; whether or not a given resource has to actually wait until this succeeded is controlled via the other parameters discussed above.

How to handle STONITH failures

If STONITH for a given node fails, we of course retry indefinetely, but in the meantime we block all resources which depend on this.

A manual override needs to be possible; the admin needs to be able to manually confirm that a given node (or set of nodes) is really down, so that the cluster can proceed.

See also

ResourceFencing