This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

This web page is no longer maintained. Information presented here exists only to avoid breaking historical links.
The Project stays maintained, and lives on: see the Linux-HA Reference Documentation.
To get rid of this notice, you may want to browse the old wiki instead.

1 February 2010 Hearbeat 3.0.2 released see the Release Notes

18 January 2009 Pacemaker 1.0.7 released see the Release Notes

16 November 2009 LINBIT new Heartbeat Steward see the Announcement

Last site update:
2017-12-11 17:14:32

Note: This is a proposal still and under reasonable heavy discussion. Please read the NodeFencing page in addition to this one; this page only explains the implementation of the STONITH Agents, not the integration into the CRM design.

StonithAgents are essentially a special ResourceAgentClass (in minor extension of the OCF ones), but named differently to avoid any confusion right away. In particular, they are very similar to OpenClusterFramework resource agents, with the following differences:

  • They are not located under /usr/ocf, but under /usr/lib/heartbeat/stonith.d/.

  • For every STONITH method, there is one agent. They likely will all call into the same STONITH backend initially (ie, be a wrapper around the existing heartbeat STONITH plug-ins), but complement them. However, this also means that we have an additional plug-in structure, so that people can chose.
  • On start, they start a daemon process to connect to the STONITH device; this is the instantiation of what is called the STONITH Controller in the NodeFencing page. This daemon monitors and controls the STONITH Device, and, if possible/necessary, does what is needed to ensure that we are the only one accessing the device (so that no other task interferes).

  • Surprisingly, on stop, said daemon is stopped and thus the ownership of the STONITH Device released.

  • monitor does just what it does for regular resources; it verifies whether the daemon is still running and performs a health check request by which the daemon reports whether it can still reach the STONITH Device.

There is two additional commands which the StonithAgents must support:

  • The fence operation, which we supply with a comma-separated list of node names to fence (via a OCF_RESKEY_STONITH_NODES environment variable) and report back on stdout for each node name whether the STONITH operation was successful or not. A non-zero exit code on the fence operation shall be interpreted a complete failure to reach the STONITH device, so the CRM can reallocate the STONITH controller resource on another node, if applicable.

  • The list-fence-targets which - after the resource has been started - reports the list of nodes which the device controls to stdout (which is already relayed back to us via the LocalResourceManager).

SunJiangDong said: According to the new node fencing architecture, now there is no actual StonithAgent scripts, only function simulation via stonith RA plugin plus the stonith daemon. Moreover, the functions corresponding to the above two operations will be moved to and implemented in node fencing daemon's API library, there will be no these two opertations on virtual StonithAgents. In other word, now these virtual StonithAgents act more like the standard OCF RAs. Please refer to SmartFencingDaemonProposal. Any comment?

Extension under discussion:

  • The monitor operation shall not only report back whether or not the STONITH device they control is reachable or not, but will also signal if the list of nodes it could control has changed, so we can dynamically reload it. As the daemon can easily track this, this would be most helpful so that we are notified when this list changes and thus can reload it dynamically. But AlanRobertson really doesn't seem to like this, but LarsMarowskyBree still hopes to convince him ;-)

See also

NodeFencing, LocalResourceManager/FencingOperations