This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

This web page is no longer maintained. Information presented here exists only to avoid breaking historical links.
The Project stays maintained, and lives on: see the Linux-HA Reference Documentation.
To get rid of this notice, you may want to browse the old wiki instead.

1 February 2010 Hearbeat 3.0.2 released see the Release Notes

18 January 2009 Pacemaker 1.0.7 released see the Release Notes

16 November 2009 LINBIT new Heartbeat Steward see the Announcement

Last site update:
2018-01-17 07:05:54

12) LarsMarowskyBree: The LocalResourceManager also needs to track resources which have failed on this node and when (ie, timestamp / reboot counter).

  • The ClusterResourceManager does not track state information and the design assumes that all node state is tracked by the nodes, and the LRM does know this.

    AlanRobertson does not fully understand this request. The use of the word "Tracking" tends to imply the desire is to do something with the information. Since there was no requirement specified to do anything with this data, and the LocalResourceManager is PolicyFree, it isn't obvious what the word tracking means in this context. It could mean something as simple as logging the information. If that's what's intended, I'm sure we can do that.

    LarsMarowskyBree: With tracking I mean to just keep the records around until the reboot of the node. For example, we require the LRM to keep a list of active ResourceInstances (obviously). This request extends that to failed/stopped resources: even if the resource instance has been stopped on that node, the LRM can still tell me that it is stopped, or that it has failed. The LRM should remember the last known state of a ResourceInstance on that node. That the LocalResourceManager should/could also easily keep a reboot counter of the node is just a nice touch.

    The rationale for this is that the LocalResourceManager is authoritive for the node status, so that the current relevent cluster status can always be easily accessed by combining the data from all LRMs, avoiding the need for truely distributed book-keeping.

    AlanRobertson has no foggy idea why the LocalResourceManager should be in charge of the status of nodes. It is in charge of the status of resources, not nodes. It knows absolutely nothing about clusters or nodes at all. It knows about resources on the current machine - which from it's point of view is all there is to know about. PS: heartbeat keeps a restart count already. We can add it to the API if you want.

    But with regard to "status"... The LRM can tell you the status of any resource, including ones which have never been started by it. That's inherent in the ResourceAgents. You just have to populate us with the configuration information for a resource, and we're off and running... In fact, on that subject, please see the next item...

    LarsMarowskyBree: That the LocalResourceManager would track local resources (which in turn represents the full node state) was just the current assumption to work with; as a node does not have any other state except for the resources which it holds, that seemed sensible. We can discuss that, obviously, but it was the way of the original proposal - the full cluster status (wrt resources) could always be assembled by querying all wiki:LocalResourceManagers.