You probably should be reading the Pacemaker site clusterlabs.org
This site conserves Heartbeat specific stuff. See Site news for details.
Ciblint
From Linux-HA
Contents |
ciblint
The ciblint program examines your CIB in detail, looking for inconsistencies, possible errors, and things you might not have noticed. When it finds them, it prints them out. Not everything it finds is an error, but is probably worth investigating and making sure you understand. The version below works with current versions of Pacemaker.
Source to ciblint
media:Ciblint.gz
You may have to change the value of the HA_LIBHBDIR
constant in the code. This will eventually get fixed when this gets put under source control somewhere.
ciblint usage message
usage: ./ciblint [-C] -f cib-file ./ciblint [-C] -L ./ciblint [-w] (-A|--list-meta_attributes-config-options ./ciblint [-w] (-l|--list-crm-config-options) ./ciblint [-w] -h --help -f cib-filename analyze CIB from this XML file -L --live-cib analyze live CIB gotten via cibadmin -Q -C --ignore-non-defaults don't print messages for non-default crm_config values -w --wiki-format print usage or crm-config-options in wiki format -l --list-crm_config-options print all valid names for use in <nvpair> sections inside the <crm_config> section -A --list-meta_attributes-config-options print all valid names for use in <nvpair> sections inside <meta_attributes> sections CIB file can either have a status section or not. Either is acceptable. This program is a work-in-progress, but for many CIBs it's probably useful now. It currently looks for a number of classes of possible errors, including these: - Non-unique 'id' strings for the given <tag> - 'id' strings (outside the status section) which are not globally unique - Incorrect <nvpair> 'name's or 'value's - Duplicate <nvpair> 'name's in a list - Incorrect XML attribute names or values - references to non-existent resources - references to non-existent nodes - invalid values for the data type (integer, boolean, enum, etc.) involved - resources you're not monitoring - non-negative values for default-resource-failure-stickiness - STONITH not enabled - No STONITH resources configured - Use of for-testing-only ssh or external/ssh STONITH resource agents - validation of <nvpair> names and values in <meta_attributes> sections - validation of class and type for <primitive> resources - validation of <attributes> names and values in <nvpair>s for <primitive> resources - check if clone form resource names are used for non-clone resources - ensure that clone form resource names have integral clone numbers - check for ids with ":" characters in them in <primitive>, <group>, <clone>, or <master_slave> tags - check for resources with non-zero failcounts - check for <rule> tags with both score and score_attribute - check for missing values and attributes in expressions - make sure attributes mentioned in the CIB are defined somewhere in the cluster More documentation can be found online at http://linux-ha.org/ciblint
The section above is the result of running ciblint -w -h
Special Notes
For ciblint to do the most checking, it will need to run some harmless lrmadmin commands as root. Currently, it will attempt to do that using sudo - which may result in you getting some password:
prompts.
Sample Output
INFO: CIB has non-default value for expected-quorum-votes [3]. Default value is [2] Explanation of expected-quorum-votes option: The number of nodes expected to be in the cluster Used to calculate quorum in openais based clusters. INFO: CIB has non-default value for startup-fencing [false]. Default value is [true] Explanation of startup-fencing option: STONITH unseen nodes Advanced Use Only! Not using the default is very unsafe! INFO: CIB has non-default value for pe-input-series-max [5000]. Default value is [-1] Explanation of pe-input-series-max option: The number of other PE inputs to save Zero to disable, -1 to store unlimited. INFO: CIB has non-default value for dc-deadtime [5s]. Default value is [60s] Explanation of dc-deadtime option: How long to wait for a response from other nodes during startup. The "correct" value will depend on the speed/load of your network and the type of switches used. INFO: CIB has non-default value for no-quorum-policy [ignore]. Default value is [stop] Explanation of no-quorum-policy option: What to do when the cluster does not have quorum What to do when the cluster does not have quorum Allowed values:INFO: CIB has non-default value for expected-quorum-votes [3]. Default value is [2] Explanation of expected-quorum-votes option: The number of nodes expected to be in the cluster Used to calculate quorum in openais based clusters. INFO: CIB has non-default value for startup-fencing [false]. Default value is [true] Explanation of startup-fencing option: STONITH unseen nodes Advanced Use Only! Not using the default is very unsafe! INFO: CIB has non-default value for pe-input-series-max [5000]. Default value is [-1] Explanation of pe-input-series-max option: The number of other PE inputs to save Zero to disable, -1 to store unlimited. INFO: CIB has non-default value for dc-deadtime [5s]. Default value is [60s] Explanation of dc-deadtime option: How long to wait for a response from other nodes during startup. The "correct" value will depend on the speed/load of your network and the type of switches used. INFO: CIB has non-default value for no-quorum-policy [ignore]. Default value is [stop] Explanation of no-quorum-policy option: What to do when the cluster does not have quorum What to do when the cluster does not have quorum Allowed values: stop, freeze, ignore, suicide INFO: CIB has non-default value for cluster-recheck-interval [15m]. Default value is [15min] Explanation of cluster-recheck-interval option: Polling interval for time based changes to options, resource parameters and constraints. The Cluster is primarily event driven, however the configuration can have elements that change based on time. To ensure these changes take effect, we can optionally poll the cluster's status for changes. Allowed values: Zero disables polling. Positive values are an interval in seconds (unless other SI units are specified. eg. 5min) INFO: CIB has non-default value for batch-limit [10]. Default value is [30] Explanation of batch-limit option: The number of jobs that the TE is allowed to execute in parallel The "correct" value will depend on the speed and load of your network and cluster nodes. WARNING: external/ssh STONITH resource NOT approved for production INFO: Resource vip1 running on node xen-e INFO: Resource ping-1:0 running on node xen-e WARNING: Resource stateful-1:0 not running anywhere. INFO: Resource ping-1:2 running on node xen-d INFO: Resource app1 running on node xen-d INFO: Resource d2 running on node xen-d INFO: Resource migrator running on node xen-d INFO: Resource ping-1:1 running on node xen-f WARNING: Resource FencingChild not running anywhere. INFO: Resource stateful-1:1 running on node xen-f INFO: Resource stateful-1:2 running on node xen-d INFO: Resource d1 running on node xen-d stop, freeze, ignore, suicide INFO: CIB has non-default value for cluster-recheck-interval [15m]. Default value is [15min] Explanation of cluster-recheck-interval option: Polling interval for time based changes to options, resource parameters and constraints. The Cluster is primarily event driven, however the configuration can have elements that change based on time. To ensure these changes take effect, we can optionally poll the cluster's status for changes. Allowed values: Zero disables polling. Positive values are an interval in seconds (unless other SI units are specified. eg. 5min) INFO: CIB has non-default value for batch-limit [10]. Default value is [30] Explanation of batch-limit option: The number of jobs that the TE is allowed to execute in parallel The "correct" value will depend on the speed and load of your network and cluster nodes. WARNING: external/ssh STONITH resource NOT approved for production INFO: Resource vip1 running on node xen-e INFO: Resource ping-1:0 running on node xen-e WARNING: Resource stateful-1:0 not running anywhere. INFO: Resource ping-1:2 running on node xen-d INFO: Resource app1 running on node xen-d INFO: Resource d2 running on node xen-d INFO: Resource migrator running on node xen-d INFO: Resource ping-1:1 running on node xen-f WARNING: Resource FencingChild not running anywhere. INFO: Resource stateful-1:1 running on node xen-f INFO: Resource stateful-1:2 running on node xen-d INFO: Resource d1 running on node xen-d