Docu review done: Mon 03 Jul 2023 17:08:17 CEST

Table of Content

Commands

CommandsDescription
crm ra list systemdThese are common cluster resource agents found in systemd
crm ra list lsbLSB (Linux Standard Base) – These are common cluster resource agents found in /etc/init.d directory (init scripts)
crm ra list ocfOCF (Open Cluster Framework) – These are actually extended LSB cluster resource agents and usually support additional parameters
crm resource cleanup <resource>Cleans up messages from resources
crm node standby <node1>puts node into standby
crm node online <node1>puts node back to available
crm configure property maintenance-mode=trueenables the maintenance mode and sets all resources to unmanaged
crm configure property maintenance-mode=falsedisables the maintenance mode and sets all resources to managed
crm status failcountshows failcounts on top of status output
crm resource failcount <resource> set <node> 0sets failcount for resources on node to 0 (need to be done for all nodes)
crm resource meta <resource> set is-managed falsedisables the managed function for services managed by crm
crm resource meta <resource> set is-managed trueenables the managed function for services managed by crm
crm resource unmanage <resource>disables the managed resource managed by crm but keeps funktionality
crm resource manage <resource>enables the managed resource managed by crm but keeps funktionality
crm resource refreshre-checks for resources started outside of crm without chaning anything on crm
crm resource reprobere-checks for resources started outside of crm
crm resource reprobere-checks for resources started outside of crm
crm_mon --daemonize --as-html /var/www/html/cluster/index.htmlGenerate crm_mon as an html file (can be consumed by webserver…) and should run on all nodes

Installation

Sample: https://zeldor.biz/2010/12/activepassive-cluster-with-pacemaker-corosync/

Errors and Solutions

Standby on fail

Node server1: standby (on-fail)
Online: [ server2 ]

Full list of resources:

 Resource Group: samba
     fs_share   (ocf::heartbeat:Filesystem):    Started server2
     ip_share   (ocf::heartbeat:IPaddr2):   Started server2
     sambad (systemd:smbd): Started server2
     syncthingTomcat    (systemd:syncthing@tomcat): server2
     syncthingShare (systemd:syncthing@share):  Started server2
     syncthingRoot  (systemd:syncthing@root):   Started server2
     worm_share_syncd   (systemd:worm_share_syncd): Started server2
 Clone Set: ms_drbd_share [drbd_share] (promotable)
     Masters: [ server2 ]
     Stopped: [ server1 ]

Failed Resource Actions:
* sambad_monitor_15000 on server1 'unknown error' (1): call=238, status=complete, exitreason='',
    last-rc-change='Mon Jul 26 06:28:15 2021', queued=0ms, exec=0ms
* syncthingTomcat_monitor_15000 on server1 'unknown error' (1): call=239, status=complete, exitreason='',
    last-rc-change='Mon Jul 26 06:28:15 2021', queued=0ms, exec=0ms

Solution

Maybe even just a refresh of the failed resources would have been enough, but this is not confirmed

crm resource unmanage ms_drbd_share
crm resource refresh ms_drbd_share
crm resource refresh syncthingTomcat
crm resource manage ms_drbd_share