Rgmanager doesn’t start automatically after reboot

ewuewu · Feb 20, 2015

We are using a three node cluster (Vers.3.3.-5). Currently we are having an issue with rgmanager.

On each node Rgmanger doesn’t start after reboot.

So each time after boot I’ve to restart cman an afterwards I’ve to start rgmanager manually.

After this procedure everything works fine until the next reboot. I’ve no clue why rgmanager doesn’t start automatically. Currently we having one VM working in HA mode. So there is als a ‘rm’ section’ in the cluster config.
Here are my configs:

Code:

pveversion -v
proxmox-ve-2.6.32: 3.3-139 (running kernel: 2.6.32-34-pve)
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03)
pve-kernel-2.6.32-32-pve: 2.6.32-136
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-34-pve: 2.6.32-140
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.3-3
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-25
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

We are using three bonded networks. Bond0 ist used for Proxmox Cluster, Bond 1 for NFS backup connections and Bond3 for ceph communication.

Code:

# network interface settings
auto lo
iface lo inet loopback
iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual
iface eth4 inet manual
iface eth5 inet manual
 
auto bond0
iface bond0 inet manual
        slaves eth0 eth2
        bond_miimon 100
        bond_mode 802.3ad
 
auto bond1
iface bond1 inet manual
        slaves eth1 eth3
        bond_miimon 100
        bond_mode 802.3ad
 
auto bond2
iface bond2 inet manual
        slaves eth4 eth5
        bond_miimon 100
        bond_mode 802.3ad
 
auto vmbr1
iface vmbr1 inet static
        address  192.168.151.3
        netmask  255.255.255.0
        bridge_ports bond1
        bridge_stp off
        bridge_fd 3
 
auto vmbr1:0
iface vmbr1:0 inet static
        address  192.168.153.3
        netmask  255.255.255.0
 
auto vmbr1:1
iface vmbr1:1 inet static
        address  192.168.154.3
        netmask  255.255.255.0
 
auto vmbr0
iface vmbr0 inet static
        address  172.18.0.32
        netmask  255.255.252.0
        gateway  172.18.0.1
        bridge_ports bond0
        bridge_stp off
        bridge_fd 3
 
auto vmbr2
iface vmbr2 inet static
        address  192.168.152.3
        netmask  255.255.255.0
        bridge_ports bond2
        bridge_stp off
        bridge_fd 3

Code:

cat /etc/pve/cluster.conf<?xml version="1.0"?>
<cluster config_version="18" name="dmc-cluster-ni">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
  <fencedevices>
    <fencedevice agent="fence_ipmilan" auth="password" ipaddr="172.18.0.33" lanplus="1" login="YY" name="lx-vmhost-ni0-ipmi" passwd="XX" power_wait="10"/>
    <fencedevice agent="fence_ipmilan" auth="password" ipaddr="172.18.0.36" lanplus="1" login="YY" name="lx-vmhost-ni1-ipmi" passwd="XX" power_wait="10"/>
    <fencedevice agent="fence_ipmilan" auth="password" ipaddr="172.18.0.38" lanplus="1" login="YY" name="lx-vmhost-ni2-ipmi" passwd="XX" power_wait="10"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="lx-vmhost-ni1" nodeid="1" votes="1">
      <fence>
        <method name="power">
          <device name="lx-vmhost-ni1-ipmi"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="lx-vmhost-ni0" nodeid="2" votes="1">
      <fence>
        <method name="power">
          <device name="lx-vmhost-ni0-ipmi"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="lx-vmhost-ni2" nodeid="3" votes="1">
      <fence>
        <method name="power">
          <device name="lx-vmhost-ni2-ipmi"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <rm>
    <pvevm autostart="1" vmid="100"/>
  </rm>
</cluster>

We are also running into another problem. After each reboot of any of our nodes one or two of our ceph osd's are marked as 'out' on the rebooted node. I can manualy start the osd an afterwards it works fine.

Do we have an possibly network issue during the bootup phase? But I can't see any error messages after startup. Maybe I'm looking at the wrong places. Please give a hint what more information is needed to investigate such issues.

Any help is appreciated.

ewuewu · Mar 9, 2015

My issue is still existing. I couldn't fix it til today. But I found some hints in bootlog.txt. Can anyone help me to resolve this issue. All cluster nodes need manual intervention after reboot. I have restart cman and to start rgmanager manually after reboot. I have also to start at least one of the CEPH OSDs wich is marked as down/out.

here's my bootlog.txt

Code:

[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Setting parameters of disc: (none).[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Activating swap...done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Checking root file system...fsck from util-linux 2.20.1[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: /dev/mapper/pve-root: clean, 78817/966656 files, 1154986/3866624 blocks[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Loading kernel module fuse.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Cleaning up temporary files... /tmp.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Setting up LVM Volume Groups...done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Activating lvm and md swap...done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Checking file systems...fsck from util-linux 2.20.1[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: /dev/mapper/pve-data: recovering journal[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: /dev/mapper/pve-data: clean, 20/1966080 files, 167451/7862272 blocks[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: /dev/sda2: recovering journal[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: /dev/sda2: clean, 343/130560 files, 130871/522240 blocks[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Mounting local filesystems...done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Activating swapfile swap...done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Cleaning up temporary files....[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Setting kernel variables ...done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:33 2015: Configuring network interfaces...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:34 2015: Waiting for vmbr0 to get ready (MAXWAIT is 6 seconds).[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:37 2015: [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:37 2015: Waiting for vmbr1 to get ready (MAXWAIT is 6 seconds).[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:37 2015: [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:37 2015: Waiting for vmbr2 to get ready (MAXWAIT is 6 seconds).[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:37 2015: done.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:37 2015: Starting rpcbind daemon....[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting NFS common utilities: statd idmapd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Cleaning up temporary files....[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Setting console screen modes and fonts.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: ^[]R^[[9;30]^[[14;30]Setting up X socket directories... /tmp/.X11-unix /tmp/.ICE-unix.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting iSCSI initiator service: iscsid.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Setting up iSCSI targets:[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: iscsiadm: No records found[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: .[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Mounting network filesystems:.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: INIT: Entering runlevel: 2[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Using makefile-style concurrent boot in runlevel 2.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting rpcbind daemon...Already running..[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting KSM control daemon: ksmtuned.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Checking vzevent kernel module .....done[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting vzeventd: [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting NFS common utilities: statd idmapd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting enhanced syslogd: rsyslogd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting deferred execution scheduler: atd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting NTP server: ntpd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting RRDtool data caching daemon: rrdcached.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting OpenBSD Secure Shell server: sshd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting internet superserver: xinetd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting Postfix Mail Transport Agent: postfix.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting pve cluster filesystem : pve-cluster.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:38 2015: Starting periodic command scheduler: cron.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:40 2015: Starting stor_cimserver : stor_cimserver Initialized successfully[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:41 2015: starting maxView Storage Manager agent ...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:44 2015: === mon.0 === [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:44 2015: Starting Ceph mon.0 on lx-vmhost-ni2...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:44 2015: Starting ceph-create-keys on lx-vmhost-ni2...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:44 2015: === osd.0 === [/FONT][/COLOR]
[B][COLOR=#000000][FONT=tahoma]Sun Mar  8 17:26:44 2015: 2015-03-08 17:26:41.878464 7fc8d0446700  0 -- :/1004504 >> 192.168.152.3:6789/0 pipe(0x214e390 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x214e620).fault[/FONT][/COLOR][/B]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015: Starting cluster: [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Checking if cluster has been disabled at boot... [  OK  ][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Checking Network Manager... [  OK  ][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Global setup... [  OK  ][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Loading kernel modules... [  OK  ][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Mounting configfs... [  OK  ][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Starting cman... [  OK  ][/FONT][/COLOR]
[B][COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:02 2015:    Waiting for quorum... 2015-03-08 17:26:47.880310 7fc8d0244700  0 -- 192.168.152.4:0/1004504 >> 192.168.152.3:6789/0 pipe(0x21532c0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x2153550).fault[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: 2015-03-08 17:26:50.881116 7fc8d0446700  0 -- 192.168.152.4:0/1004504 >> 192.168.152.2:6789/0 pipe(0x2153e40 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x21540d0).fault[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: 2015-03-08 17:26:53.882144 7fc8d0244700  0 -- 192.168.152.4:0/1004504 >> 192.168.152.3:6789/0 pipe(0x2154aa0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x2154d30).fault[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: 2015-03-08 17:26:59.884199 7fc8d0345700  0 -- 192.168.152.4:0/1004504 >> 192.168.152.2:6789/0 pipe(0x7fc8cc001f30 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc8cc0021c0).fault[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: 2015-03-08 17:27:05.886135 7fc8d0446700  0 -- 192.168.152.4:0/1004504 >> 192.168.152.2:6789/0 pipe(0x2150670 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x2150900).fault[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: 2015-03-08 17:27:06.881174 7fc8d0345700  0 -- 192.168.152.4:0/1004504 >> 192.168.152.3:6789/0 pipe(0x2196800 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x2196a90).fault[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 3.64 host=lx-vmhost-ni2 root=default'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.0']' returned non-zero exit status 1[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: === osd.3 === [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: 2015-03-08 17:27:11.888157 7f35cc700700  0 -- :/1005000 >> 192.168.152.2:6789/0 pipe(0x29000d0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x2900360).fault[/FONT][/COLOR][/B]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: create-or-move updated item name 'osd.3' weight 3.64 at location {host=lx-vmhost-ni2,root=default} to crush map[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: Starting Ceph osd.3 on lx-vmhost-ni2...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: starting osd.3 at :/0 osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal[/FONT][/COLOR]
[B][COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:16 2015: ceph-disk: Error: One or more partitions failed to activate[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:27 2015: Timed-out waiting for cluster[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:27 2015: [FAILED][/FONT][/COLOR][/B]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:27 2015: Starting PVE firewall logger: pvefw-logger.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:28 2015: Starting OpenVZ: ..done[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:28 2015: Bringing up interface venet0: ..done[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:28 2015: Starting Cluster Service Manager: [  OK  ][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:28 2015: Starting Proxmox VE firewall: pve-firewall.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:28 2015: Starting PVE Daemon: pvedaemon.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:29 2015: Starting PVE Status Daemon: pvestatd.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:29 2015: Starting PVE API Proxy Server: pveproxy.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:29 2015: Starting PVE SPICE Proxy Server: spiceproxy.[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:29 2015: Starting VMs and Containers[/FONT][/COLOR]
[B][COLOR=#000000][FONT=tahoma]Sun Mar  8 17:27:39 2015: cluster not ready - no quorum?[/FONT][/COLOR][/B]

Any help ist apreciated

spirit · Mar 9, 2015

Hi,
I have notice that too on a test cluster (both rgmanager and some osd).

I'll try to do tests to debug that next week.

maybe it's a race with network or multicast enabling at startup

ewuewu · Mar 10, 2015

I'm looking forward to your results

brad_mssw · Mar 11, 2015

ewuewu said:
My issue is still existing. I couldn't fix it til today. But I found some hints in bootlog.txt. Can anyone help me to resolve this issue. All cluster nodes need manual intervention after reboot. I have restart cman and to start rgmanager manually after reboot. I have also to start at least one of the CEPH OSDs wich is marked as down/out.

According to the log it does appear your network isn't fully up before things are trying to start. We noticed this ourselves when using cisco switches without 'portfast', where the negotiation time for STP takes about 45 seconds before the ports enter forwarding mode. Anyhow, in your network configuration, add a 'post-up sleep 45' to one of your interfaces as a debug test to see if you just need to delay long enough for the switch ports to be come fully online before continuing the boot process. I know that solved our issues that seemed similar to yours. Most likely you'll need to do that on all nodes and reboot each one for the cluster to fully come online.

And of course always check 'pvecm status', 'clustat', 'fence_tool ls', 'group_tool ls' and 'ceph status'

ewuewu · Mar 16, 2015

brad_mssw said:
According to the log it does appear your network isn't fully up before things are trying to start. We noticed this ourselves when using cisco switches without 'portfast', where the negotiation time for STP takes about 45 seconds before the ports enter forwarding mode. Anyhow, in your network configuration, add a 'post-up sleep 45' to one of your interfaces as a debug test to see if you just need to delay long enough for the switch ports to be come fully online before continuing the boot process. I know that solved our issues that seemed similar to yours. Most likely you'll need to do that on all nodes and reboot each one for the cluster to fully come online.

And of course always check 'pvecm status', 'clustat', 'fence_tool ls', 'group_tool ls' and 'ceph status'

Thank you for your hint brad_mssw,

I tried it - but unfortunatley it made no changes in my cases.

spirit · Mar 16, 2015

It could be also related to igmp_querier interval.

I known that cman wait for 45s by default, and cisco for example is 60s.
(so it take 60s to have multicast working).

But I think case I have a timeout on quorum at startup, not sure it's related.

ewuewu · Mar 17, 2015

Unbelievable -- thanks to you guys especially to sprit - I increased the waiting time after network was started to 120s (post-up sleep 120) and everything is working fine. (It was just a test not really driven by knowledge)

But I think thats just a workaround and not a solution.

Any ideas how to investigate this issue a little bit deeper?

Search

Search

Rgmanager doesn’t start automatically after reboot

ewuewu

Renowned Member

ewuewu

Renowned Member

spirit

Distinguished Member

ewuewu

Renowned Member

brad_mssw

Well-Known Member

ewuewu

Renowned Member

spirit

Distinguished Member

ewuewu

Renowned Member

We value your privacy