Two node sheepdog cluster fail

webfrank

Member
May 28, 2011
38
1
6
Hi, I have a two node proxmox cluster in OVH with sheepdog. Everything works as expected but I got twice a network problem inside my vRack in OVH which lead to a disconnection between the nodes.
Corosync reports the new cluster configuration with only one member and the same does sheepdog but when network connection is re-estalished, corosync reports the new topology with two nodes but not sheepdog which remain with one node connected. It seems like the daemon does not listen to corosync new configuration.
The only way I could re-establish sheepdog cluster config is to restart one node which trigger a cluster recovery.

proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-29-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-29-pve: 2.6.32-126
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
Sheepdog daemon version 0.8.0