Broken cluster after reboot

galphanet

Active Member
Jun 18, 2009
33
0
26
Long story short, we had to reboot our proxmox cluster (6 machines) after a switch change and everything went south.
We had HA enabled and at some point, each server decided to start every VM...which corrupted their disks on the shared storage.

But this was our fault, here is what we did on the nodes to make them start again the VM:
pvecm expected 1
pmxcfs -l

So now we have each node running the VM but they don't see each other and we need help to make it work again. Thanks for your time.

Each node show itself only on pvecm :
# pvecm status
Quorum information

------------------

Date: Thu Sep 13 19:56:48 2018

Quorum provider: corosync_votequorum

Nodes: 1

Node ID: 0x00000003

Ring ID: 3/56416

Quorate: Yes


Votequorum information

----------------------

Expected votes: 1

Highest expected: 1

Total votes: 1

Quorum: 1

Flags: Quorate


Membership information

----------------------

Nodeid Votes Name

0x00000003 1 10.50.188.146 (local)

# systemctl status pve-cluster

Sep 13 19:41:37 blade6 systemd[1]: Starting The Proxmox VE cluster filesystem...

Sep 13 19:41:37 blade6 pmxcfs[64119]: [status] notice: update cluster info (cluster name qls2, version = 6)

Sep 13 19:41:37 blade6 pmxcfs[64119]: [dcdb] notice: members: 3/64119

Sep 13 19:41:37 blade6 pmxcfs[64119]: [dcdb] notice: all data is up to date

Sep 13 19:41:37 blade6 pmxcfs[64119]: [status] notice: members: 3/64119

Sep 13 19:41:37 blade6 pmxcfs[64119]: [status] notice: all data is up to date

Sep 13 19:41:38 blade6 systemd[1]: Started The Proxmox VE cluster filesystem.

Sep 13 19:50:44 blade6 pmxcfs[64119]: [status] notice: node has quorum

#systemctl status pvestatd.service

Sep 13 19:40:54 blade6 pvestatd[3404]: ipcc_send_rec[1] failed: Connection refused

Sep 13 19:40:54 blade6 pvestatd[3404]: ipcc_send_rec[2] failed: Connection refused

Sep 13 19:40:54 blade6 pvestatd[3404]: ipcc_send_rec[3] failed: Connection refused

Sep 13 19:40:54 blade6 pvestatd[3404]: ipcc_send_rec[4] failed: Connection refused

Sep 13 19:40:54 blade6 pvestatd[3404]: status update error: Connection refused

Sep 13 19:41:04 blade6 pvestatd[3404]: ipcc_send_rec[1] failed: Connection refused

Sep 13 19:41:04 blade6 pvestatd[3404]: ipcc_send_rec[2] failed: Connection refused

Sep 13 19:41:04 blade6 pvestatd[3404]: ipcc_send_rec[3] failed: Connection refused

Sep 13 19:41:04 blade6 pvestatd[3404]: ipcc_send_rec[4] failed: Connection refused

Sep 13 19:41:04 blade6 pvestatd[3404]: status update error: Connection refused

Code:
root@blade6:~# pveversion --verbose
proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
pve-manager: 5.2-6 (running version: 5.2-6/bcd5f008)
pve-kernel-4.15: 5.2-4
pve-kernel-4.13: 5.2-2
pve-kernel-4.15.18-1-pve: 4.15.18-17
pve-kernel-4.15.17-2-pve: 4.15.17-10
pve-kernel-4.13.16-4-pve: 4.13.16-51
pve-kernel-4.13.16-3-pve: 4.13.16-50
pve-kernel-4.13.16-1-pve: 4.13.16-46
pve-kernel-4.13.13-6-pve: 4.13.13-42
pve-kernel-4.13.13-5-pve: 4.13.13-38
pve-kernel-4.13.13-4-pve: 4.13.13-35
pve-kernel-4.13.13-2-pve: 4.13.13-33
pve-kernel-4.10.17-2-pve: 4.10.17-20
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-37
libpve-guest-common-perl: 2.0-17
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-24
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-29
pve-container: 2.0-24
pve-docs: 5.2-5
pve-firewall: 3.0-13
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-30
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!