1 node is out of sync from cluster

PiotrDev

Member
Sep 15, 2019
10
0
21
38
here are some details from logs: https://pastebin.com/dhUUKSGm
basically one node got off from cluster, in logs I see problem with file replica file (but that file is on other working nodes, failing node has empty directory /etc/pve/priv - and directory don't even have "w" flag for write)
can you suggest where I should look for..?

pvecm status of "broken" node:

Code:
root@anna4 /etc/pve # pvecm status
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused

and one of 2 working nodes:
Code:
root@anna3 /etc/pve # pvecm status
Quorum information
------------------
Date:             Tue Feb 18 18:16:22 2020
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.41668
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.10.1.3 (local)
0x00000003          1 10.10.1.5
 
Last edited:
after restarting pvestatd on disconnected node, pvecm status/nodes started showing something, but it seems to think other machines are unreacheable

Code:
root@anna4 /etc/pve # pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         2          1 anna4 (local)
        
root@anna4 /etc/pve # pvecm status
Quorum information
------------------
Date:             Tue Feb 18 18:29:26 2020
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          1/267956
Quorate:          No

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      1
Quorum:           2 Activity blocked
Flags:           

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 10.10.1.4 (local)
 
How is your network configured? Do you have a dedicated physical NIC for corosync?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!