Qdevice not voting in my cluster

jmalms

New Member
Aug 13, 2024
2
1
3
To preface, I am completely new and I have no idea what is going on, so bare with me.

I am trying to set up a qdevice to be the third vote in my cluster, but it won't vote. Logs seem to show everything is working fine but maybe I don't understand what logs I need to look at, I have no idea where to even start really. I can ssh between the qdevice and the nodes, node to node, and node to qdevice. I've reinstalled everything and reconfigured things, I am totally lost. I've spent 40 minutes googling to no end, at some point I think I saw someone was complaining that there was a bug for this with VE 8.1?

Here is where I am at, and I'll put the corosync.log below this:

1723510741769.png

These are the logs from corosync.log on the qdevice from my last attempt:

Code:
Aug 13 00:41:39.892 [2974] orangepizero3 corosync notice  [CFG   ] Node 1 was shut down by sysadmin
Aug 13 00:41:39.896 [2974] orangepizero3 corosync notice  [SERV  ] Unloading all Corosync service engines.
Aug 13 00:41:39.896 [2974] orangepizero3 corosync info    [QB    ] withdrawing server sockets
Aug 13 00:41:39.896 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync vote quorum service v1.0
Aug 13 00:41:39.896 [2974] orangepizero3 corosync info    [QB    ] withdrawing server sockets
Aug 13 00:41:39.896 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync configuration map access
Aug 13 00:41:39.896 [2974] orangepizero3 corosync info    [QB    ] withdrawing server sockets
Aug 13 00:41:39.896 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync configuration service
Aug 13 00:41:39.896 [2974] orangepizero3 corosync info    [QB    ] withdrawing server sockets
Aug 13 00:41:39.896 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync cluster closed process group service v1.01
Aug 13 00:41:39.900 [2974] orangepizero3 corosync info    [QB    ] withdrawing server sockets
Aug 13 00:41:39.900 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync cluster quorum service v0.1
Aug 13 00:41:39.900 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync profile loading service
Aug 13 00:41:39.900 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync resource monitoring service
Aug 13 00:41:39.900 [2974] orangepizero3 corosync notice  [SERV  ] Service engine unloaded: corosync watchdog service
Aug 13 00:41:40.436 [2974] orangepizero3 corosync notice  [MAIN  ] Corosync Cluster Engine exiting normally
Aug 13 00:41:47.252 [1114] orangepizero3 corosync notice  [MAIN  ] Corosync Cluster Engine 3.1.6 starting up
Aug 13 00:41:47.252 [1114] orangepizero3 corosync info    [MAIN  ] Corosync built-in features: dbus monitoring watchdog augeas systemd xmlconf vqsim nozzle snmp pie relro bindnow
Aug 13 00:41:47.413 [1114] orangepizero3 corosync notice  [TOTEM ] Initializing transport (Kronosnet).
Aug 13 00:41:47.977 [1114] orangepizero3 corosync info    [TOTEM ] totemknet initialized
Aug 13 00:41:48.097 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync configuration map access [0]
Aug 13 00:41:48.097 [1114] orangepizero3 corosync info    [QB    ] server name: cmap
Aug 13 00:41:48.097 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync configuration service [1]
Aug 13 00:41:48.097 [1114] orangepizero3 corosync info    [QB    ] server name: cfg
Aug 13 00:41:48.097 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Aug 13 00:41:48.097 [1114] orangepizero3 corosync info    [QB    ] server name: cpg
Aug 13 00:41:48.097 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync profile loading service [4]
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync resource monitoring service [6]
Aug 13 00:41:48.101 [1114] orangepizero3 corosync warning [WD    ] Watchdog not enabled by configuration
Aug 13 00:41:48.101 [1114] orangepizero3 corosync warning [WD    ] resource load_15min missing a recovery key.
Aug 13 00:41:48.101 [1114] orangepizero3 corosync warning [WD    ] resource memory_used missing a recovery key.
Aug 13 00:41:48.101 [1114] orangepizero3 corosync info    [WD    ] no resources configured.
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync watchdog service [7]
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [QUORUM] Using quorum provider corosync_votequorum
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [QUORUM] This node is within the primary component and will provide service.
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [QUORUM] Members[0]:
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync vote quorum service v1.0 [5]
Aug 13 00:41:48.101 [1114] orangepizero3 corosync info    [QB    ] server name: votequorum
Aug 13 00:41:48.101 [1114] orangepizero3 corosync notice  [SERV  ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Aug 13 00:41:48.101 [1114] orangepizero3 corosync info    [QB    ] server name: quorum
Aug 13 00:41:48.105 [1114] orangepizero3 corosync info    [TOTEM ] Configuring link 0
Aug 13 00:41:48.105 [1114] orangepizero3 corosync info    [TOTEM ] Configured link number 0: local addr: 127.0.0.1, port=5405
Aug 13 00:41:48.105 [1114] orangepizero3 corosync info    [KNET  ] host: host: 1 (passive) best link: 0 (pri: 0)
Aug 13 00:41:48.105 [1114] orangepizero3 corosync warning [KNET  ] host: host: 1 has no active links
Aug 13 00:41:48.109 [1114] orangepizero3 corosync notice  [QUORUM] Sync members[1]: 1
Aug 13 00:41:48.109 [1114] orangepizero3 corosync notice  [QUORUM] Sync joined[1]: 1
Aug 13 00:41:48.109 [1114] orangepizero3 corosync notice  [TOTEM ] A new membership (1.a) was formed. Members joined: 1
Aug 13 00:41:48.113 [1114] orangepizero3 corosync notice  [QUORUM] Members[1]: 1
Aug 13 00:41:48.113 [1114] orangepizero3 corosync notice  [MAIN  ] Completed service synchronization, ready to provide service.
 
Ahhh. All the nodes in your cluster need to communicate to the Qdevice on tcp port 5403 as well.

Is there a firewall or similar that needs to be adjusted to let traffic from the nodes through to the Qdevice?

Once that communication path is working, the Total votes count in Votequorum information should increment to 3, and the vote count for the Qdevice in the Membership information should change to a 1, something like this:

Code:
Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 192.168.2.104 (local)
0x00000002          1    A,V,NMW 192.168.2.106
0x00000000          1            Qdevice
 
Last edited:
  • Like
Reactions: Kingneutron
Ahhh. All the nodes in your cluster need to communicate to the Qdevice on tcp port 5403.

Is there a firewall or similar that needs to be adjusted to let traffic from the nodes through to the Qdevice?

Once that communication path is working, the Total votes count in Votequorum information like should increment to 3, and the vote count for the Qdevice in the Membership information should change to a 1, something like this:

Code:
Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 192.168.2.104 (local)
0x00000002          1    A,V,NMW 192.168.2.106
0x00000000          1            Qdevice
Thanks so much, that's probably it!

I am going to write this down to try later. I just set up my experimental homelab which doesn't actually need to be in a cluster, I only wanted to try a cluster because it seemed fascinating and I wanted to learn more about it. For now I will just have isolated nodes which will work just fine for what I am doing right now.

I will probably create a proper 3 node cluster in the future when I've got the cash to do so, I already had a pi running so I thought to at least try it. I did learn a decent bit even though I did not accomplish my goal!
 
  • Like
Reactions: justinclift

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!