corosync

  1. A

    Proxmox cluster lost synchronization

    Hello, Today our cluster lost synchronization. Most of the nodes were shown as offline or unknown. The nodes were up but every node could see only itself and few other nodes. Restarting the pve-cluster and corosync didn't help so we brought everything down and started them one by one. For most...
  2. TwiX

    PVE7 - corosync upgrade rebooted all nodes !

    Hi, I tried to upgrade a 6 nodes pve7 cluster yesterday. We use Ceph and HA for all VMs. I was able to upgrade 4 nodes without issue. But on the fifth node, I lost the whole cluster. All nodes rebooted ! Syslogs for nodes 22 to 27 : update ok for nodes 22,23,24 and 27 upgrade of node 25...
  3. Y

    Upgrade proxmox from 5 to 6

    Hello, what is the best choice for a 12-node cluster? 1) Upgrade nodes one by one. Initially, the newly upgraded node(s) will not have be quorate on their own. Once at least half of the nodes plus one have been upgraded, the upgraded partition will become quorate and the not-yet-upgraded...
  4. T

    Some notes and questions about Proxmox Cluster networking

    Hello, I am trying to find more information about Proxmox Cluster networking, and specially the use of ports 22, 5404 and 5405 for intra cluster communication. I feel like the PVE admin guide could be updated with more accurate information (some of which I am contributing in this thread). I...
  5. T

    Issues joining new servers to clusters

    Hi All, I appreciate any feedback I get here. I have an existing 5 node cluster. It runs Ceph, but I dont believe this is applicable to the situation, but thought to mention it. I also have an existing 3 node cluster. No shared storage. I've migrated everything away from the 3 node cluster and...
  6. S

    48 node pve cluster

    Hello everyone, we're running a 48 node pve cluster with this setup: AMD EPYC 7402P, 512GB Memory, Intel X520-DA2 or Mellanox Connect X3 NIC, Ceph Pool with only NVMe, 2x 10Gbit/s interfaces (for cluster traffic) + 2x 1G (for public traffic). As a few others have recently reported in the forum...
  7. A

    Adding node to cluster crash all nodes from it

    Hi, tonight is the second time I face a huge problem when trying to add a node to my existing cluster. My cluster contains approximately 15 nodes, I use CEPH as storage and everything is working pretty good. All our nodes and "future" nodes FQDN are contained in /etc/hosts like : -...
  8. L

    Qdevice is not voting

    Hello guys, I have a problem with my Qdevice. If I type "pvemc status", my first node give the following result: root@pve1:~# pvecm status Cluster information ------------------- Name: server Config Version: 7 Transport: knet Secure auth: on Quorum information...
  9. M

    Whole cluster randomly rebooted twice (maybe corosync?)

    Dear all, I am currently dealing with a problem in a hyperconverged (ceph) where the whole cluster reboots seemingly at random. Every single one (of the total of seven) node resets at the same time. I am suspecting corosync to not be able to communicate properly. This problem has only popped up...
  10. L

    1x NIC to 2x NICs (Keep VMs/WAN on 1st & Move PVE/Corosync/etc to 2nd)

    Hi there, I'm looking to update the networking setup for Proxmox VE now that the switches have been swapped out for more capable models. At the moment, each host has 1x NIC connected, which is serving VMs (WAN) and PVE (Web GUI, SSH, Corosync, etc). Static addressing for PVE is set against...
  11. I

    Multiple cluster in corosync

    Hello all, May be someone can help, how to add second PVE cluster (as observer) in already exist PVE Cluster. How i can add here second separate CLUSTER-2? # cat /etc/pve/corosync.conf nodelist { node { name: n1 nodeid: 1 quorum_votes: 1 ring0_addr: n1 mcastaddr...
  12. H

    corosync constant retransmit

    hello, i have up to 16 nodes in a proxmox cluster and corosync is constantly showing retransmit in logs: Jan 05 13:38:15 node2 corosync[18227]: [TOTEM ] Retransmit List: 66a9 Jan 05 13:38:23 node2 corosync[18227]: [TOTEM ] Retransmit List: 6708 Jan 05 13:38:30 node2 corosync[18227]...
  13. Y

    [SOLVED] corosync crash when adding a 15th node

    Before I had a cluster of 13 nodes. I added 3 other nodes and within 5 minutes I lost the whole cluster. After restarting corosync 1 by 1 but when I start a 15th node I have this message: corosync[29232]: [TOTEM ] Token has not been received in 380 ms then after a few minutes the cluster...
  14. A

    Corosync - Mysterious reboot after network flapping

    Hello, I have four servers in a cluster. The last night, we faced to a big network flapping on 'srva' (private network and public network) with an impact to the private network '10.50.255.0/24'. The expected behavior was to get the three nodes (srvb, srvc, srvd) working together and the node...
  15. se4n_1

    [SOLVED] corosync-qdevice.service fails to start with 'received server error 18. Disconnecting from server'

    I have a problem getting a QDevice to work on proxmox 6.2-12 First I install the QDevice package on the 3rd witness (Raspberry Pi OS 20-08-2020) box: # apt install corosync-qnetd Reading package lists... Done Building dependency tree Reading state information... Done The following NEW...
  16. C

    Made mistake in corosync.conf; now cannot edit

    I have (had) a 3 node Proxmox VE 6.2-11 and Ceph cluster. I'm modifying my config after install and some light use. Ceph is now on its own 10Gx2 LAN. I decided to dedicate a 1Gb interface and create a VLAN for corosync and attempted to modify corosync.conf before understanding exactly what...
  17. M

    HA Design

    I currently have a 4 node HCI cluster that's working quite well. It will be expanding to 8 nodes total and be used for critical services. All of the testing was satisfactory and management was duly impressed. I am reinstalling the cluster from scratch in order to ensure none of the testing bits...
  18. 1

    Corosync Cluster Engine is dead. Is this normal?

    Hello. I recently installed Proxmox in 1 one physical server (node). I was browsing around the node's settings when I noticed that under System, it is saying that the status of the Corosync Cluster Engine is dead. I did some Googling and learned that the Corosync Cluster Engine is how physical...
  19. I

    IP Range Correction

    Hey Guys, I'm sitting with an issue, we replaced some of our servers in our cluster... And somewhere we made a mistake..Can someone please help me. We would like the public IP range to be the 129.232.156.xx range, and the ceph data sync ip range 10.161.0.xx The reason obviously being that...
  20. S

    2 Node Cluster- Corosync Netzwerktrennung bedenken

    Ich möchte nun mein Corosync-Netzwerk auf eine andere Netzwerkschnittstelle legen mit folgendem Wikiartikel. Zurzeit befinden sich nur 2 Nodes im Cluster. Meine bedenken sind nun das wenn ich die Datei gemäß der Anleitung ändere zur neuen Netzwerkkarte, as ich dies nur auf dem einem Node über...

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!