after upgrading from pve6 to pve7, the syslog of both servers is spammed with:
corosync: [KNET ] nsscrypto: Incorrect packet size.
After some googleing I haven't found a solution yet.
Hopfefully someone can help.
Related corosync github issue
The reason was the...
Before I begin I want to thank you all for your great posts/replies here. We've been using Proxmox for over 1 year and have never made a post as we've always found the answers here by searching (and that could be the case here, but I'm currently on vacation and wife gonna kill me if...
We upgraded the proxmox to 7. with latest ceph. and almost everything back to normal..
Currently our main network is the same network with the corosync and consists of 11 nodes, (4 of them with ceph)
We plan to double the server count in the near future and i thinking of moving the corosync...
Today our cluster lost synchronization. Most of the nodes were shown as offline or unknown. The nodes were up but every node could see only itself and few other nodes.
Restarting the pve-cluster and corosync didn't help so we brought everything down and started them one by one.
I tried to upgrade a 6 nodes pve7 cluster yesterday.
We use Ceph and HA for all VMs.
I was able to upgrade 4 nodes without issue.
But on the fifth node, I lost the whole cluster. All nodes rebooted !
Syslogs for nodes 22 to 27 :
update ok for nodes 22,23,24 and 27
upgrade of node 25...
what is the best choice for a 12-node cluster?
1) Upgrade nodes one by one. Initially, the newly upgraded node(s) will not have be quorate on their own. Once at least half of the nodes plus one have been upgraded, the upgraded partition will become quorate and the not-yet-upgraded...
I am trying to find more information about Proxmox Cluster networking, and specially the use of ports 22, 5404 and 5405 for intra cluster communication. I feel like the PVE admin guide could be updated with more accurate information (some of which I am contributing in this thread).
Hi All, I appreciate any feedback I get here.
I have an existing 5 node cluster. It runs Ceph, but I dont believe this is applicable to the situation, but thought to mention it.
I also have an existing 3 node cluster. No shared storage.
I've migrated everything away from the 3 node cluster and...
we're running a 48 node pve cluster with this setup:
AMD EPYC 7402P, 512GB Memory, Intel X520-DA2 or Mellanox Connect X3 NIC, Ceph Pool with only NVMe, 2x 10Gbit/s interfaces (for cluster traffic) + 2x 1G (for public traffic).
As a few others have recently reported in the forum...
Hi, tonight is the second time I face a huge problem when trying to add a node to my existing cluster.
My cluster contains approximately 15 nodes, I use CEPH as storage and everything is working pretty good.
All our nodes and "future" nodes FQDN are contained in /etc/hosts like :
I have a problem with my Qdevice. If I type "pvemc status", my first node give the following result:
root@pve1:~# pvecm status
Config Version: 7
Secure auth: on
I am currently dealing with a problem in a hyperconverged (ceph) where the whole cluster reboots seemingly at random. Every single one (of the total of seven) node resets at the same time. I am suspecting corosync to not be able to communicate properly. This problem has only popped up...
I'm looking to update the networking setup for Proxmox VE now that the switches have been swapped out for more capable models.
At the moment, each host has 1x NIC connected, which is serving VMs (WAN) and PVE (Web GUI, SSH, Corosync, etc). Static addressing for PVE is set against...
May be someone can help, how to add second PVE cluster (as observer) in already exist PVE Cluster.
How i can add here second separate CLUSTER-2?
# cat /etc/pve/corosync.conf
i have up to 16 nodes in a proxmox cluster and corosync is constantly showing retransmit in logs:
Jan 05 13:38:15 node2 corosync: [TOTEM ] Retransmit List: 66a9
Jan 05 13:38:23 node2 corosync: [TOTEM ] Retransmit List: 6708
Jan 05 13:38:30 node2 corosync...
Before I had a cluster of 13 nodes. I added 3 other nodes and within 5 minutes I lost the whole cluster. After restarting corosync 1 by 1 but when I start a 15th node I have this message:
corosync: [TOTEM ] Token has not been received in 380 ms
then after a few minutes the cluster...
I have four servers in a cluster. The last night, we faced to a big network flapping on 'srva' (private network and public network) with an impact to the private network '10.50.255.0/24'. The expected behavior was to get the three nodes (srvb, srvc, srvd) working together and the node...
I have a problem getting a QDevice to work on proxmox 6.2-12
First I install the QDevice package on the 3rd witness (Raspberry Pi OS 20-08-2020) box:
# apt install corosync-qnetd
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW...
I have (had) a 3 node Proxmox VE 6.2-11 and Ceph cluster. I'm modifying my config after install and some light use. Ceph is now on its own 10Gx2 LAN. I decided to dedicate a 1Gb interface and create a VLAN for corosync and attempted to modify corosync.conf before understanding exactly what...
I currently have a 4 node HCI cluster that's working quite well. It will be expanding to 8 nodes total and be used for critical services. All of the testing was satisfactory and management was duly impressed. I am reinstalling the cluster from scratch in order to ensure none of the testing bits...