Recently did the 11.2024 update. After a reboot, one of my server's VMs (primarily Linux) no longer has connectivity to the network. I am passing VLAN tags through to the VMs using a 2 bonded 10Gbps NICs. The server is part of a 2 server cluster with a Quorum Member. All the interfaces on the server are accessible and I tried to migrate the VMs not working to the server that does not seem to have the issue, but those VMs still do not work (start and accessible through console but no network connectivity) after the migration.
Also of note, the cluster seems to flap and that can be seen in the corasync.
Here is relevant info on the server:
Also of note, the cluster seems to flap and that can be seen in the corasync.
Here is relevant info on the server:
Code:
CPU(s) - 32 x AMD Ryzen 9 5950X 16-Core Processor (1 Socket)
Kernel Version - Linux 6.8.12-4-pve (2024-11-06T15:04Z)
Boot Mode - Legacy BIOS
Manager Version - pve-manager/8.3.2/3e76eec21c4a14a7
Code:
root@pve2:~# pveversion -v
proxmox-ve: 8.3.0 (running kernel: 6.8.12-4-pve)
pve-manager: 8.3.2 (running version: 8.3.2/3e76eec21c4a14a7)
proxmox-kernel-helper: 8.1.0
pve-kernel-5.15: 7.4-9
proxmox-kernel-6.8: 6.8.12-4
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
proxmox-kernel-6.8.12-2-pve-signed: 6.8.12-2
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.8.4-3-pve-signed: 6.8.4-3
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.5.13-5-pve-signed: 6.5.13-5
proxmox-kernel-6.5.13-1-pve-signed: 6.5.13-1
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
pve-kernel-5.15.131-2-pve: 5.15.131-3
pve-kernel-5.15.102-1-pve: 5.15.102-1
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2+deb12u1
frr-pythontools: 8.5.2-1+pve1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.1.2
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.1
libpve-storage-perl: 8.3.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.3.2-1
proxmox-backup-file-restore: 3.3.2-2
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.3.3
pve-cluster: 8.0.10
pve-container: 5.2.2
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-1
pve-ha-manager: 4.0.6
pve-i18n: 3.3.2
pve-qemu-kvm: 9.0.2-4
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.3
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.6-pve1
Code:
root@pve2:~# head -n -0 /etc/apt/sources.list /etc/apt/sources.list.d/*
==> /etc/apt/sources.list <==
deb http://ftp.us.debian.org/debian bookworm main contrib
deb http://ftp.us.debian.org/debian bookworm-updates main contrib
# security updates
deb http://security.debian.org bookworm-security main contrib
==> /etc/apt/sources.list.d/ceph.list <==
deb https://enterprise.proxmox.com/debian/ceph-quincy bookworm enterprise
Code:
==> /etc/apt/sources.list.d/pve-enterprise.list <==
deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise
root@pve2:~# journalctl -u corosync
Jun 18 12:29:59 pve2 corosync[4668]: [TOTEM ] Token has not been received in 2250 ms
Jun 18 12:30:00 pve2 corosync[4668]: [TOTEM ] A processor failed, forming new configuration: token timed out (3000ms), waiting 3600ms for>
Jun 18 12:30:03 pve2 corosync[4668]: [QUORUM] Sync members[1]: 2
Jun 18 12:30:03 pve2 corosync[4668]: [QUORUM] Sync left[1]: 1
Jun 18 12:30:03 pve2 corosync[4668]: [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Jun 18 12:30:03 pve2 corosync[4668]: [TOTEM ] A new membership (2.2a72) was formed. Members left: 1
Jun 18 12:30:03 pve2 corosync[4668]: [TOTEM ] Failed to receive the leave message. failed: 1
Jun 18 12:30:04 pve2 corosync[4668]: [QUORUM] Members[1]: 2
Jun 18 12:30:04 pve2 corosync[4668]: [MAIN ] Completed service synchronization, ready to provide service.
Jun 18 12:30:06 pve2 corosync[4668]: [QUORUM] Sync members[2]: 1 2
Jun 18 12:30:06 pve2 corosync[4668]: [QUORUM] Sync joined[1]: 1
Jun 18 12:30:06 pve2 corosync[4668]: [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Jun 18 12:30:06 pve2 corosync[4668]: [TOTEM ] A new membership (1.2a76) was formed. Members joined: 1
Jun 18 12:30:06 pve2 corosync[4668]: [QUORUM] Members[2]: 1 2
Jun 18 12:30:06 pve2 corosync[4668]: [MAIN ] Completed service synchronization, ready to provide service.
Jun 18 12:30:13 pve2 corosync[4668]: [CFG ] Node 1 was shut down by sysadmin
Jun 18 12:30:13 pve2 corosync[4668]: [QUORUM] Sync members[1]: 2
Jun 18 12:30:13 pve2 corosync[4668]: [QUORUM] Sync left[1]: 1
Jun 18 12:30:13 pve2 corosync[4668]: [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Jun 18 12:30:13 pve2 corosync[4668]: [TOTEM ] A new membership (2.2a7a) was formed. Members left: 1
Jun 18 12:30:13 pve2 corosync[4668]: [QUORUM] Members[1]: 2
Jun 18 12:30:13 pve2 corosync[4668]: [MAIN ] Completed service synchronization, ready to provide service.
Jun 18 12:30:14 pve2 corosync[4668]: [KNET ] link: host: 1 link: 0 is down
Jun 18 12:30:14 pve2 corosync[4668]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Jun 18 12:30:14 pve2 corosync[4668]: [KNET ] host: host: 1 has no active links
Jun 18 12:31:14 pve2 corosync[4668]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
Jun 18 12:31:14 pve2 corosync[4668]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Jun 18 12:31:14 pve2 corosync[4668]: [QUORUM] Sync members[2]: 1 2
Jun 18 12:31:14 pve2 corosync[4668]: [QUORUM] Sync joined[1]: 1
Jun 18 12:31:14 pve2 corosync[4668]: [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Jun 18 12:31:14 pve2 corosync[4668]: [TOTEM ] A new membership (1.2a7f) was formed. Members joined: 1
Jun 18 12:31:14 pve2 corosync[4668]: [QUORUM] Members[2]: 1 2
Jun 18 12:31:14 pve2 corosync[4668]: [MAIN ] Completed service synchronization, ready to provide service.
Jun 18 12:31:15 pve2 corosync[4668]: [KNET ] pmtud: Global data MTU changed to: 1397
Jul 04 12:59:39 pve2 corosync[4668]: [CFG ] Node 1 was shut down by sysadmin
Jul 04 12:59:39 pve2 corosync[4668]: [QUORUM] Sync members[1]: 2
Jul 04 12:59:39 pve2 corosync[4668]: [QUORUM] Sync left[1]: 1
Jul 04 12:59:39 pve2 corosync[4668]: [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Jul 04 12:59:39 pve2 corosync[4668]: [TOTEM ] A new membership (2.2a83) was formed. Members left: 1
Jul 04 12:59:39 pve2 corosync[4668]: [QUORUM] Members[1]: 2
Jul 04 12:59:39 pve2 corosync[4668]: [MAIN ] Completed service synchronization, ready to provide service.
Jul 04 12:59:39 pve2 corosync[4668]: [KNET ] link: host: 1 link: 0 is down
Jul 04 12:59:39 pve2 corosync[4668]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Jul 04 12:59:39 pve2 corosync[4668]: [KNET ] host: host: 1 has no active links
Jul 04 13:00:28 pve2 corosync[4668]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
Jul 04 13:00:28 pve2 corosync[4668]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Jul 04 13:00:28 pve2 corosync[4668]: [KNET ] pmtud: Global data MTU changed to: 1397
Jul 04 13:00:28 pve2 corosync[4668]: [QUORUM] Sync members[2]: 1 2
Jul 04 13:00:28 pve2 corosync[4668]: [QUORUM] Sync joined[1]: 1