I have a 6-node cluster that I recently (two nights ago) upgraded to 8.0.4. This cluster has been going strong with no issues at all until after the upgrade to 8.0.4.
The upgrade went smoothly with zero issues at all.
Now, one node in the cluster keeps 'greying out' on the GUI. I rebooted the server, and it came back, but scarcely minutes later, it greyed out again. All LXC and VM are running and working fine, but I am having trouble trying to determine what the issue is with it losing connectivity to the GUI.
Corosync traffic is on a dedicated VLAN, all nodes are connected via 10G for Corosync and a separate 10G trunk link for all other traffic. Nothing shares the network with Corosync traffic.
PVE Version all Nodes:
pve-manager/8.0.4/d258a813cfa6b390 (running kernel: 6.2.16-12-pve)
I tried the following to get it back:
This sort of brings it back: all the VMs go green, LXC containers all stay grey as well as the storage listed stays grey. It stays this way for about 3 to 5 minutes then everything goes grey again. Rerunning the command brings it back, but then fails again.
At this point I am not sure what other information or logs might be useful.
The upgrade went smoothly with zero issues at all.
Now, one node in the cluster keeps 'greying out' on the GUI. I rebooted the server, and it came back, but scarcely minutes later, it greyed out again. All LXC and VM are running and working fine, but I am having trouble trying to determine what the issue is with it losing connectivity to the GUI.
Corosync traffic is on a dedicated VLAN, all nodes are connected via 10G for Corosync and a separate 10G trunk link for all other traffic. Nothing shares the network with Corosync traffic.
PVE Version all Nodes:
pve-manager/8.0.4/d258a813cfa6b390 (running kernel: 6.2.16-12-pve)
I tried the following to get it back:
systemctl restart pve-cluster
(does not bring it back)systemctl restart pvestatd
This sort of brings it back: all the VMs go green, LXC containers all stay grey as well as the storage listed stays grey. It stays this way for about 3 to 5 minutes then everything goes grey again. Rerunning the command brings it back, but then fails again.
systemctl restart pveproxy
does not bring it backSep 20 12:46:22 proxmox02 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Sep 20 12:46:22 proxmox02 pmxcfs[1899301]: [main] notice: resolved node name 'proxmox02' to '10.200.70.3' for default node IP address
Sep 20 12:46:22 proxmox02 pmxcfs[1899301]: [main] notice: resolved node name 'proxmox02' to '10.200.70.3' for default node IP address
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: update cluster info (cluster name Proxmox, version = 6)
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: node has quorum
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: members: 1/5120, 2/1899303, 3/3553, 4/5629, 5/4677, 6/28004
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: starting data syncronisation
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: received sync request (epoch 1/5120/0000000C)
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: members: 1/5120, 2/1899303, 3/3553, 4/5629, 5/4677, 6/28004
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: starting data syncronisation
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: received sync request (epoch 1/5120/0000000C)
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: received all states
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: leader is 1/5120
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: synced members: 1/5120, 2/1899303, 3/3553, 4/5629, 5/4677, 6/28004
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: all data is up to date
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: received all states
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: all data is up to date
Sep 20 12:46:22 proxmox02 pmxcfs[1899301]: [main] notice: resolved node name 'proxmox02' to '10.200.70.3' for default node IP address
Sep 20 12:46:22 proxmox02 pmxcfs[1899301]: [main] notice: resolved node name 'proxmox02' to '10.200.70.3' for default node IP address
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: update cluster info (cluster name Proxmox, version = 6)
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: node has quorum
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: members: 1/5120, 2/1899303, 3/3553, 4/5629, 5/4677, 6/28004
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: starting data syncronisation
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: received sync request (epoch 1/5120/0000000C)
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: members: 1/5120, 2/1899303, 3/3553, 4/5629, 5/4677, 6/28004
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: starting data syncronisation
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: received sync request (epoch 1/5120/0000000C)
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: received all states
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: leader is 1/5120
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: synced members: 1/5120, 2/1899303, 3/3553, 4/5629, 5/4677, 6/28004
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [dcdb] notice: all data is up to date
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: received all states
Sep 20 12:46:22 proxmox02 pmxcfs[1899303]: [status] notice: all data is up to date
root@proxmox02:~# pvecm status
Cluster information
-------------------
Name: Proxmox
Config Version: 6
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Wed Sep 20 14:02:37 2023
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000002
Ring ID: 1.447
Quorate: Yes
Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 6
Quorum: 4
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.200.80.2
0x00000002 1 10.200.80.3 (local)
0x00000003 1 10.200.80.4
0x00000004 1 10.200.80.5
0x00000005 1 10.200.80.6
0x00000006 1 10.200.80.7
Cluster information
-------------------
Name: Proxmox
Config Version: 6
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Wed Sep 20 14:02:37 2023
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000002
Ring ID: 1.447
Quorate: Yes
Votequorum information
----------------------
Expected votes: 6
Highest expected: 6
Total votes: 6
Quorum: 4
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.200.80.2
0x00000002 1 10.200.80.3 (local)
0x00000003 1 10.200.80.4
0x00000004 1 10.200.80.5
0x00000005 1 10.200.80.6
0x00000006 1 10.200.80.7
Sep 20 12:53:08 proxmox02 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Sep 20 12:53:10 proxmox02 pvestatd[7144]: received signal TERM
Sep 20 12:53:10 proxmox02 pvestatd[7144]: server closing
Sep 20 12:53:10 proxmox02 pvestatd[7144]: server stopped
Sep 20 12:53:11 proxmox02 systemd[1]: pvestatd.service: Deactivated successfully.
Sep 20 12:53:11 proxmox02 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Sep 20 12:53:11 proxmox02 systemd[1]: pvestatd.service: Consumed 1min 59.482s CPU time.
Sep 20 12:53:11 proxmox02 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 20 12:53:12 proxmox02 pvestatd[1940505]: starting server
Sep 20 12:53:12 proxmox02 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Sep 20 13:48:28 proxmox02 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Sep 20 13:48:29 proxmox02 pvestatd[1940505]: received signal TERM
Sep 20 13:48:29 proxmox02 pvestatd[1940505]: server closing
Sep 20 13:48:29 proxmox02 pvestatd[1940505]: server stopped
Sep 20 13:48:30 proxmox02 systemd[1]: pvestatd.service: Deactivated successfully.
Sep 20 13:48:30 proxmox02 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Sep 20 13:48:30 proxmox02 systemd[1]: pvestatd.service: Consumed 3.900s CPU time.
Sep 20 13:48:31 proxmox02 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 20 13:48:32 proxmox02 pvestatd[2274251]: starting server
Sep 20 13:48:32 proxmox02 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Sep 20 12:53:10 proxmox02 pvestatd[7144]: received signal TERM
Sep 20 12:53:10 proxmox02 pvestatd[7144]: server closing
Sep 20 12:53:10 proxmox02 pvestatd[7144]: server stopped
Sep 20 12:53:11 proxmox02 systemd[1]: pvestatd.service: Deactivated successfully.
Sep 20 12:53:11 proxmox02 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Sep 20 12:53:11 proxmox02 systemd[1]: pvestatd.service: Consumed 1min 59.482s CPU time.
Sep 20 12:53:11 proxmox02 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 20 12:53:12 proxmox02 pvestatd[1940505]: starting server
Sep 20 12:53:12 proxmox02 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Sep 20 13:48:28 proxmox02 systemd[1]: Stopping pvestatd.service - PVE Status Daemon...
Sep 20 13:48:29 proxmox02 pvestatd[1940505]: received signal TERM
Sep 20 13:48:29 proxmox02 pvestatd[1940505]: server closing
Sep 20 13:48:29 proxmox02 pvestatd[1940505]: server stopped
Sep 20 13:48:30 proxmox02 systemd[1]: pvestatd.service: Deactivated successfully.
Sep 20 13:48:30 proxmox02 systemd[1]: Stopped pvestatd.service - PVE Status Daemon.
Sep 20 13:48:30 proxmox02 systemd[1]: pvestatd.service: Consumed 3.900s CPU time.
Sep 20 13:48:31 proxmox02 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 20 13:48:32 proxmox02 pvestatd[2274251]: starting server
Sep 20 13:48:32 proxmox02 systemd[1]: Started pvestatd.service - PVE Status Daemon.
● pvestatd.service - PVE Status Daemon
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
Active: active (running) since Wed 2023-09-20 14:06:18 PDT; 33s ago
Process: 2374955 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
Main PID: 2375152 (pvestatd)
Tasks: 2 (limit: 309322)
Memory: 83.0M
CPU: 2.334s
CGroup: /system.slice/pvestatd.service
├─2375152 pvestatd
└─2375967 lxc-info -n 120 -p
Sep 20 14:06:17 proxmox02 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 20 14:06:18 proxmox02 pvestatd[2375152]: starting server
Sep 20 14:06:18 proxmox02 systemd[1]: Started pvestatd.service - PVE Status Daemon.
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
Active: active (running) since Wed 2023-09-20 14:06:18 PDT; 33s ago
Process: 2374955 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
Main PID: 2375152 (pvestatd)
Tasks: 2 (limit: 309322)
Memory: 83.0M
CPU: 2.334s
CGroup: /system.slice/pvestatd.service
├─2375152 pvestatd
└─2375967 lxc-info -n 120 -p
Sep 20 14:06:17 proxmox02 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 20 14:06:18 proxmox02 pvestatd[2375152]: starting server
Sep 20 14:06:18 proxmox02 systemd[1]: Started pvestatd.service - PVE Status Daemon.
root@proxmox02:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-12-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-6
pve-kernel-5.13: 7.1-9
pve-kernel-5.4: 6.4-15
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
proxmox-kernel-6.2: 6.2.16-12
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.4.174-2-pve: 5.4.174-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.8
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-5
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
proxmox-ve: 8.0.2 (running kernel: 6.2.16-12-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-6
pve-kernel-5.13: 7.1-9
pve-kernel-5.4: 6.4-15
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
proxmox-kernel-6.2: 6.2.16-12
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.4.174-2-pve: 5.4.174-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.8
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-5
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
At this point I am not sure what other information or logs might be useful.
Last edited: