I can log in via sshI can still access pveHA via webui but it only half ass see's pveMain , the cpu status etc updates live but it does not see the container
all the containers on pveMain are still running
and Quarum is working so unsure why i can not log into PVEMain ?
the last error from journalctl was from months ago
so far all the processes I have checked say they are ok
systemctl status pve-cluster corosync pvedaemon pve-firewall pve-ha-crm pve-ha-lrm pvestatd
apart from
(pvemain pvestatd[1444]: got timeout)
what can i do to correct it with out rebooting i rather not take the containers offline
Code:
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
Active: active (running) since Tue 2025-03-18 22:19:18 AEST; 52min ago
Main PID: 3307622 (pmxcfs)
Tasks: 6 (limit: 37970)
Memory: 54.9M
CPU: 6.401s
CGroup: /system.slice/pve-cluster.service
└─3307622 /usr/bin/pmxcfs
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [dcdb] notice: leader is 1/3307622
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [dcdb] notice: synced members: 1/3307622
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [dcdb] notice: start sending inode updates
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [dcdb] notice: sent all (6) updates
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [dcdb] notice: all data is up to date
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [status] notice: received all states
Mar 18 22:27:58 pvemain pmxcfs[3307622]: [status] notice: all data is up to date
Mar 18 22:37:12 pvemain pmxcfs[3307622]: [status] notice: received log
Mar 18 22:46:00 pvemain pmxcfs[3307622]: [status] notice: received log
Mar 18 23:01:00 pvemain pmxcfs[3307622]: [status] notice: received log
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
Active: active (running) since Fri 2024-09-06 20:29:10 AEST; 6 months 10 days ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 1375 (corosync)
Tasks: 9 (limit: 37970)
Memory: 140.0M
CPU: 1d 19h 14min 24.760s
CGroup: /system.slice/corosync.service
└─1375 /usr/sbin/corosync -f
Mar 18 22:27:51 pvemain corosync[1375]: [MAIN ] Completed service synchronization, ready to provide service.
Mar 18 22:27:57 pvemain corosync[1375]: [KNET ] rx: host: 2 link: 0 is up
Mar 18 22:27:57 pvemain corosync[1375]: [KNET ] link: Resetting MTU for link 0 because host 2 joined
Mar 18 22:27:57 pvemain corosync[1375]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 18 22:27:57 pvemain corosync[1375]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 18 22:27:58 pvemain corosync[1375]: [QUORUM] Sync members[2]: 1 2
Mar 18 22:27:58 pvemain corosync[1375]: [QUORUM] Sync joined[1]: 2
Mar 18 22:27:58 pvemain corosync[1375]: [TOTEM ] A new membership (1.cbb3) was formed. Members joined: 2
Mar 18 22:27:58 pvemain corosync[1375]: [QUORUM] Members[2]: 1 2
Mar 18 22:27:58 pvemain corosync[1375]: [MAIN ] Completed service synchronization, ready to provide service.
● pvedaemon.service - PVE API Daemon
Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; preset: enabled)
Active: active (running) since Fri 2024-09-06 20:29:11 AEST; 6 months 10 days ago
Process: 3308037 ExecReload=/usr/bin/pvedaemon restart (code=exited, status=0/SUCCESS)
Main PID: 1455 (pvedaemon)
Tasks: 31 (limit: 37970)
Memory: 360.3M
CPU: 5h 21min 32.733s
CGroup: /system.slice/pvedaemon.service
├─ 1455 pvedaemon
├─ 774335 /usr/bin/dtach -A /var/run/dtach/vzctlconsole118 -r winch -z lxc-console -n 118 -e -1
├─ 774336 lxc-console -n 118 -e -1
├─ 774469 /usr/bin/dtach -A /var/run/dtach/vzctlconsole125 -r winch -z lxc-console -n 125 -e -1
├─ 774470 lxc-console -n 125 -e -1
├─1817428 /usr/bin/dtach -A /var/run/dtach/vzctlconsole127 -r winch -z lxc-console -n 127 -e -1
├─1817429 lxc-console -n 127 -e -1
├─2132523 /usr/bin/dtach -A /var/run/dtach/vzctlconsole103 -r winch -z lxc-console -n 103 -e -1
├─2132524 lxc-console -n 103 -e -1
├─2219253 /usr/bin/dtach -A /var/run/dtach/vzctlconsole129 -r winch -z lxc-console -n 129 -e -1
├─2219254 lxc-console -n 129 -e -1
├─3214215 /usr/bin/dtach -A /var/run/dtach/vzctlconsole110 -r winch -z lxc-console -n 110 -e -1
├─3214216 lxc-console -n 110 -e -1
├─3236038 "pvedaemon worker"
├─3238250 "pvedaemon worker"
├─3238683 "pvedaemon worker"
├─3245390 lxc-info -n 110 -p
├─3245690 lxc-info -n 110 -p
├─3245973 lxc-info -n 110 -p
├─3308045 "pvedaemon worker"
├─3308046 "pvedaemon worker"
├─3308047 "pvedaemon worker"
├─3308054 lxc-info -n 110 -p
├─3308055 lxc-info -n 110 -p
├─3308056 lxc-info -n 110 -p
├─3419697 /usr/bin/dtach -A /var/run/dtach/vzctlconsole128 -r winch -z lxc-console -n 128 -e -1
├─3419698 lxc-console -n 128 -e -1
├─3659515 "task UPID:pvemain:0037D6FB:391FA33E:676D15EE:vncshell::root@pam:"
├─3659516 /usr/bin/termproxy 5900 --path /nodes/pvemain --perm Sys.Console -- /bin/login -f root
├─3961656 /usr/bin/dtach -A /var/run/dtach/vzctlconsole132 -r winch -z lxc-console -n 132 -e -1
└─3961657 lxc-console -n 132 -e -1
Mar 18 22:19:41 pvemain pvedaemon[1455]: received signal HUP
Mar 18 22:19:41 pvemain pvedaemon[1455]: server closing
Mar 18 22:19:41 pvemain pvedaemon[1455]: server shutdown (restart)
Mar 18 22:19:41 pvemain systemd[1]: Reloaded pvedaemon.service - PVE API Daemon.
Mar 18 22:19:42 pvemain pvedaemon[1455]: restarting server
Mar 18 22:19:42 pvemain pvedaemon[1455]: starting 3 worker(s)
Mar 18 22:19:42 pvemain pvedaemon[1455]: worker 3308045 started
Mar 18 22:19:42 pvemain pvedaemon[1455]: worker 3308046 started
Mar 18 22:19:42 pvemain pvedaemon[1455]: worker 3308047 started
Mar 18 22:19:42 pvemain pvedaemon[3308046]: <root@pam> successful auth for user 'root@pam'
● pve-firewall.service - Proxmox VE firewall
Loaded: loaded (/lib/systemd/system/pve-firewall.service; enabled; preset: enabled)
Active: active (running) since Fri 2024-09-06 20:29:10 AEST; 6 months 10 days ago
Main PID: 1428 (pve-firewall)
Tasks: 1 (limit: 37970)
Memory: 100.7M
CPU: 16h 28min 13.975s
CGroup: /system.slice/pve-firewall.service
└─1428 pve-firewall
Sep 06 20:29:10 pvemain systemd[1]: Starting pve-firewall.service - Proxmox VE firewall...
Sep 06 20:29:10 pvemain pve-firewall[1428]: starting server
Sep 06 20:29:10 pvemain systemd[1]: Started pve-firewall.service - Proxmox VE firewall.
Mar 18 22:19:36 pvemain systemd[1]: Reloading pve-firewall.service - Proxmox VE firewall...
Mar 18 22:19:36 pvemain pve-firewall[3307874]: send HUP to 1428
Mar 18 22:19:36 pvemain pve-firewall[1428]: received signal HUP
Mar 18 22:19:36 pvemain pve-firewall[1428]: server shutdown (restart)
Mar 18 22:19:36 pvemain systemd[1]: Reloaded pve-firewall.service - Proxmox VE firewall.
Mar 18 22:19:37 pvemain pve-firewall[1428]: restarting server
● pve-ha-crm.service - PVE Cluster HA Resource Manager Daemon
Loaded: loaded (/lib/systemd/system/pve-ha-crm.service; enabled; preset: enabled)
Active: active (running) since Tue 2025-03-18 22:20:31 AEST; 51min ago
Process: 3335263 ExecStart=/usr/sbin/pve-ha-crm start (code=exited, status=0/SUCCESS)
Main PID: 3335269 (pve-ha-crm)
Tasks: 1 (limit: 37970)
Memory: 112.4M
CPU: 1.052s
CGroup: /system.slice/pve-ha-crm.service
└─3335269 pve-ha-crm
Mar 18 22:20:31 pvemain systemd[1]: Starting pve-ha-crm.service - PVE Cluster HA Resource Manager Daemon...
Mar 18 22:20:31 pvemain pve-ha-crm[3335269]: starting server
Mar 18 22:20:31 pvemain pve-ha-crm[3335269]: status change startup => wait_for_quorum
Mar 18 22:20:31 pvemain systemd[1]: Started pve-ha-crm.service - PVE Cluster HA Resource Manager Daemon.
Mar 18 22:20:36 pvemain pve-ha-crm[3335269]: status change wait_for_quorum => slave
Mar 18 22:24:56 pvemain pve-ha-crm[3335269]: successfully acquired lock 'ha_manager_lock'
Mar 18 22:24:56 pvemain pve-ha-crm[3335269]: watchdog active
Mar 18 22:24:56 pvemain pve-ha-crm[3335269]: status change slave => master
● pve-ha-lrm.service - PVE Local HA Resource Manager Daemon
Loaded: loaded (/lib/systemd/system/pve-ha-lrm.service; enabled; preset: enabled)
Active: active (running) since Tue 2025-03-18 22:20:27 AEST; 51min ago
Process: 3335231 ExecStart=/usr/sbin/pve-ha-lrm start (code=exited, status=0/SUCCESS)
Main PID: 3335249 (pve-ha-lrm)
Tasks: 1 (limit: 37970)
Memory: 111.8M
CPU: 16.817s
CGroup: /system.slice/pve-ha-lrm.service
└─3335249 pve-ha-lrm
Mar 18 22:20:26 pvemain systemd[1]: Starting pve-ha-lrm.service - PVE Local HA Resource Manager Daemon...
Mar 18 22:20:27 pvemain pve-ha-lrm[3335249]: starting server
Mar 18 22:20:27 pvemain pve-ha-lrm[3335249]: status change startup => wait_for_agent_lock
Mar 18 22:20:27 pvemain systemd[1]: Started pve-ha-lrm.service - PVE Local HA Resource Manager Daemon.
Mar 18 22:20:37 pvemain pve-ha-lrm[3335249]: successfully acquired lock 'ha_agent_pvemain_lock'
Mar 18 22:20:37 pvemain pve-ha-lrm[3335249]: watchdog active
Mar 18 22:20:37 pvemain pve-ha-lrm[3335249]: status change wait_for_agent_lock => active
● pvestatd.service - PVE Status Daemon
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
Active: active (running) since Fri 2024-09-06 20:29:10 AEST; 6 months 10 days ago
Process: 3308064 ExecReload=/usr/bin/pvestatd restart (code=exited, status=0/SUCCESS)
Main PID: 1444 (pvestatd)
Tasks: 2 (limit: 37970)
Memory: 353.8M
CPU: 3d 14h 15min 9.321s
CGroup: /system.slice/pvestatd.service
├─ 1444 pvestatd
└─3245436 lxc-info -n 110 -p
Mar 18 20:29:59 pvemain pvestatd[1444]: got timeout
Mar 18 20:29:59 pvemain pvestatd[1444]: status update time (20.470 seconds)
Mar 18 20:30:04 pvemain pvestatd[1444]: got timeout
Mar 18 20:30:09 pvemain pvestatd[1444]: got timeout
Mar 18 20:30:14 pvemain pvestatd[1444]: got timeout
Mar 18 20:30:19 pvemain pvestatd[1444]: got timeout
Mar 18 22:19:42 pvemain systemd[1]: Reloading pvestatd.service - PVE Status Daemon...
Mar 18 22:19:43 pvemain pvestatd[3308064]: send HUP to 1444
Mar 18 22:19:43 pvemain pvestatd[1444]: received signal HUP
Mar 18 22:19:43 pvemain systemd[1]: Reloaded pvestatd.service - PVE Status Daemon.
Last edited: