Since updating to PVE 8, I've been having an issue that appears to be vzdump related, or an issue that's stopping vzdump executing. The more I've thought about it I suspect it's the latter.
At between 12-12:30 for the last couple of days, the status' of all my VMs and containers just disappears. I restarted pvestatd when I noticed today - which brought the VM status back, but not that of the containers. Whilst checking things out for this post, it's lost the status of everything again.
All VMs are running, containers yesterday seemed to lose network on the back of it - my rProxy certainly couldn't forward traffic to them. I assume the same today (didn't check), but having shut them down and restarted, they appear to be OK.
Both days there's been a vzdump that's failed on the same container when I've noticed the behaviour.
The backup job in question runs every 6 hours to snapshot more frequently used containers - but this issue isn't appearing at other times, so I think something's a bit wonky elsewhere.
In writing the post, I restarted pvestatd for a second time which, yet again brought the VM status back, and has now left me with full unknowns again. /etc/pve/.rrd is now blank.
If I check the status of the the pvestatd service, I get the following - container 912 is the problematic container for backup.
Not really sure where to go from here, any idea of what to check next?
At between 12-12:30 for the last couple of days, the status' of all my VMs and containers just disappears. I restarted pvestatd when I noticed today - which brought the VM status back, but not that of the containers. Whilst checking things out for this post, it's lost the status of everything again.
All VMs are running, containers yesterday seemed to lose network on the back of it - my rProxy certainly couldn't forward traffic to them. I assume the same today (didn't check), but having shut them down and restarted, they appear to be OK.
Both days there's been a vzdump that's failed on the same container when I've noticed the behaviour.
The backup job in question runs every 6 hours to snapshot more frequently used containers - but this issue isn't appearing at other times, so I think something's a bit wonky elsewhere.
In writing the post, I restarted pvestatd for a second time which, yet again brought the VM status back, and has now left me with full unknowns again. /etc/pve/.rrd is now blank.
If I check the status of the the pvestatd service, I get the following - container 912 is the problematic container for backup.
Code:
root@proxmox:~# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
Active: active (running) since Tue 2023-06-27 12:57:17 BST; 10s ago
Process: 1626430 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
Main PID: 1626432 (pvestatd)
Tasks: 2 (limit: 134945)
Memory: 86.5M
CPU: 688ms
CGroup: /system.slice/pvestatd.service
├─1626432 pvestatd
└─1642002 lxc-info -n 912 -p
Not really sure where to go from here, any idea of what to check next?