I have a single machine running the latest Proxmox (recently updated it too), but it freezes completely about 1x per month.
The machine is a very simple setup (no cluster) and runs our office software. Some limited downtime is not a big problem.
This machine has:
- ZFS
- 6 WD Red drives
- Intel Optane SSD disk as ZIL
- Latest generation Xeon E-2176G CPU
- SUPERMICRO Server board MBD-X11SCA-F-O
- Intel 10 GBE Nic
- 64GB of ECC RAM
When it freezes, nothing works anymore:
- No SSH
- No web UI
- Not a single VM works
- Not a single Docker container works (I have installed Docker on Proxmox)
1 thing keeps working, and that is Nginx that I have on the host system.
I'm using this to forward domains to the docker containers...
This Nginx will give 503 bad gateways though, because the docker container behind it is dead obviously.
When looking at syslog, the last messages were:
Dec 8 00:21:00 proxmox systemd[1]: Starting Proxmox VE replication runner...
Dec 8 00:21:00 proxmox systemd[1]: pvesr.service: Succeeded.
Dec 8 00:21:00 proxmox systemd[1]: Started Proxmox VE replication runner.
Dec 8 00:22:00 proxmox systemd[1]: Starting Proxmox VE replication runner...
Dec 8 00:22:00 proxmox systemd[1]: pvesr.service: Succeeded.
Dec 8 00:22:00 proxmox systemd[1]: Started Proxmox VE replication runner.
Dec 8 00:23:00 proxmox systemd[1]: Starting Proxmox VE replication runner...
Dec 8 00:23:00 proxmox systemd[1]: pvesr.service: Succeeded.
Dec 8 00:23:00 proxmox systemd[1]: Started Proxmox VE replication runner.
After this I rebooted the machine entirely and everything came back up nicely.
So... what should be my next steps in trying to figure this out?
The machine is a very simple setup (no cluster) and runs our office software. Some limited downtime is not a big problem.
This machine has:
- ZFS
- 6 WD Red drives
- Intel Optane SSD disk as ZIL
- Latest generation Xeon E-2176G CPU
- SUPERMICRO Server board MBD-X11SCA-F-O
- Intel 10 GBE Nic
- 64GB of ECC RAM
When it freezes, nothing works anymore:
- No SSH
- No web UI
- Not a single VM works
- Not a single Docker container works (I have installed Docker on Proxmox)
1 thing keeps working, and that is Nginx that I have on the host system.
I'm using this to forward domains to the docker containers...
This Nginx will give 503 bad gateways though, because the docker container behind it is dead obviously.
When looking at syslog, the last messages were:
Dec 8 00:21:00 proxmox systemd[1]: Starting Proxmox VE replication runner...
Dec 8 00:21:00 proxmox systemd[1]: pvesr.service: Succeeded.
Dec 8 00:21:00 proxmox systemd[1]: Started Proxmox VE replication runner.
Dec 8 00:22:00 proxmox systemd[1]: Starting Proxmox VE replication runner...
Dec 8 00:22:00 proxmox systemd[1]: pvesr.service: Succeeded.
Dec 8 00:22:00 proxmox systemd[1]: Started Proxmox VE replication runner.
Dec 8 00:23:00 proxmox systemd[1]: Starting Proxmox VE replication runner...
Dec 8 00:23:00 proxmox systemd[1]: pvesr.service: Succeeded.
Dec 8 00:23:00 proxmox systemd[1]: Started Proxmox VE replication runner.
After this I rebooted the machine entirely and everything came back up nicely.
So... what should be my next steps in trying to figure this out?
Last edited: