Hi everyone,
We are running a cluster (version 8.2.7) that is connected to a separate ceph cluster running reef (18.2.4).
It now happened two or three times over the last years that a node restarted with a hard shutdown/reset. We weren’t able to find out what triggered that.
The Node came up and running again without any problems but all VMs (with and without HA) hosted on that node had a filesystem that was messed up beyond repair. All VMs had to be restored from backups.
I’m now wondering if I can configure Proxmox and the VMs in a way that prevents this. We tried switching to direct sync cache after the last incident but this did not help this time.
Does anyone have experience with this and can suggest something?
We are running a cluster (version 8.2.7) that is connected to a separate ceph cluster running reef (18.2.4).
It now happened two or three times over the last years that a node restarted with a hard shutdown/reset. We weren’t able to find out what triggered that.
The Node came up and running again without any problems but all VMs (with and without HA) hosted on that node had a filesystem that was messed up beyond repair. All VMs had to be restored from backups.
I’m now wondering if I can configure Proxmox and the VMs in a way that prevents this. We tried switching to direct sync cache after the last incident but this did not help this time.
Does anyone have experience with this and can suggest something?