[SOLVED] VM filesystems all broken after cluster node crashed

woodstock · Oct 7, 2024

Hi everyone,

We are running a cluster (version 8.2.7) that is connected to a separate ceph cluster running reef (18.2.4).

It now happened two or three times over the last years that a node restarted with a hard shutdown/reset. We weren’t able to find out what triggered that.

The Node came up and running again without any problems but all VMs (with and without HA) hosted on that node had a filesystem that was messed up beyond repair. All VMs had to be restored from backups.

I’m now wondering if I can configure Proxmox and the VMs in a way that prevents this. We tried switching to direct sync cache after the last incident but this did not help this time.

Does anyone have experience with this and can suggest something?

michel.seicon · Oct 8, 2024

Hello,
I have noticed that when a single node is running proxmox restarts itself.
But when he is part of a group this does not happen.
This is very strange, but it happens to me too and I don't know how to solve it because it happens very sporadically.

woodstock · Oct 17, 2024

I'll rephrase my question: which cache settings for Proxmox (librbd) and Ceph will prevent this?
Is there a way to completely disable I/O caching for Proxmox VMs having their disks on ceph storage?

Johannes S · Oct 17, 2024

You should be able to configure this in the disk settings of the vm. I don't know whether you can set a node or cluster wide default

woodstock · Oct 17, 2024

Thanks for your reply.

I know these settings. But I'm not sure what they really do in combination with an external ceph cluster.

We already use direct sync and this did not prevent the corrupt filesystems.
Is there any layer involved that explains that?

VictorSTS · Oct 17, 2024

In theory direct sync makes every write, either sync or async to be pushed in sync mode to the storage. Ceph by default will commit to at least two OSD before returning ACK to the client (PVE in this case). So if both PVE and Ceph are properly configured this should not happend.

PVE settings are clear, which is the Ceph configuration in that Ceph cluster? Does this happen if you force a power off manually (i.e. can you easily reproduce the issue?).

woodstock · Mar 17, 2025

I'm writing this in case others have the same problem.

We found out that our ceph user needed other permissions/capabilities. We’ve been running with:

Code:

mon = "allow r" osd = "allow * pool=poolname"

At some point in the past this was recommended to us as the minimum needed.
We had to change this to:

Code:

mgr 'profile rbd' mon 'profile rbd' osd 'profile rbd pool=poolname'

It seems that our old capabilities did not include image locking and/or unlocking.

[SOLVED] VM filesystems all broken after cluster node crashed

woodstock

Renowned Member

michel.seicon

Active Member

woodstock

Renowned Member

Johannes S

Distinguished Member

woodstock

Renowned Member

VictorSTS

Distinguished Member

woodstock

Renowned Member

We value your privacy