Last Friday I encountered a strange issue - I was cloning a 1.55TB virtual machine, on NVMe ceph pool. There was 3.5TB of free space on our NVMe pool remaining.
Cloning was normal at first, but sometime throughout the night the actual NVMe pool began to shrink, at the same rate it was being filled (refer to the picture below)

After noticing this the following morning, all VMs using this pool were stuck and I couldn't remove the unfinished clone.
I followed the following steps:
- Stopped clone, it was still running but stuck because out of space
- Removed VM lock from cli
- Attempt to remove the disk of the failed clone but it would not remove until ceph fully healed
I am wondering if anyone has seen this before? You can see, once ceph healed and my the VM disk removed - The datastore grew back up to 3.5TB
Cloning was normal at first, but sometime throughout the night the actual NVMe pool began to shrink, at the same rate it was being filled (refer to the picture below)

After noticing this the following morning, all VMs using this pool were stuck and I couldn't remove the unfinished clone.
I followed the following steps:
- Stopped clone, it was still running but stuck because out of space
- Removed VM lock from cli
- Attempt to remove the disk of the failed clone but it would not remove until ceph fully healed
I am wondering if anyone has seen this before? You can see, once ceph healed and my the VM disk removed - The datastore grew back up to 3.5TB
Last edited: