One specific VM has issues with CEPH storage

kiwistag

New Member
Mar 18, 2025
1
0
1
Hi all.
This is a weird one and I should have fixed it sooner.
I have one VM on Proxmox which starts but comes up "Failed to connect to server" when using Console. BUT the bigger issue is the CEPH storage.
If I try to amend any settings for the disks stored on the CEPH pool, it just freezes and I have to stop the process. Backups do the same and just sit there,

Last night it did boot but was insanely slow with a lockup and now - nothing.. All other VM's are fine.
I have a file level backup but am after ideas on what I could try to get this going first to recover later data off.

I have tried:
  • (CLI) qm unlock *vmID*
  • Migrating to another host.
  • Deleting an irrelevant disk on VM (that is on the CEPH Pool)
Datacenter environment:
4x HP DL360 Gen7's
All with approx 4x 1TB SATA disks each.
2x bonded eth interfaces on bond0 (public)
6x bonded eth interfaces on bond1 (Private/Cluster)

Moving a disk to local storage at one stage posted:

Code:
trying to acquire lock...
TASK ERROR: can't lock file '/var/lock/qemu-server/lock-116.conf' - got timeout

Note this is not a corporate-use cluster.

Any ideas for fault finding or information you need for further diagnosing, please reach out.
 
Last edited: