error with cfs lock '****': unable to create image: got lock timeout - aborting command

no, because this is a cluster wide lock where the other nodes and other components besides the perl code assume this timeout of 60s is in place, so if you change it just in the perl code it can lead to multiple things entering the locked section thinking they all have exclusive access to the shared resource.
 
Yeah but in my case I changed this timeout in every cluster node to match. But I see your point. Still I don't see any other options for me. I have a 3 node system and ceph doesn't seem recommended for that use case. My zfs on spinning disks take a long time to respond.
 
Last edited:
like I said, it's not just that that single part of the perl code on every node in the cluster needs to be changed, other components including the automated unlock on timeout also have the same hardcoded value..
 
ProxMox users, developers, et alia:

I am seeing this issue when I try to create a 500GB HD via NFS. Is there a way to see the actual qemu-img command that ProxMox is trying to issue to build the qcow2 file?

Presuming I had the precise command being run and then wanted to run it, how would I then scan and add that device to the appropriate virtual machine?

Stuart
 
Hi,
ProxMox users, developers, et alia:

I am seeing this issue when I try to create a 500GB HD via NFS. Is there a way to see the actual qemu-img command that ProxMox is trying to issue to build the qcow2 file?

Presuming I had the precise command being run and then wanted to run it, how would I then scan and add that device to the appropriate virtual machine?

Stuart
if you run into timeouts when allocating an image on a network storage, it's recommended to use pvesm set <storage ID> --preallocation off. This disables the default preallocation of metadata, thus speeding up the image allocation a lot. For such images, QEMU will then allocate the metadata as needed.
 
Fiona,

Thank you for the response. I will give that a try.

My end state solution was simply to run:

sudo qemu-img create -f qcow2 -o cluster_size=65536 -o extended_l2=off -o preallocation=metadata -o compression_type=zlib -o size=500G -o lazy_refcounts=off -o refcount_bits=16 /mnt/pve/nfs_sata_pool/images/121/vm-121-disk-0.qcow2

There after I just edited the 121.conf file to assure that the new qcow2 file was setup as an available disk therein.

The VM seems to be running fine now and using the new space allocation absent any appreciable anomalies.

Thanks in advance and stay safe and healthy.


Stuart
 
Last edited:
  • Like
Reactions: fiona
Hi,

if you run into timeouts when allocating an image on a network storage, it's recommended to use pvesm set <storage ID> --preallocation off. This disables the default preallocation of metadata, thus speeding up the image allocation a lot. For such images, QEMU will then allocate the metadata as needed.
I am having the same issue as OP, when I try "pvesm set <storage ID> --preallocation off", i get failed: unexprected property preallocation. Any suggestions?
 

Attachments

  • Screenshot 2023-01-01 154135.png
    Screenshot 2023-01-01 154135.png
    136.5 KB · Views: 26
what kind of storage type is "virtualMachines"?
 
Meu caso foi testar um servidor proxmox 8.0.3 criando vms.
Em uma unidade compartilhada com Cifs e ocorreu o seguinte erro

TASK ERROR: incapaz de criar VM 110 - incapaz de criar imagem: comando 'storage-Data'-locked expirou - abortando Não

encontrei nenhuma evidência do que poderia estar acontecendo .

Após o erro ao criar o vm. nenhum outro criado no mesmo armazenamento funcionou.

Fui olhar o host físico e vi o seguinte erro que está no anexo.
 

Attachments

  • WhatsApp Image 2023-08-22 at 15.45.12.jpeg
    WhatsApp Image 2023-08-22 at 15.45.12.jpeg
    99.1 KB · Views: 6
  • ErrorDiskCifs.png
    ErrorDiskCifs.png
    30.7 KB · Views: 6
Meu caso foi testar um servidor proxmox 8.0.3 criando vms.
Em uma unidade compartilhada com Cifs e ocorreu o seguinte erro

TASK ERROR: incapaz de criar VM 110 - incapaz de criar imagem: comando 'storage-Data'-locked expirou - abortando Não

encontrei nenhuma evidência do que poderia estar acontecendo .

Após o erro ao criar o vm. nenhum outro criado no mesmo armazenamento funcionou.

Fui olhar o host físico e vi o seguinte erro que está no anexo.
After a few hours of conducting tests trying to identify the error and how I could fix it, I managed to do so today.

The procedure carried out involved selecting the storage that was experiencing a failure and editing it, enabling the advanced option, changing the preallocation mode to off, and saving it. Then, I edited it again and left it on Metadata.

I noticed that many people referred to the "Preallocation" parameter but didn't specify where it should be edited directly in the storage, as this option wasn't available for modification when creating a virtual machine through the interface.
 
  • Like
Reactions: bhotrock
Hi,

if you run into timeouts when allocating an image on a network storage, it's recommended to use pvesm set <storage ID>
--preallocation off. This disables the default preallocation of metadata, thus speeding up the image allocation a lot. For such images, QEMU will then allocate the metadata as needed.
i got this error as well
trying to acquire csf lock 'storage-ceph-pool'
during destroy the VM.

is that the same solution using option --preallocation off ?
 
preallocation only has an effect for directory based storages (dir, nfs, cifs, ..)
 
no, it sounds like there is some issue with the performance of your ceph storage..
 
I'm aware of that one, but I don't think anybody started working on implementing that enhancement yet.
 
I'm aware of that one, but I don't think anybody started working on implementing that enhancement yet.
do u think this is issue with ceph itself? maybe using latest ceph version will be improved and no more locked like that?
or maybe there is another think that i also dont know.
 
no, the locking is on our side, and it is currently a global lock per shared storage for certain tasks - the linked issue is about improving that by reducing the scope of the lock.

there are basically two (related) causes/issues:
- lock contention/failure to acquire the lock at all (too many tasks trying to lock the same storage -> solved by reducing the scope of the lock)
- lock timeout/failure to release the lock in time (tasked in locked context takes too long -> this can only be solved by making the storage faster or doing less things in locked context)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!