We are using Proxmox VE 8.4.1 with enterprise subscription repos. I have noticed that the clone process either does not start or hangs at the same place(% done for a disk) for several VMs. We have tried quite few VMs and results are mostly negative(oddly it works for a few VMs). However, if we create a new VM and clone it - works fine. We are running Ceph RBD(hyperconverged) with 19.2.1. Ceph does not show any errors, also fio test from one of the VMs is fine, also performance wise read and write. We can add a new disk to an existing VM. Restore of a VM works fine. The migration offline and online is working fine. The existing setup/version has been working till yesterday for about 3 months(on 8.4.1) and for about 6 months on version 8.
More oddly, cloning of a live VM works fine(the same VM that offline clone fails). We have tried the same operation from the command line and the same results are the same. We have tried to reboot one host and perform the operations on that host = the same results.
We have reported this issue to Proxmox already, so this post is for the community if someone has encountered anything similar.
The output of a failed job is very simple, the clone job just stops showing the progress = it is stuck at some %, always the same % if we repeat the process. It "never" finishes, we canceled the job after 20 min. After the cancelation, a disk that was not finished it has to be cleaned manually.
I guess the first question is, what is the difference between live clone and offline clone(copy job) ?
More oddly, cloning of a live VM works fine(the same VM that offline clone fails). We have tried the same operation from the command line and the same results are the same. We have tried to reboot one host and perform the operations on that host = the same results.
We have reported this issue to Proxmox already, so this post is for the community if someone has encountered anything similar.
The output of a failed job is very simple, the clone job just stops showing the progress = it is stuck at some %, always the same % if we repeat the process. It "never" finishes, we canceled the job after 20 min. After the cancelation, a disk that was not finished it has to be cleaned manually.
I guess the first question is, what is the difference between live clone and offline clone(copy job) ?