Hello,
I run several 7.3 pve clusters. Two of these cluster are NOT hyperconverged and instead use an external nautilus ceph cluster for storage. Of course each of both pve clusters use a different ceph pool of this ceph cluster. On one of the pve clusters (call it "B" ) I saw that for the ceph-storage kernel rbd is enabled wherelse the other pve cluster (call it: "A") does not use kernel rbd.
On the B-Cluster I accidentally came across a problem: allthough there is no problem with cloning (full) an existing VM from a template to a new VM, there is a problem when I try to clone an existing, not running VM to a new one. In this case cloning starts as usual and continues to 100% but then the cloning does not terminate and the new VM is still locked. The only way to go on is to cancel the cloning. The log then shows something like this:
Then I tried to delete the not completely cloned VMs rbd disk image and I got an error message. I found out that the cloned VMs rbd image is still mapped on the host. I manually unmapped it and then was able to delete it. Any known problem?
Then I looked on my cluster "A" where cloning works and I found that no rbd images at all are rbd mapped at all which leads me to the question if rbd mapping on a pve host is only used on servers that use kernel rbd?
So my guess is the kernel-rbd is the culprit here. Can I simple turn off kernel-RBD in pve Datacenter->Storage-<rbd-storagename> without harming still running VMs with kernel-rbd?
Thanks for your help
Rainer
I run several 7.3 pve clusters. Two of these cluster are NOT hyperconverged and instead use an external nautilus ceph cluster for storage. Of course each of both pve clusters use a different ceph pool of this ceph cluster. On one of the pve clusters (call it "B" ) I saw that for the ceph-storage kernel rbd is enabled wherelse the other pve cluster (call it: "A") does not use kernel rbd.
On the B-Cluster I accidentally came across a problem: allthough there is no problem with cloning (full) an existing VM from a template to a new VM, there is a problem when I try to clone an existing, not running VM to a new one. In this case cloning starts as usual and continues to 100% but then the cloning does not terminate and the new VM is still locked. The only way to go on is to cancel the cloning. The log then shows something like this:
Code:
Feb 13 10:17:26 B pvedaemon[3633192]: VM 129 qmp command failed - VM 129 not running
Feb 13 10:17:28 B pvedaemon[3633192]: can't unmap rbd device /dev/rbd-pve/7397a0cf-bfc6-4d25-aabb-be9f6564a13b/pxb-rb>
Feb 13 10:17:28 B pvedaemon[3633192]: clone failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f raw -O r>
Feb 13 10:17:28 B pvedaemon[3457650]: <root@pam> end task UPID:B:00377028:01962EB7:63E9FE62:qmclone:129:root@>
Feb 13 10:17:37 B pvedaemon[3457650]: worker exit
Then I tried to delete the not completely cloned VMs rbd disk image and I got an error message. I found out that the cloned VMs rbd image is still mapped on the host. I manually unmapped it and then was able to delete it. Any known problem?
Then I looked on my cluster "A" where cloning works and I found that no rbd images at all are rbd mapped at all which leads me to the question if rbd mapping on a pve host is only used on servers that use kernel rbd?
So my guess is the kernel-rbd is the culprit here. Can I simple turn off kernel-RBD in pve Datacenter->Storage-<rbd-storagename> without harming still running VMs with kernel-rbd?
Thanks for your help
Rainer
Last edited: