Move disk show "TASK OK" but it's not!

udo

Distinguished Member
Apr 22, 2009
5,977
199
163
Ahrensburg; Germany
Hi,
I found an issue after transfering four big (4TB) vm-disks form one ceph pool (pve) to another ceph pool (file).

The Task show OK, but the old file isn't deleted:
Code:
create full clone of drive virtio7 (ceph_pve:vm-410-disk-4)
transferred: 0 bytes remaining: 4398046511104 bytes total: 4398046511104 bytes progression: 0.00 % busy: true
transferred: 41943040 bytes remaining: 4398004568064 bytes total: 4398046511104 bytes progression: 0.00 % busy: true
transferred: 94371840 bytes remaining: 4397952139264 bytes total: 4398046511104 bytes progression: 0.00 % busy: true
transferred: 157286400 bytes remaining: 4397889224704 bytes total: 4398046511104 bytes progression: 0.00 % busy: true
transferred: 199229440 bytes remaining: 4397847281664 bytes total: 4398046511104 bytes progression: 0.00 % busy: true
transferred: 251658240 bytes remaining: 4397794852864 bytes total: 4398046511104 bytes progression: 0.01 % busy: true
transferred: 293601280 bytes remaining: 4397752909824 bytes total: 4398046511104 bytes progression: 0.01 % busy: true
...
transferred: 4397696286720 bytes remaining: 350224384 bytes total: 4398046511104 bytes progression: 99.99 % busy: true
transferred: 4397811630080 bytes remaining: 234881024 bytes total: 4398046511104 bytes progression: 99.99 % busy: true
transferred: 4397906001920 bytes remaining: 140509184 bytes total: 4398046511104 bytes progression: 100.00 % busy: true
transferred: 4398000373760 bytes remaining: 46137344 bytes total: 4398046511104 bytes progression: 100.00 % busy: true
transferred: 4398046511104 bytes remaining: 0 bytes total: 4398046511104 bytes progression: 100.00 % busy: false
Removing all snapshots: 100% complete...done.
image has watchers - not removing
Removing image: 0% complete...failed.
rbd: error: image still has watchers
TASK OK
The old file (luckily) still there:
Code:
# rbd -p pve ls
vm-410-disk-4

# rbd -p file ls
...
vm-410-disk-1
vm-410-disk-2
vm-410-disk-3
vm-410-disk-4
...

The VM-Config show the new storage place - but this is wrong!
Code:
cat /etc/pve/qemu-server/410.conf 
bootdisk: virtio0
cores: 4
cpu: kvm64
ide2: none,media=cdrom
memory: 32768
name: prod-srv
net0: virtio=46:E3:35:AA:93:07,bridge=vmbr20
ostype: l26
sockets: 2
virtio0: d_sas_r0:vm-410-disk-1,cache=writethrough,size=32G
virtio1: d_sas_r0:vm-410-disk-2,cache=writethrough,backup=no,size=100G
virtio2: d_sas_r0:vm-410-disk-3,cache=writethrough,backup=no,size=50G
virtio3: ceph_file:vm-410-disk-1,cache=writethrough,backup=no,size=4096G
virtio4: ceph_file:vm-410-disk-2,cache=writethrough,backup=no,size=4096G
virtio5: ceph_file:vm-410-disk-3,cache=writethrough,backup=no,size=4096G
virtio6: d_sas_r0:vm-410-disk-4,cache=writethrough,size=170G
virtio7: ceph_file:vm-410-disk-4,cache=writethrough,backup=no,size=4096G
because the VM still use the old disk (I put some breaks in the ps output for better reading)
Code:
root       10062 86.8 21.3 40884180 28131620 ?   Sl   Feb12 7340:51 /usr/bin/kvm -id 410 \
 -chardev socket,id=qmp,path=/var/run/qemu-server/410.qmp,server,nowait \
 -mon chardev=qmp,mode=control -vnc unix:/var/run/qemu-server/410.vnc,x509,password \
 -pidfile /var/run/qemu-server/410.pid -daemonize -name prod-srv -smp sockets=2,cores=4 \
 -nodefaults -boot menu=on -vga cirrus -cpu kvm64,+lahf_lm,+x2apic,+sep -k de -m 32768 \
 -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f \
 -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1\
 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \
 -iscsi initiator-name=iqn.1993-08.org.debian:01:15a1d996f6d3 \
 -drive file=/dev/d_sas_r0/vm-410-disk-2,if=none,id=drive-virtio1,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb \
 -drive [B]file=rbd:pve/vm-410-disk-4[/B]:mon_host=172.20.2.64 172.20.2.65 172.20.2.62:id=pve:auth_supported=cephx:[B]keyring=/etc/pve/priv/ceph/ceph_pve.keyring[/B],if=none,id=drive-virtio7,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio7,id=virtio7,bus=pci.2,addr=0x2 \
 -drive file=/dev/d_sas_r0/vm-410-disk-3,if=none,id=drive-virtio2,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio2,id=virtio2,bus=pci.0,addr=0xc \
 -drive file=/dev/d_sas_r0/vm-410-disk-4,if=none,id=drive-virtio6,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio6,id=virtio6,bus=pci.2,addr=0x1 \
 -drive file=rbd:file/vm-410-disk-1:mon_host=172.20.2.64 172.20.2.65 172.20.2.62:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/ceph_file.keyring,if=none,id=drive-virtio3,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio3,id=virtio3,bus=pci.0,addr=0xd \
 -drive if=none,id=drive-ide2,media=cdrom,aio=native \
   -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 \
 -drive file=/dev/d_sas_r0/vm-410-disk-1,if=none,id=drive-virtio0,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=105 \
 -drive file=rbd:file/vm-410-disk-2:mon_host=172.20.2.64 172.20.2.65 172.20.2.62:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/ceph_file.keyring,if=none,id=drive-virtio4,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio4,id=virtio4,bus=pci.0,addr=0xe \
 -drive file=rbd:pve/vm-410-disk-3:mon_host=172.20.2.64 172.20.2.65 172.20.2.62:id=pve:auth_supported=cephx:keyring=/etc/pve/priv/ceph/ceph_pve.keyring,if=none,id=drive-virtio5,cache=writethrough,aio=native,detect-zeroes=on \
   -device virtio-blk-pci,drive=drive-virtio5,id=virtio5,bus=pci.0,addr=0xf \
 -netdev type=tap,id=net0,ifname=tap410i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on \
   -device virtio-net-pci,mac=46:E3:35:AA:93:07,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300
What is the best way to fix this issue? I assume stop the VM, edit the conf for vm-410-disk-4 back to ceph_pve and start the VM again?! After that try again the move?

Udo
 
Yes. What version do you run exactly (I thought we already fixed that bug)?
Hi Dietmar,
here my version (not too old, I think)
Code:
pveversion -v
proxmox-ve-2.6.32: 3.3-139 (running kernel: 2.6.32-34-pve)
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03)
pve-kernel-2.6.32-32-pve: 2.6.32-136
pve-kernel-2.6.32-27-pve: 2.6.32-121
pve-kernel-2.6.32-34-pve: 2.6.32-140
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.3-3
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-25
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
Udo
 
AFAIK this is fixed in the upcoming 3.4.