PVE 4.1 to 4.2 upgrade: Corrupted CEPH VM images

proxtest · May 22, 2016

...
As I already mentoined this wasn't a problem on PVE 4.1 (not applicable to PVE 3.x, since it was not possible to migrate more then 1 VM at the same time at all with PVE 3.x), so seems to be a bug in PVE 4.2.
...

As i remeber i did this in proxmox 3.x with multible browser windows too and i think i was running in the same open knife. :-(
My VM's had filesystem errors too when i restart them. :-(
So i never ever moved more than 1 VM at the same time.

[/QUOTE]

spirit · May 23, 2016

Damn, This is really strange. Maybe they are a bug with ssh tunnel.
I'm running production with unsecure flag since 2year, so I never notice it.

I'll make a bug report to proxmox dev

spirit · May 23, 2016

proxtest said:
As i remeber i did this in proxmox 3.x with multible browser windows too and i think i was running in the same open knife. :-(
My VM's had filesystem errors too when i restart them. :-(
So i never ever moved more than 1 VM at the same time.

[/QUOTE]

can you test with the unsecure migration flag ?

wosp · Jun 6, 2016

Hi Alexandre,

Not a big problem (no VM's are corrupted), but since I use the unsecure migration flag I still do have sometimes a migration problem on 1 VM when using the 'migrate all' option. I have some logs now:

Code:

()
task started by HA resource agent
Jun 06 12:42:30 starting migration of VM 121 to node 'host03' (192.168.110.134)
Jun 06 12:42:30 copying disk images
Jun 06 12:42:30 starting VM 121 on remote node 'host03'
Jun 06 12:42:31 trying to acquire lock... OK
Jun 06 12:42:33 start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 121 --description \''Proxmox VE VM 121'\' -p 'KillMode=none' -p 'CPUShares=1000' /usr/bin/kvm -id 121 -chardev 'socket,id=qmp,path=/var/run/qemu-server/121.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/121.pid -daemonize -smbios 'type=1,uuid=2417a306-e6a4-4e4f-9b24-db282c8ed9c8' -name HOSTNAME -smp '4,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -vnc unix:/var/run/qemu-server/121.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 12288 -k en-us -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:13a9185295e7' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' -drive 'file=rbd:cl1/vm-121-disk-1:mon_host=192.168.110.131\:6789;192.168.110.133\:6789;192.168.110.135\:6789:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/SSD-cluster.keyring,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=200' -drive 'file=/mnt/pve/backup/images/121/vm-121-disk-1.qcow2,if=none,id=drive-virtio1,format=qcow2,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb' -netdev 'type=tap,id=net0,ifname=tap121i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=8A:A4:08:1D:F4:91,netdev=net0,bus=pci.0,addr=0x12,id=net0' -machine 'type=pc-i440fx-2.5' -incoming tcp:192.168.110.134:60002 -S' failed: exit code 1
Jun 06 12:42:33 ERROR: online migrate failure - command '/usr/bin/ssh -o 'BatchMode=yes' root@192.168.110.134 qm start 121 --stateuri tcp --skiplock --migratedfrom host05 --machine pc-i440fx-2.5' failed: exit code 255
Jun 06 12:42:33 aborting phase 2 - cleanup resources
Jun 06 12:42:33 migrate_cancel
Jun 06 12:42:33 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

Code:

Running as unit 121.scope.
kvm: -incoming tcp:192.168.110.134:60002: Failed to bind socket: Address already in use
TASK ERROR: start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 121 --description \''Proxmox VE VM 121'\' -p 'KillMode=none' -p 'CPUShares=1000' /usr/bin/kvm -id 121 -chardev 'socket,id=qmp,path=/var/run/qemu-server/121.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/121.pid -daemonize -smbios 'type=1,uuid=2417a306-e6a4-4e4f-9b24-db282c8ed9c8' -name HOSTNAME -smp '4,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -vnc unix:/var/run/qemu-server/121.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 12288 -k en-us -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:13a9185295e7' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' -drive 'file=rbd:cl1/vm-121-disk-1:mon_host=192.168.110.131\:6789;192.168.110.133\:6789;192.168.110.135\:6789:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/SSD-cluster.keyring,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=200' -drive 'file=/mnt/pve/backup/images/121/vm-121-disk-1.qcow2,if=none,id=drive-virtio1,format=qcow2,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb' -netdev 'type=tap,id=net0,ifname=tap121i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=8A:A4:08:1D:F4:91,netdev=net0,bus=pci.0,addr=0x12,id=net0' -machine 'type=pc-i440fx-2.5' -incoming tcp:192.168.110.134:60002 -S' failed: exit code 1

Any idea what's causing this and maybe how to fix? Thanks!

tom · Jun 6, 2016

did you use packages from today? post your pveversion -v.

wosp · Jun 7, 2016

No, I don't. But are there changes in the transfer proces that can fix these errors? Unfortunately I can't update right now, because I'm out for a holiday in about 2 weeks, and I don't want to make any changes to config/servers that are not critical within 2 weeks before I'm out of office for holiday.

Code:

proxmox-ve: 4.2-51 (running kernel: 4.4.8-1-pve)
pve-manager: 4.2-5 (running version: 4.2-5/7cf09667)
pve-kernel-4.4.8-1-pve: 4.4.8-51
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-75
pve-firmware: 1.1-8
libpve-common-perl: 4.0-62
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-17
pve-container: 1.0-64
pve-firewall: 2.0-27
pve-ha-manager: 1.0-31
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie

proxtest · Jun 19, 2016

can you test with the unsecure migration flag ?[/QUOTE]

Sorry for the late answer, was verry busy.
I prefer not to touch! Its a production system and i dont want to crash the filesystems.
Maybe im not lucky to repair it again and have to go offline for longer, dont want this nigthmare.

Search

Search

PVE 4.1 to 4.2 upgrade: Corrupted CEPH VM images

proxtest

Active Member

spirit

Distinguished Member

spirit

Distinguished Member

wosp

Renowned Member

tom

Proxmox Staff Member

wosp

Renowned Member

proxtest

Active Member

We value your privacy