PVE 4.1 to 4.2 upgrade: Corrupted CEPH VM images

...
As I already mentoined this wasn't a problem on PVE 4.1 (not applicable to PVE 3.x, since it was not possible to migrate more then 1 VM at the same time at all with PVE 3.x), so seems to be a bug in PVE 4.2.
...

As i remeber i did this in proxmox 3.x with multible browser windows too and i think i was running in the same open knife. :-(
My VM's had filesystem errors too when i restart them. :-(
So i never ever moved more than 1 VM at the same time. :)[/QUOTE]
 
As i remeber i did this in proxmox 3.x with multible browser windows too and i think i was running in the same open knife. :-(
My VM's had filesystem errors too when i restart them. :-(
So i never ever moved more than 1 VM at the same time. :)
[/QUOTE]

can you test with the unsecure migration flag ?
 
Hi Alexandre,

Not a big problem (no VM's are corrupted), but since I use the unsecure migration flag I still do have sometimes a migration problem on 1 VM when using the 'migrate all' option. I have some logs now:

Code:
()
task started by HA resource agent
Jun 06 12:42:30 starting migration of VM 121 to node 'host03' (192.168.110.134)
Jun 06 12:42:30 copying disk images
Jun 06 12:42:30 starting VM 121 on remote node 'host03'
Jun 06 12:42:31 trying to acquire lock... OK
Jun 06 12:42:33 start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 121 --description \''Proxmox VE VM 121'\' -p 'KillMode=none' -p 'CPUShares=1000' /usr/bin/kvm -id 121 -chardev 'socket,id=qmp,path=/var/run/qemu-server/121.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/121.pid -daemonize -smbios 'type=1,uuid=2417a306-e6a4-4e4f-9b24-db282c8ed9c8' -name HOSTNAME -smp '4,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -vnc unix:/var/run/qemu-server/121.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 12288 -k en-us -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:13a9185295e7' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' -drive 'file=rbd:cl1/vm-121-disk-1:mon_host=192.168.110.131\:6789;192.168.110.133\:6789;192.168.110.135\:6789:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/SSD-cluster.keyring,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=200' -drive 'file=/mnt/pve/backup/images/121/vm-121-disk-1.qcow2,if=none,id=drive-virtio1,format=qcow2,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb' -netdev 'type=tap,id=net0,ifname=tap121i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=8A:A4:08:1D:F4:91,netdev=net0,bus=pci.0,addr=0x12,id=net0' -machine 'type=pc-i440fx-2.5' -incoming tcp:192.168.110.134:60002 -S' failed: exit code 1
Jun 06 12:42:33 ERROR: online migrate failure - command '/usr/bin/ssh -o 'BatchMode=yes' root@192.168.110.134 qm start 121 --stateuri tcp --skiplock --migratedfrom host05 --machine pc-i440fx-2.5' failed: exit code 255
Jun 06 12:42:33 aborting phase 2 - cleanup resources
Jun 06 12:42:33 migrate_cancel
Jun 06 12:42:33 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

Code:
Running as unit 121.scope.
kvm: -incoming tcp:192.168.110.134:60002: Failed to bind socket: Address already in use
TASK ERROR: start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 121 --description \''Proxmox VE VM 121'\' -p 'KillMode=none' -p 'CPUShares=1000' /usr/bin/kvm -id 121 -chardev 'socket,id=qmp,path=/var/run/qemu-server/121.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/121.pid -daemonize -smbios 'type=1,uuid=2417a306-e6a4-4e4f-9b24-db282c8ed9c8' -name HOSTNAME -smp '4,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -vnc unix:/var/run/qemu-server/121.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 12288 -k en-us -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:13a9185295e7' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' -drive 'file=rbd:cl1/vm-121-disk-1:mon_host=192.168.110.131\:6789;192.168.110.133\:6789;192.168.110.135\:6789:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/SSD-cluster.keyring,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=200' -drive 'file=/mnt/pve/backup/images/121/vm-121-disk-1.qcow2,if=none,id=drive-virtio1,format=qcow2,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb' -netdev 'type=tap,id=net0,ifname=tap121i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=8A:A4:08:1D:F4:91,netdev=net0,bus=pci.0,addr=0x12,id=net0' -machine 'type=pc-i440fx-2.5' -incoming tcp:192.168.110.134:60002 -S' failed: exit code 1

Any idea what's causing this and maybe how to fix? Thanks!
 
did you use packages from today? post your pveversion -v.
 
No, I don't. But are there changes in the transfer proces that can fix these errors? Unfortunately I can't update right now, because I'm out for a holiday in about 2 weeks, and I don't want to make any changes to config/servers that are not critical within 2 weeks before I'm out of office for holiday.

Code:
proxmox-ve: 4.2-51 (running kernel: 4.4.8-1-pve)
pve-manager: 4.2-5 (running version: 4.2-5/7cf09667)
pve-kernel-4.4.8-1-pve: 4.4.8-51
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-75
pve-firmware: 1.1-8
libpve-common-perl: 4.0-62
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-17
pve-container: 1.0-64
pve-firewall: 2.0-27
pve-ha-manager: 1.0-31
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
 

can you test with the unsecure migration flag ?[/QUOTE]

Sorry for the late answer, was verry busy.
I prefer not to touch! Its a production system and i dont want to crash the filesystems.
Maybe im not lucky to repair it again and have to go offline for longer, dont want this nigthmare.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!