Proxmox VE 4.0 2-node cluster: cannot migrate online

Hivane

Active Member
Feb 11, 2014
38
0
26
Paris
www.hivane.net
Hi,

I had a PVE 3.4 cluster, on which I used to run several VMs, using a shared iSCSI storage with LVM. Everything was fine with that.

In order to go to PVE 4.0, I installed two new servers, and created another cluster (no HA).
Then, I attached the iSCSI storage/LVM on that cluster, and mitgrated the VMs on the first node of that PVE4 cluster.

The problem is that now, I cannot migrate (online) some VMs on the second node. I get the following error:

Code:
Nov 01 14:21:57 starting migration of VM 120 to node 'pve02-ivr' (172.16.43.2)
Nov 01 14:21:57 copying disk images
Nov 01 14:21:57 starting VM 120 on remote node 'pve02-ivr'
Nov 01 14:21:59 start failed: command '/usr/bin/systemd-run --scope  --slice qemu --unit 120 -p 'CPUShares=1000' /usr/bin/kvm -id 120  -chardev 'socket,id=qmp,path=/var/run/qemu-server/120.qmp,server,nowait'  -mon 'chardev=qmp,mode=control' -vnc  unix:/var/run/qemu-server/120.vnc,x509,password -pidfile  /var/run/qemu-server/120.pid -daemonize -name iloth -smp  '4,sockets=2,cores=2,maxcpus=4' -nodefaults -boot  'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -cpu  kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,-kvm_steal_time,enforce  -m 2048 -k fr -device  'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device  'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device  'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device  'usb-tablet,id=tablet,bus=uhci.0,port=1' -device  'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi  'initiator-name=iqn.1993-08.org.debian:01:c06b724e73ba' -drive  'if=none,id=drive-ide2,media=cdrom,aio=threads' -device  'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -drive  'file=/dev/vg-keylargo-ng/vm-120-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on'  -device  'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100'  -netdev  'type=tap,id=net0,ifname=tap120i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on'  -device  'virtio-net-pci,mac=F6:BA:2C:DF:0E:7B,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'  -machine 'type=pc-i440fx-2.4' -incoming 'tcp:[localhost]:60000' -S'  failed: exit code 1
Nov 01 14:21:59 ERROR: online migrate failure - command '/usr/bin/ssh -o  'BatchMode=yes' root@172.16.43.2 qm start 120 --stateuri tcp --skiplock  --migratedfrom pve01-ivr --machine pc-i440fx-2.4' failed: exit code 255
Nov 01 14:21:59 aborting phase 2 - cleanup resources
Nov 01 14:21:59 migrate_cancel
Nov 01 14:22:00 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

Whats wrong with this ? Am I missing something ?
Thanks !


Code:
# pveversion -v
proxmox-ve: 4.0-16 (running kernel: 4.2.2-1-pve)
pve-manager: 4.0-50 (running version: 4.0-50/d3a6b7e5)
pve-kernel-4.2.2-1-pve: 4.2.2-16
lvm2: 2.02.116-pve1
corosync-pve: 2.3.5-1
libqb0: 0.17.2-1
pve-cluster: 4.0-23
qemu-server: 4.0-31
pve-firmware: 1.1-7
libpve-common-perl: 4.0-32
libpve-access-control: 4.0-9
libpve-storage-perl: 4.0-27
pve-libspice-server1: 0.12.5-1
vncterm: 1.2-1
pve-qemu-kvm: 2.4-10
pve-container: 1.0-10
pve-firewall: 2.0-12
pve-ha-manager: 1.0-10
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.3-1
lxcfs: 0.9-pve2
cgmanager: 0.37-pve2
criu: 1.6.0-1
zfsutils: 0.6.5-pve4~jessie
 
Sure !

Code:
iscsi: keylargo-ng
    target iqn.datastore
    portal 172.16.0.1
    content none

dir: local
    path /var/lib/vz
    maxfiles 0
    content vztmpl,rootdir,images,iso

lvm: lvm-keylargo-ng
    vgname vg-keylargo-ng
    content images
    shared

Before I had that issue, I previously had an issue with iscsiadm.
Digging on the forum, I found an advise, so I have removed the following: "base keylargo-ng:0.0.0.scsi-36589cfc000000f0b2575a81868851c07" from the LVM section.

Thanks !
 
Digging on the forum, I found an advise, so I have removed the following: "base keylargo-ng:0.0.0.scsi-36589cfc000000f0b2575a81868851c07" from the LVM section.

That advise is definitely wrong. Storage activation will fail if there is no base storage reference.
 
Here is my storage.cfg:

Code:
iscsi: keylargo-ng
    target iqn.datastore
    portal 172.16.0.1
    content none

dir: local
    path /var/lib/vz
    maxfiles 0
    content vztmpl,rootdir,images,iso

lvm: lvm-keylargo-ng
    vgname vg-keylargo-ng
    content images
    base keylargo-ng:0.0.0.scsi-36589cfc000000f0b2575a81868851c07
    shared

... and now, the new error message when trying to migrate:

Code:
iscsiadm: default: 1 session requested, but 1 already present.
iscsiadm: Could not log into all portals
command '/usr/bin/iscsiadm --mode node --targetname iqn.datastore --login' failed: exit code 15
Nov 01 16:41:34 starting migration of VM 120 to node 'pve02-ivr' (172.16.43.2)
Nov 01 16:41:34 copying disk images
Nov 01 16:41:34 starting VM 120 on remote node 'pve02-ivr'
Nov 01 16:41:36 start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 120 -p 'CPUShares=1000' /usr/bin/kvm -id 120 -chardev 'socket,id=qmp,path=/var/run/qemu-server/120.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/120.vnc,x509,password -pidfile /var/run/qemu-server/120.pid -daemonize -name iloth -smp '4,sockets=2,cores=2,maxcpus=4' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,-kvm_steal_time,enforce -m 2048 -k fr -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:c06b724e73ba' -drive 'file=/dev/vg-keylargo-ng/vm-120-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -netdev 'type=tap,id=net0,ifname=tap120i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=F6:BA:2C:DF:0E:7B,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc-i440fx-2.4' -incoming 'tcp:[localhost]:60000' -S' failed: exit code 1
Nov 01 16:41:36 ERROR: online migrate failure - command '/usr/bin/ssh -o 'BatchMode=yes' root@172.16.43.2 qm start 120 --stateuri tcp --skiplock --migratedfrom pve01-ivr --machine pc-i440fx-2.4' failed: exit code 255
Nov 01 16:41:36 aborting phase 2 - cleanup resources
Nov 01 16:41:36 migrate_cancel
Nov 01 16:41:37 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

Thanks for your help !
 
so why does the iSCSI login fail? Try to run that manually:

# iscsiadm --mode node --targetname iqn.datastore --login
 
I'm not an expert with iscsi... here is the manual cmd output:
Code:
root@pve01-ivr:~# iscsiadm --mode node --targetname iqn.datastore --login 
iscsiadm: default: 1 session requested, but 1 already present.
iscsiadm: Could not log into all portals
 
I have not rebuilt any VM on it (yet). Should I try to setup a VM, and move it on the other side ?
Btw, I have the same output on the iscsiadm command, on the second node.
 
Here, on both nodes:

Code:
root@pve01-ivr:~# iscsiadm --mode session 
tcp: [1] 172.16.0.1:3260,2 iqn.datastore (non-flash)
root@pve02-ivr:~#  iscsiadm --mode session 
tcp: [1] 172.16.0.1:3260,2 iqn.datastore (non-flash)
 
Hi,

Well I installed it on both nodes.
I still have a problem, but not related to iscsiadm :)

I now have the initial error message:
Code:
Nov 02 17:39:16 starting migration of VM 118 to node 'pve02-ivr' (172.16.43.2)
Nov 02 17:39:16 copying disk images
Nov 02 17:39:16 starting VM 118 on remote node 'pve02-ivr'
Nov 02 17:39:18 start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 118 -p 'CPUShares=1000' /usr/bin/kvm -id 118 -chardev 'socket,id=qmp,path=/var/run/qemu-server/118.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/118.vnc,x509,password -pidfile /var/run/qemu-server/118.pid -daemonize -name mx-ns3.vedege.net -smp '8,sockets=2,cores=4,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,-kvm_steal_time,enforce -m 1024 -k fr -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:c06b724e73ba' -drive 'file=/dev/vg-keylargo-ng/vm-118-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -netdev 'type=tap,id=net0,ifname=tap118i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=8E:3E:1D:94:2A:85,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc-i440fx-2.4' -incoming 'tcp:[localhost]:60000' -S' failed: exit code 1
Nov 02 17:39:18 ERROR: online migrate failure - command '/usr/bin/ssh -o 'BatchMode=yes' root@172.16.43.2 qm start 118 --stateuri tcp --skiplock --migratedfrom pve01-ivr --machine pc-i440fx-2.4' failed: exit code 255
Nov 02 17:39:18 aborting phase 2 - cleanup resources
Nov 02 17:39:18 migrate_cancel
Nov 02 17:39:19 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

Still, that's better... :-)

Do you need any other output or tests ?
 
Dietmar,

Yes, it works. I shutdown'ed a VM, and migrated it successfully:
Code:
Nov 02 17:48:25 starting migration of VM 118 to node 'pve02-ivr' (172.16.43.2)
Nov 02 17:48:25 copying disk images
Nov 02 17:48:26 migration finished successfully (duration 00:00:02)
TASK OK

However, I have not been able to start it, either from the console's context menu, or from the "Start" button from the main GUI interface.

Code:
Running as unit 118.scope.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
TASK ERROR: start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 118 -p 'CPUShares=1000' /usr/bin/kvm -id 118 -chardev 'socket,id=qmp,path=/var/run/qemu-server/118.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/118.vnc,x509,password -pidfile /var/run/qemu-server/118.pid -daemonize -name mx-ns3.vedege.net -smp '8,sockets=2,cores=4,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,-kvm_steal_time,enforce -m 1024 -k fr -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:c06b724e73ba' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -drive 'file=/dev/vg-keylargo-ng/vm-118-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap118i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=8E:3E:1D:94:2A:85,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'' failed: exit code 1