Hi,
Slowly building my first production PVE HA cluster (2 node), I solved all the troubles I faced up to now (and learned _a_lot_ on this road !) but I'm breaking my teeth on the last I'm struggling with :'(
Nodes (pve01 and pve02, HP ML350, up to date PVE 2.2), network (2xGbEth bond0/vmbr0 for the LAN, 2xGbEth direct links bond1/vmbr1 w/ jumbo frames for storage), storage (HW RAID, LVM VG on DRBD on LVM LV) are setup, cluster is created and nodes are joined, fencing (domain and ipmilan devices) is configured and barely tested. Only qdisk left to be configured.
About the storage configuration
VM 101 is a KVM W7 VM configured as follow:
DRBD is working nicely through its dedicated network.
Network configuration seems stable and well performing according to iperf.
PVE storage configuration seems all right and is coherent on both nodes. "vm101" VG is actually marked as shared and available to nodes pve01 and pve02 in storage.cfg on both servers:
VM starts, runs, stops and is backep up (to RDX) smoothly...
BUT I can't migrate it because of the following error:
I may have made a mistake, but I really can't see which one as this worked on my PoC setup (that I sadly can't access ATM :'( ) This PoC was built exactly the same way (LVM VG on DRBD on LVM LV on Soft RAID) and live migration of (W$, Linux, FreeBSD) VMs was working without a hitch.
But The Beast keep telling me I'm using local disks and I can't possibly figure out why ?!
The only suspect I can think of is CLVM which I'd have supposed to be running... but is not. As the few relevant mentions of it (replacing PVE's own lock mechanism in v2 on one side, and only needed in v2 in rare circumstances on the other) seem a little contradictory to me, I thought I'd better ask for enlightenment before struggling further.
Thank you in advance for your answers and comments.
Time for a recomforting 12 year ol' bram...
Have a nice WE
Bests
Slowly building my first production PVE HA cluster (2 node), I solved all the troubles I faced up to now (and learned _a_lot_ on this road !) but I'm breaking my teeth on the last I'm struggling with :'(
Nodes (pve01 and pve02, HP ML350, up to date PVE 2.2), network (2xGbEth bond0/vmbr0 for the LAN, 2xGbEth direct links bond1/vmbr1 w/ jumbo frames for storage), storage (HW RAID, LVM VG on DRBD on LVM LV) are setup, cluster is created and nodes are joined, fencing (domain and ipmilan devices) is configured and barely tested. Only qdisk left to be configured.
About the storage configuration
Code:
2xIntel 520 240Gb SSDs
\_HP SA P410 w/ 512M FBWC RAID1 array with accelerator disabled as recommended by HP
\_ /dev/sda3 configured as LVM physical volume
\_ vgpve01vms: LVM VG for virtual machines storage
\_ lvdrbd0vm101: DRBD backing LV for VM 101 (for resizable DRBD devices etc.)
\_ drbd0: Pri/Pri DRBD device dedicated to VM 101 (for isolated DRBD mgmt of each VM)
\_ vgdrbd0vm101: "vm101" PVE _shared_ storage (dedicated to VM 101)
\_ vm-101-disk-1: VM 101 KVM disk device configured as "virtio0: vm101:vm-101-disk-1"
VM 101 is a KVM W7 VM configured as follow:
Code:
bootdisk: virtio0
cores: 2
cpu: host
keyboard: fr
memory: 2048
name: wseven
net0: virtio=8E:CA:D1:58:82:A0,bridge=vmbr0
ostype: win7
sockets: 1
virtio0: vm101:vm-101-disk-1
DRBD is working nicely through its dedicated network.
Code:
0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
ns:21734972 nr:0 dw:13289664 dr:131091039 al:2372 bm:1298 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
Network configuration seems stable and well performing according to iperf.
PVE storage configuration seems all right and is coherent on both nodes. "vm101" VG is actually marked as shared and available to nodes pve01 and pve02 in storage.cfg on both servers:
Code:
lvm: vm101
vgname vgdrbd0vm101
shared
content images
nodes pve01,pve02
VM starts, runs, stops and is backep up (to RDX) smoothly...
BUT I can't migrate it because of the following error:
Code:
1443 Dec 14 12:02:26 starting migration of VM 101 to node 'pve02' (192.168.100.20)
1444 Dec 14 12:02:26 copying disk images
1445 Dec 14 12:02:26 ERROR: Failed to sync data - can't do online migration - VM uses local disks
1446 Dec 14 12:02:26 aborting phase 1 - cleanup resources
1447 Dec 14 12:02:26 ERROR: migration aborted (duration 00:00:00): Failed to sync data - can't do online migration - VM uses local disks
1448 TASK ERROR: migration aborted
I may have made a mistake, but I really can't see which one as this worked on my PoC setup (that I sadly can't access ATM :'( ) This PoC was built exactly the same way (LVM VG on DRBD on LVM LV on Soft RAID) and live migration of (W$, Linux, FreeBSD) VMs was working without a hitch.
But The Beast keep telling me I'm using local disks and I can't possibly figure out why ?!
The only suspect I can think of is CLVM which I'd have supposed to be running... but is not. As the few relevant mentions of it (replacing PVE's own lock mechanism in v2 on one side, and only needed in v2 in rare circumstances on the other) seem a little contradictory to me, I thought I'd better ask for enlightenment before struggling further.
Thank you in advance for your answers and comments.
Time for a recomforting 12 year ol' bram...
Have a nice WE
Bests