I've built out a test environment for ProxMox 2.0b (specifications listed at the end of this post). Three node cluster set up with containers, images, backups, etc on nfs. Fencing established and HA configured. I performed my first failover test tonight by setting up one container (id 100) on node 1 for management by HA then shutting down the node 1 rgmanager. Container 100 successfully moved to node 2 and everything continued normally.
The behavior I observed was that when rgmanager on node 1 was started again, the container stayed on node 2 which may be expected, however I was unable to get the container back to node 1 without removing it from HA management. Every time I tried (online or offline migration) the container seemed to move back to node 1 but before it could start successfully, it was moved to one of the other nodes.
Can anyone speak to this behavior?
Warren D. 'Cal' Calhoun
Specifications
-------------
3 x Dell PowerEdge 1950 Servers
- Dual Quad Core 2.0 gHz Xeons
- 8 gB DDR2 ECC RAM (4x2 gB)
- Dual 73 gB SAS Drives, Raid 1
- DRAC-V Remote Access Cards
root@earth:~# pveversion --verbose
pve-manager: 2.0-18 (pve-manager/2.0/16283a5a)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-55
pve-kernel-2.6.32-6-pve: 2.6.32-55
lvm2: 2.02.88-2pve1
clvm: 2.02.88-2pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.6.0-1
redhat-cluster-pve: 3.1.8-3
pve-cluster: 1.0-17
qemu-server: 2.0-13
pve-firmware: 1.0-14
libpve-common-perl: 1.0-11
libpve-access-control: 1.0-5
libpve-storage-perl: 2.0-9
vncterm: 1.0-2
vzctl: 3.0.29-3pve8
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-1
ksm-control-daemon: 1.1-1
Three CentOS 6.2 openvz containers running on each physical device running a variety of services (zimbra, powerdns x 2, apache2 x2 [round-robin], mysql x2 [master-master replication], ldap, radius, management utilities, etc.
The behavior I observed was that when rgmanager on node 1 was started again, the container stayed on node 2 which may be expected, however I was unable to get the container back to node 1 without removing it from HA management. Every time I tried (online or offline migration) the container seemed to move back to node 1 but before it could start successfully, it was moved to one of the other nodes.
Can anyone speak to this behavior?
Warren D. 'Cal' Calhoun
Specifications
-------------
3 x Dell PowerEdge 1950 Servers
- Dual Quad Core 2.0 gHz Xeons
- 8 gB DDR2 ECC RAM (4x2 gB)
- Dual 73 gB SAS Drives, Raid 1
- DRAC-V Remote Access Cards
root@earth:~# pveversion --verbose
pve-manager: 2.0-18 (pve-manager/2.0/16283a5a)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-55
pve-kernel-2.6.32-6-pve: 2.6.32-55
lvm2: 2.02.88-2pve1
clvm: 2.02.88-2pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.6.0-1
redhat-cluster-pve: 3.1.8-3
pve-cluster: 1.0-17
qemu-server: 2.0-13
pve-firmware: 1.0-14
libpve-common-perl: 1.0-11
libpve-access-control: 1.0-5
libpve-storage-perl: 2.0-9
vncterm: 1.0-2
vzctl: 3.0.29-3pve8
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-1
ksm-control-daemon: 1.1-1
Three CentOS 6.2 openvz containers running on each physical device running a variety of services (zimbra, powerdns x 2, apache2 x2 [round-robin], mysql x2 [master-master replication], ldap, radius, management utilities, etc.