General HA Discussion

cswizard

Member
Jan 13, 2012
7
0
21
Ringgold, Georgia, United States
I've built out a test environment for ProxMox 2.0b (specifications listed at the end of this post). Three node cluster set up with containers, images, backups, etc on nfs. Fencing established and HA configured. I performed my first failover test tonight by setting up one container (id 100) on node 1 for management by HA then shutting down the node 1 rgmanager. Container 100 successfully moved to node 2 and everything continued normally.

The behavior I observed was that when rgmanager on node 1 was started again, the container stayed on node 2 which may be expected, however I was unable to get the container back to node 1 without removing it from HA management. Every time I tried (online or offline migration) the container seemed to move back to node 1 but before it could start successfully, it was moved to one of the other nodes.

Can anyone speak to this behavior?

Warren D. 'Cal' Calhoun

Specifications
-------------
3 x Dell PowerEdge 1950 Servers
- Dual Quad Core 2.0 gHz Xeons
- 8 gB DDR2 ECC RAM (4x2 gB)
- Dual 73 gB SAS Drives, Raid 1
- DRAC-V Remote Access Cards

root@earth:~# pveversion --verbose
pve-manager: 2.0-18 (pve-manager/2.0/16283a5a)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-55
pve-kernel-2.6.32-6-pve: 2.6.32-55
lvm2: 2.02.88-2pve1
clvm: 2.02.88-2pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.6.0-1
redhat-cluster-pve: 3.1.8-3
pve-cluster: 1.0-17
qemu-server: 2.0-13
pve-firmware: 1.0-14
libpve-common-perl: 1.0-11
libpve-access-control: 1.0-5
libpve-storage-perl: 2.0-9
vncterm: 1.0-2
vzctl: 3.0.29-3pve8
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-1
ksm-control-daemon: 1.1-1

Three CentOS 6.2 openvz containers running on each physical device running a variety of services (zimbra, powerdns x 2, apache2 x2 [round-robin], mysql x2 [master-master replication], ldap, radius, management utilities, etc.
 
live migration via GUI for HA managed VM´s is not implemented yet, this behavior is expected and shows at least that HA is working.

take a look on the CLI commands, see http://pve.proxmox.com/wiki/High_Availability_Cluster#Testing

Thanks Tom. I suspected as much so it didn't alarm me very much. All-in-all things are looking pretty good. As the features get finished up and reliability increases, it's going to be a fantastic product. I'm looking forward to putting it into production.

Thanks

Cal