if you run HA, you need at least 3 nodes and very important, reliable fencing (e.g. power fencing devices).
you can find all requirement on our wiki. if you do not follow these recommendations, weird situations can occur. if you do it right, it will work very reliable.
http://pve.proxmox.com/wiki/High_Availability_Cluster#System_requirements
Hi,
thanks for your reply.
My setup uses quorum disk, and fencing based on directly connected iLO3/Lanplus. I made several test on my environment to be sure that with a network split no vm were started before a node shut the other.
But this was not the case, there was no split, no network partition, no hosts down, all was working regularly, up and running, I just added a (running) VM on HA, then activated the new configuration. then this happened
-- node 1
Apr 2 20:05:40 vmhost01 rgmanager[2943]: Loading Service Data
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Stopping changed resources.
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Restarting changed resources.
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Starting changed resources.
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Initializing pvevm:109
Apr 2 20:05:42 vmhost01 rgmanager[2943]: pvevm:109 was added to the config, but I am not initializing it.
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Starting stopped service pvevm:109
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Migration: pvevm:110 is running on 2
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Migration: pvevm:112 is running on 2
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Migration: pvevm:111 is running on 2
Apr 2 20:05:43 vmhost01 rgmanager[977258]: [pvevm] Move config for VM 109 to local node
Apr 2 20:05:43 vmhost01 pvevm: <root@pam> starting task UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam:
Apr 2 20:05:43 vmhost01 task UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam:: start VM 109: UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam:
Apr 2 20:05:44 vmhost01 multipathd: dm-12: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-5: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-5: devmap already registered
Apr 2 20:05:44 vmhost01 multipathd: dm-13: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-6: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-6: devmap already registered
Apr 2 20:05:44 vmhost01 kernel: device tap109i0 entered promiscuous mode
Apr 2 20:05:44 vmhost01 kernel: vmbr0: port 3(tap109i0) entering forwarding state
Apr 2 20:05:44 vmhost01 rgmanager[977357]: [pvevm] Task still active, waiting
Apr 2 20:05:45 vmhost01 pvevm: <root@pam> end task UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam: OK
Apr 2 20:05:45 vmhost01 rgmanager[2943]: Service pvevm:109 started
Apr 2 20:05:45 vmhost01 rgmanager[977386]: [pvevm] VM 109 is running
--node 2
Apr 2 20:05:40 vmhost02 pmxcfs[8683]: [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
Apr 2 20:05:40 vmhost02 corosync[4063]: [QUORUM] Members[2]: 1 2
Apr 2 20:05:40 vmhost02 pmxcfs[8683]: [status] notice: update cluster info (cluster name vmhosts-cluster, version = 24)
Apr 2 20:05:40 vmhost02 rgmanager[5735]: Reconfiguring
Apr 2 20:05:40 vmhost02 rgmanager[5735]: Loading Service Data
Apr 2 20:05:43 vmhost02 rgmanager[291782]: [pvevm] VM 110 is running
Apr 2 20:05:43 vmhost02 rgmanager[291794]: [pvevm] VM 112 is running
Apr 2 20:05:43 vmhost02 rgmanager[291808]: [pvevm] VM 111 is running
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Stopping changed resources.
Apr 2 20:05:43 vmhost02 pmxcfs[8683]: [status] notice: received log
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Restarting changed resources.
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Starting changed resources.
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Initializing pvevm:109
Apr 2 20:05:43 vmhost02 rgmanager[5735]: pvevm:109 was added to the config, but I am not initializing it.
Apr 2 20:05:45 vmhost02 pmxcfs[8683]: [status] notice: received log
Apr 2 20:05:45 vmhost02 rgmanager[5735]: Migration: pvevm:109 is running on 1
*BUT* vm 109 was already running on node 2, I just can't understand why rgmanager decided it wasn't.