Same VM wrongly running on both node

P

panda

Guest
Hi,

I have a two node cluster running PROXMOX 2.2 with multipath shared iScsi storage.
Something really bad happened and I don't understand if I made something wrong:

I was working with a kvm vm on node 2 and when I was satisfied with the its state I put it
(while running) in HA configuration with autostart.
After a while I noticed that the vm was "migrated" to node 1, BUT WAS RUNNING ON NODE 2 too.
I don't know what I did wrong, rgmanager simply did not notice that vm was running on node 2 and started on node 1.
I'm trying to recover the filesystems now.

I hoped that even after a wrong procedure (as it might be put a running vm in HA) the cluster would have not allowed such a behaviour.
Did I miss something else ?
 
if you run HA, you need at least 3 nodes and very important, reliable fencing (e.g. power fencing devices).

you can find all requirement on our wiki. if you do not follow these recommendations, weird situations can occur. if you do it right, it will work very reliable.

http://pve.proxmox.com/wiki/High_Availability_Cluster#System_requirements

Hi,


thanks for your reply.
My setup uses quorum disk, and fencing based on directly connected iLO3/Lanplus. I made several test on my environment to be sure that with a network split no vm were started before a node shut the other.
But this was not the case, there was no split, no network partition, no hosts down, all was working regularly, up and running, I just added a (running) VM on HA, then activated the new configuration. then this happened

-- node 1

Apr 2 20:05:40 vmhost01 rgmanager[2943]: Loading Service Data
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Stopping changed resources.
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Restarting changed resources.
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Starting changed resources.
Apr 2 20:05:42 vmhost01 rgmanager[2943]: Initializing pvevm:109
Apr 2 20:05:42 vmhost01 rgmanager[2943]: pvevm:109 was added to the config, but I am not initializing it.
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Starting stopped service pvevm:109
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Migration: pvevm:110 is running on 2
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Migration: pvevm:112 is running on 2
Apr 2 20:05:43 vmhost01 rgmanager[2943]: Migration: pvevm:111 is running on 2
Apr 2 20:05:43 vmhost01 rgmanager[977258]: [pvevm] Move config for VM 109 to local node
Apr 2 20:05:43 vmhost01 pvevm: <root@pam> starting task UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam:
Apr 2 20:05:43 vmhost01 task UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam:: start VM 109: UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam:
Apr 2 20:05:44 vmhost01 multipathd: dm-12: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-5: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-5: devmap already registered
Apr 2 20:05:44 vmhost01 multipathd: dm-13: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-6: add map (uevent)
Apr 2 20:05:44 vmhost01 multipathd: dm-6: devmap already registered
Apr 2 20:05:44 vmhost01 kernel: device tap109i0 entered promiscuous mode
Apr 2 20:05:44 vmhost01 kernel: vmbr0: port 3(tap109i0) entering forwarding state
Apr 2 20:05:44 vmhost01 rgmanager[977357]: [pvevm] Task still active, waiting
Apr 2 20:05:45 vmhost01 pvevm: <root@pam> end task UPID:vmhost01:000EE97E:07BB344C:515B1DF7:qmstart:109:root@pam: OK
Apr 2 20:05:45 vmhost01 rgmanager[2943]: Service pvevm:109 started
Apr 2 20:05:45 vmhost01 rgmanager[977386]: [pvevm] VM 109 is running

--node 2


Apr 2 20:05:40 vmhost02 pmxcfs[8683]: [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
Apr 2 20:05:40 vmhost02 corosync[4063]: [QUORUM] Members[2]: 1 2
Apr 2 20:05:40 vmhost02 pmxcfs[8683]: [status] notice: update cluster info (cluster name vmhosts-cluster, version = 24)
Apr 2 20:05:40 vmhost02 rgmanager[5735]: Reconfiguring
Apr 2 20:05:40 vmhost02 rgmanager[5735]: Loading Service Data
Apr 2 20:05:43 vmhost02 rgmanager[291782]: [pvevm] VM 110 is running
Apr 2 20:05:43 vmhost02 rgmanager[291794]: [pvevm] VM 112 is running
Apr 2 20:05:43 vmhost02 rgmanager[291808]: [pvevm] VM 111 is running
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Stopping changed resources.
Apr 2 20:05:43 vmhost02 pmxcfs[8683]: [status] notice: received log
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Restarting changed resources.
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Starting changed resources.
Apr 2 20:05:43 vmhost02 rgmanager[5735]: Initializing pvevm:109
Apr 2 20:05:43 vmhost02 rgmanager[5735]: pvevm:109 was added to the config, but I am not initializing it.
Apr 2 20:05:45 vmhost02 pmxcfs[8683]: [status] notice: received log
Apr 2 20:05:45 vmhost02 rgmanager[5735]: Migration: pvevm:109 is running on 1

*BUT* vm 109 was already running on node 2, I just can't understand why rgmanager decided it wasn't.
 
just another bit of information

this is the last piece of log on node 2 referring to vm 109 before the entries I just posted.

Apr 2 19:55:32 vmhost02 pvedaemon[286711]: start VM 109: UPID:vmhost02:00045FF7:07B95131:515B1B94:qmstart:109:root@pam:
Apr 2 19:55:32 vmhost02 pvedaemon[285902]: <root@pam> starting task UPID:vmhost02:00045FF7:07B95131:515B1B94:qmstart:109:root@pam:
Apr 2 19:55:33 vmhost02 kernel: device tap109i0 entered promiscuous mode
Apr 2 19:55:33 vmhost02 kernel: vmbr0: port 4(tap109i0) entering forwarding state
Apr 2 19:55:34 vmhost02 pvedaemon[285902]: <root@pam> end task UPID:vmhost02:00045FF7:07B95131:515B1B94:qmstart:109:root@pam: OK
Apr 2 19:55:42 vmhost02 rgmanager[286800]: [pvevm] VM 110 is running
Apr 2 19:55:44 vmhost02 kernel: tap109i0: no IPv6 routers present
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!