A
ansanto
Guest
Assume we create all the (KVM) VMs in the slave node only, i.e. the slave node is the only one responsible for running all the VMs.
In my understanding, in such scenario a DRBD split-brain (or a clutster sb too) will never cause data incosistence just because all the VMs run in only one side of the cluster(!) while the other one doesn't issue any I/O operations on the replicated disk (just the ones issued by DRBD in order to sync the disks).
Is that correct?
That said, assume also we have some simple code:
- clonevm
scan and "clone" the VMs definitions files from the slave (node2:/etc/qemu-server/*.conf)
to the master (node1:/etc/qemu-server-clone/*.conf). It could be also scheduled for updating pursoses.
- initvm
copy the [selected|all] conf file from /etc/qemu-server-clone to /etc/qemu-server (inside the master node1). These initialized VM(s) are held down (stopped).
- startvm/stopvm
SystemV-style rc script which starts|stops the [selected|all] cloned VM (inside the master node1)
'initvm' and 'startvm' may be triggered by DRBD/Cluster events or started manually so that in case of the slave node2 failure we get a fast on-line migration (as VMs images are updated by DRBD).
In few words, replicated images and cloned VMs.
It is in the above scenario that we could experience data inconsistence due to the split-brain if the slave node2 becomes available and it starts all the VMs at boot(!).
In that case, however, the sb-recovery policy to be adopted is quite simple. Once the slave node2 disk is synced (DRBD manual/autonatic recovery) we have to stop the cloned VMs on the master node1 and start them (the original ones) on the slave node2.
Is that correct or I'm missing something? (I'm testing what I posted and it seems working like a charm...)
Antonio
In my understanding, in such scenario a DRBD split-brain (or a clutster sb too) will never cause data incosistence just because all the VMs run in only one side of the cluster(!) while the other one doesn't issue any I/O operations on the replicated disk (just the ones issued by DRBD in order to sync the disks).
Is that correct?
That said, assume also we have some simple code:
- clonevm
scan and "clone" the VMs definitions files from the slave (node2:/etc/qemu-server/*.conf)
to the master (node1:/etc/qemu-server-clone/*.conf). It could be also scheduled for updating pursoses.
- initvm
copy the [selected|all] conf file from /etc/qemu-server-clone to /etc/qemu-server (inside the master node1). These initialized VM(s) are held down (stopped).
- startvm/stopvm
SystemV-style rc script which starts|stops the [selected|all] cloned VM (inside the master node1)
'initvm' and 'startvm' may be triggered by DRBD/Cluster events or started manually so that in case of the slave node2 failure we get a fast on-line migration (as VMs images are updated by DRBD).
In few words, replicated images and cloned VMs.
It is in the above scenario that we could experience data inconsistence due to the split-brain if the slave node2 becomes available and it starts all the VMs at boot(!).
In that case, however, the sb-recovery policy to be adopted is quite simple. Once the slave node2 disk is synced (DRBD manual/autonatic recovery) we have to stop the cloned VMs on the master node1 and start them (the original ones) on the slave node2.
Is that correct or I'm missing something? (I'm testing what I posted and it seems working like a charm...)
Antonio
Last edited by a moderator: