Understanding how migration and failover work in Proxmox/Ceph cluster?

victorhooi

Well-Known Member
Apr 3, 2018
253
20
58
38
Hi,

Say I have a Proxmox cluster, with Ceph as the shared storage for VMs. Our VMs are mostly running Windows, and clients access them via RDP.

To confirm - migrating a running VM from one node to another should be fairly quick, and the VM stays running for the whole period - so a RDP session might not drop, right

However, if a node suddenly fails - what happens? Eg. what happens to the memory contents of the VM on the failed node? Is there any way to get it to shat it migrates across seamlessly?

Thanks,
Victor
 
However, if a node suddenly fails - what happens? Eg. what happens to the memory contents of the VM on the failed node? Is there any way to get it to shat it migrates across seamlessly?
No, if a node failed the VM/CT of that node will be started on another node.
 
Right - so it will be a new boot of that VM.

Curious - is there any method, or scenario under which it could be seamless migrated over, without a restart? Is such a thing possible under Proxmox (or elsewheere)?
 
Curious - is there any method, or scenario under which it could be seamless migrated over, without a restart? Is such a thing possible under Proxmox (or elsewheere)?
If you mean something like fault-tolerant (hot-standby), then no. Code for this landed in qemu, but to my knowledge it is not production ready yet.
 
for some services there is also the option of doing such hot standby/failover on the application level - e.g. some database solutions offer it. it is very costly though, since you basically have >2x the load all the time. for most scenarios, shared storage for having a readily available cold standby (with automatic failover with our HA stack) is more than enough.