Watchdog to make sure VMs/CTs don’t die

exp

New Member
Jul 20, 2023
19
0
1
It has happened 2-3x in the past year: a virtual machine or container was just killed, most likely due to an OOM condition.

While I understand that OOM conditions can happen and containers/VMs have to be killed as a last resort, this can be absolutely detrimental. For example, my router is a VM and I can loose remote access to my network (clearly I don’t want to expose the host directly).

Is there any watchdog function to restart containers/VMs? Is there the option to define prorities of containers ? For example, my NAS is much more acceptable to be killed than my router VM.
 
Thanks, I’ll do a per VM watchdog separately.

My question is really about the whole VM dying, for example qemu being killed by host (eg due to OOM).

Defining as HA means “High Availability “? Does this require more than one instance though?