Watchdog to make sure VMs/CTs don’t die

exp · Dec 31, 2023

It has happened 2-3x in the past year: a virtual machine or container was just killed, most likely due to an OOM condition.

While I understand that OOM conditions can happen and containers/VMs have to be killed as a last resort, this can be absolutely detrimental. For example, my router is a VM and I can loose remote access to my network (clearly I don’t want to expose the host directly).

Is there any watchdog function to restart containers/VMs? Is there the option to define prorities of containers ? For example, my NAS is much more acceptable to be killed than my router VM.

LnxBil · Dec 31, 2023

exp said:
Is there any watchdog function to restart containers/VMs?

Define them as HA. Watchdog exists in side of QEMU, yet it is just for inside if your OS crashes.

exp · Dec 31, 2023

Thanks, I’ll do a per VM watchdog separately.

My question is really about the whole VM dying, for example qemu being killed by host (eg due to OOM).

Defining as HA means “High Availability “? Does this require more than one instance though?

Dunuin · Dec 31, 2023

You could write a single bash script that polls the state of a VM and starts it in case it is stoppen by polling "qm status" and "qm start" commands.
I've seen some of such scripts in this forum so you might search for them. For example: https://forum.proxmox.com/threads/how-to-keep-the-vm-boot-state-is-not-shut-down？.100565/post-434266

mac.linux.free · Dec 31, 2023

https://tteck.github.io/Proxmox/

and search for monitor-all

Search

Search

Watchdog to make sure VMs/CTs don’t die

exp

New Member

LnxBil

Distinguished Member

exp

New Member

Dunuin

Distinguished Member

mac.linux.free

Renowned Member

We value your privacy