[SOLVED] Understanding HA and Watchdog

Grunt

New Member
Sep 6, 2022
26
4
3
I've got a two node (plus qnode) cluster, consisting of consumer hardware, connected to TrueNAS storage over iSCSI. My goal here is that if one node goes down, any HA configured resources will be spun up on the surviving host. Under fencing, all it says is use watchdog based fencing. However, I can seem to figure out where the fencing needs to occur. I would think it would be on the proxomox host that died, but all the guides I find on the internet reference setting it up on the guest VMs. I also gave myself a quick scare when I tried to install watchdog on the host and it said it would uninstall proxmox.

So, where should software watchdog be configured, on the host or guest, and if on the host...how?

As far as understanding HA, if the node a VM lives on dies, it should be migrated over to the survivng host and started up, right?
 
Did you already check our docs for that topic?
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_fencing

So, where should software watchdog be configured, on the host or guest, and if on the host...how?
Nothing is to be configured if you want to use the Linux Kernel softdog, yes that runs on the host but is already the default.
You'd only need to configure something if you want to use a hardware based watchdog, e.g., one that your server's CPU or management stack provides – IME those are not always "better" (some of those drivers/firmware are rather a bit lacking) and the softdog is so simple and reliable that I've never seen it fail.
 
  • Like
Reactions: Grunt
I did. Since the section only references hardware watchdog, it led me to believe it was something to do on the host since it's the only thing that has 'real' hardware to it.

Oh, and out of interest: Which guides?
Just a search for 'configure proxmox watchdog' and the like the first dozen or so links I looked through. I couldn't find a definitive answer of, "Setup watchdog on proxmox hosts in case one fails and all HA resources will failover to surviving node(s)."
 
  • Like
Reactions: DynFi User
I kind of agree with M. Grunt, the explanations on this specific section in the userguide are falling short (talking about "§.15.7.1 How Proxmox VE Fences" and "§.15.7.2 Configure Hardware Watchdog").

I think it lacks a simple phrase such as:

"By default Proxmox-VE will use softdog which is automatically enabled with any PVE system" or something similare describing, what is enabled, what needs to be configured (or not) and how it operates.

At this stage this section is confusing.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!