Fencing device on Proxmox 8.x ?

Pigi_102

New Member
May 24, 2024
10
0
1
Hello all, I'm trying to find a way to have and configure fencing devices on a test cluster, but can't find any good documentation.
In the manual, in section 15.7 there are very few information, but is reported to be present.
Ha Manager Fencing
Can someone point me to some documentation about configure and use some fence device ( like APC/ILO/IDRAC for example) ?
Eventually, also some good documentation on different kind of watchdog modules would be nice.

Thanks in advance

Pigi_102
 
Hello, and thanks for reply!
I install and configure, for work, quite a lot of linux cluster and fencing is a must. This always ensure that a splitbrain can never happens. Also, in a RHEL cluster ( with official RH support ) a production environment without fencing or with soft fencing is not even supported.
I've also configured quite a bit of other corosync/pacemakers clusters on other distro and always with a fence device.
It sound strange to me to have a corosync/pacemaker cluster without it. I feel more confortable to have a sort of STONITH ( shoot the other node in the head ) mechanism instead of a suicide one ( that is what you get with a watchdog ). Especially when you have some kind of shared storage without extremally strong reservation mechanism.
In the old SunCluster stuffs, for example, there was a package to use in case of NFS storage that used to modify exports on the ( supported ) storage to avoid access to the shares in case of a split-brain.
I thought it would be nice to have a STONITH fence ( like it used in old 3.x pve series ).

In the link you provided there is a bit of docs, but not plenty, unfortunally, or at least that's my understanding ( and still is a suicide mechanism not a STONITH).
But please, really thanks for the answer !
 
The node fences itself if it fails to connect to the cluster (i.e. when it loses corosync quorum) for longer than 60 seconds. While it is true that it is the node that is responsible for fencing itself, we have not had issues in practice with this difference. As a second layer of protection everything in the cluster filesystem is mounted as read only as soon as a node loses quorum.

Afaik the only thing that would prevent a node from fencing itself would be the kernel crashing (which would crash the watchdog), but at that point there should be no risk anyways since guests are not able to reach shared resources.
 
Thanks for your explanation !
I'll give a better look to the link you provided in your previous post, for the other kind of watchdog, just to see how they works.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!