Proxmox VE 4.0 beta1 released!

Status
Not open for further replies.
All new Mainboards have hardware watchdogs so there is normally no need for extra device. If not softdog is a part of the kernel and works.
 
All new Mainboards have hardware watchdogs so there is normally no need for extra device. If not softdog is a part of the kernel and works.
Would the hardware watchdog work in conjunction with shared storage fencing, e.g how Pacemaker uses sbd?

In any of these cases, how do the other nodes know for certain that the rogue node has been killed? Do they simply wait for the watchdog timer period and then after that assume that the node has been killed?
 
It doesn't matter if the the node is not reachable from the cluster or off.
consider if the rest cluster has quorum then the cluster know everything is ok and the one node what is missing is not ok.
so after a time widows (dependence on different thinks) the cluster will believe the node is down (self fencing).
if no one has quorum it doesn't matter how the real state is, there are no node how can make decision.
 
We use a distributed locking mechanism, combined with the watchdog feature.
wolfgang said:
It doesn't matter if the the node is not reachable from the cluster or off.
consider if the rest cluster has quorum then the cluster know everything is ok and the one node what is missing is not ok.

The concern I have is if the rogue node were to "recover" from the failed state and then start writing bad data to the shared storage (e.g DRBD, ceph, NFS, etc) at the same time as the new node that took over its VMs. If this were to occur, wouldn't it result in two nodes writing data to the same VM image files at once and thus corrupting them?
 
[/COLOR]The concern I have is if the rogue node were to "recover" from the failed state and then start writing bad data to the shared storage (e.g DRBD, ceph, NFS, etc) at the same time as the new node that took over its VMs. If this were to occur, wouldn't it result in two nodes writing data to the same VM image files at once and thus corrupting them?

Our software solves exactly that problem. Or what do you think the software is for?
 
Our software solves exactly that problem. Or what do you think the software is for?
The mechanism that it uses to solve that problem is generic in that it can work with any shared storage backend (DRBD, NFS, Ceph, Gluster, iSCSI, etc)? Does it intercept I/O requests to the shared storage or how does it work?
 
Is it still recommended to use a 32bit OS in an LXC container in terms of resources?
 
I see in the docs that the minimum number of nodes for an ha confiuration is 3. Is a 2 node cluster no longer supported?
 
How I wrote before Corosync has no quorumdisk anymore. That the point.
And it is not impossible but not reliable. Or where is the decision breaker when you have a 2 node cluster split?
 
Last edited:
Do you mean using:quorum {provider: corosync_votequorumtwo_node: 1}I'm no cluster expert, but how does other softwares deal with 2 node clusters (eg VMware) without fencing and quorum disks?
 
I don't know what other softwares (eg VMware) do.
But I know one thing for sure it is not possible to make a reliable cluster with ha with less then 3 decision makers.
This can be different implemented eg shared Storage.
see https://en.wikipedia.org/wiki/Byzantine_fault_tolerance
 
Status
Not open for further replies.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!