Partial failures and fencing

mgiammarco

Renowned Member
Feb 18, 2010
165
10
83
Hello,
Proxmox used to support hardware fencing devices and ilom/ipmi fencing.
Now it supports only watchdog fencing.
I have a server in a proxmox cluster that, sometimes, loses all hard disks: it is a partial failure.
The watchdog timer does not start, the server seems alive but ceph osds are down.
Until that is not a big problem.
But without disks the ceph monitor does not work correctly and hangs ceph filesystem of all cluster.
Also you cannot do anything on the server because even commands as "ls" or "cp" or "reboot" do not work because they need to read from disks.
How can I solve this problem?
Thanks,
Mario
 
"I have a server in a proxmox cluster that, sometimes, loses all hard disks: it is a partial failure."
Wild guess: It is an HP with raid in jbod mode?
 
"I have a server in a proxmox cluster that, sometimes, loses all hard disks: it is a partial failure."
Wild guess: It is an HP with raid in jbod mode?
It is a Supermicro server with raid controller Areca 1883i with battery backup, supermicro backplane and 12g SAS disks.