The both nodes are stoping by idrac at the same time

djahida

Active Member
Mar 24, 2015
44
0
26
Hi,
I configure a cluster Proxmox HA with 3 nodes (Two proxmox + one quorum)
The fencing is by iDRAC express
The problem is when I test, I have the both nodes stoping by idrac at the same time.
Proxmox version's: 3.4.1
Do you have this problem before? or do you have some suggestion?
Thank you
 
Hi,

I don't see the third node in cluster.conf.

also

"Nov 20 10:45:52 SRPP104 corosync[15661]: [CPG ] chosen downlist: sender r(0) ip(192.168.100.151) ; members(old:2 left:1)"

Show that only 2 nodes are in the cluster.


what is the output of :

#pvecm status

?

(when all the nodes are online)

 
Oh, ok sorry.

This is strange than in log, you lost contact to quorum device.

Nov 20 10:45:52 SRPP104 corosync[15661]: [CMAN ] lost contact with quorum device
Nov 20 10:45:52 SRPP104 corosync[15661]: [CMAN ] quorum lost, blocking activity


How do you test the HA feature ? killing a node ?
 
can you test with simulate a kernel panic ?

echo c > /proc/sysrq-trigger

Sorry, for the late reply
I have not test "echo c > /proc/sysrq-trigger" yet.
I juste want to precise that I have this problem, in this case:
When I unplug the cable from eth 0 on the first machine
the first will stopped and all VM are migrated to the second, (the cluster works fine)
I start the first machine, when I'm sure that the cluster worked .I repeat the same actions on the same machine, the first one.
and here I have a probleme, both machines are stopped.
Do you think, that the way of testing is wrong?
 
Last edited:
Sorry, for the late reply
I have not test "echo c > /proc/sysrq-trigger" yet.
I juste want to precise that I have this problem, in this case:
When I unplug the cable from eth 0 on the first machine
the first will stopped and all VM are migrated to the second, (the cluster works fine)
I start the first machine, when I'm sure that the cluster worked .I repeat the same actions on the same machine, the first one.
and here I have a probleme, both machines are stopped.
Do you think, that the way of testing is wrong?
Hi spirit,
I test the simulate kernel panic with echo c > /proc/sysrq-trigger
The cluster woks fine
Do you have any idea why I have the both nodes stopped when I test whith puling the network cable?
Thanks