drbd problem with 2-node HA

zordio

New Member
Jun 22, 2009
21
0
1
I have a 2-node HA cluster set up as per http://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster. When I first bring the servers up, clustat will report both nodes and the quorum disk all online. However, if I reboot a node, it can not connect to drbd on the second node. The second node also can not be rebooted gracefully, it must be powered off.

The output of /proc/drbd on the first node:
Code:
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51
 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:16384

The output of /proc/drbd on the second node:version:
Code:
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   s-----
    ns:16396 nr:0 dw:0 dr:16476 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Any suggestions? Any other info needed?
 
Last edited:
The real problem with the post was that I have an old version of Chrome. Fixed now. Problem with servers remains.
 
Hi, yes; understood. For what it is worth, my testing with 2-node HA DRBD config left me not keen to use it in production. (Behaviour when one host was unexpectedly rebooted - ie - simulating hardware fail and recovery - was not great, at least in my tests). This was a while ago however.

My general experience, to date, is that decent server boxes (redundant power, raid/redundant HDD) are so unlikely to suffer hardware-related crash/fails (in contrast to human error, network glitches, or OS level fails such as kernel panic, BSOD or similar -- which are more common than profound redundant hardware failure, IMHO) that the benefits of "HA" when weighed against the added complexity in terms of management and operation -- are simply not worth it. However, I do appreciate that some people choose to configure these sort of features and want such things. But I'm just speaking from my own experience..

Tim