drbd issue after upgrading one of 2 nodes.

RobFantini

Famous Member
May 24, 2012
2,043
111
133
Boston,Mass
Hello
after upgrading one of 2 cluster nodes to Proxmox 3 [ fbc243 ] , we have some drbd issues.

we use a primary/primary set up.

on upgrade the file /etc/drbd.d/global_common.conf was different on the upgraded node, resulting in the following:
node fbc241
Code:
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51

 1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:792045025 nr:6368318 dw:815806772 dr:756083736 al:7683902 bm:17026 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:6640200
 2: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:1268921457 dr:499815611 al:132 bm:241 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:114800

node fbc243:
Code:
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51

 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:12400 dr:80812 al:47 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:13808
 2: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:327917 dr:2087349 al:1308 bm:180 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:2584024

When I started a manual recovery, there was an error message about verify-alg being different. I checked and founf that /etc/drbd.d/global_common.conf had been changed .

Has anyone else run in to this? We will get it fixed but it is a PIA. In Debian configuration files are not supposed to get clobbered. so there could be somethng wrong with how i ran the upgrade script, or the script itself.
 
this issue was caused by this file getting over written during the upgrade: /etc/drbd.d/global_common.conf

DRBD did not like the fact the files were different on both ends. We had set this per pve wiki: verify-alg sha1 . After the upgrade there was a different verify-alg on each node so drbd stopped working. We were left with split brain , and although we had not dealt with an drbd issue in over a year, docs at drbd.com had the answers .

What I learned is : upgrade both drbd nodes at the same time. make sure the configs are the same. I had tried to do one node in the morning figuring at noon to do the other.
 
Thanks for sharing this, I will be upgrading 16 DRBD nodes to 3.x soon. If config files are the only upgrade related problems things should go smooth.

The problem you had is one reason I love Chef.
http://www.opscode.com/chef/

All of my proxmox nodes are configured by chef including setting up the DRBD config files and even performing the initial pairing and sync of the DRBD volumes.