DRBD in HA - PRA

Cyt

New Member
Apr 29, 2013
6
0
1
Hello,

I have two nodes in DRBD cluster.

The configuration is :

/etc/drbd.conf :

Code:
# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example
 
include "drbd.d/global_common.conf";
include "drbd.d/*.res";

Global_common.conf:

Code:
global { usage-count no; }
common {
    protocol C;
    startup {
        degr-wfc-timeout 120;
#        become-primary-on proxmox001;
        become-primary-on both;
    }
    disk {
    }
    net {
        allow-two-primaries;
        after-sb-0pri discard-zero-changes;
        after-sb-1pri discard-secondary;
        after-sb-2pri disconnect;
    }
    syncer {
        verify-alg md5;
        rate 30M;
    }
}
R0.res :
Code:
resource r0 {
        protocol C;
        on proxmox001 {
                device /dev/drbd0;
                disk /dev/mapper/pve-lv_data;
                address 192.168.0.1:7788;
                meta-disk internal;
        }
        on proxmox002 {
                device /dev/drbd0;
                disk /dev/mapper/pve-lv_data;
                address 192.168.0.2:7788;
                meta-disk internal;
        }
}

When a physical host have a instability, I lost the drbd link.

In normal time, /proc/drbd :

Code:
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:421142 nr:55715 dw:8498959 dr:10994034 al:1144 bm:420 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0


And while an instability, /proc/drbd return :

Code:
cs:Standalone or cs:WFConnection st:Secondary/Unknown

And when I would restart DRBD Sync, I have an error with UpToDate or
Code:
ERROR: Module drbd is in use proxmox

For instant, the solution is to stop the running VM, restart the drbd services for each nodes and to do a sync. After this, Start the VMs which are stopped before.

Thanks per advance for your help !


EDIT


Hello all,

I try differents tests then Google suggeste :



Code:
root@proxmox001:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbdvg" using metadata type lvm2


root@proxmox001:~# vgchange -an /dev/drbdvg
  Can't deactivate volume group "drbdvg" with 1 open logical volume(s)


root@proxmox001:~# /sbin/vgchange -a y
  4 logical volume(s) in volume group "pve" now active
  2 logical volume(s) in volume group "drbdvg" now active

root@proxmox001:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:8523047 dr:11025118 al:1147 bm:421 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8080

root@proxmox001:~# drbdadm connect all

root@proxmox001:~# drbdadm verify r0
0: State change failed: (-15) Need a connection to start verify or resync
Command 'drbdsetup 0 verify' terminated with exit code 11

root@proxmox001:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:8523371 dr:11025534 al:1147 bm:421 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8288

root@proxmox001:~# drbdadm secondary all
0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup 0 secondary' terminated with exit code 11

root@proxmox001:~# drbdadm up r0
0: Failure: (124) Device is attached to a disk (use detach first)
Command 'drbdsetup 0 disk /dev/mapper/pve-lv_data  /dev/mapper/pve-lv_data internal --set-defaults --create-device'  terminated with exit code 10


Good day & thanks per advance



EDIT :


Now, I use this for restart & resync DRBD :

- I stop the running VMs
- I disable the vg
- I restart the drbd service and the resync running


Code:
root@proxmox001:~# service drbd stop
Stopping all DRBD resources:/dev/drbd0: State change failed: (-12) Device is held open by someone
ERROR: Module drbd is in use
.
root@proxmox001:~# drbdadm detach r0
0: State change failed: (-2) Need access to UpToDate data
Command 'drbdsetup 0 detach' terminated with exit code 17

root@proxmox001:~# vgchange -an /dev/drbdvg
  0 logical volume(s) in volume group "drbdvg" now active

root@proxmox001:~# service drbd stop
Stopping all DRBD resources:.

root@proxmox001:~# service drbd start
Starting DRBD resources:[ d(r0) s(r0) n(r0) ].

root@proxmox001:~#


But I have always an production interrupt ...

Thanks
 
Last edited:
Hello all,

I try differents tests then Google suggeste :



Code:
root@proxmox001:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbdvg" using metadata type lvm2


root@proxmox001:~# vgchange -an /dev/drbdvg
  Can't deactivate volume group "drbdvg" with 1 open logical volume(s)


root@proxmox001:~# /sbin/vgchange -a y
  4 logical volume(s) in volume group "pve" now active
  2 logical volume(s) in volume group "drbdvg" now active

root@proxmox001:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:8523047 dr:11025118 al:1147 bm:421 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8080

root@proxmox001:~# drbdadm connect all

root@proxmox001:~# drbdadm verify r0
0: State change failed: (-15) Need a connection to start verify or resync
Command 'drbdsetup 0 verify' terminated with exit code 11

root@proxmox001:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:8523371 dr:11025534 al:1147 bm:421 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8288

root@proxmox001:~# drbdadm secondary all
0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup 0 secondary' terminated with exit code 11

root@proxmox001:~# drbdadm up r0
0: Failure: (124) Device is attached to a disk (use detach first)
Command 'drbdsetup 0 disk /dev/mapper/pve-lv_data /dev/mapper/pve-lv_data internal --set-defaults --create-device' terminated with exit code 10


Good day & thanks per advance



EDIT :


Now, I use this for restart & resync DRBD :

- I stop the running VMs
- I disable the vg
- I restart the drbd service and the resync running


Code:
root@proxmox001:~# service drbd stop
Stopping all DRBD resources:/dev/drbd0: State change failed: (-12) Device is held open by someone
ERROR: Module drbd is in use
.
root@proxmox001:~# drbdadm detach r0
0: State change failed: (-2) Need access to UpToDate data
Command 'drbdsetup 0 detach' terminated with exit code 17

root@proxmox001:~# vgchange -an /dev/drbdvg
  0 logical volume(s) in volume group "drbdvg" now active

root@proxmox001:~# service drbd stop
Stopping all DRBD resources:.

root@proxmox001:~# service drbd start
Starting DRBD resources:[ d(r0) s(r0) n(r0) ].

root@proxmox001:~#


But I have always an production interrupt ...

Thanks
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!