DRBD config issues

skeates

New Member
Feb 12, 2015
24
0
1
I'm running through the following guide: http://pve.proxmox.com/wiki/DRBD and it all makes sense and is understandable. I'm hitting an issue when I am getting to the Prepare drbd configuration section.

I've followed through and setup the
Code:
/etc/drbd.d/global_common.conf
file as well as the
Code:
/etc/drbd.d/r0.res
file.

When I try to start DRBD I get what appear to be errors output below

Code:
root@prox01:/etc/drbd.d# /etc/init.d/drbd start
Starting DRBD resources:[ 
r0
no suitable meta data found :(
Command '/sbin/drbdmeta 0 v08 /dev/sdb1 internal check-resize' terminated with exit code 255
drbdadm check-resize r0: exited with code 255
d(r0) 0: Failure: (119) No valid meta-data signature found.

    ==> Use 'drbdadm create-md res' to initialize meta-data area. <==


[r0] cmd /sbin/drbdsetup 0 disk /dev/sdb1 /dev/sdb1 internal --set-defaults --create-device --no-disk-barrier --no-disk-flushes  failed - continuing!
 
s(r0) n(r0) ]..........
***************************************************************
 DRBD's startup script waits for the peer node(s) to appear.
 - In case this node was already a degraded cluster before the
   reboot the timeout is 60 seconds. [degr-wfc-timeout]
 - If the peer was available before the reboot the timeout will
   expire after 0 seconds. [wfc-timeout]
   (These values are for resource 'r0'; 0 sec -> wait forever)
 To abort waiting enter 'yes' [  23]:
 To abort waiting enter 'yes' [  27]:yes

0: State change failed: (-2) Need access to UpToDate data
Command '/sbin/drbdsetup 0 primary' terminated with exit code 17
0: State change failed: (-2) Need access to UpToDate data
Command '/sbin/drbdsetup 0 primary' terminated with exit code 17
0: State change failed: (-2) Need access to UpToDate data
Command '/sbin/drbdsetup 0 primary' terminated with exit code 17
0: State change failed: (-2) Need access to UpToDate data
Command '/sbin/drbdsetup 0 primary' terminated with exit code 17
0: State change failed: (-2) Need access to UpToDate data
Command '/sbin/drbdsetup 0 primary' terminated with exit code 17
.
root@prox01:/etc/drbd.d#

This happens on both nodes. My two nodes are called prox01 & prox02 dns is setup so if I nslookup prox01 or prox02 they resolve to the machine IP address which are 192.168.1.10 (prox01) & 192.168.1.11 (prox02).

The two DRBD nics that are linked are on 10.0.1.10 (prox01) & 10.0.1.11 (prox02)

My config file is below. I'm not quite sure what I am missing here. I did notice the shared secret option which at the moment says my secret should this be something specific to my setup and if so how do I find this?

Code:
resource r0 {
        protocol C;
        startup {
                wfc-timeout  0;     # non-zero wfc-timeout can be dangerous (http://forum.proxmox.com/threads/3465-Is-it-safe-to-use-wfc-timeout-in-DRBD-configuration)
                degr-wfc-timeout 60;
                become-primary-on both;
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "my-secret";
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                #data-integrity-alg crc32c;     # has to be enabled only for test and disabled for production use (check man drbd.conf, section "NOTES ON DATA INTEGRITY")
        }
        on prox01 {
                device /dev/drbd0;
                disk /dev/sdb1;
                address 10.0.1.10:7788;
                meta-disk internal;
        }
        on prox02 {
                device /dev/drbd0;
                disk /dev/sdb1;
                address 10.0.1.11:7788;
                meta-disk internal;
        }
    disk {
        # no-disk-barrier and no-disk-flushes should be applied only to systems with non-volatile (battery backed) controller caches.
        # Follow links for more information:
        # http://www.drbd.org/users-guide-8.3/s-throughput-tuning.html#s-tune-disable-barriers
        # http://www.drbd.org/users-guide/s-throughput-tuning.html#s-tune-disable-barriers
        no-disk-barrier;
        no-disk-flushes;
    }
}

my /etc/hosts file on each node looks like the following:

Code:
127.0.0.1 localhost.localdomain localhost
192.168.1.10 prox01.fourthfloorsolutions.com prox01 pvelocalhost
192.168.1.11 prox02.fourthfloorsolutions.com prox02 pvelocalhost

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Any help would be great.
 
Ok so after rebuilding the servers a few times i've finally managed to iron out most of these issues. I think most of them where down to me missing some of the information in the wiki and other where just down to not understanding some information correctly.

I have now got to the point where I am trying to create the physical volume on the LVM and when I try to run the following command:

Code:
pvecreate /dev/drbd0

I get the following out put:

Code:
[FONT=Menlo]Device /dev/drbd0 not found (or ignored by filtering).[/FONT]

if I run:

Code:
pvecreate -vvv /dev/drbd0

I get:

Code:
[FONT=Menlo]Setting activation/monitoring to 0[/FONT]
[FONT=Menlo]        Processing: pvcreate -vvv /dev/drbd0[/FONT]
[FONT=Menlo]        O_DIRECT will be used[/FONT]
[FONT=Menlo]      Setting global/locking_type to 1[/FONT]
[FONT=Menlo]      Setting global/wait_for_locks to 1[/FONT]
[FONT=Menlo]      File-based locking selected.[/FONT]
[FONT=Menlo]      Setting global/locking_dir to /run/lock/lvm[/FONT]
[FONT=Menlo]      Setting global/prioritise_write_locks to 1[/FONT]
[FONT=Menlo]      metadata/pvmetadataignore not found in config: defaulting to n[/FONT]
[FONT=Menlo]      metadata/pvmetadatasize not found in config: defaulting to 255[/FONT]
[FONT=Menlo]      metadata/pvmetadatacopies not found in config: defaulting to 1[/FONT]
[FONT=Menlo]      Locking /run/lock/lvm/P_orphans WB[/FONT]
[FONT=Menlo]        _do_flock /run/lock/lvm/P_orphans:aux WB[/FONT]
[FONT=Menlo]        _do_flock /run/lock/lvm/P_orphans WB[/FONT]
[FONT=Menlo]        _undo_flock /run/lock/lvm/P_orphans:aux[/FONT]
[FONT=Menlo]        /dev/drbd0: Added to device cache[/FONT]
[FONT=Menlo]        Opened /dev/drbd0 RO O_DIRECT[/FONT]
[FONT=Menlo]      /dev/drbd0: size is 860022880 sectors[/FONT]
[FONT=Menlo]        /dev/drbd0: block size is 4096 bytes[/FONT]
[FONT=Menlo]        /dev/drbd0: Skipping: Partition table signature found[/FONT]
[FONT=Menlo]        Closed /dev/drbd0[/FONT]
[FONT=Menlo]        /dev/drbd0: Skipping (cached)[/FONT]
[FONT=Menlo]      Setting devices/sysfs_scan to 1[/FONT]
[FONT=Menlo]      Setting devices/md_component_detection to 1[/FONT]
[FONT=Menlo]      Setting devices/multipath_component_detection to 1[/FONT]
[FONT=Menlo]      Setting devices/ignore_suspended_devices to 0[/FONT]
[FONT=Menlo]      Setting devices/cache_dir to /run/lvm[/FONT]
[FONT=Menlo]      Setting devices/write_cache_state to 1[/FONT]
[FONT=Menlo]        Opened /dev/drbd0 RO O_DIRECT[/FONT]
[FONT=Menlo]      /dev/drbd0: size is 860022880 sectors[/FONT]
[FONT=Menlo]        /dev/drbd0: block size is 4096 bytes[/FONT]
[FONT=Menlo]        /dev/drbd0: Skipping: Partition table signature found[/FONT]
[FONT=Menlo]        Closed /dev/drbd0[/FONT]
[FONT=Menlo]  Device /dev/drbd0 not found (or ignored by filtering).[/FONT]
[FONT=Menlo]      Unlocking /run/lock/lvm/P_orphans[/FONT]
[FONT=Menlo]        _undo_flock /run/lock/lvm/P_orphans[/FONT]

So I did some more digging and ran:

Code:
pvscan

which returned:

Code:
[FONT=Menlo]PV /dev/sda3   VG pve   lvm2 [136.20 GiB / 16.00 GiB free][/FONT]
[FONT=Menlo]  Total: 1 [136.20 GiB] / in use: 1 [136.20 GiB] / in no VG: 0 [0   ][/FONT]

So if I understand this correctly is /dev/drbd0 looking at /dev/sda3 which would already have a file system on instead of /dev/sdb1 which does not?

If that is the case how do I point /dev/drbd0 to /dev/sdb1
 
The output of pvscan is not related to DRBD.

If /Dev/drbd0 is missing then something is wrong, DRBD is not started, DRBD0 is not primary, configuration issue maybe.

What's the output of "cat /proc/drbd"?

Not wanting to offend you but be sure you really understand DRBD before using it in production. You can lose data very easily, happened to me once about 10 years ago.
 
Thanks for the advice. I've since rebuilt again and it seems to be working now. No offense taken I'm trying to get my head around how the whole system works and how to put it together correctly so I can start to put it into testing and then into production. To be honest I'll probably go with a support plan on this, but before then I want to try understand as much as possible about the system and how it all fits together.