DRBD comes up with Connected Diskless/Diskless after reboot

Gabor Koltai

New Member
Mar 12, 2019
2
0
1
49
After a reboot the DBRB comes up with Connected Diskless/Diskless status.

About the environment:

This drbd resource normaly used as a block storage for lvm, which configured as an (shared lvm) storage to a proxmox ve 5.3-8 cluster. On top of drbd block device an lvm configured, but on drbd host lvm config the device (/dev/nvme0n1p1) below drbd serivice are filtered out (/etc/lvm/lvm.conf shown below)

Things are NOT hepls:

  • Reboot again not help
  • drbd service restart not help
  • drbdadm detach/disconnect/attach/service restart not help
  • nfs-kernel-server service aren't confiured on these drbd nodes (so cannot unconfigure nfs-server)

After some investigation:

dump-md response: Found meta data is "unclean", please apply-al first
apply-al command terminated with exit code 20 with this message:
open(/dev/nvme0n1p1) failed: Device or resource busy
It seems that the problem is that this device (/dev/nvme0n1p1) used by my
drbd resource config cannot be opened exclusive.


Failing DRBD commands:

root@pmx0:~# drbdadm attach r0
open(/dev/nvme0n1p1) failed: Device or resource busy
Operation canceled.

Command 'drbdmeta 0 v08 /dev/nvme0n1p1 internal apply-al' terminated with exit code 20
root@pmx0:~# drbdadm apply-al r0
open(/dev/nvme0n1p1) failed: Device or resource busy
Operation canceled.

Command 'drbdmeta 0 v08 /dev/nvme0n1p1 internal apply-al' terminated with exit code 20

root@pmx0:~# drbdadm dump-md r0
open(/dev/nvme0n1p1) failed: Device or resource busy

Exclusive open failed. Do it anyways?
[need to type 'yes' to confirm] yes

Found meta data is "unclean", please apply-al first
Command 'drbdmeta 0 v08 /dev/nvme0n1p1 internal dump-md' terminated with exit code 255



DRBD service status/commands:

root@pmx0:~# drbd-overview
0:r0/0 Connected Secondary/Secondary Diskless/Diskless
root@pmx0:~# drbdadm dstate r0
Diskless/Diskless
root@pmx0:~# drbdadm disconnect r0
root@pmx0:~# drbd-overview
0:r0/0 . . .
root@pmx0:~# drbdadm detach r0
root@pmx0:~# drbd-overview
0:r0/0 . . .



Trying reattach resource r0:

root@pmx0:~# drbdadm attach r0
open(/dev/nvme0n1p1) failed: Device or resource busy
Operation canceled.
Command 'drbdmeta 0 v08 /dev/nvme0n1p1 internal apply-al' terminated with exit code 20
root@pmx0:~# drbdadm apply-al r0
open(/dev/nvme0n1p1) failed: Device or resource busy
Operation canceled.
Command 'drbdmeta 0 v08 /dev/nvme0n1p1 internal apply-al' terminated with exit code 20



lsof, fuser zero output:

root@pmx0:~# lsof /dev/nvme0n1p1
root@pmx0:~# fuser /dev/nvme0n1p1
root@pmx0:~# fuser /dev/nvme0n1
root@pmx0:~# lsof /dev/nvme0n1


Resource disk partition and LVM config:

root@pmx0:~# fdisk -l /dev/nvme0n1
Disk /dev/nvme0n1: 1.9 TiB, 2048408248320 bytes, 4000797360 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x59762e31

Device Boot Start End Sectors Size Id Type
/dev/nvme0n1p1 2048 3825207295 3825205248 1.8T 83 Linux
root@pmx0:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sdb2 pve lvm2 a-- 135.62g 16.00g
root@pmx0:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 135.62g 16.00g
root@pmx0:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-a-tz-- 75.87g 0.00 0.04
root pve -wi-ao---- 33.75g
swap pve -wi-ao---- 8.00g
root@pmx0:~# vi /etc/lvm/lvm.conf
root@pmx0:~# cat /etc/lvm/lvm.conf | grep nvm
filter = [ "r|/dev/nvme0n1p1|", "a|/dev/sdb|", "a|sd.*|", "a|drbd.*|", "r|.*|" ]


OTHER NODE:

root@pmx1:~# drbd-overview
0:r0/0 Connected Secondary/Secondary Diskless/Diskless


and so on every command responses and configurations showing the same like node pmx0 above...

DRBD resource config:

root@pmx0:~# cat /etc/drbd.d/r0.res
resource r0 {
protocol C;
startup {
wfc-timeout 0;
degr-wfc-timeout 300;
become-primary-on both;
}
net {
cram-hmac-alg sha1;
shared-secret "*********";
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
on pmx0 {
device /dev/drbd0;
disk /dev/nvme0n1p1;
address 10.0.20.15:7788;
meta-disk internal;
}
on pmx1 {
device /dev/drbd0;
disk /dev/nvme0n1p1;
address 10.0.20.16:7788;
meta-disk internal;
}
disk {

no-disk-barrier;
no-disk-flushes;
}
}



Debian and DRBD versions:

root@pmx0:~# uname -a
Linux pmx0 4.15.18-10-pve #1 SMP PVE 4.15.18-32 (Sat, 19 Jan 2019 10:09:37 +0100) x86_64 GNU/Linux
root@pmx0:~# cat /etc/debian_version
9.8
root@pmx0:~# dpkg --list| grep drbd
ii drbd-utils 8.9.10-2 amd64 RAID 1 over TCP/IP for Linux (user utilities)
root@pmx0:~# lsmod | grep drbd
drbd 364544 1
lru_cache 16384 1 drbd
libcrc32c 16384 2 dm_persistent_data,drbd
root@pmx0:~# modinfo drbd
filename: /lib/modules/4.15.18-10-pve/kernel/drivers/block/drbd/drbd.ko
alias: block-major-147-*
license: GPL
version: 8.4.10
description: drbd - Distributed Replicated Block Device v8.4.10
author: Philipp Reisner <phil@linbit.com>, Lars Ellenberg <lars@linbit.com>
srcversion: 9A7FB947BDAB6A2C83BA0D4
depends: lru_cache,libcrc32c
retpoline: Y
intree: Y
name: drbd
vermagic: 4.15.18-10-pve SMP mod_unload modversions
parm: allow_oos:DONT USE! (bool)
parm: disable_sendpage:bool
parm: proc_details:int
parm: minor_count:Approximate number of drbd devices (1-255) (uint)
parm: usermode_helper:string


MOUNTS:

root@pmx0:~# cat /proc/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,relatime 0 0
udev /dev devtmpfs rw,nosuid,relatime,size=24679656k,nr_inodes=6169914,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=4940140k,mode=755 0 0
/dev/mapper/pve-root / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/rdma cgroup rw,nosuid,nodev,noexec,relatime,rdma 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=39,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=20879 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
sunrpc /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
/dev/sda1 /mnt/intelSSD700G ext3 rw,relatime,errors=remount-ro,data=ordered 0 0
lxcfs /var/lib/lxcfs fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
/dev/fuse /etc/pve fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
10.0.0.15:/samba/shp /mnt/pve/bckNFS nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.0.0.15,mountvers=3,mountport=42772,mountproto=udp,local_lock=none,addr=10.0.0.15 0 0
 
Last edited:
Hi Gabor,
It looks like that I try to have a quite similar setup and unfortunately the problem as well. I'm trying to setup the following:
  • ProxMox VE 6.1
  • BCACHE
  • DRBD 9.0
  • With or without linbit linstor (I don't care for that, would be nice but gave me hell on the first try)
  • 2 Servers IBM x3650,
    • each server 2CPU 2x4Cores
    • 36GB RAM
    • 2x onboard NICs
    • 1x4 port Intel 82576 NIC
    • 1x 10GE Tehuti NIC (cross wire connected)each
  • 1x PCIe SSD 128GB as BCACHE device for the storage devices
  • 2x 76GB Raid1 SAS drives for ProxMox
  • 6x 146G Raid6
  • Storage "design"
    • 500GB SAS (dev/sdb1) + 50GB SSD (/dev/nvme1n1p1, as Cache) >> drbd01 "device" (/dev/bcache1) volume - then as LVM storage to Proxmox
    • 150GB SAS (dev/sdb2) + 50GB SSD (/dev/nvme1n1p2, as Cache) >> drbd02 "device" (/dev/bcache2) volume - then as LVM storage to Proxmox
  • 10G for storage replication, the other NICs as MGMT and VMNET (LACP -> which drives me nuts as well)
What now happens when I install and configure the whole thing, is that everything works perfectly fine before a reboot of one node. Sync works, migration works, creation of vms work. After the reboot DRBD is unable to start and states diskless on both nodes. When I try to start it with force or whatever option I receive "device or resource is busy). I first tought that this could be because of BCACHE creating its device before DRBD (which is correct in my opinion). But I also tried the setup without BCACHE = same behavior.
  • lsof of the devices tells me nothing
  • there are no mount points for the devices
    • I've read about /proc/mounts which I will check as well today
  • the DRBD cannot be recreated as well
  • Funny is, I can change the LVM and add the /dev/bcache1 in Proxmox as LVM directly and can start my VMs (locally)
Since I'm currently not on site I cannot state the full configuration files right now.

Basically my question to you is, have you been able to sort/solve that problem?

regards
 
You still need help with this? I had the same issue and figured out the solution.

I my case the logical volumes (the raw disks) where activated on the physical partition of my disk instead of the drbd device (/dev/sdb1 instead of /dev/drbd0).
 
Finaly I have found the root cause of this problem!
At boot time the device mapper create mapping of logical volume on top of drbd device linked to the backing device, but it occure before drbd service startup. After this drbd cannot open exclusively the backing device resource. If I remove the mapping of lv device manualy, then drbd resource can be switch up.

Mapping can be remove by dmsetup remove <volume name> like the example below:


root@pmx0:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 893.8G 0 disk └─sda1 8:1 0 744.6G 0 part └─drbd--intel800--vg-vm--103--disk--0 253:2 0 80G 0 lvm sdb 8:16 0 136.1G 0 disk ├─sdb1 8:17 0 512M 0 part └─sdb2 8:18 0 135.6G 0 part ├─pve-swap 253:0 0 8G 0 lvm [SWAP] └─pve-root 253:1 0 33.8G 0 lvm / sdc 8:32 0 1.8T 0 disk └─sdc1 8:33 0 1.8T 0 part └─drbd1 147:1 0 1.8T 1 disk sr0 11:0 1 1024M 0 rom root@pmx0:~# dmsetup info -c Name Maj Min Stat Open Targ Event UUID pve-swap 253 0 L--w 2 1 0 LVM-upAG64GGzE9OLCOcDKvIwuNVzCg238v0xxrfApwyCQdQN3HBHnpPOhCSJe0eMQP3 pve-root 253 1 L--w 1 1 0 LVM-upAG64GGzE9OLCOcDKvIwuNVzCg238v0kYEDRlWWy5IXJYWqB2Fzc117JT9w2004 drbd--intel800--vg-vm--103--disk--0 253 2 L--w 0 1 0 LVM-849ik4y1F5s9tZbA21R2TaiI9uK42SPp4waMZVBqzudKY3vXBxAV3IULRlEthcGW root@pmx0:~# drbdadm up r0 open(/dev/sda1) failed: Device or resource busy Operation canceled. Command 'drbdmeta 0 v08 /dev/sda1 internal apply-al' terminated with exit code 20 root@pmx0:~# dmsetup remove drbd--intel800--vg-vm--103--disk--0 root@pmx0:~# dmsetup info -c Name Maj Min Stat Open Targ Event UUID pve-swap 253 0 L--w 2 1 0 LVM-upAG64GGzE9OLCOcDKvIwuNVzCg238v0xxrfApwyCQdQN3HBHnpPOhCSJe0eMQP3 pve-root 253 1 L--w 1 1 0 LVM-upAG64GGzE9OLCOcDKvIwuNVzCg238v0kYEDRlWWy5IXJYWqB2Fzc117JT9w2004 root@pmx0:~# drbdadm attach r0 Marked additional 4948 MB as out-of-sync based on AL. root@pmx0:~# drbd-overview 0:r0/0 Connected Secondary/Secondary UpToDate/Diskless

After doing the same thing on the other node drbd disk status goes to UpToDate/UpToDate, can be set up to Primary/Primary by command drbdadm primary r0

root@pmx1:~# drbdadm attach r0 root@pmx1:~# drbd-overview 0:r0/0 SyncTarget Secondary/Secondary Inconsistent/UpToDate [==>.................] sync'ed: 17.3% (4100/4948)M root@pmx0:~# drbd-overview 0:r0/0 Connected Secondary/Secondary UpToDate/UpToDate


Unfortunately this problem occurs on every reboot. The device mapper create the mapping again, and the same issue occurs.

So drbd cannot set up resource on boot time. Didn't found any clear solution to solve this. The lvm filter in lvm.conf not affects this issue, only filters pv/vg/lv-s from scanning, but device mapper still create mapping at boot time. Unfortunately I cannot found solution to prevent it.

I have a workaround, but I dont like it. Place the dmsetup remove <devicename> command on the drbd startup script at the begining of start) section. With this, the script remove the mapping before start drbd, but it needs the exact names of the lv volumes. Every time I create an new lv on the shared vg, it needs to put in the drbd startup script..
 
Last edited:
I found the solution of this issue make the entry in following file to avoid scan of block devices and update initram and reboot the system

cat /etc/lvm/lvm.conf
global_filter = [ "r|/dev/zd.*|","r|/dev/sda[0-9]|", "a|/dev/vgdrbd*|","r|/dev/mapper/vg*|", "r|/dev/mapper/pve-.*|" "r|/dev/mapper/.*-(vm|base)--[0-9]+--disk--[0-9]+|"]



update-initramfs -u -----And reboot the machine
 

Attachments

  • 1599036105433.png
    1599036105433.png
    1.7 KB · Views: 6
cat /etc/lvm/lvm.conf
global_filter = [ "r|/dev/zd.*|","r|/dev/sda[0-9]|", "a|/dev/vgdrbd*|","r|/dev/mapper/vg*|", "r|/dev/mapper/pve-.*|" "r|/dev/mapper/.*-(vm|base)--[0-9]+--disk--[0-9]+|"]
careful - the format of global_filter is to match regular expressions (and not shell globs) - i.e. * matches 0 or more occurences of the previous character - in this case it would match '/dev/vgdrb', '/dev/vgdrbdddddd', and so on (same for the /dev/mapper/vg*)

you probably want:
Code:
global_filter = [ "r|/dev/zd.*|","r|/dev/sda[0-9]|", "a|/dev/vgdrbd.*|","r|/dev/mapper/vg.*|", "r|/dev/mapper/pve-.*|" "r|/dev/mapper/.*-(vm|base)--[0-9]+--disk--[0-9]+|"]
 
  • Like
Reactions: Stefan_R

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!