[noob needs help] Can't get Proxmox working with Linstor DRBD

Steve087

New Member
Sep 15, 2025
29
1
3
Hi I’m pretty new to Proxmox and a total noob when it come to Linstor and DRDB. I have a 2 node Proxmox-Cluster with an additional corosync-qdevice running on my Unraid-NAS.

Now I try to get DRBD working but unfortunately I fail. I followed these tutorials:

https://linbit.com/drbd-user-guide/instor-guide-1_0-en/#ch-proxmox-linstor

https://linbit.com/blog/setting-up-...age-for-proxmox-using-linstor-the-linbit-gui/

and this video:

https://youtu.be/pP7nS_rmhmE

From my point of view all should be configured correctly. I also see the DRBD storage in the Proxmox Web-UI. Unfortunately when I try to move a VM disk to the DRBD storage or try to create a new VM on the DRBD storage I get an error saying that I do not have enough nodes.

All seems to be in order though:
Code:
root@gateway:~# linstor node list
╭─────────────────────────────────────────────────────────╮
┊ Node    ┊ NodeType ┊ Addresses                 ┊ State  ┊
╞═════════════════════════════════════════════════════════╡
┊ gateway ┊ COMBINED ┊ 192.168.2.31:3366 (PLAIN) ┊ Online ┊
┊ tvbox   ┊ COMBINED ┊ 192.168.2.32:3366 (PLAIN) ┊ Online ┊
╰─────────────────────────────────────────────────────────╯
root@gateway:~# linstor storage-pool list
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node    ┊ Driver   ┊ PoolName                       ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName                   ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ gateway ┊ DISKLESS ┊                                ┊              ┊               ┊ False        ┊ Ok    ┊ gateway;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ tvbox   ┊ DISKLESS ┊                                ┊              ┊               ┊ False        ┊ Ok    ┊ tvbox;DfltDisklessStorPool   ┊
┊ ha-lvm-storage       ┊ gateway ┊ LVM_THIN ┊ ha-lvm-thin-vg/ha-lvm-thinpool ┊   381.36 GiB ┊    381.36 GiB ┊ True         ┊ Ok    ┊ gateway;ha-lvm-storage       ┊
┊ ha-lvm-storage       ┊ tvbox   ┊ LVM_THIN ┊ ha-lvm-thin-vg/ha-lvm-thinpool ┊   381.36 GiB ┊    381.36 GiB ┊ True         ┊ Ok    ┊ tvbox;ha-lvm-storage         ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
root@gateway:~# linstor resource-group list
╭───────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter                   ┊ VlmNrs ┊ Description ┊
╞═══════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2                  ┊        ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ ha-lvm-rg     ┊ PlaceCount: 2                  ┊ 0      ┊             ┊
┊               ┊ StoragePool(s): ha-lvm-storage ┊        ┊             ┊
┊               ┊ LayerStack: ['DRBD']           ┊        ┊             ┊
╰───────────────────────────────────────────────────────────────────────╯
root@gateway:~# linstor resource-definition list
╭───────────────────────────────────────────────╮
┊ ResourceName ┊ ResourceGroup ┊ Layers ┊ State ┊
╞═══════════════════════════════════════════════╡
╰───────────────────────────────────────────────╯
root@gateway:~#

When I try to create resources I see this error message:
Code:
root@gateway:~# linstor resource-group spawn ha-lvm-rg test10 1GiB
ERROR:
Description:
    Not enough available nodes
Details:
    Not enough nodes fulfilling the following auto-place criteria:
     * has a deployed storage pool named [ha-lvm-storage]
     * the storage pools have to have at least '1048576' free space
     * the current access context has enough privileges to use the node and the storage pool
     * the node is online
    
    Auto-place configuration details:
       Replica count: 2
       Additional replica count: 2
       Storage pool name:
          ha-lvm-storage
       Do not place with resource:
          test10
       Do not place with resource (regex): 
       Layer stack:
          DRBD
    
    Resource group: ha-lvm-rg
root@gateway:~#
Where are these additional replica coming from? I only configured PlaceCount = 2.
How can I fix this?

Thanks for you time and help.
 
If I change the layer-stack from DRBD to STORAGE...
Code:
root@gateway:~# linstor resource-group list
╭──────────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter                      ┊ VlmNrs ┊ Description ┊
╞══════════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp    ┊ PlaceCount: 2                     ┊        ┊             ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ ha-lvm-rg     ┊ PlaceCount: 2                     ┊ 0      ┊             ┊
┊               ┊ StoragePool(s): ha-lvm-storage    ┊        ┊             ┊
┊               ┊ LayerStack: ['STORAGE']           ┊        ┊             ┊
┊               ┊ ProviderList: ['LVM_THIN', 'LVM'] ┊        ┊             ┊
╰──────────────────────────────────────────────────────────────────────────╯
root@gateway:~#
... then I can suddenly create resources:
Code:
root@gateway:~# linstor resource-group spawn-resources ha-lvm-rg test 1GiB
SUCCESS:
    Volume definition with number '0' successfully  created in resource definition 'test'.
SUCCESS:
Description:
    New resource definition 'test' created.
Details:
    Resource definition 'test' UUID is: 568ff267-b77b-41ba-ad26-e222999813b0
SUCCESS:
    Successfully set property key(s): StorPoolName
SUCCESS:
    Successfully set property key(s): StorPoolName
SUCCESS:
Description:
    Resource 'test' successfully autoplaced on 2 nodes
Details:
    Used nodes (storage pool name): 'gateway (ha-lvm-storage)', 'tvbox (ha-lvm-storage)'
SUCCESS:
    (gateway) Volume number 0 of resource 'test' [LVM-Thin] created
SUCCESS:
    Created resource 'test' on 'gateway'
SUCCESS:
    (tvbox) Volume number 0 of resource 'test' [LVM-Thin] created
SUCCESS:
    Created resource 'test' on 'tvbox'
root@gateway:~# linstor resource list
╭──────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node    ┊ Layers  ┊ Usage ┊ Conns ┊   State ┊ CreatedOn           ┊
╞══════════════════════════════════════════════════════════════════════════════════╡
┊ test         ┊ gateway ┊ STORAGE ┊       ┊ Ok    ┊ Created ┊ 2025-12-25 23:21:26 ┊
┊ test         ┊ tvbox   ┊ STORAGE ┊       ┊ Ok    ┊ Created ┊ 2025-12-25 23:21:26 ┊
╰──────────────────────────────────────────────────────────────────────────────────╯
root@gateway:~# linstor resource-definition  list
╭────────────────────────────────────────────────╮
┊ ResourceName ┊ ResourceGroup ┊ Layers  ┊ State ┊
╞════════════════════════════════════════════════╡
┊ test         ┊ ha-lvm-rg     ┊ STORAGE ┊ ok    ┊
╰────────────────────────────────────────────────╯
root@gateway:~#

However proxmox still fails to move exiting VM disks to the DRBD storage:
Code:
create full clone of drive scsi0 (local-lvm:vm-9900-disk-1)

NOTICE
  Trying to create diskful resource (pm-bacfc095) on (gateway).
  Diskfull assignment on gateway failed, let's autoplace it.
TASK ERROR: storage migration failed: API Return-Code: 500. Message: Could not autoplace resource pm-bacfc095, because: [{"ret_code":-4611686018407201820,"message":"Not enough available nodes","details":"Not enough nodes fulfilling the following auto-place criteria:\n * has a deployed storage pool named [ha-lvm-storage]\n * the storage pools have to have at least '33554432' free space\n * the current access context has enough privileges to use the node and the storage pool\n * the node is online\n\nAuto-place configuration details:\n   Replica count: 2\n   Additional replica count: 2\n   Storage pool name:\n      ha-lvm-storage\n   Do not place with resource:\n      pm-bacfc095\n   Do not place with resource (regex): \n   Layer stack:\n      STORAGE\n   Allowed Providers:\n      LVM_THIN\n      LVM\n   Diskless on remaining: false\n\nAuto-placing resource: pm-bacfc095","obj_refs":{"RscDfn":"pm-bacfc095"},...

VM creation sometimes work and sometimes not:

Code:
NOTICE
  Trying to create diskful resource (pm-54e3d011) on (gateway).
scsi0: successfully created disk 'ha-lvm:pm-54e3d011_101,iothread=1,size=2G,ssd=1'
TASK OK

Code:
NOTICE
  Trying to create diskful resource (pm-2fab2977) on (gateway).
  Diskfull assignment on gateway failed, let's autoplace it.
TASK ERROR: unable to create VM 102 - API Return-Code: 500. Message: Could not autoplace resource pm-2fab2977, because: [{"ret_code":-4611686018407201820,"message":"Not enough available nodes","details":"Not enough nodes fulfilling the following auto-place criteria:\n * has a deployed storage pool named [ha-lvm-storage]\n * the storage pools have to have at least '2097152' free space\n * the current access context has enough privileges to use the node and the storage pool\n * the node is online\n\nAuto-place configuration details:\n   Replica count: 2\n   Additional replica count: 2\n   Storage pool name:\n      ha-lvm-storage\n   Do not place with resource:\n      pm-2fab2977\n   Do not place with resource (regex): \n   Layer stack:\n      STORAGE\n   Allowed Providers:\n      LVM_THIN\n      LVM\n   Diskless on remaining: false\n\nAuto-placing resource: pm-2fab2977","obj_refs":{"RscDfn":"pm-2fab2977"},...

Even if the VM createion is successfully it's not working:
Code:
blockdev: cannot open /dev/drbd/by-res/pm-54e3d011/0: No such file or directory
TASK ERROR: stat for '/dev/drbd/by-res/pm-54e3d011/0' failed - No such file or directory

Does this mean DRBD is not running on my nodes?
Code:
root@gateway:~# ls /dev/ | grep drbd
root@gateway:~#
root@gateway:~# linstor node info
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node    ┊ Diskless ┊ LVM ┊ LVMThin ┊ ZFS/Thin ┊ File/Thin ┊ SPDK ┊ Remote SPDK ┊ Storage Spaces ┊ Storage Spaces/Thin ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ gateway ┊ +        ┊ +   ┊ +       ┊ +        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
┊ tvbox   ┊ +        ┊ +   ┊ +       ┊ +        ┊ +         ┊ -    ┊ +           ┊ -              ┊ -                   ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭──────────────────────────────────────────────────────────────────────╮
┊ Node    ┊ DRBD ┊ LUKS ┊ NVMe ┊ Cache ┊ BCache ┊ WriteCache ┊ Storage ┊
╞══════════════════════════════════════════════════════════════════════╡
┊ gateway ┊ -    ┊ -    ┊ -    ┊ +     ┊ -      ┊ +          ┊ +       ┊
┊ tvbox   ┊ -    ┊ -    ┊ -    ┊ +     ┊ -      ┊ +          ┊ +       ┊
╰──────────────────────────────────────────────────────────────────────╯
root@gateway:~# modprobe drbd
root@gateway:~#
I'm very confused.
 
Last edited:
Problem solved. I realized that due to secure-boot drbd 9 was rejected and the system was using drbd 8. I had to re-install everything, properly sign drbd 9 and use mokutil to enroll. after that. After that I had to re-start from scratch with the linstor drbd config. Now it’s working like a charm
:slight_smile:
 
The issue is that we have this stupid and useless drbd 8 kernel module.

I stumbled across this either. Either it should be completely removed or updated to drbd 9 in the kernel itself.
Drbd 8 in the Kernel just causes issues and is absolutely useless.

@t.lamprecht did you probably forgot to remove it from proxmox-9 kernels?
Because the drbd module isnt in official ubuntu kernels.

Please do something with that, either fully remove or update to v9.

Cheers
 
Hello! Happy to see people interested in Linstor DRBD. I've been trying to get a similar system working for quite a while now: 2 diskfull nodes with a dedicated MGMT network and a Direct Attached network for replication + a Corosync qDevice that will also serve as a DRBD Witness on the MGMT network

I wanted to know, have you managed to get the controller H.A. working? Or do you have another method? If my controller is running on PVE01 for example, and that node goes down, I lose all access to my storage :'( and I'm having a lot of trouble getting DRBD Reactor to work properly