[SOLVED] 3 Nodes cluster with Proxmox 5.4 and DRBD 9

Hello,

first of all I know that there was a license change about DRBD and I should be posting this only on Linbit's website but, because I couldn't find any information at all about my problem, I prefered posting it on both sites to maybe help out someone who's in the same case.

(I think too that Proxmox's forum is more ergonomic for the user that seek help or some answers.)


My project is to build a 3 nodes proxmox 5.4 cluster first, then add DRBD to it (a 3 nodes linstor cluster with node 1 and node 2 as Satellites, plus node 3 as a diskless Controller for quorum).

Nodes 1 and 2 hardware configuration :
- Intel Xeon 4c/8t at 3.8 Ghz / 4.2 Ghz
- 32 Go DDR4 ECC 2133 Mhz
- 1 Gbps public network + 1 Gbps private network (vRACK from ovh)
- 3x960Go SSD in raid 1 (2 used disks and 1 unused) for VMs

Node 3 hardware configuration :
- Intel Xeon 4c/8t at 3.5 Ghz / 3.9 Ghz
- 16 Go DDR4 ECC 2400 Mhz
- 1 Gbps public network + 1 Gbps private network (vRACK from ovh)
- 2x4 To HDD in "soft raid" for backups

I started by installing Proxmox 5.4 on all nodes than I properly configured each nodes's /etc/hosts with theirs vRACK IP Addresses, then I created my Proxmox's cluster, everything went good.
(The fact that you can create a cluster at this day only with few clicks is, indeed, amazing)


/!\ Sources that I tried, will be put at the bottom of this post /!\


After this step, I began to install the pve-headers package then add linbit's repository and downloaded some packages (drbd-dkms drbd-utils drbd-top) on all nodes.
It's now time to install linstor's (satellite,client) packages on each nodes and linstor-controller package on my 3rd node (The 2x4 To one).

As I said above, nodes 1-2 (SSDs) are going to be satellites that hold VMs, node 3 (HDD) is the diskless controller for quorum and will hold backups later.



The next step that I did is to enable and start each node's services. Nodes 1-2 will need linstor-satellite, node 3 both linstor-satellite and linstor-controller.
I verified that these services are started with systemctl status.

Now, I need to create my linstor node, so I start by adding the node 3 as a "Combined" node-type and the last nodes as "Combined" too which is recommended in the documentations I've seen.



Nodes 1-2 are named "second" and "third", node 3 is named "first" as an example.
I verified again that the nodes are created, which is the case.



I need to create a LVM partition on both nodes (1-2) that supports snapshots, so I chose a lvmthin partition of 760 Go named drbdpool which is in the VG (volume group) pve. (You can verifiy all of this with commands like "pvs", "vgs", "lvs")


Here comes my problem at this step : (I tried so many times and different ways to do it, but I'm getting the same errors at the same step on a different setup. I even did it on some VMs to try if it wasn't a hardware problem.)

I need to create a resource-definition, name it and create a volume-definition, it worked.
But after this, I need to create this resource on the node 1 (second) and the node 2 (third) and they gave me the same error description.

"Satellite 'second/third' does not support the following layers : [DRBD]"



At first, I told myself that it isn't a big deal if this step doesn't work but, after this I had to install the linstor-proxmox package and then, "declare" the storage that will be used and seen by DRBD.

This storage will appear on the Proxmox interface, but in my case it appear as a Unknown type of storage but active, and I can't create any VM on it, or see the remaining storage capacity.



Then, I was thinking back at this step that gave me one error and I drew a conclusion that either my proxmox has a problem, either because of this error, proxmox cannot recognize the storage at all. (I tried with a different setup, same error)

Even with the tool drbdtop, I was able to see that there was no activity at all from DRBD. I created a folder and a file on node 1 to see if it was going to be replicated on node 2, but it wasn't.



This is where I've been stuck for days, without finding any information about it.

Thanks a lot for reading and helping,

Sources :
- Linbit's official DRBD documentation with a special chapter for Proxmox
- "How to setup LINSTOR on Proxmox VE" by Roland Kammerer on Linbit website the 26th of July 2018
- "Configuring Linstor storage plugin for Proxmox VE" by yannis-itwiki's on his blogspot the 29th of July 2018
(I apologize, I can't post link because I'm a new member)

Screenshots : The posts below this one
 
Here's the screenshots I found useful.
 

Attachments

  • pvs vgs lvs second.PNG
    pvs vgs lvs second.PNG
    16 KB · Views: 59
  • pvs vgs lvs third.PNG
    pvs vgs lvs third.PNG
    14.5 KB · Views: 49
  • node list.png
    node list.png
    7.5 KB · Views: 44
  • storage pool list from controller.PNG
    storage pool list from controller.PNG
    7.4 KB · Views: 42
  • Node second error.png
    Node second error.png
    202.6 KB · Views: 45
  • Node third error.png
    Node third error.png
    201.8 KB · Views: 49
Update : I posted my project on Linbit's mail list (The community support) and someone from Linbit replied to me.

The error I got was caused, by the fact that when I created my linstor cluster (linstor add nodes) I didn't used the same names for my nodes.
My nodes are called alpha bravo charlie but in my linstor cluster, they were called first second third. So, in order to not have the same problem, you'll need to delete your linstor cluster and then re-add your nodes with the same names "alpha bravo charlie".

And it worked ! The "resource layer" DRBD just appeared when I added nodes to my linstor cluster, and the storage entry on Proxmox's interface isn't "Unknown" anymore. I was even able to create a resource-definition on nodes 1-2 and delete it after.
Of course, I can now create VMs on the drbdstorage that has the type "DRBD" on Proxmox's interface.


Linbit's support answer :
Which also leads us to the core issue: In order to DRBD being supported by linstor, the linstor node-names must match the actual host names (uname -n). As I assume that your machines are not named "first", "second" and "third" (otherwise you would not had to grey out the "--controllers ______ " in your "n l" command), linstor reports that this UNAME-check failed which is required for DRBD-support.

I will put this to my todo-list to report a reason why a layer is not supported (i.e. in your case you would see something like "Satellite '....' does not support the following layers: [DRBD]. Failed checks: [UNAME]" or something like that), although with quite low priority, so don't expect this to be ready by tomorrow :)


My problem is now SOLVED.
 
  • Like
Reactions: Jorge Peixoto

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!