[TUTORIAL] PVE 7.x Cluster Setup of shared LVM/LV with MSA2040 SAS [partial howto]

Glowsome · Sep 1, 2024

spirit said:
Another question about write performance,

I have done some test with fio, and I have abymissal results when the vm disk file is not preallocated.

preallocated: I got around 20000iops 4k randwrite, 3GB/S write 4M. (This is almost the same than my physical disk without gfs2)

but when the disk is not preallocated, or when I take a snapshot on a preallocated drive. (so new write are not preallocated anymore), I have :

60 iops 4k randwrite, 40MB/S for write 4M

I have not examined, nor taken measurements in regards of performance, so i cannot provide you with data.

spirit · Sep 1, 2024

Glowsome said:
I have not examined, nor taken measurements in regards of performance, so i cannot provide you with data.

ok thanks !

works fine without lvm in my tests, so no need to use lvmlockd, vgscan,.and all other lvm stuff.

About performance, I have compared with ocfs2, and it's really night and day with 4k direct write when the file is not preallocated. (i'm around 20000iops on ocfs2 and 200 iops on gfs2).

I have also notice that qcow2 snapshot is lowering iops de 100~200iops for 4k direct write. It's also happening with local storage, so I'll look to implement external qcow2 snapshot. (snapshot in external file). I don't have performance regression with external snapshot.

the4amfriend · Oct 11, 2024

Hey @spirit & @Glowsome thank you for such an informative thread.

I have 6 hosts in my cluster and 2 MSAs that I am trying to use as clustered distributed storage. I initially tried using LVM on top of iSCSI to implement that but soon found out that the files were not being replicated across nodes and realised I needed GFS2. So I've installed and configured to the best of my knowledge (I don't want to use LVM if I can avoid it so have configured only GFS2 & DLM) but I don't get a prompt back when I try to mount - here is my dlm.conf

log_debug=1
protocol=tcp
post_join_delay=10
enable_fencing=0
lockspace Xypro-Cluster nodir=1

dlm_tool status
cluster nodeid 1 quorate 1 ring seq 9277 9277
daemon now 2743 fence_pid 0
node 1 M add 16 rem 0 fail 0 fence 0 at 0 0
node 2 M add 710 rem 0 fail 0 fence 0 at 0 0
node 3 M add 785 rem 0 fail 0 fence 0 at 0 0
node 4 M add 751 rem 0 fail 0 fence 0 at 0 0
node 5 M add 816 rem 0 fail 0 fence 0 at 0 0
node 6 M add 1145 rem 0 fail 0 fence 0 at 0 0

I'd appreciate any help.

spirit · Oct 20, 2024

the4amfriend said:
Hey @spirit & @Glowsome thank you for such an informative thread.

I have 6 hosts in my cluster and 2 MSAs that I am trying to use as clustered distributed storage. I initially tried using LVM on top of iSCSI to implement that but soon found out that the files were not being replicated across nodes and realised I needed GFS2. So I've installed and configured to the best of my knowledge (I don't want to use LVM if I can avoid it so have configured only GFS2 & DLM) but I don't get a prompt back when I try to mount - here is my dlm.conf

I'd appreciate any help.

Hi,
here my dlm.conf

Code:

# Enable debugging
log_debug=1
# Use tcp as protocol
protocol=sctp
# Delay at join
#post_join_delay=10
# Disable fencing (for now)
enable_fencing=0

I'm using protocol=sctp because I have multiple corosync link, and it's mandatory.

then I format with gfs2 my block device

mkfs.gfs2 -t <corosync_clustername>:testgfs2 -j 4 -J 128 /dev/mapper/36742b0f0000010480000000000e02bf3

(here I'm using a multipath iscsi lun)

and finally I'm mounting it

mount -t gfs2 -o noatime /dev/mapper/36742b0f0000010480000000000e02bf3 /mnt/pve/gfs2

RoscioG · Nov 21, 2024

Hi,

I’m writing this post after testing the Glowsome configuration for about two months, followed by four months of production use on three nodes with mixed servers connected via FC to a Lenovo De2000H SAN.
I want to thank @Glowsome for the excellent work they’ve done.

I sincerely hope that this solution can become officially supported in Proxmox in the future.

Thank you again!

iwik · Mar 20, 2025

There is this tutorial https://forum.proxmox.com/threads/poc-2-node-ha-cluster-with-shared-iscsi-gfs2.160177/ which I have used to setup 2 node cluster in our lab, FC SAN (all flash storage), GFS2 directly on multipath device (simple setup)
From features perspective everything seems to be working (we only miss tpm2 blocking snapshots), all basic features we need (snapshots + san)
In lab seems to be stable, performance is also ok, even discard is supported on gfs2.
Some performance from windows VM:

Sequential speeds shows 8Gbit HBA are bottlenecks in this case.

einhirn · May 21, 2025

I have issues with DLM/Mount on boot with this setup - although I'm not using LVM but the raw LUNs themselves. I've added some dependencies to the FStab entries, but the automatic mount still somehow runs into indefinite "kern_stop" for the "mount" commands. Can only be fixed by rebooting.

My current workaround is to define the mount as "noauto" and mount it manually after the proxmox box is completely booted. That works fine up until now.

Here's my fstab entry:

Code:

/dev/disk/by-uuid/8ee5d7a9-7b19-4b45-b388-bb5758c20d77 /mnt/pve/storage-gfs2-01 gfs2 _netdev,noauto,noacl,lazytime,noatime,rgrplvb,discard,x-systemd.requires=dlm.service,x-systemd.requires=nvmf-connect-script.service,x-systemd.requires=pve-ha-crm.service,nofail 0 0
/dev/disk/by-uuid/1a89385a-965c-4014-9b83-f90a1f3782f6 /mnt/pve/storage-gfs2-02 gfs2 _netdev,noauto,noacl,lazytime,noatime,rgrplvb,discard,x-systemd.requires=dlm.service,x-systemd.requires=nvmf-connect-script.service,x-systemd.requires=pve-ha-crm.service,nofail 0 0

With the x-systemd.requires and the _netdev flag, systemd adds following dependencies:

Code:

After=dlm.service nvmf-connect-script.service pve-ha-crm.service
Requires=dlm.service nvmf-connect-script.service pve-ha-crm.service
After=blockdev@dev-disk-by\x2duuid-1a89385a\x2d965c\x2d4014\x2d9b83\x2df90a1f...target

DLM should obviously be started, and the NVMe-over-TCP connection should be established. The last bit was a first stab at a workaround, trying to wait for corosync to be ready, but it didn't work reliably. Systemd automatically added the "After=blockdev@...target", which seems fine.

I don't know whether it's a race condition because of mounting two shares at once - or is it a fencing related issue? This is my default DLM config, I'm using sctp because I've got two rings defined in corosync. I was already experimenting with disabling additional fencing related options, though. I wasn't sure whether disabling something like "enable_quorum_lockspace" would be a good idea...

Code:

# cat /etc/default/dlm
DLM_CONTROLD_OPTS="--enable_fencing 0 --protocol sctp --log_debug"

# new settings might add
# --enable_startup_fencing 0 --enable_quorum_fencing 0

Can anyone see an error I've overlooked?

david_tao · May 21, 2025

einhirn said:
I have issues with DLM/Mount on boot with this setup - although I'm not using LVM but the raw LUNs themselves. I've added some dependencies to the FStab entries, but the automatic mount still somehow runs into indefinite "kern_stop" for the "mount" commands. Can only be fixed by rebooting.

My current workaround is to define the mount as "noauto" and mount it manually after the proxmox box is completely booted. That works fine up until now.

Here's my fstab entry:

Code:

/dev/disk/by-uuid/8ee5d7a9-7b19-4b45-b388-bb5758c20d77 /mnt/pve/storage-gfs2-01 gfs2 _netdev,noauto,noacl,lazytime,noatime,rgrplvb,discard,x-systemd.requires=dlm.service,x-systemd.requires=nvmf-connect-script.service,x-systemd.requires=pve-ha-crm.service,nofail 0 0 /dev/disk/by-uuid/1a89385a-965c-4014-9b83-f90a1f3782f6 /mnt/pve/storage-gfs2-02 gfs2 _netdev,noauto,noacl,lazytime,noatime,rgrplvb,discard,x-systemd.requires=dlm.service,x-systemd.requires=nvmf-connect-script.service,x-systemd.requires=pve-ha-crm.service,nofail 0 0

With the x-systemd.requires and the _netdev flag, systemd adds following dependencies:

Code:

After=dlm.service nvmf-connect-script.service pve-ha-crm.service Requires=dlm.service nvmf-connect-script.service pve-ha-crm.service After=blockdev@dev-disk-by\x2duuid-1a89385a\x2d965c\x2d4014\x2d9b83\x2df90a1f...target

DLM should obviously be started, and the NVMe-over-TCP connection should be established. The last bit was a first stab at a workaround, trying to wait for corosync to be ready, but it didn't work reliably. Systemd automatically added the "After=blockdev@...target", which seems fine.

I don't know whether it's a race condition because of mounting two shares at once - or is it a fencing related issue? This is my default DLM config, I'm using sctp because I've got two rings defined in corosync. I was already experimenting with disabling additional fencing related options, though. I wasn't sure whether disabling something like "enable_quorum_lockspace" would be a good idea...

Code:

# cat /etc/default/dlm DLM_CONTROLD_OPTS="--enable_fencing 0 --protocol sctp --log_debug" # new settings might add # --enable_startup_fencing 0 --enable_quorum_fencing 0

Can anyone see an error I've overlooked?

Hi einhirn: you don't have to use DLM, it's required by GFS instead Shared LVM. Recommend you can reference to https://kb.blockbridge.com/technote/proxmox-lvm-shared-storage/

einhirn · Jun 5, 2025

david_tao said:
it's required by GFS

Exactly - that's what I'm using. Ok, I didn't mention that other than in the "fstab" lines, but since this thread is about using GFS2 I didn't think it neccessary.

Btw: I'm also using shared thick LVM storage via iSCSI+Multipathing and NVMe-over-TCP, but I really like to use thin provisioning for VMs and possibly snapshots, even though I was surprised that QCOW-Snapshots were internal (i.e. same file) in PVE, but that's a different topic.

einhirn · Jun 5, 2025

einhirn said:
I have issues with DLM/Mount on boot with this setup - although I'm not using LVM but the raw LUNs themselves. I've added some dependencies to the FStab entries, but the automatic mount still somehow runs into indefinite "kern_stop" for the "mount" commands. Can only be fixed by rebooting.
[...]

Can anyone see an error I've overlooked?

It seems that there are some dependencies to take care of:

Post in thread 'Cant Get GFS2 filesystem to mount on reboot'

Oct 6, 2022

dlm.service needs

Code:

[Unit]
After=pve-ha-crm.service

to be able to mount GFS2 on boot.

Post in thread 'Cant Get GFS2 filesystem to mount on reboot'

Oct 4, 2022

BTW: You also have to remove the $remote_fs dependency from /etc/init.d/rrdcached, otherwise you get a dependency cycle:

Code:

remote-fs.target -> rrdached.service -> pve-cluster.service
     ^                                         |
     |                                         V
gfs2.mount   <-      dlm.service  <-     corosync.service

rrdcached never writes to a remote filesystem, AFAIK.

I'll try those and check whether it helps...

Search

Search

[TUTORIAL] PVE 7.x Cluster Setup of shared LVM/LV with MSA2040 SAS [partial howto]

Glowsome

Renowned Member

spirit

Distinguished Member

the4amfriend

New Member

spirit

Distinguished Member

RoscioG

Member

iwik

Member

einhirn

New Member

david_tao

Member

einhirn

New Member

einhirn

New Member

Post in thread 'Cant Get GFS2 filesystem to mount on reboot'

Post in thread 'Cant Get GFS2 filesystem to mount on reboot'

We value your privacy