Multipath - LVM Disappearing VG's

Vaulting

New Member
Dec 20, 2023
2
0
1
Hi there,
My setup is shown as followed in the picture attached:
Controllers are Dell ME5400 connected via SAS to each node for redundancy. The controllers are handling the RAID.

1703094741977.png

We plan to have a quorum of 3 but at the moment we're testing with what we have with two nodes.

At the moment, multipath is setup and working(verified with FIOS tested) in a round-robin configuration so only one mpatha
size=56T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=30 status=active
|- 3:0:0:0 sdb 8:16 active ready running
`- 3:0:1:0 sdc 8:32 active ready running

Due to it being a single shared block storage we have went with LVM and did the following:
root@X2:~# pvcreate /dev/mapper/mpatha
Physical volume "/dev/mapper/mpatha" successfully created.

root@X2:~# vgcreate VMData /dev/mapper/mpatha
Physical volume "/dev/mapper/mpatha" successfully created.
Volume group "VMData" successfully created

root@X2:~# vgs
VG #PV #LV #SN Attr VSize VFree
VMData 1 0 0 wz--n- <55.86t <55.86t
pve 1 3 0 wz--n- <445.62g 16.00g

root@X2:~# pvs
PV VG Fmt Attr PSize PFree
/dev/mapper/mpatha VMData lvm2 a-- <55.86t <55.86t
/dev/sda3 pve lvm2 a-- <445.62g 16.00g

1703095107890.png
(I may have skipped listing some commands on X1)

After that, I proceed to add to the storage in Datacenter and verify that it replicated to the other node with the option shared enabled. I proceed to attempt to install a VM and get an error when attempting to save formatting changes. Both storages on the nodes are listed as not active now.

I go into both nodes and get:
root@X2:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 pve lvm2 a-- <445.62g 16.00g

root@X2:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- <445.62g 16.00g

root@X1:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 pve lvm2 a-- 445.62g 16.00g

root@X1:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 445.62g 16.00g

Using journalctl I get: Dec 20 13:06:36 X1 pvestatd[2648]: command '/sbin/vgscan --ignorelockingfailure --mknodes' failed: exit code 5

Multipath still intact:
size=56T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=30 status=active
|- 3:0:0:0 sdb 8:16 active ready running
`- 3:0:1:0 sdc 8:32 active ready running

I attempted to create the same VG again and it was successful without error of it being already created. Now that I have the VG back and verified on the originating node, I look back at the other node and it proceeds to not have the "new" VG but attempting to create the VG will state it already exists in the filesystem. Storage view in the GUI shows originating node has a active status but the other node it isn't active.

root@X2:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 pve lvm2 a-- <445.62g 16.00g

root@X2:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- <445.62g 16.00g

root@X2:~# vgcreate VMData /dev/mapper/mpatha
Physical volume "/dev/mapper/mpatha" successfully created.
Volume group "VMData" successfully created

root@X1:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 445.62g 16.00g
root@X1:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 445.62g 16.00g

root@X1:~# vgcreate VMData /dev/mapper/mpatha
/dev/VMData: already exists in filesystem
Run `vgcreate --help' for more information.

root@X1:~# /sbin/vgscan --ignorelockingfailure --mknodes
Found volume group "VMData" using metadata type lvm2
Found volume group "pve" using metadata type lvm2
Command failed with status code 5.

As I was writing this post X2 proceeds to lose the VG again.
root@X2:~# /sbin/vgscan --ignorelockingfailure --mknodes
Found volume group "pve" using metadata type lvm2
Command failed with status code 5.

Dmesg shows nothing

Is there anything that I am missing? Sorry if this is everywhere, I have been beating at this for the last day and am still lost after looking at the forums and blog posts.
 
Last edited:
Changing my multipath from Round Robin to Group_by_prio allowed VMs to install. Seems it might be related to the connections back to the controller?