FC Storage challenges

jone · Aug 22, 2024

I am on a path to migrate 15x ESXi servers (With vcenter / Distributed networking / cluster) and 200TB FC SSD SAN, with Proxmox virtualization. I am facing lots of challenges with the lack of a "good filesystem" like VMFS.

I have my first 2 PVE servers up running with Multipath, and LVM in shared mode. The performance is really good, and migration of Virtual machines between the nodes is working really well.
Some challenges that I am trying to get some ideas on.
1. How do I easiest make sure that we maintain identical Multipath config / Disk naming across all servers? If I make a change / provision a new LUN, I will have to modify the Multipath files on EVERY server.
2. How can I create a shared LUN and use that as a source folder for ISO / Installation media and have it available for all nodes? Looks like I cannot upload ISO files to my LVM drive, and not create folder on LVM.
3. How can I easiest expand a LUN without downtime?

These were all tasks that were quick and seamless on ESXi.

bbgeek17 · Aug 22, 2024

jone said:
1. How do I easiest make sure that we maintain identical Multipath config / Disk naming across all servers? If I make a change / provision a new LUN, I will have to modify the Multipath files on EVERY server.

Automation is the worst enemy of human error. Implement Ansible and Version Control if you want to to avoid copy/pasting.

jone said:
How can I create a shared LUN and use that as a source folder for ISO / Installation media and have it available for all nodes? Looks like I cannot upload ISO files to my LVM drive, and not create folder on LVM.

You either need to deploy a Cluster Aware Filesystem (ie OCFS2), or pass the LUN to a VM that runs NFS server

jone said:
How can I easiest expand a LUN without downtime?

Perhaps you can expand on what you have done so far that caused downtime?
FC LUN expansion happens on your storage side, the host should automagically see larger raw LUN size. You will then need to expand all involved layers (PV , VG, LV pool). In general all of this happens without downtime.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

waltar · Aug 22, 2024

1. As luns have unique id's you can put all luns in the multipath config at once and even distribute to all servers the same.
2. Create a filesystem on a lun after which you can create a folder for your iso's and provide it in the datacenter for all pve's.
3. Extend the lun in your storage then use lvextend. If that lvm volume has a filesystem resize that also.

jone · Aug 23, 2024

waltar said:
1. As luns have unique id's you can put all luns in the multipath config at once and even distribute to all servers the same.
2. Create a filesystem on a lun after which you can create a folder for your iso's and provide it in the datacenter for all pve's.
3. Extend the lun in your storage then use lvextend. If that lvm volume has a filesystem resize that also.

For 2, Are there any File systems that can be used on a FC Storage, and supports shared (That anyone would advice on using)?
I am looking at either install an ISCSI system that I have in addition to FC, OR run FreeNAS as ISCSI for sharing of files.

bbgeek17 · Aug 23, 2024

jone said:
Are there any File systems that can be used on a FC Storage, and supports shared (That anyone would advice on using)?

I am not endorsing it, but people have had success with OCFS2 https://www.ibm.com/docs/kk/linux-on-systems?topic=z-installing-customizing-ocfs2

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

jone · Aug 23, 2024

bbgeek17 said:
I am not endorsing it, but people have had success with OCFS2 https://www.ibm.com/docs/kk/linux-on-systems?topic=z-installing-customizing-ocfs2

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Thank you. I will for sure try it out !!
You are fast !!

bbgeek17 · Aug 23, 2024

Perhaps this guide would be more appropriate : https://krackout.wordpress.com/2019/11/05/ocfs2-on-debian-10-buster/
Note I did not read or try it. Plug this "install and configure ocfs2 debian" into google for more results

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

jone · Aug 23, 2024

bbgeek17 said:
Automation is the worst enemy of human error. Implement Ansible and Version Control if you want to to avoid copy/pasting.

You either need to deploy a Cluster Aware Filesystem (ie OCFS2), or pass the LUN to a VM that runs NFS server

Perhaps you can expand on what you have done so far that caused downtime?
FC LUN expansion happens on your storage side, the host should automagically see larger raw LUN size. You will then need to expand all involved layers (PV , VG, LV pool). In general all of this happens without downtime.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Been doing some more testing.
Expanded my test LUN from 100G to 200G on my storage system. This is shared to 2 different systems, and using multipath.

root@SVR-DC-PX1:~# multipath -ll
3PARPV1 (360002ac0000000000000032c0001fdcd) dm-5 3PARdata,VV
size=30T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 1:0:1:0 sde 8:64 active ready running
|- 2:0:0:0 sdg 8:96 active ready running
|- 1:0:0:0 sdc 8:32 active ready running
`- 2:0:1:0 sdi 8:128 active ready running
3PARPV2 (360002ac000000000000003370001fdcd) dm-6 3PARdata,VV
size=100G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 1:0:1:1 sdf 8:80 active ready running
|- 2:0:0:1 sdh 8:112 active ready running
|- 1:0:0:1 sdd 8:48 active ready running
`- 2:0:1:1 sdj 8:144 active ready running

Rescanned devices, no change. Restarted Multipath service, and the new size shows up.
root@SVR-DC-PX1:~# multipath -ll
3PARPV1 (360002ac0000000000000032c0001fdcd) dm-5 3PARdata,VV
size=30T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 1:0:1:0 sde 8:64 active ready running
|- 2:0:0:0 sdg 8:96 active ready running
|- 1:0:0:0 sdc 8:32 active ready running
`- 2:0:1:0 sdi 8:128 active ready running
3PARPV2 (360002ac000000000000003370001fdcd) dm-6 3PARdata,VV
size=200G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 1:0:1:1 sdf 8:80 active ready running
|- 2:0:0:1 sdh 8:112 active ready running
|- 1:0:0:1 sdd 8:48 active ready running
`- 2:0:1:1 sdj 8:144 active ready running
Trying to Resize the Multipath device, but it fails.
root@SVR-DC-PX1:~# multipathd -k"resize map /dev/mapper/3PARV2"
fail

My previous attempt was solved by a reboot on all shared nodes.
Hoping to find a flow that is working seamless.

Thanks

bbgeek17 · Aug 23, 2024

jone said:
Hoping to find a flow that is working seamless.

There are many variables involved. Things like "holders", IO load, etc.

You can try this procedure https://docs.redhat.com/en/document...ath/online_device_resize#online_device_resize (second google result for me "multipath new size" ).

You may also want to consult HPE/3PAR support/site for best approach. Its not PVE specific, there are bugs in software:

https://bugzilla.redhat.com/show_bug.cgi?id=352421#c16
https://dm-devel.redhat.narkive.com...ath-maps-reload-ioctl-failed-invalid-argument

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Falk R. · Aug 23, 2024

jone said:
1. How do I easiest make sure that we maintain identical Multipath config / Disk naming across all servers? If I make a change / provision a new LUN, I will have to modify the Multipath files on EVERY server.

Why do you require the multipath file to be customized? I have implemented some clusters at customers and I only mask the local disks and allow all disks with the correct vendor string.

jone said:
2. How can I create a shared LUN and use that as a source folder for ISO / Installation media and have it available for all nodes? Looks like I cannot upload ISO files to my LVM drive, and not create folder on LVM.

You can use OCFS, but I prefer to use NFS for ISO files.

jone said:
3. How can I easiest expand a LUN without downtime?

I have very mixed experiences. With some storages it's a bit tricky and with others it works better.
Often it is enough to run rescan-scsi-bus.shh once and then multipath -v3 once.
But I have already seen several times that the extension was only successful after all nodes had been restarted once.

I generally size the LUNs a little larger, the storages do thin provisioning anyway.
I usually create a new additional LUN, which is easier and faster.

However, I also prepare these customers for a switch to Ceph when they purchase new nodes. Then life is much easier.

alexskysilk · Aug 24, 2024

jone said:
1. How do I easiest make sure that we maintain identical Multipath config / Disk naming across all servers? If I make a change / provision a new LUN, I will have to modify the Multipath files on EVERY server.

For the most part, once you set up your multipath config file, it should work the same way on all hosts using it. this isnt really a problem, usually even the default config works "well enough." mostly issues arise due to lvm and multipath race conditions at boot, which are solvable with the multipath-boot-tools package.

jone said:
2. How can I create a shared LUN and use that as a source folder for ISO / Installation media and have it available for all nodes? Looks like I cannot upload ISO files to my LVM drive, and not create folder on LVM.

create a nas bridge vm serving out nfs. normally I dont bother with this; your nas can and should be external to your main cluster- I usually either use the backup device, or a separate device alltogether.

jone said:
How can I easiest expand a LUN without downtime?

afaik resizing a lun wont cause downtime anyway, and a simple Echo "- - -" > /sys/class/scsi_host/hostX/scan should cause the host to see the new capacity.

What you arent asking is just as important. There isnt a simple way to expose snapshot functionality for scsi based storage, if you're going to use your storage for snapshots, it will not be exposed to the host os, nor will it have exposed filesystem quiescence- meaning you'll need to manually orchestrate this both for taking snapshots and for restoration of any kind.

Falk R. · Aug 24, 2024

alexskysilk said:
What you arent asking is just as important. There isnt a simple way to expose snapshot functionality for scsi based storage, if you're going to use your storage for snapshots, it will not be exposed to the host os, nor will it have exposed filesystem quiescence- meaning you'll need to manually orchestrate this both for taking snapshots and for restoration of any kind.

Yes, snapshots do not work on LVM, but you can still make snapshot backups, which is also very fast with the PBS. This is usually sufficient as a replacement for VM snapshots.

But why shouldn't quiescing work? If the qemu guest agent is installed, it works normally.

alexskysilk · Aug 24, 2024

Falk R. said:
But why shouldn't quiescing work? If the qemu guest agent is installed, it works normally.

Hardware based snapshots dont know to trigger the guest agent quiescing procedure. thats why its a manual orchestration

Falk R. · Aug 24, 2024

Yes, snapshots in storage, but the problem exists in many environments and is often only solved via agents in the hosts.

alexskysilk · Aug 24, 2024

Falk R. said:
Yes, snapshots in storage, but the problem exists in many environments and is often only solved via agents in the hosts.

jone said:
I am on a path to migrate 15x ESXi servers

technically true, but meaningless. OPs storage is designed for vmware, and it works there.

Falk R. · Aug 24, 2024

alexskysilk said:
technically true, but meaningless. OPs storage is designed for vmware, and it works there.

But many customers don't use the functions at all, except for Pure Storage.

alexskysilk · Aug 24, 2024

Falk R. said:
But many customers don't use the functions at all

I LOVE customers who pay for stuff they dont use! my favorite!

snapshot integration is a pretty big deal in any vsphere infrastructure I support. If op doesnt want/need/use it, then its not an issue. the good news is that implementing it (at least in a basic form) is a simple script that issues a vm freeze, calls snapshot create api, and vm thaw, so the orchestration isnt complicated- but it does not come in the box, either from proxmox or the storage vendor.

LnxBil · Aug 26, 2024

alexskysilk said:
afaik resizing a lun wont cause downtime anyway, and a simple Echo "- - -" > /sys/class/scsi_host/hostX/scan should cause the host to see the new capacity.

Worst case is to rescan on every level:

rescan all disks (echo 1 > /sys/block/sda/device/rescan) for alle devices in your multipath or on your host - or use rescan-scsi-bun.sh from scsitools
reload multipath (multipath -r or systemctl reload multipathd)
pvresize the affected physical volumes

jone · Aug 26, 2024

LnxBil said:
Worst case is to rescan on every level:

rescan all disks (echo 1 > /sys/block/sda/device/rescan) for alle devices in your multipath or on your host - or use rescan-scsi-bun.sh from scsitools

reload multipath (multipath -r or systemctl reload multipathd)

pvresize the affected physical volumes

I spent some time over the weekend and tested. My test LUN grew from 100G to 350G, but like it have been mentioned, its thin provisioned on my 3PAR. I have to rescan, reload, and pvresize. Now that I have that sorted out, everything gets easier. Now I need to change the Multipath file to wildcard. This is a long way to go coming from vcenter / esxi, but its free

jone · Aug 27, 2024

Falk R. said:
Why do you require the multipath file to be customized? I have implemented some clusters at customers and I only mask the local disks and allow all disks with the correct vendor string.

You can use OCFS, but I prefer to use NFS for ISO files.

I have very mixed experiences. With some storages it's a bit tricky and with others it works better.
Often it is enough to run rescan-scsi-bus.shh once and then multipath -v3 once.
But I have already seen several times that the extension was only successful after all nodes had been restarted once.

I generally size the LUNs a little larger, the storages do thin provisioning anyway.
I usually create a new additional LUN, which is easier and faster.

However, I also prepare these customers for a switch to Ceph when they purchase new nodes. Then life is much easier.

With Multipath, my experience has been to use multipath config file with the wwid and an alias. This requires each LUN to have a static config on every node.

Like this:
multipaths {
multipath {
wwid "360002ac0000000000000032c0001fdcd"
alias 3PARData1
}

If I am using wildcard config, will I be able to add new LUN's without edit the multipath file?
Anywone that have an example file to show me / educate me?

Thank you

FC Storage challenges

New Member

Distinguished Member

Active Member

New Member

Distinguished Member

New Member

Distinguished Member

New Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

Distinguished Member

New Member

New Member