Fiber Channel SAN and Multipath after a reboot

SpawnkillerQC

New Member
Jul 22, 2024
13
3
3
Drummondville, Quebec, Canada
Hi all, i managed to configure Proxmox with a fiber channel SAN (HPE MSA2062) and multipath-tools.

Everything is fine until i reboot, everytime the disk is showing LVM2 Member instead of mpath_member.

After that when i migrate or create a VM, i have an alert that i'm accessing a multipath device with /dev/sda instead of /dev/mapper/xxxx

What am i missing ????

Thanks in advance !
 
Yes multipath -ll return this but the dm-05 and dm-06 we're the Original ID before the reboot,

As soon as i setup LVM this is happening, if i just do the multipath without LVM, it comes back everytime correctly

RAID10 (3600c0ff00053025039357d6601000000) dm-31 HPE,MSA 2060 FC
size=3.5T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:0:3 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
`- 1:0:0:3 sdd 8:48 active ready running
RAID5 (3600c0ff000530276c0ea7a6601000000) dm-30 HPE,MSA 2060 FC
size=17T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 1:0:0:1 sdc 8:32 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
`- 0:0:0:1 sda 8:0 active ready running

**EDIT: i manually deleted old disk in /dev/mapper that we're shown in lsblk under /dev/sda instead of /dev/sda/RAID5 (the name of my multipath) and when i create a new disk, it now shows in both multipath devices correctly, seems like this was the problem from the start...

i'll continue to test this as we're planning to fully move from VMWare to Proxmox at our renewal date (need the new foundation licences at 100k+ if we don't move and 6 premium of Proxmox is way cheaper for our platinum cpus)

He'res the new lsblk output that seems correct (since it's not in prod, only one vm was created and move between both mpath correctly)

sda 8:0 0 17.4T 0 disk
└─RAID5 252:2 0 17.4T 0 mpath
├─RAID5-vm--100--disk--0 252:4 0 128G 0 lvm
└─RAID5-vm--100--disk--1 252:5 0 4M 0 lvm
sdb 8:16 0 3.5T 0 disk
└─RAID10 252:3 0 3.5T 0 mpath
sdc 8:32 0 17.4T 0 disk
└─RAID5 252:2 0 17.4T 0 mpath
├─RAID5-vm--100--disk--0 252:4 0 128G 0 lvm
└─RAID5-vm--100--disk--1 252:5 0 4M 0 lvm
sdd 8:48 0 3.5T 0 disk
└─RAID10 252:3 0 3.5T 0 mpath
 
Last edited:
**EDIT: i manually deleted old disk in /dev/mapper that we're shown in lsblk under /dev/sda instead of /dev/sda/RAID5 (the name of my multipath) and when i create a new disk, it now shows in both multipath devices correctly, seems like this was the problem from the start...
What exactly? /dev/sda should never be a directory, it should be a block device.
 
What exactly? /dev/sda should never be a directory, it should be a block device.
lsblk was showing VM drives under /dev/sda directly like this instead of under the mpath device

sda 8:0 0 17.4T 0 disk
├─RAID5-vm--100--disk--0 252:4 0 128G 0 lvm
└─RAID5-vm--100--disk--1 252:5 0 4M 0 lvm
└─RAID5 252:2 0 17.4T 0 mpath
sdb 8:16 0 3.5T 0 disk
└─RAID10 252:3 0 3.5T 0 mpath
sdc 8:32 0 17.4T 0 disk
└─RAID5 252:2 0 17.4T 0 mpath
sdd 8:48 0 3.5T 0 disk
└─RAID10 252:3 0 3.5T 0 mpath
 
lsblk was showing VM drives under /dev/sda directly like this instead of under the mpath device

sda 8:0 0 17.4T 0 disk
├─RAID5-vm--100--disk--0 252:4 0 128G 0 lvm
└─RAID5-vm--100--disk--1 252:5 0 4M 0 lvm
└─RAID5 252:2 0 17.4T 0 mpath
sdb 8:16 0 3.5T 0 disk
└─RAID10 252:3 0 3.5T 0 mpath
sdc 8:32 0 17.4T 0 disk
└─RAID5 252:2 0 17.4T 0 mpath
sdd 8:48 0 3.5T 0 disk
└─RAID10 252:3 0 3.5T 0 mpath
Oh, that is what you mean ... not the actual path, but the output of lsblk ... yes. This is normal and I have this too.
 
Oh, that is what you mean ... not the actual path, but the output of lsblk ... yes. This is normal and I have this too.
ok, in the host disk section do you see LVM2 Member or mpath_member for those device ?

That was my first problem (before restart it's mpath_member and after it's LVM2 Member but was accessing only on one of the two path when moving/creating vm)

Since i remove all old vm disk as previously said, i don't have the error of accessing mpath device trough sda but they still show LVM2 Member
 
Take a look at /etc/lvm/lvm.conf and look at the section on filters. You probably need to only allow some of your devices to be handled by LVM locally on the hypervisors, and block LVM from activating at boot on other block devices. This will include volumes presented from the SAN.
 
Take a look at /etc/lvm/lvm.conf and look at the section on filters. You probably need to only allow some of your devices to be handled by LVM locally on the hypervisors, and block LVM from activating at boot on other block devices. This will include volumes presented from the SAN.
i'll check that as when i was blacklisting sdx, the share simply stopped working totally so i just reverted to the stock ZFS and RBD blacklist
 
There may be a race (for some reason) where your LVM starts earlier than DM. AS @d0glesby mentioned, you may need to exclude the devices that are members of DMs from LVM scan.
Keep in mind, that you do want LVM to activate on your boot disk.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
i boot from NVME BOSS RAID Card so the dms are sda, sdb, sdc and sdd but when i blacklist those, all goes down completly for the shared storage
 
i boot from NVME BOSS RAID Card so the dms are sda, sdb, sdc and sdd but when i blacklist those, all goes down completly for the shared storage
Sounds like you possibly need to review your configuration bottom to top. Remove the storage pool from PVE, wipe the disks, and start building it back methodically.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
ok, in the host disk section do you see LVM2 Member or mpath_member for those device ?
Yes, exactly like yours. Here is an excerpt (colorization with grc):

1721803055180.png

I also have this filter set in /etc/lvm.conf:

Code:
filter = [ "a|/dev/mapper/DX100*|", "a|/dev/mapper/DX200*|", "a|/dev/sda3|", "r|.*|" ]
 
Yes, exactly like yours. Here is an excerpt (colorization with grc):

View attachment 71755

I also have this filter set in /etc/lvm.conf:

Code:
filter = [ "a|/dev/mapper/DX100*|", "a|/dev/mapper/DX200*|", "a|/dev/sda3|", "r|.*|" ]
Tried adding /dev/mapper/RAID5 to mine and got the same result

Also deleted everything and restart the creation of pools, as soon as i do the pvcreate /dev/mapper/RAID5, they appear as LVM2_Member after a reboot even without vg on it (Just before it was mpath_member)
 
It would probably help the community if you were to provide more data. Start from scratch and show the exact commands you run and the output.

Don't cut things out, if the output is long - use the SPOILER tag.

Start with no LVM or Multipath, and stop the service if needed. Provide "lsblk", "lsscsi", "blkid"
Add/enable multipath. Provide your exact configuration file "cat /etc/multipath.conf|egrep -v "^$|^#".
What is in your /etc/multipath/wwids

What are the OS and PVE versions? (pveversion). Did you install it from PVE ISO or Debian?
What are the package versions for multipath?
Make sure your system is operational and survives reboot at this point.

What is in your lvm.conf? cat /etc/lvm/lvm.conf |egrep -v "^$|^#| *#"
Only after that add LVM. Make sure you show the commands you use to do so, the state of devices, etc. Follow on with reboot.

Many thousands of people use this configuration at this time, there is something in your workflow that is different but it's hard to say with limited information.



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
It would probably help the community if you were to provide more data. Start from scratch and show the exact commands you run and the output.

Don't cut things out, if the output is long - use the SPOILER tag.

Start with no LVM or Multipath, and stop the service if needed. Provide "lsblk", "lsscsi", "blkid"
Add/enable multipath. Provide your exact configuration file "cat /etc/multipath.conf|egrep -v "^$|^#".
What is in your /etc/multipath/wwids

What are the OS and PVE versions? (pveversion). Did you install it from PVE ISO or Debian?
What are the package versions for multipath?
Make sure your system is operational and survives reboot at this point.

What is in your lvm.conf? cat /etc/lvm/lvm.conf |egrep -v "^$|^#| *#"
Only after that add LVM. Make sure you show the commands you use to do so, the state of devices, etc. Follow on with reboot.

Many thousands of people use this configuration at this time, there is something in your workflow that is different but it's hard to say with limited information.



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
i used some guide on the internet for that as expected and merge information from 2 post, PVE 8.2.4 (ISO installed) on no-subscription for now (need a POC to justify the 10k CAD cost of 6 socket premium to my boss) lsscsi return command not found but i included lsblk and blkid with multipath -ll

i think it's about the filter in /etc/lvm/lvm.conf

https://gist.github.com/mrpeardotnet/547aecb041dbbcfa8334eb7ffb81d784
https://pve.proxmox.com/wiki/ISCSI_Multipath#Configuration

Here's the config i run


Multipath.conf

defaults {
polling_interval 2
path_selector "round-robin 0"
path_grouping_policy multibus
rr_min_io 100
failback immediate
no_path_retry queue
find_multipaths yes
}

blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][0-9]*"
devnode "^cciss!c[0-9]d[0-9].*"
}

devices {
device {
vendor "(HP|HPE)"
product "MSA [12]0[456]0 (SAN|SAS|FC|iSCSI)"
path_grouping_policy "group_by_prio"
prio "alua"
failback "immediate"
no_path_retry 18
}
}

multipaths {
multipath {
wwid 3600c0ff000530276c0ea7a6601000000
alias RAID5
}
multipath {
wwid 3600c0ff00053025039357d6601000000
alias RAID10
}

}



multipath.wwids

# Multipath wwids, Version : 1.0
# NOTE: This file is automatically maintained by multipath and multipathd.
# You should not need to edit this file in normal circumstances.
#
# Valid WWIDs:
/3600c0ff000530276c0ea7a6601000000/
/3600c0ff00053025039357d6601000000/



multipath.bindings

# Multipath bindings, Version : 1.0
# NOTE: this file is automatically maintained by the multipath program.
# You should not need to edit this file in normal circumstances.
#
# Format:
# alias wwid
#
RAID5 3600c0ff000530276c0ea7a6601000000
RAID10 3600c0ff00053025039357d6601000000



etc/lvm/lvm.conf

devices {
# added by pve-manager to avoid scanning ZFS zvols and Ceph rbds
global_filter=["r|/dev/zd.*|","r|/dev/rbd.*|"]
}

cat /etc/lvm/lvm.conf |egrep -v "^$|^#| *#"
config {
}
devices {
}
allocation {
}
log {
}
backup {
}
shell {
}
global {
}
activation {
}
dmeventd {
}
devices {
global_filter=["r|/dev/zd.*|","r|/dev/rbd.*|"]
}
 

Attachments

Last edited:
  • Like
Reactions: Deepen Dhulla
Also, keep in mind that nothing in above steps/configuration is PVE specific. Its general Linux management of FC storage


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
yeah, i understand, it's my first time with a FC San and Multipath but i'm using Proxmox with CEPH/ZFS since 2018, just that my company want to keep the actual SAN storage they paid for last year so i'm learning how to do it ;)
 
  • Like
Reactions: Deepen Dhulla

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!