FibreChannel setup in PVE. Weird packet drops on VMs running on 2nd node added after 1st node.

Cobus · Apr 7, 2025

Hi

I have recently strarted using proxmox and have a cluster running fine.

I have a Huawei server with 6 blades and a big SAN attached to them that i reloaded Proxmox onto
I setup the SAN to be a FibreChannel rather than iSCSI as we have speed issues with a current iSCSI that we are using
Managed to setup the Fibrechannel on firse node by following 2 guides:

https://pve.proxmox.com/wiki/Multipath
https://www.youtube.com/watch?v=aF2QUbmxvcw

These where my steps on first node
Install Proxmox, then run a handy script that enables non-sub repositories and updates proxmox then restart node.

Then apt update, followed by apt install multipath-tools, also did apt install grub-efi-amd64, reboot

Then fdisk -l

found 2 devices (well 4 they each repeat twice)

Disk /dev/sdb: 17.28 TiB, 19002344603648 bytes, 37113954304 sectors
Disk model: XSG1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/sdc: 139.93 TiB, 153854788239360 bytes, 300497633280 sectors
Disk model: XSG1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

To get wwid on device /lib/udev/scsi_id -g -u -d /dev/sdc (and sdb)

Now add the wwid to the system

multipath -a 36785860100b13fad1aa17fca00000001
multipath -a 36785860100b13fad1aa1950100000002

Update system with multipath -r

Create a file /etc/multipath.conf (fcpath1 and 2 is names that i chose)

defaults {
find_multipaths yes
user_friendly_names yes
}
blacklist {
devnode "^hd[a-z][[0-9]*"
devnode "^sda$"
}
multipaths {

multipath {
wwid "36785860100b13fad1aa17fca00000001"
alias "fcpath1"
}

multipath {
wwid "36785860100b13fad1aa1950100000002"
alias "fcpath2"
}

}

To see if system is working i run multipath -ll

fcpath1 (36785860100b13fad1aa17fca00000001) dm-5 HUAWEI,XSG1
size=17T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 11:0:0:1 sdb 8:16 active ghost running
`- 12:0:0:1 sdd 8:48 active ready running
fcpath2 (36785860100b13fad1aa1950100000002) dm-6 HUAWEI,XSG1
size=140T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 11:0:0:2 sdc 8:32 active ready running
`- 12:0:0:2 sde 8:64 active ghost running

Now i run
pvcreate /dev/mapper/fcpath1
pvcreate /dev/mapper/fcpath2

and then

vgcreate fastsas /dev/mapper/fcpath1
vgcreate slowsas /dev/mapper/fcpath2

Then i went into WebGui and added the node to my cluster.
Then i went tp Datacenter, Storage and click ADD and select LVM. ID i made FCfast and select volume fastsas and selected the node and select shared. did the same with other but called it FCslow

All works well i can migrate live vms to it, vms run fine they can ping and see internet all is good. pinging them from outside computers they can ping fine no packet losses.

Now i have 4 more nodes that are also connected to same SAN. I installed proxmox on 3 of them. followed same steps as first one exept i stop right before i run pvcreate and vgcreate. because as i understand it this will destroy the data on the devices? am i wrong?

I then add them to node, go to datacenter and just add the node name to FCfast LVM and FCslow LVM.

The LVMs comes up under the storage things i can migrate offline VMs to them no problem, but live VM the migration succeeds and fails the VM does not recover it is offline. i need to manually start it, no biggy.

But now i have a weird problem. VMS running on 2nd, 3rd and 4th node can ping outwards with no packet loss. BUT pinging from outside computers to them i get either no reply or massive packet loss of 15-25%.

as i have seen on this post https://forum.proxmox.com/threads/fiber-channel-setup.35225/

"- you must manually configure the SAN Backed LVM storage, from each proxmox node. There is not any 'automatic propigation' of shared storage from one Proxmox node to other nodes, for any (!) of the shared storage types, as far as I am aware (ie, iSCSI, etc)

ie, you basically

- setup on first proxmox node
- then go to second node,
- only step you won't have to repeat, is the creation of the LVM on the SAN Disk volume. But you do need to 'add' the storage into the proxmox node. Flag it as type = shared with the check-box option. Then it comes online. You must rinse and repeat this config process on all nodes who need access to the shared storage."

the "you must manually configure the SAN Backed LVM storage, from each proxmox node." which i believe i did with multipath?

but i dont understand:
"you do need to 'add' the storage into the proxmox node. Flag it as type = shared with the check-box option. Then it comes online. You must rinse and repeat this config process on all nodes who need access to the shared storage."

you cant add it to the node itself you have to do it via Datacenter that is cluster wide

now my theory is that the FibreChannel settings was not completed on 2nd, 3rd and 4th node. and they are accessing the FCfast and FCslow over network with node1 and thats causing issues... well im dont know i am spit balling

Should i run pvcreate and vgcreate on 2nd, 3rd and 4th node as well? does this just create a physical volume and volume group on the node and does not touch the data on the SAN?

running multipath -ll gives same results on all 4 nodes

i can ping node 1, 2, 3, 4 and the vms on node 1 and vms on node 2 (nothing on 3 and 4 yet) at same time and only vms on node 2 will drop packets while nodes are stable and VMs on node 1 also stable, so its not the nodes network interfaces. (VM on node 3 ping the internet fine but had 100% packet lost when pinging it from another device so migrated it to node 1) VMS work fine if i migrate them to node 1 or other nodes in cluster where they origionally was.

all 4 nodes use same network interface, its basically a virtual interface that the huawei server make available to all blades.
all 4 nodes have identical interface config (except ips ofcourse), hosts file and resolv.conf file

all Nodes are on same management vlan 210 (have gateway and dns but blocked outside comms are block on firewall)
all Nodes are on same Proxmox cluster vlan 220
all Nodes are on same iSCSI vlan 210
all Nodes are on same vlan 10 as servers that they are hosted just in case.
All servers and devices used to do ping test is either on vlan 10 or has full access to vlan 10 via firewall rules.

Any insights will be appreciated.

bbgeek17 · Apr 8, 2025

Cobus said:
followed same steps as first one exept i stop right before i run pvcreate and vgcreate. because as i understand it this will destroy the data on the devices? am i wrong?

You are not wrong. It will likely fail because there is already an LVM signature on the disk, but there is no reason to try.

Cobus said:
The LVMs comes up under the storage things i can migrate offline VMs to them no problem, but live VM the migration succeeds and fails the VM does not recover it is offline. i need to manually start it

Have you examined the Task log, the journalctl of the source and target nodes?

Cobus said:
BUT pinging from outside computers to them i get either no reply or massive packet loss of 15-25%.

This has nothing to do with FC, SAN, Multipath or LVM.

Cobus said:
But you do need to 'add' the storage into the proxmox node. Flag it as type = shared with the check-box option. Then it comes online. You must rinse and repeat this config process on all nodes who need access to the shared storage."

Your story is very voluminous, and its a little unclear what you are doing here. It would have been more useful if you provided "storage.cfg" of your cluster.
That said, No - you do not "add" storage on every node. You need to take care of the backend connectivity, i.e.: FC, Multipath. The LVM will already be present, so will the Storage Cluster configuration. Of course, the pre-requisite is for PVE cluster to be healthy.

Cobus said:
now my theory is that the FibreChannel settings was not completed on 2nd, 3rd and 4th node. and they are accessing the FCfast and FCslow over network with node1 and thats causing issues... well im dont know i am spit balling

No, that theory is wrong.

Cobus said:
Should i run pvcreate and vgcreate on 2nd, 3rd and 4th node as well? does this just create a physical volume and volume group on the node and does not touch the data on the SAN?

No, don't do it.

Cobus said:
running multipath -ll gives same results on all 4 nodes

That's a good sign that things are connected properly.

Cobus said:
at same time and only vms on node 2 will drop packets while nodes are stable and VMs on node 1 also stable, so its not the nodes network interfaces.

It's not the storage.

Cobus said:
all 4 nodes use same network interface, its basically a virtual interface that the huawei server make available to all blades.
all 4 nodes have identical interface config (except ips ofcourse), hosts file and resolv.conf file

all Nodes are on same management vlan 210 (have gateway and dns but blocked outside comms are block on firewall)
all Nodes are on same Proxmox cluster vlan 220
all Nodes are on same iSCSI vlan 210
all Nodes are on same vlan 10 as servers that they are hosted just in case.
All servers and devices used to do ping test is either on vlan 10 or has full access to vlan 10 via firewall rules.

You said at the start of your story that you are using FC - why is Virtual (network?) interface on Huawei in the picture?
Are you, perhaps, saying FC and really mean optical NETWORK interface, vs RJ45?

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Cobus · Apr 8, 2025

Thank you for the reply.

You said at the start of your story that you are using FC - why is Virtual (network?) interface on Huawei in the picture?
Are you, perhaps, saying FC and really mean optical NETWORK interface, vs RJ45?

The nodes are on a Huawei E900 Chassis. So its not a physical network card on a server. The blades "plug into" the chassis and you setup the chassis to provide each blade with a connection to the network so you can give them up to to 8 network cards. Or you can give 6 network cards and 2 connections for Fibre Channel. Or that's how the Huawei tech explained to me.

This in turn connects via fibre to an Ocean Store 2600.

This has nothing to do with FC, SAN, Multipath or LVM.

I am exploring and going over all the network settings again, waiting for the Huawei tech to come again and check his settings.

Will post again end of today or tomorrow if i find something else or manage to sort it.

bbgeek17 · Apr 8, 2025

Cobus said:
The nodes are on a Huawei E900 Chassis. So its not a physical network card on a server. The blades "plug into" the chassis and you setup the chassis to provide each blade with a connection to the network so you can give them up to to 8 network cards. Or you can give 6 network cards and 2 connections for Fibre Channel. Or that's how the Huawei tech explained to me.

So these are mixed-use ports that can change personality based on user choices.
In that case, your issue does not have anything to do with Multipath or LVM, but you ports/switch/cards are likely misconfigured in some way. I would lean on your hardware vendor for more support.

Cheers

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Search

Search

FibreChannel setup in PVE. Weird packet drops on VMs running on 2nd node added after 1st node.

Cobus

New Member

bbgeek17

Distinguished Member

Cobus

New Member

bbgeek17

Distinguished Member

We value your privacy