Hi,
OK, I have two pve v9.0.11 nodes in my cluster where I'm trying to get iSCSI and multipath working. As part of the overall hardware set up, I've got the two nodes (call them node 1 and 2), a TrueNAS SAN (Enterprise, dual controller), and two Unifi fiber agg switches (not the ECS model so no MLAG unfortunately). Each node has a connection to each switch (2 different subnets), and the SAN is cross-connected to the switches over the 2 subnets (controller A-1 to switch1, controller A-2 to switch2, controller B-1 to switch 2, controller B-2 to switch 1).
I've followed the multipath wiki and it works - I have iSCSI connections and a connected LVM for guests. Problem is I'm running into reconnect issues if the switches are rebooted. If I reboot switch 1, node 2 loses one of the iSCSI paths and does not reconnect that path until I reboot the node. Node 1 does reconnect once the switch has fully booted. If I reboot switch 2, both nodes lose one of the iSCSI paths and neither reconnect until I reboot each node. I'm also unable to ping the SAN from the nodes on the affected interface.
To be clear, multipath does appear to be working in as far as I don't 100% lose connectivity...I just can't seem to get 100% connectivity back after a switch reboot other than to also reboot the nodes. Note that I DO NOT see this behavior if I simply unplug either interface. For an unplug, the corresponding target goes down but is restored/reconnected upon plugging the cable back in.
Anyone have any thoughts as to why this happens and is there any way to mitigate? Thanks...
OK, I have two pve v9.0.11 nodes in my cluster where I'm trying to get iSCSI and multipath working. As part of the overall hardware set up, I've got the two nodes (call them node 1 and 2), a TrueNAS SAN (Enterprise, dual controller), and two Unifi fiber agg switches (not the ECS model so no MLAG unfortunately). Each node has a connection to each switch (2 different subnets), and the SAN is cross-connected to the switches over the 2 subnets (controller A-1 to switch1, controller A-2 to switch2, controller B-1 to switch 2, controller B-2 to switch 1).
I've followed the multipath wiki and it works - I have iSCSI connections and a connected LVM for guests. Problem is I'm running into reconnect issues if the switches are rebooted. If I reboot switch 1, node 2 loses one of the iSCSI paths and does not reconnect that path until I reboot the node. Node 1 does reconnect once the switch has fully booted. If I reboot switch 2, both nodes lose one of the iSCSI paths and neither reconnect until I reboot each node. I'm also unable to ping the SAN from the nodes on the affected interface.
To be clear, multipath does appear to be working in as far as I don't 100% lose connectivity...I just can't seem to get 100% connectivity back after a switch reboot other than to also reboot the nodes. Note that I DO NOT see this behavior if I simply unplug either interface. For an unplug, the corresponding target goes down but is restored/reconnected upon plugging the cable back in.
Anyone have any thoughts as to why this happens and is there any way to mitigate? Thanks...