Hi,
we are currently migrating to Proxmox, slowly adding more Nodes (HPE DL380 G10) to our cluster. The current nodes are on 9.0.10 and the new nodes which we are just adding now are on 9.1.1. (we are aware that we have to update all nodes in the cluster to the same version).
Now on those new 9.1.1 nodes in the cluster, we are having trouble when adding a new LUN over fibre channel. In our process we have to trigger a Bus rescan, since the newly presented disks don't show up without doing that. We trigger the rescan by running:
This works on the 9.0.10 nodes without any problem, but with 9.1.1 the command never finishes, locking up the bash. This is only resolved by rebooting the node.
Now if we run "dmesg | grep lun" we get the following information:
Running lsscsi we find that this ID belongs to the HPE p408i-a adapter which holds our local RAID1. It does not belong to the HPE SN1100Q adapter which we use to access the fibre channel LUN.
This means we can work around this issue by running a rescan which excludes this adapter (host2).
Running the rescan without this adapter finishes without any problems.
We have found a similar problem here. https://forum.proxmox.com/threads/i...roblem-with-qlogic-fiber-channel-cards.78797/
But the P408i-a runs on the smartpqi module, which can't make use of the ql2xmaxlun option. According to the manpage it does not have such an option.
Now we are wondering how we can solve this to not accidentally run into a bash lockup.
thanks!
PS: All nodes have the same hardware and as far as we can tell the same configuration.
PPS: On the nodes running 9.0.10 this LUN also has a weirdly high ID, without causing issues:
we are currently migrating to Proxmox, slowly adding more Nodes (HPE DL380 G10) to our cluster. The current nodes are on 9.0.10 and the new nodes which we are just adding now are on 9.1.1. (we are aware that we have to update all nodes in the cluster to the same version).
Now on those new 9.1.1 nodes in the cluster, we are having trouble when adding a new LUN over fibre channel. In our process we have to trigger a Bus rescan, since the newly presented disks don't show up without doing that. We trigger the rescan by running:
Bash:
for host in /sys/class/scsi_host/host*/scan; do echo "- - -" > $host; done
Now if we run "dmesg | grep lun" we get the following information:
Bash:
[ 5677.064642] sd 2:1:0:0: lun4194304 has a LUN larger than allowed by the host adapter
Bash:
[2:0:0:0] enclosu HPE Smart Adapter 6.22 -
[2:1:0:0] disk HPE LOGICAL VOLUME 6.22 /dev/sda
[2:2:0:0] storage HPE P408i-a SR Gen10 6.22 -
Bash:
for host in $(ls /sys/class/scsi_host/ | grep -v host2); do echo "- - -" > /sys/class/scsi_host/$host/scan; done
We have found a similar problem here. https://forum.proxmox.com/threads/i...roblem-with-qlogic-fiber-channel-cards.78797/
But the P408i-a runs on the smartpqi module, which can't make use of the ql2xmaxlun option. According to the manpage it does not have such an option.
Now we are wondering how we can solve this to not accidentally run into a bash lockup.
thanks!
PS: All nodes have the same hardware and as far as we can tell the same configuration.
PPS: On the nodes running 9.0.10 this LUN also has a weirdly high ID, without causing issues:
Bash:
root@node:~# cat /sys/class/scsi_disk/2\:1\:0\:0/device/lunid
0x0000004000000000
Last edited: