I have 2 nodes plus a pi as quorum witness. Both nodes are directly connected to a SAN with multipathing on each node setup. The nodes will switch to an unknown status then back to online every few minutes. VMs continue to run fine and I can migrate between the two nodes without moving storage and just moving compute resources without any issues. I spent a bunch of time trying to figure out whats going on with no luck and am hoping someone here can help I do not have the status issues when I remove the shared lvm and iscsi from 2nd node so I am wondering if it was my config. I will explain my config first (no switches are in use to connect the san to the hosts):
SAN iscsi connections:
nic1 172.16.1.1 direct connect to host 1 (iscsi1)
nic2 172.16.2.1 direct connect to host 1 (iscsi2)
nic3 172.16.3.1 direct connect to host 2 (iscsi3)
nic4 172.16.4.1 direct connect to host 2 (iscsi4)
host 1 iscsi connections:
nic1 172.16.1.2 (iscsi1)
nic2 172.16.2.2 (iscsi2)
host 2 iscsi connections:
172.16.3.2 (iscsi3)
172.16.4.2 (iscsi4)
multipath.conf file for both hosts with confirmed working multipath -ll output:
blacklist {
wwid .*
}
blacklist_exceptions {
wwid "36589cfc000000b637b5ff277165ec1e0"
}
multipaths {
multipath {
wwid 36589cfc0000001d00c1272f749a6d297
alias mpathb
}
}
multipath -ll output on each host:
root@hosta:~# multipath -ll
mpathb (36589cfc0000001d00c1272f749a6d297) dm-7 TrueNAS,iSCSI Disk
size=12T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 6:0:0:0 sdb 8:16 active ready running
`- 7:0:0:0 sdc 8:32 active ready running
mpathb (36589cfc0000001d00c1272f749a6d297) dm-5 TrueNAS,iSCSI Disk
size=12T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 7:0:0:0 sdb 8:16 active ready running
`- 8:0:0:0 sdc 8:32 active ready running
root@hostb:~#
All 4 iscsi connections show connected on both hosts as well as the LVM with the shared option enabled:
One or both will go to an unknown status along with everything below it, but I am still able to work from 1 host webgui to the others and it does not affect anything at all besides the status. Any ideas? Did I do something wrong in this setup?
SAN iscsi connections:
nic1 172.16.1.1 direct connect to host 1 (iscsi1)
nic2 172.16.2.1 direct connect to host 1 (iscsi2)
nic3 172.16.3.1 direct connect to host 2 (iscsi3)
nic4 172.16.4.1 direct connect to host 2 (iscsi4)
host 1 iscsi connections:
nic1 172.16.1.2 (iscsi1)
nic2 172.16.2.2 (iscsi2)
host 2 iscsi connections:
172.16.3.2 (iscsi3)
172.16.4.2 (iscsi4)
multipath.conf file for both hosts with confirmed working multipath -ll output:
blacklist {
wwid .*
}
blacklist_exceptions {
wwid "36589cfc000000b637b5ff277165ec1e0"
}
multipaths {
multipath {
wwid 36589cfc0000001d00c1272f749a6d297
alias mpathb
}
}
multipath -ll output on each host:
root@hosta:~# multipath -ll
mpathb (36589cfc0000001d00c1272f749a6d297) dm-7 TrueNAS,iSCSI Disk
size=12T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 6:0:0:0 sdb 8:16 active ready running
`- 7:0:0:0 sdc 8:32 active ready running
mpathb (36589cfc0000001d00c1272f749a6d297) dm-5 TrueNAS,iSCSI Disk
size=12T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 7:0:0:0 sdb 8:16 active ready running
`- 8:0:0:0 sdc 8:32 active ready running
root@hostb:~#
All 4 iscsi connections show connected on both hosts as well as the LVM with the shared option enabled:
One or both will go to an unknown status along with everything below it, but I am still able to work from 1 host webgui to the others and it does not affect anything at all besides the status. Any ideas? Did I do something wrong in this setup?
Last edited: