4 node iscsi multipath not working on 2 of the nodes

poxin

Well-Known Member
Jun 27, 2017
70
6
48
I had multipath working in the past on all 4 nodes that are connecting to a TrueNAS share. I'm not sure when, but now 2 of the nodes won't utilize multipath anymore and have been struggling to figure out why. I've attempted to verify config files, wwid on the drives, have rebooted them and restarted multipath service but no change.

The multipath.conf file is identical on each one:
code_language.shell:
~# cat /etc/multipath.conf
blacklist {
        wwid .*
}

blacklist_exceptions {
    wwid "36589cfc000000564f17ba1e2c35fde22"
    wwid "36589cfc000000c942157c6355c3e1c7e"
    wwid "36589cfc00000061b98ff986eb4c5d026"
    wwid "36589cfc0000003058d269875bf10299b"
}

defaults {
        polling_interval        2
        path_selector           "round-robin 0"
        path_grouping_policy    multibus
        uid_attribute           ID_SERIAL
        rr_min_io               100
        failback                immediate
        no_path_retry           queue
        user_friendly_names     yes
}

And the device WWIDs are present on each node, though I do notice they have different mappings from /sdc to /sdi depending on the node, if this piece matters?

code_language.shell:
~# /lib/udev/scsi_id -g -u -d /dev/sdc
36589cfc000000c942157c6355c3e1c7e

~# /lib/udev/scsi_id -g -u -d /dev/sde
36589cfc000000564f17ba1e2c35fde22

~# /lib/udev/scsi_id -g -u -d /dev/sdg
36589cfc00000061b98ff986eb4c5d026

~# /lib/udev/scsi_id -g -u -d /dev/sdi
36589cfc0000003058d269875bf10299b

They are present in the wwids:
code_language.shell:
~# cat /etc/multipath/wwids
# Multipath wwids, Version : 1.0
# NOTE: This file is automatically maintained by multipath and multipathd.
# You should not need to edit this file in normal circumstances.
#
# Valid WWIDs:
/36589cfc000000564f17ba1e2c35fde22/
/36589cfc000000c942157c6355c3e1c7e/
/36589cfc00000061b98ff986eb4c5d026/
/36589cfc0000003058d269875bf10299b/

The first node below is one that is not working, and the second entry looks fine. What else can I check?

PVE NODE 1 (no longer working)
code_language.shell:
pve-01:~# multipath -ll
mpatha (36589cfc000000c942157c6355c3e1c7e) dm-22 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 8:0:0:0  sdd 8:48 active ready running
mpathb (36589cfc00000061b98ff986eb4c5d026) dm-0 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 7:0:0:0  sdc 8:32 active ready running
mpathc (36589cfc000000564f17ba1e2c35fde22) dm-24 TrueNAS,iSCSI Disk
size=3.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 9:0:0:0  sde 8:64 active ready running
mpathd (36589cfc0000003058d269875bf10299b) dm-23 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 10:0:0:0 sdf 8:80 active ready running

PVE NODE 4 (working)
code_language.shell:
pve-04:~# multipath -ll
mpatha (36589cfc000000c942157c6355c3e1c7e) dm-0 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  |- 8:0:0:0  sdd 8:48  active ready running
  `- 7:0:0:0  sdc 8:32  active ready running
mpathb (36589cfc00000061b98ff986eb4c5d026) dm-15 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  |- 11:0:0:0 sdg 8:96  active ready running
  `- 12:0:0:0 sdh 8:112 active ready running
mpathc (36589cfc000000564f17ba1e2c35fde22) dm-14 TrueNAS,iSCSI Disk
size=3.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  |- 10:0:0:0 sdf 8:80  active ready running
  `- 9:0:0:0  sde 8:64  active ready running
mpathd (36589cfc0000003058d269875bf10299b) dm-41 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 13:0:0:0 sdi 8:128 active ready running
 
Last edited:
Are here are the multipath -v3 outputs on both:

PVE NODE 1 (no longer working)
code_language.shell:
pve-01:~# multipath -v3
Jun 30 09:08:41 | set open fds limit to 1048576/1048576
Jun 30 09:08:41 | loading //lib/multipath/libchecktur.so checker
Jun 30 09:08:41 | checker tur: message table size = 3
Jun 30 09:08:41 | loading //lib/multipath/libprioconst.so prioritizer
Jun 30 09:08:41 | _init_foreign: foreign library "nvme" is not enabled
Jun 30 09:08:41 | sda: size = 234441648
Jun 30 09:08:41 | sda: vendor = ATA
Jun 30 09:08:41 | sda: product = INTEL SSDSC2BB12
Jun 30 09:08:41 | sda: rev = 0370
Jun 30 09:08:41 | sda: h:b:t:l = 0:0:0:0
Jun 30 09:08:41 | sda: tgt_node_name = 0x4433221103000000
Jun 30 09:08:41 | sda: 14593 cyl, 255 heads, 63 sectors/track, start at 0
Jun 30 09:08:41 | sda: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sda: serial = CVWL425100UJ120LGN
Jun 30 09:08:41 | sda: detect_checker = yes (setting: multipath internal)
Jun 30 09:08:41 | sda: path_checker = tur (setting: multipath internal)
Jun 30 09:08:41 | sda: checker timeout = 30 s (setting: kernel sysfs)
Jun 30 09:08:41 | sda: tur state = up
Jun 30 09:08:41 | sda: uid_attribute = ID_SERIAL (setting: multipath.conf defaults/devices section)
Jun 30 09:08:41 | sda: uid = 355cd2e404bcaeb31 (udev)
Jun 30 09:08:41 | sda: wwid 355cd2e404bcaeb31 blacklisted
Jun 30 09:08:41 | sdb: size = 234441648
Jun 30 09:08:41 | sdb: vendor = ATA
Jun 30 09:08:41 | sdb: product = INTEL SSDSC2BB12
Jun 30 09:08:41 | sdb: rev = 0370
Jun 30 09:08:41 | sdb: h:b:t:l = 0:0:1:0
Jun 30 09:08:41 | sdb: tgt_node_name = 0x4433221102000000
Jun 30 09:08:41 | sdb: 14593 cyl, 255 heads, 63 sectors/track, start at 0
Jun 30 09:08:41 | sdb: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdb: serial = CVWL425200AW120LGN
Jun 30 09:08:41 | sdb: detect_checker = yes (setting: multipath internal)
Jun 30 09:08:41 | sdb: path_checker = tur (setting: multipath internal)
Jun 30 09:08:41 | sdb: checker timeout = 30 s (setting: kernel sysfs)
Jun 30 09:08:41 | sdb: tur state = up
Jun 30 09:08:41 | sdb: uid_attribute = ID_SERIAL (setting: multipath.conf defaults/devices section)
Jun 30 09:08:41 | sdb: uid = 355cd2e404bcb446e (udev)
Jun 30 09:08:41 | sdb: wwid 355cd2e404bcb446e blacklisted
Jun 30 09:08:41 | sr0: device node name blacklisted
Jun 30 09:08:41 | sdf: size = 8589934593
Jun 30 09:08:41 | sdf: vendor = TrueNAS
Jun 30 09:08:41 | sdf: product = iSCSI Disk
Jun 30 09:08:41 | sdf: rev = 0123
Jun 30 09:08:41 | sdf: h:b:t:l = 10:0:0:0
Jun 30 09:08:41 | sdf: tgt_node_name = iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
Jun 30 09:08:41 | sdf: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sdf: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdf: serial = ecf4bbd04714000
Jun 30 09:08:41 | sdf: detect_checker = yes (setting: multipath internal)
Jun 30 09:08:41 | sdf: path_checker = tur (setting: storage device autodetected)
Jun 30 09:08:41 | sdf: checker timeout = 30 s (setting: kernel sysfs)
Jun 30 09:08:41 | sdf: tur state = up
Jun 30 09:08:41 | sdf: uid_attribute = ID_SERIAL (setting: multipath.conf defaults/devices section)
Jun 30 09:08:41 | sdf: uid = 36589cfc0000003058d269875bf10299b (udev)
Jun 30 09:08:41 | sdf: wwid 36589cfc0000003058d269875bf10299b whitelisted
Jun 30 09:08:41 | sdf: detect_prio = yes (setting: multipath internal)
Jun 30 09:08:41 | loading //lib/multipath/libpriosysfs.so prioritizer
Jun 30 09:08:41 | sdf: prio = sysfs (setting: storage device autodetected)
Jun 30 09:08:41 | sdf: prio args = "" (setting: storage device autodetected)
Jun 30 09:08:41 | sdf: sysfs prio = 50
Jun 30 09:08:41 | sdc: size = 8589934593
Jun 30 09:08:41 | sdc: vendor = TrueNAS
Jun 30 09:08:41 | sdc: product = iSCSI Disk
Jun 30 09:08:41 | sdc: rev = 0123
Jun 30 09:08:41 | sdc: h:b:t:l = 7:0:0:0
Jun 30 09:08:41 | sdc: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
Jun 30 09:08:41 | sdc: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sdc: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdc: serial = 246e96471b80001
Jun 30 09:08:41 | sdc: detect_checker = yes (setting: multipath internal)
Jun 30 09:08:41 | sdc: path_checker = tur (setting: storage device autodetected)
Jun 30 09:08:41 | sdc: checker timeout = 30 s (setting: kernel sysfs)
Jun 30 09:08:41 | sdc: tur state = up
Jun 30 09:08:41 | sdc: uid_attribute = ID_SERIAL (setting: multipath.conf defaults/devices section)
Jun 30 09:08:41 | sdc: uid = 36589cfc00000061b98ff986eb4c5d026 (udev)
Jun 30 09:08:41 | sdc: wwid 36589cfc00000061b98ff986eb4c5d026 whitelisted
Jun 30 09:08:41 | sdc: detect_prio = yes (setting: multipath internal)
Jun 30 09:08:41 | sdc: prio = sysfs (setting: storage device autodetected)
Jun 30 09:08:41 | sdc: prio args = "" (setting: storage device autodetected)
Jun 30 09:08:41 | sdc: sysfs prio = 50
Jun 30 09:08:41 | sdd: size = 8589934593
Jun 30 09:08:41 | sdd: vendor = TrueNAS
Jun 30 09:08:41 | sdd: product = iSCSI Disk
Jun 30 09:08:41 | sdd: rev = 0123
Jun 30 09:08:41 | sdd: h:b:t:l = 8:0:0:0
Jun 30 09:08:41 | sdd: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
Jun 30 09:08:41 | sdd: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sdd: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdd: serial = 246e96471b80000
Jun 30 09:08:41 | sdd: detect_checker = yes (setting: multipath internal)
Jun 30 09:08:41 | sdd: path_checker = tur (setting: storage device autodetected)
Jun 30 09:08:41 | sdd: checker timeout = 30 s (setting: kernel sysfs)
Jun 30 09:08:41 | sdd: tur state = up
Jun 30 09:08:41 | sdd: uid_attribute = ID_SERIAL (setting: multipath.conf defaults/devices section)
Jun 30 09:08:41 | sdd: uid = 36589cfc000000c942157c6355c3e1c7e (udev)
Jun 30 09:08:41 | sdd: wwid 36589cfc000000c942157c6355c3e1c7e whitelisted
Jun 30 09:08:41 | sdd: detect_prio = yes (setting: multipath internal)
Jun 30 09:08:41 | sdd: prio = sysfs (setting: storage device autodetected)
Jun 30 09:08:41 | sdd: prio args = "" (setting: storage device autodetected)
Jun 30 09:08:41 | sdd: sysfs prio = 50
Jun 30 09:08:41 | sde: size = 6979321857
Jun 30 09:08:41 | sde: vendor = TrueNAS
Jun 30 09:08:41 | sde: product = iSCSI Disk
Jun 30 09:08:41 | sde: rev = 0123
Jun 30 09:08:41 | sde: h:b:t:l = 9:0:0:0
Jun 30 09:08:41 | sde: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb
Jun 30 09:08:41 | sde: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sde: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sde: serial = 246e96471b80002
Jun 30 09:08:41 | sde: detect_checker = yes (setting: multipath internal)
Jun 30 09:08:41 | sde: path_checker = tur (setting: storage device autodetected)
Jun 30 09:08:41 | sde: checker timeout = 30 s (setting: kernel sysfs)
Jun 30 09:08:41 | sde: tur state = up
Jun 30 09:08:41 | sde: uid_attribute = ID_SERIAL (setting: multipath.conf defaults/devices section)
Jun 30 09:08:41 | sde: uid = 36589cfc000000564f17ba1e2c35fde22 (udev)
Jun 30 09:08:41 | sde: wwid 36589cfc000000564f17ba1e2c35fde22 whitelisted
Jun 30 09:08:41 | sde: detect_prio = yes (setting: multipath internal)
Jun 30 09:08:41 | sde: prio = sysfs (setting: storage device autodetected)
Jun 30 09:08:41 | sde: prio args = "" (setting: storage device autodetected)
Jun 30 09:08:41 | sde: sysfs prio = 50
Jun 30 09:08:41 | loop0: device node name blacklisted
Jun 30 09:08:41 | loop1: device node name blacklisted
Jun 30 09:08:41 | loop2: device node name blacklisted
Jun 30 09:08:41 | loop3: device node name blacklisted
Jun 30 09:08:41 | loop4: device node name blacklisted
Jun 30 09:08:41 | loop5: device node name blacklisted
Jun 30 09:08:41 | loop6: device node name blacklisted
Jun 30 09:08:41 | loop7: device node name blacklisted
Jun 30 09:08:41 | dm-0: device node name blacklisted
Jun 30 09:08:41 | dm-1: device node name blacklisted
Jun 30 09:08:41 | dm-10: device node name blacklisted
Jun 30 09:08:41 | dm-11: device node name blacklisted
Jun 30 09:08:41 | dm-12: device node name blacklisted
Jun 30 09:08:41 | dm-13: device node name blacklisted
Jun 30 09:08:41 | dm-14: device node name blacklisted
Jun 30 09:08:41 | dm-15: device node name blacklisted
Jun 30 09:08:41 | dm-16: device node name blacklisted
Jun 30 09:08:41 | dm-17: device node name blacklisted
Jun 30 09:08:41 | dm-18: device node name blacklisted
Jun 30 09:08:41 | dm-19: device node name blacklisted
Jun 30 09:08:41 | dm-2: device node name blacklisted
Jun 30 09:08:41 | dm-20: device node name blacklisted
Jun 30 09:08:41 | dm-21: device node name blacklisted
Jun 30 09:08:41 | dm-22: device node name blacklisted
Jun 30 09:08:41 | dm-23: device node name blacklisted
Jun 30 09:08:41 | dm-24: device node name blacklisted
Jun 30 09:08:41 | dm-25: device node name blacklisted
Jun 30 09:08:41 | dm-26: device node name blacklisted
Jun 30 09:08:41 | dm-27: device node name blacklisted
Jun 30 09:08:41 | dm-28: device node name blacklisted
Jun 30 09:08:41 | dm-29: device node name blacklisted
Jun 30 09:08:41 | dm-3: device node name blacklisted
Jun 30 09:08:41 | dm-30: device node name blacklisted
Jun 30 09:08:41 | dm-31: device node name blacklisted
Jun 30 09:08:41 | dm-32: device node name blacklisted
Jun 30 09:08:41 | dm-33: device node name blacklisted
Jun 30 09:08:41 | dm-34: device node name blacklisted
Jun 30 09:08:41 | dm-35: device node name blacklisted
Jun 30 09:08:41 | dm-36: device node name blacklisted
Jun 30 09:08:41 | dm-37: device node name blacklisted
Jun 30 09:08:41 | dm-38: device node name blacklisted
Jun 30 09:08:41 | dm-39: device node name blacklisted
Jun 30 09:08:41 | dm-4: device node name blacklisted
Jun 30 09:08:41 | dm-40: device node name blacklisted
Jun 30 09:08:41 | dm-41: device node name blacklisted
Jun 30 09:08:41 | dm-42: device node name blacklisted
Jun 30 09:08:41 | dm-43: device node name blacklisted
Jun 30 09:08:41 | dm-44: device node name blacklisted
Jun 30 09:08:41 | dm-45: device node name blacklisted
Jun 30 09:08:41 | dm-46: device node name blacklisted
Jun 30 09:08:41 | dm-47: device node name blacklisted
Jun 30 09:08:41 | dm-48: device node name blacklisted
Jun 30 09:08:41 | dm-49: device node name blacklisted
Jun 30 09:08:41 | dm-5: device node name blacklisted
Jun 30 09:08:41 | dm-50: device node name blacklisted
Jun 30 09:08:41 | dm-51: device node name blacklisted
Jun 30 09:08:41 | dm-52: device node name blacklisted
Jun 30 09:08:41 | dm-53: device node name blacklisted
Jun 30 09:08:41 | dm-54: device node name blacklisted
Jun 30 09:08:41 | dm-55: device node name blacklisted
Jun 30 09:08:41 | dm-56: device node name blacklisted
Jun 30 09:08:41 | dm-57: device node name blacklisted
Jun 30 09:08:41 | dm-58: device node name blacklisted
Jun 30 09:08:41 | dm-8: device node name blacklisted
Jun 30 09:08:41 | dm-9: device node name blacklisted
===== paths list =====
uuid                              hcil     dev dev_t pri dm_st chk_st vend/pro
36589cfc0000003058d269875bf10299b 10:0:0:0 sdf 8:80  50  undef undef  TrueNAS,
36589cfc00000061b98ff986eb4c5d026 7:0:0:0  sdc 8:32  50  undef undef  TrueNAS,
36589cfc000000c942157c6355c3e1c7e 8:0:0:0  sdd 8:48  50  undef undef  TrueNAS,
36589cfc000000564f17ba1e2c35fde22 9:0:0:0  sde 8:64  50  undef undef  TrueNAS,
Jun 30 09:08:41 | libdevmapper version 1.02.175 (2021-01-08)
Jun 30 09:08:41 | DM multipath kernel driver v1.14.0
Jun 30 09:08:41 | sdd: size = 8589934593
Jun 30 09:08:41 | sdd: vendor = TrueNAS
Jun 30 09:08:41 | sdd: product = iSCSI Disk
Jun 30 09:08:41 | sdd: rev = 0123
Jun 30 09:08:41 | sdd: h:b:t:l = 8:0:0:0
Jun 30 09:08:41 | sdd: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
Jun 30 09:08:41 | sdd: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sdd: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdd: serial = 246e96471b80000
Jun 30 09:08:41 | sdd: tur state = up
Jun 30 09:08:41 | sdc: size = 8589934593
Jun 30 09:08:41 | sdc: vendor = TrueNAS
Jun 30 09:08:41 | sdc: product = iSCSI Disk
Jun 30 09:08:41 | sdc: rev = 0123
Jun 30 09:08:41 | sdc: h:b:t:l = 7:0:0:0
Jun 30 09:08:41 | sdc: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
Jun 30 09:08:41 | sdc: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sdc: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdc: serial = 246e96471b80001
Jun 30 09:08:41 | sdc: tur state = up
Jun 30 09:08:41 | sde: size = 6979321857
Jun 30 09:08:41 | sde: vendor = TrueNAS
Jun 30 09:08:41 | sde: product = iSCSI Disk
Jun 30 09:08:41 | sde: rev = 0123
Jun 30 09:08:41 | sde: h:b:t:l = 9:0:0:0
Jun 30 09:08:41 | sde: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb
Jun 30 09:08:41 | sde: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sde: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sde: serial = 246e96471b80002
Jun 30 09:08:41 | sde: tur state = up
Jun 30 09:08:41 | sdf: size = 8589934593
Jun 30 09:08:41 | sdf: vendor = TrueNAS
Jun 30 09:08:41 | sdf: product = iSCSI Disk
Jun 30 09:08:41 | sdf: rev = 0123
Jun 30 09:08:41 | sdf: h:b:t:l = 10:0:0:0
Jun 30 09:08:41 | sdf: tgt_node_name = iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
Jun 30 09:08:41 | sdf: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:08:41 | sdf: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:08:41 | sdf: serial = ecf4bbd04714000
Jun 30 09:08:41 | sdf: tur state = up
Jun 30 09:08:41 | sdf: udev property ID_WWN whitelisted
Jun 30 09:08:41 | sdf: wwid 36589cfc0000003058d269875bf10299b whitelisted
Jun 30 09:08:41 | sdc: udev property ID_WWN whitelisted
Jun 30 09:08:41 | sdc: wwid 36589cfc00000061b98ff986eb4c5d026 whitelisted
Jun 30 09:08:41 | sdd: udev property ID_WWN whitelisted
Jun 30 09:08:41 | sdd: wwid 36589cfc000000c942157c6355c3e1c7e whitelisted
Jun 30 09:08:41 | sde: udev property ID_WWN whitelisted
Jun 30 09:08:41 | sde: wwid 36589cfc000000564f17ba1e2c35fde22 whitelisted
Jun 30 09:08:41 | unloading sysfs prioritizer
Jun 30 09:08:41 | unloading const prioritizer
Jun 30 09:08:41 | unloading tur checker
 
PVE NODE 4 (working)
(trimmed first part due to character limit)

code_language.shell:
pve-04:~# multipath -v3
===== paths list =====
uuid                              hcil     dev dev_t pri dm_st chk_st vend/pro
36589cfc000000564f17ba1e2c35fde22 10:0:0:0 sdf 8:80  50  undef undef  TrueNAS,
36589cfc00000061b98ff986eb4c5d026 11:0:0:0 sdg 8:96  50  undef undef  TrueNAS,
36589cfc00000061b98ff986eb4c5d026 12:0:0:0 sdh 8:112 50  undef undef  TrueNAS,
36589cfc0000003058d269875bf10299b 13:0:0:0 sdi 8:128 50  undef undef  TrueNAS,
36589cfc000000c942157c6355c3e1c7e 7:0:0:0  sdc 8:32  50  undef undef  TrueNAS,
36589cfc000000c942157c6355c3e1c7e 8:0:0:0  sdd 8:48  50  undef undef  TrueNAS,
36589cfc000000564f17ba1e2c35fde22 9:0:0:0  sde 8:64  50  undef undef  TrueNAS,
Jun 30 09:07:58 | libdevmapper version 1.02.175 (2021-01-08)
Jun 30 09:07:58 | DM multipath kernel driver v1.14.0
Jun 30 09:07:58 | sdd: size = 8589934593
Jun 30 09:07:58 | sdd: vendor = TrueNAS
Jun 30 09:07:58 | sdd: product = iSCSI Disk
Jun 30 09:07:58 | sdd: rev = 0123
Jun 30 09:07:58 | sdd: h:b:t:l = 8:0:0:0
Jun 30 09:07:58 | sdd: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
Jun 30 09:07:58 | sdd: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sdd: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sdd: serial = 246e96471b80000
Jun 30 09:07:58 | sdd: tur state = up
Jun 30 09:07:58 | sdc: size = 8589934593
Jun 30 09:07:58 | sdc: vendor = TrueNAS
Jun 30 09:07:58 | sdc: product = iSCSI Disk
Jun 30 09:07:58 | sdc: rev = 0123
Jun 30 09:07:58 | sdc: h:b:t:l = 7:0:0:0
Jun 30 09:07:58 | sdc: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
Jun 30 09:07:58 | sdc: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sdc: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sdc: serial = 246e96471b80000
Jun 30 09:07:58 | sdc: tur state = up
Jun 30 09:07:58 | sdg: size = 8589934593
Jun 30 09:07:58 | sdg: vendor = TrueNAS
Jun 30 09:07:58 | sdg: product = iSCSI Disk
Jun 30 09:07:58 | sdg: rev = 0123
Jun 30 09:07:58 | sdg: h:b:t:l = 11:0:0:0
Jun 30 09:07:58 | sdg: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
Jun 30 09:07:58 | sdg: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sdg: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sdg: serial = 246e96471b80001
Jun 30 09:07:58 | sdg: tur state = up
Jun 30 09:07:58 | sdh: size = 8589934593
Jun 30 09:07:58 | sdh: vendor = TrueNAS
Jun 30 09:07:58 | sdh: product = iSCSI Disk
Jun 30 09:07:58 | sdh: rev = 0123
Jun 30 09:07:58 | sdh: h:b:t:l = 12:0:0:0
Jun 30 09:07:58 | sdh: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
Jun 30 09:07:58 | sdh: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sdh: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sdh: serial = 246e96471b80001
Jun 30 09:07:58 | sdh: tur state = up
Jun 30 09:07:58 | sdf: size = 6979321857
Jun 30 09:07:58 | sdf: vendor = TrueNAS
Jun 30 09:07:58 | sdf: product = iSCSI Disk
Jun 30 09:07:58 | sdf: rev = 0123
Jun 30 09:07:58 | sdf: h:b:t:l = 10:0:0:0
Jun 30 09:07:58 | sdf: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb
Jun 30 09:07:58 | sdf: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sdf: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sdf: serial = 246e96471b80002
Jun 30 09:07:58 | sdf: tur state = up
Jun 30 09:07:58 | sde: size = 6979321857
Jun 30 09:07:58 | sde: vendor = TrueNAS
Jun 30 09:07:58 | sde: product = iSCSI Disk
Jun 30 09:07:58 | sde: rev = 0123
Jun 30 09:07:58 | sde: h:b:t:l = 9:0:0:0
Jun 30 09:07:58 | sde: tgt_node_name = iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb
Jun 30 09:07:58 | sde: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sde: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sde: serial = 246e96471b80002
Jun 30 09:07:58 | sde: tur state = up
Jun 30 09:07:58 | sdi: size = 8589934593
Jun 30 09:07:58 | sdi: vendor = TrueNAS
Jun 30 09:07:58 | sdi: product = iSCSI Disk
Jun 30 09:07:58 | sdi: rev = 0123
Jun 30 09:07:58 | sdi: h:b:t:l = 13:0:0:0
Jun 30 09:07:58 | sdi: tgt_node_name = iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
Jun 30 09:07:58 | sdi: 0 cyl, 64 heads, 32 sectors/track, start at 0
Jun 30 09:07:58 | sdi: vpd_vendor_id = 0 "undef" (setting: multipath internal)
Jun 30 09:07:58 | sdi: serial = ecf4bbd04714000
Jun 30 09:07:58 | sdi: tur state = up
Jun 30 09:07:58 | sdf: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sdf: wwid 36589cfc000000564f17ba1e2c35fde22 whitelisted
Jun 30 09:07:58 | sdg: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sdg: wwid 36589cfc00000061b98ff986eb4c5d026 whitelisted
Jun 30 09:07:58 | sdh: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sdh: wwid 36589cfc00000061b98ff986eb4c5d026 whitelisted
Jun 30 09:07:58 | sdi: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sdi: wwid 36589cfc0000003058d269875bf10299b whitelisted
Jun 30 09:07:58 | sdc: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sdc: wwid 36589cfc000000c942157c6355c3e1c7e whitelisted
Jun 30 09:07:58 | sdd: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sdd: wwid 36589cfc000000c942157c6355c3e1c7e whitelisted
Jun 30 09:07:58 | sde: udev property ID_WWN whitelisted
Jun 30 09:07:58 | sde: wwid 36589cfc000000564f17ba1e2c35fde22 whitelisted
Jun 30 09:07:58 | unloading sysfs prioritizer
Jun 30 09:07:58 | unloading const prioritizer
Jun 30 09:07:58 | unloading tur checker
 
whats the output of "lsscsi" from "good" and "bad" nodes?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Code:
pve-01:~# lsscsi
[0:0:0:0]    disk    ATA      INTEL SSDSC2BB12 0370  /dev/sda
[0:0:1:0]    disk    ATA      INTEL SSDSC2BB12 0370  /dev/sdb
[5:0:0:0]    cd/dvd  TSSTcorp DVD-ROM SN-108DN D150  /dev/sr0
[7:0:0:0]    disk    TrueNAS  iSCSI Disk       0123  /dev/sdc
[8:0:0:0]    disk    TrueNAS  iSCSI Disk       0123  /dev/sdd
[9:0:0:0]    disk    TrueNAS  iSCSI Disk       0123  /dev/sde
[10:0:0:0]   disk    TrueNAS  iSCSI Disk       0123  /dev/sdf

Code:
pve-04:~# lsscsi
[0:0:4:0]    disk    ATA      CT240BX500SSD1   052   /dev/sda
[0:0:5:0]    disk    ATA      CT240BX500SSD1   052   /dev/sdb
[5:0:0:0]    cd/dvd  TSSTcorp DVD-ROM SN-108DN D150  /dev/sr0
[7:0:0:0]    disk    TrueNAS  iSCSI Disk       0123  /dev/sdc
[8:0:0:0]    disk    TrueNAS  iSCSI Disk       0123  /dev/sdd
[9:0:0:0]    disk    TrueNAS  iSCSI Disk       0123  /dev/sde
[10:0:0:0]   disk    TrueNAS  iSCSI Disk       0123  /dev/sdf
[11:0:0:0]   disk    TrueNAS  iSCSI Disk       0123  /dev/sdg
[12:0:0:0]   disk    TrueNAS  iSCSI Disk       0123  /dev/sdh
[13:0:0:0]   disk    TrueNAS  iSCSI Disk       0123  /dev/sdi
 
Are you logged into each portal? Have you tried scanning again?

Code:
iscsiadm -m session --rescan
Doesn't appear to be any change

code_language.shell:
pve-01:~# iscsiadm -m session --rescan
Rescanning session [sid: 1, target: iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2, portal: 10.202.202.5,3260]
Rescanning session [sid: 2, target: iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1, portal: 10.202.202.5,3260]
Rescanning session [sid: 3, target: iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb, portal: 10.202.202.5,3260]
Rescanning session [sid: 4, target: iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb, portal: 10.202.202.6,3260]
root@c4-pve-01:~# multipath -ll
mpatha (36589cfc000000c942157c6355c3e1c7e) dm-22 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 8:0:0:0  sdd 8:48 active ready running
mpathb (36589cfc00000061b98ff986eb4c5d026) dm-0 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 7:0:0:0  sdc 8:32 active ready running
mpathc (36589cfc000000564f17ba1e2c35fde22) dm-24 TrueNAS,iSCSI Disk
size=3.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 9:0:0:0  sde 8:64 active ready running
mpathd (36589cfc0000003058d269875bf10299b) dm-23 TrueNAS,iSCSI Disk
size=4.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  `- 10:0:0:0 sdf 8:80 active ready running
 
multipath can only do its job if the raw devices are presented to Kernel. The "lsscsi" clearly shows that the "bad" node only has single path connectivity to the storage. This is a view before multipath gets involved.

Compare the output of "iscsiadm -m node" and "iscsiadm -m session" between nodes.
Start looking at your storage side for any changes in zoning, make sure your network paths are functioning.


How are you connecting iSCSI to PVE? Are you using PVE storage configuration or manual setup?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
multipath can only do its job if the raw devices are presented to Kernel. The "lsscsi" clearly shows that the "bad" node only has single path connectivity to the storage. This is a view before multipath gets involved.

Compare the output of "iscsiadm -m node" and "iscsiadm -m session" between nodes.
Start looking at your storage side for any changes in zoning, make sure your network paths are functioning.


How are you connecting iSCSI to PVE? Are you using PVE storage configuration or manual setup?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
For the storage I added the ISCSI Target at the datacenter level and created an LVM on top of the disk, then added that also via the Proxmox GUI. No manual setup. I believe you actually helped me with the initial setup in https://forum.proxmox.com/threads/mpio-with-proxmox-iscsi-and-truenas.123832

I can ping the node on both network paths which is 10.201.201.x and 10.202.202.x. Comparing the output actually looks the same on both nodes:

code_language.shell:
pve-01:~# iscsiadm -m node
10.201.201.6:3260,1 iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
10.202.202.6:3260,1 iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb
10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb

code_language.shell:
pve-04:~# iscsiadm -m node
10.201.201.6:3260,1 iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
10.202.202.6:3260,1 iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb
10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1
10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2
10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb
10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb

Session shows a different story
code_language.shell:
pve-01:~# iscsiadm -m session
tcp: [1] 10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2 (non-flash)
tcp: [2] 10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1 (non-flash)
tcp: [3] 10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb (non-flash)
tcp: [4] 10.202.202.6:3260,1 iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb (non-flash)

pve-04:~# iscsiadm -m session
tcp: [1] 10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1 (non-flash)
tcp: [2] 10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1 (non-flash)
tcp: [3] 10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb (non-flash)
tcp: [4] 10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb (non-flash)
tcp: [5] 10.202.202.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2 (non-flash)
tcp: [6] 10.201.201.5:3260,1 iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2 (non-flash)
tcp: [7] 10.201.201.6:3260,1 iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb (non-flash)
 
Last edited:
I would examine your "journalctl" for any "iscsi" related messages on "bad" node, to see if there is any indication why the session was logged out or failed.
You can try to fix it by manually logging the session in, ie:
iscsiadm -m node -T $mytarget -p $myportal --login
Or rebooting the host.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
I would examine your "journalctl" for any "iscsi" related messages on "bad" node, to see if there is any indication why the session was logged out or failed.
You can try to fix it by manually logging the session in, ie:
iscsiadm -m node -T $mytarget -p $myportal --login
Or rebooting the host.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Unfortunately rebooting doesn't resolve it, this was one of the first things I attempted. I'll try to look at the logs but not completely sure what to look out for.
 
"journalctl -b0 -u iscsid" is a good start


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Seems to have issue connecting to the TrueNAS box (which is on 10.201.201.x and 10.202.202.X.) The 201 path seems to fail on this node (fine on two others.. very odd):

code_language.shell:
Jun 29 07:20:31 c4-pve-01 iscsid[2639]: Connection1:0 to [target: iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-2, portal: 10.202.202.5,3260] through [iface: default] is operational now
Jun 29 07:20:31 c4-pve-01 iscsid[2639]: Connection2:0 to [target: iqn.2005-10.org.freenas.ctl:ssd-z2-x10-600gb-1, portal: 10.202.202.5,3260] through [iface: default] is operational now
Jun 29 07:20:31 c4-pve-01 iscsid[2639]: Connection3:0 to [target: iqn.2005-10.org.freenas.ctl:ssd-z2-x4-2tb, portal: 10.202.202.5,3260] through [iface: default] is operational now
Jun 29 07:20:31 c4-pve-01 iscsid[2639]: Connection4:0 to [target: iqn.2005-10.org.freenas.ctl:flash-z1-x4-1500gb, portal: 10.202.202.6,3260] through [iface: default] is operational now
Jun 29 07:20:34 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:34 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:34 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:35 c4-pve-01 iscsid[2639]: connect to 10.201.201.6:3260 failed (No route to host)
Jun 29 07:20:43 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:43 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:43 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:44 c4-pve-01 iscsid[2639]: connect to 10.201.201.6:3260 failed (No route to host)
Jun 29 07:20:50 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)
Jun 29 07:20:50 c4-pve-01 iscsid[2639]: connect to 10.201.201.5:3260 failed (No route to host)

But I can ping and telnet to it just fine.
code_language.shell:
root@c4-pve-01:~# ping 10.201.201.5
PING 10.201.201.5 (10.201.201.5) 56(84) bytes of data.
64 bytes from 10.201.201.5: icmp_seq=1 ttl=64 time=0.192 ms
64 bytes from 10.201.201.5: icmp_seq=2 ttl=64 time=0.225 ms
64 bytes from 10.201.201.5: icmp_seq=3 ttl=64 time=0.111 ms
64 bytes from 10.201.201.5: icmp_seq=4 ttl=64 time=0.237 ms
^C
--- 10.201.201.5 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3058ms
rtt min/avg/max/mdev = 0.111/0.191/0.237/0.049 ms

root@c4-pve-01:~# telnet 10.201.201.5 3260
Trying 10.201.201.5...
Connected to 10.201.201.5.
Escape character is '^]'.
 
Last edited:
iscsiadm -m node -T $mytarget -p $myportal --login
I logged in to all the missing sessions on the nodes again and everything appears normal. Thank you. Looks like we might have a switch flaking out. I'm curious why it doesn't auto remount them if I were to restart the node however.
 
the error you get is generic TCP/IP layer as described here: https://bugzilla.redhat.com/show_bug.cgi?id=736957

some retries can be controlled in /etc/iscsi/iscsid.conf
Code:
# To specify the number of times iscsid should retry a login
# if the login attempt fails due to the node.conn[0].timeo.login_timeout
# expiring modify the following line. Note that if the login fails
# quickly (before node.conn[0].timeo.login_timeout fires) because the network
# layer or the target returns an error, iscsid may retry the login more than
# node.session.initial_login_retry_max times.
#
# This retry count along with node.conn[0].timeo.login_timeout
# determines the maximum amount of time iscsid will try to
# establish the initial login. node.session.initial_login_retry_max is
# multiplied by the node.conn[0].timeo.login_timeout to determine the
# maximum amount.
#
# The default node.session.initial_login_retry_max is 8 and
# node.conn[0].timeo.login_timeout is 15 so we have:
#
# node.conn[0].timeo.login_timeout * node.session.initial_login_retry_max =
#                                                               120 seconds
#
# Valid values are any integer value. This only
# affects the initial login. Setting it to a high value can slow
# down the iscsi service startup. Setting it to a low value can
# cause a session to not get logged into, if there are distuptions
# during startup or if the network is not ready at that time.
node.session.initial_login_retry_max = 8


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!