iscsi multipath does not switch

sergei56 · Jul 13, 2021

Hello.
I have a proxmox cluster and ME4024 storage.
Configured iSCSI Multipath as written in the instructions.
For the test, I disconnected the port wire, the address of which is connected in proxmox. At the same time, the machine continued to work for me until I turn it off.
The main question.
After pulling out the wire, the device and the LVM drive become with question marks.
I think that means. that iSCSI Multipath does not work and does not switch to another route.
Is that how it should be or not?

root@proxn1cl1:~# multipath -ll
Jul 13 11:24:01 | /etc/multipath.conf line 25, invalid keyword: polling_interval
mpath0 (3600c0ff0005050974bfde26001000000) dm-5 DellEMC,ME4
size=16T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=23 status=active
|- 15:0:0:0 sdb 8:16 failed faulty running
|- 16:0:0:0 sdc 8:32 active ready running
|- 17:0:0:0 sdd 8:48 active ready running
`- 18:0:0:0 sde 8:64 active ready running

defaults {
polling_interval 2
path_selector "round-robin 0"
path_grouping_policy multibus
getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n"
rr_min_io 100
failback immediate
no_path_retry queue
}
blacklist {
wwid .*
}

blacklist_exceptions {
wwid 3600c0ff0005050974bfde26001000000

}

devices {
device {
vendor "DELL"
product "MD32xxi"
path_grouping_policy group_by_prio
prio rdac
polling_interval 5
path_checker rdac
path_selector "round-robin 0"
hardware_handler "1 rdac"
failback immediate
features "2 pg_init_retries 50"
no_path_retry 30
rr_min_io 100
}
}

mira · Jul 13, 2021

Please provide the storage config (/etc/pve/storage.cfg), the output of iscsiadm -m node, the output of iscsiadm -m session and your network config (/etc/network/interfaces).

Also please provide the output of pveversion -v.

Please note that those SAN configs are only examples and you should get the right config from the SAN vendor.

sergei56 · Jul 13, 2021

/etc/pve/storage.cfg

dir: local
path /var/lib/vz
content vztmpl,iso,backup

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

iscsi: delliscsi
portal 192.168.30.10
target iqn.1988-11.com.dell:01.array.bc305bf0ffcf
content none

lvm: delllvm
vgname grdell
base delliscsi:0.0.0.dm-uuid-mpath-3600c0ff0005050974bfde26001000000
content images,rootdir
shared 1

root@proxn1cl1:~# iscsiadm -m node
192.168.30.10:3260,1 iqn.1988-11.com.dell:01.array.bc305bf0ffcf
192.168.30.30:3260,3 iqn.1988-11.com.dell:01.array.bc305bf0ffcf
192.168.30.40:3260,8 iqn.1988-11.com.dell:01.array.bc305bf0ffcf
192.168.30.10:3260,2 iqn.1988-11.com.dell:01.array.bc305bf0ffcf
192.168.30.20:3260,6 iqn.1988-11.com.dell:01.array.bc305bf0ffcf

root@proxn1cl1:~# iscsiadm -m session
tcp: [1] 192.168.30.10:3260,3 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [2] 192.168.30.30:3260,3 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [3] 192.168.30.40:3260,8 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [4] 192.168.30.20:3260,6 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)

/etc/network/interfaces

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet static
address 192.168.30.1/24

auto vmbr0
iface vmbr0 inet static
address 1.1.0.31/20
gateway 1.1.1.9
bridge-ports eno1
bridge-stp off
bridge-fd 0

root@proxn1cl1:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.106-1-pve)
pve-manager: 6.4-4 (running version: 6.4-4/337d6701)
pve-kernel-5.4: 6.4-1
pve-kernel-helper: 6.4-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.4-1
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-2
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-1
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.5-3
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-1
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

mira · Jul 13, 2021

Am I seeing this right? You have 4 paths, all connected via the same interface and in the same subnet?
Then it's to be expected that the PVE host doesn't see the storage as online once you pull the cable.

sergei56 · Jul 13, 2021

The scheme is as follows.
The storage has 4 interfaces and, accordingly, 4 addresses.
They are connected to a switch.
192.168.30.10
192.168.30.20
192.168.30.30
192.168.30.40
Each node is connected to this switch with its own interface.
The storage is connected via the web proxmox, at 192.168.30.10.

When I pull the wire out of the 192.168.30.10 interface.
The device and the disk are disconnected.
But we have iSCSI Multipath.
Shouldn't proxmox reconnect via a different route?

mira · Jul 13, 2021

Ah, my mistake.
As long as there's a session available for that target, it should be shown as `online`.

What's the output of iscsiadm -m session when you disconnect the 192.168.30.10 link?

sergei56 · Jul 13, 2021

Before disconnecting the wire and after disconnecting.
Nothing changes.

root@proxn1cl1:~# iscsiadm -m session
tcp: [1] 192.168.30.10:3260,1 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [2] 192.168.30.30:3260,3 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [3] 192.168.30.40:3260,8 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [4] 192.168.30.20:3260,6 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
root@proxn1cl1:~# ping 192.168.30.10
PING 192.168.30.10 (192.168.30.10) 56(84) bytes of data.
From 192.168.30.1 icmp_seq=1 Destination Host Unreachable
From 192.168.30.1 icmp_seq=2 Destination Host Unreachable
From 192.168.30.1 icmp_seq=3 Destination Host Unreachable
From 192.168.30.1 icmp_seq=4 Destination Host Unreachable
From 192.168.30.1 icmp_seq=5 Destination Host Unreachable
From 192.168.30.1 icmp_seq=6 Destination Host Unreachable
From 192.168.30.1 icmp_seq=7 Destination Host Unreachable
From 192.168.30.1 icmp_seq=8 Destination Host Unreachable
From 192.168.30.1 icmp_seq=9 Destination Host Unreachable
From 192.168.30.1 icmp_seq=10 Destination Host Unreachable
From 192.168.30.1 icmp_seq=11 Destination Host Unreachable
From 192.168.30.1 icmp_seq=12 Destination Host Unreachable
From 192.168.30.1 icmp_seq=13 Destination Host Unreachable
From 192.168.30.1 icmp_seq=14 Destination Host Unreachable
From 192.168.30.1 icmp_seq=15 Destination Host Unreachable
From 192.168.30.1 icmp_seq=16 Destination Host Unreachable
From 192.168.30.1 icmp_seq=17 Destination Host Unreachable
From 192.168.30.1 icmp_seq=18 Destination Host Unreachable
^C
--- 192.168.30.10 ping statistics ---
19 packets transmitted, 0 received, +18 errors, 100% packet loss, time 372ms
pipe 4
root@proxn1cl1:~# iscsiadm -m session
tcp: [1] 192.168.30.10:3260,1 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [2] 192.168.30.30:3260,3 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [3] 192.168.30.40:3260,8 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)
tcp: [4] 192.168.30.20:3260,6 iqn.1988-11.com.dell:01.array.bc305bf0ffcf (non-flash)

sergei56 · Jul 14, 2021

Hello.
Can you help me?

mira · Jul 14, 2021

When checking if the storage is `online` the portal is tested. That means once that link is down, the check fails.
We might be able to check something else once the first connection is made. But it wouldn't work if the link where the portal is on, is unavailable on the first connection.

sergei56 · Jul 14, 2021

Do not understand. So, after all, switching to other routes should work or not?
How to do it?

bbgeek17 · Jul 14, 2021

You have to think about your setup as two independent parts:

1) OS+multipath tools. This is completely independent of the Proxmox. The iSCSI should be configured directly from OS using iscsiadm, and then multipath should be layered on top of it.
2) Proxmox is configured to use a disk *mpath* with LVM on top of it

The path failover happens at the multipath daemon/Kernel layer. The easiest way to test it so to use "fio", start writing to disk then pull the cable and ensure that fio is still writing. Check the status of the paths with "multipath -ll"

Proxmox configuration only allows for a single portal IP, which is being used for high level "healthcheck" - "does it respond?". It plays no role in actual path failover - thats handled at a lower level by multipath.
Some storage solution do provide a "virtual" failover IP that would continue to respond regardless of path failure, yours is not one them. I dont think there is much value to configure iSCSI storage config in proxmox, the LVM is enough.

Search

Search

iscsi multipath does not switch

sergei56

New Member

Attachments

mira

Proxmox Staff Member

sergei56

New Member

mira

Proxmox Staff Member

sergei56

New Member

mira

Proxmox Staff Member

sergei56

New Member

sergei56

New Member

mira

Proxmox Staff Member

sergei56

New Member

bbgeek17

Distinguished Member