Multipath iSCSI problems with 8.1

mweigelt · Dec 11, 2023

Hi,

in 8.1 was a fix for iSCSI "improvements" for trying to login in all portals delivered by sendtargets.

Problem: If you use some specific iSCSI serveres (i.e. open-e) they send you all locally configured IP addresses - there is no way to change this behavior. Even if they are not in use. So you get a big list of targets - but you are not able to use all of them. The regular linux iscsi stack tries to connect all sent targets and if it fails it will ignore the not successful paths. Only successfully etablished paths will be used for multipath and redundancy.

With the new "PVE stack over Linux stack" the PVE is trying to connect to sendtargets again and again. And because there are some paths which are not reachable (i.e. the addresses of local interfaces in an HA cluster) those connections will fail in a loop and the pve node will not start correctly.

The syslog is flooded with:
2023-12-11T13:31:55.678771+01:00 host-003 pvestatd[1999]: command '/usr/bin/iscsiadm --mode node --targetname iqn.2019-03:stor1.vg00 --login' failed: exit code 15
2023-12-11T13:31:55.832249+01:00 host-003 kernel: [ 550.553383] scsi host15: iSCSI Initiator over TCP/IP
2023-12-11T13:31:55.838261+01:00 host-003 kernel: [ 550.560050] connection19:0: detected conn error (1020)
2023-12-11T13:31:55.912233+01:00 host-003 kernel: [ 550.632201] scsi host15: iSCSI Initiator over TCP/IP
2023-12-11T13:31:55.916231+01:00 host-003 kernel: [ 550.636848] connection20:0: detected conn error (1020)
2023-12-11T13:31:55.920232+01:00 host-003 kernel: [ 550.639074] scsi host16: iSCSI Initiator over TCP/IP
2023-12-11T13:31:55.921482+01:00 host-003 kernel: [ 550.643286] connection21:0: detected conn error (1020)
2023-12-11T13:31:56.049591+01:00 host-003 iscsid: Connection-1:0 to [target: iqn.2019-03:stor1.vg00, portal: 172.20.235.2,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049629+01:00 host-003 iscsid: Connection-1:0 to [target: iqn.2019-03:stor1.vg00, portal: 172.20.232.2,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049647+01:00 host-003 iscsid: Connection-1:0 to [target: iqn.2019-03:stor1.vg00, portal: 172.20.233.2,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049664+01:00 host-003 iscsid: Connection-1:0 to [target: iqn.2019-03:stor1.vg00, portal: 172.20.237.1,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049679+01:00 host-003 iscsid: Connection-1:0 to [target: iqn.2019-03:stor1.vg00, portal: 192.168.225.2,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049697+01:00 host-003 iscsid: connection19:0 login rejected: initiator error - target not found (02/03)
2023-12-11T13:31:56.049715+01:00 host-003 iscsid: Connection19:0 to [target: iqn.2023-07:stor1.vg02, portal: 10.20.4.102,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049735+01:00 host-003 iscsid: connection20:0 login rejected: initiator error - target not found (02/03)
2023-12-11T13:31:56.049752+01:00 host-003 iscsid: Connection20:0 to [target: iqn.2023-07:stor1.vg02, portal: 10.20.4.101,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049768+01:00 host-003 iscsid: connection21:0 login rejected: initiator error - target not found (02/03)
2023-12-11T13:31:56.049785+01:00 host-003 iscsid: Connection21:0 to [target: iqn.2023-07:stor1.vg02, portal: 10.20.2.101,3260] through [iface: default] is shutdown.
2023-12-11T13:31:56.049801+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:02.050026+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:05.050199+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:08.050441+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:11.050658+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:14.050822+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:17.051057+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)
2023-12-11T13:32:20.051213+01:00 host-003 iscsid: connect to 192.168.225.2:3260 failed (No route to host)

mweigelt · Dec 11, 2023

Furthermore the problem is: iSCSI multipath und volumes are up as they should be
But i.e. "pvesm status" does not work with the same errors:
iscsiadm: No portals found
iscsiadm: No portals found
iscsiadm: No portals found
iscsiadm: default: 1 session requested, but 1 already present.
iscsiadm: could not read session targetname: 5
iscsiadm: could not find session info for session40
iscsiadm: could not read session targetname: 5
iscsiadm: could not find session info for session40
iscsiadm: default: 1 session requested, but 1 already present.
iscsiadm: Could not login to [iface: default, target: iqn.2023-07:stor1.vg02, portal: 10.20.4.102,3260].
iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure)

bbgeek17 · Dec 11, 2023

mweigelt said:
Problem: If you use some specific iSCSI serveres (i.e. open-e) they send you all locally configured IP addresses - there is no way to change this behavior.

this seems to be the main problem, in my opinion. If the storage array provides a set of available IPs in the "discover target" which are known to be unusable under any circumstances, I'd think its the array's responsibility?
When you said "local IP", I thought may be 127., but then you showed a list of all private network ranges. If the vendor is unwilling to adjust their handling, then one option for you is to clone the plugin responsible and implement your own custom filtering.

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

alexskysilk · Dec 11, 2023

bbgeek17 said:
I thought may be 127., but then you showed a list of all private network ranges. If the vendor is unwilling to adjust their handling, then one option for you is to clone the plugin responsible and implement your own custom filtering.

I dont think this is the case. best practices on iscsi generally suggests a different subnet per port (or port couple in a dual controller setup.) the idea being that the host should have an address on each vlan and the controller will announce all available paths- @mweigelt please post your /etc/network/interfaces, and every ip for every port on your storage.

bbgeek17 · Dec 11, 2023

alexskysilk said:
I dont think this is the case. best practices on iscsi generally suggests a different subnet per port (or port couple in a dual controller setup.) the idea being that the host should have an address on each vlan and the controller will announce all available path

yes, we are on the same page. As long as client has mirror config, it can easily decide on traffic flow.
However, if as OP said, the server advertises 10.10.10.40 in iSCSI target and this IP is only used for inter-cluster communication , perhaps on private VLAN or even direct connected, then the client will have no way to get to it. IMHO, that IP should not be in the iSCSI target in the first place.
As is, PVE will try to connect to it and status check it, every minute or so.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

mweigelt · Dec 12, 2023

I have two subnets/physical interfaces for iSCSI communication. 192.168.255.0/24 is the internal ha/drbd network for the storage redundant pair. So it's not reachable from proxmox hosts.

Maybe it's an iscsid behavior on the storage which binds to all available ip addresses, including management, non-virtual and drbd - and sends them as target. Until now I excluded all unwanted paths/addresses with an iSCSI acl. So the storage denies all unwanted paths. Those paths were not used by the initiator after the first connection fails. This is still working. The problem is the one path which can't be established via TCP to get a deny from the iscsi stack on storage.

As I said, until now there was no problem. No log entries. Working multipath. No abnormalities. Correct behavior via the wanted paths.

bbgeek17 · Dec 12, 2023

mweigelt said:
As I said, until now there was no problem. No log entries. Working multipath. No abnormalities. Correct behavior via the wanted paths.

Yes, behavior has changed and exposed a deficiency in your storage product. You have a somewhat valid claim that the change broke a long standing behavior. However, that claim should be made in https://bugzilla.proxmox.com/.

On the other hand, a quick read into RFC https://datatracker.ietf.org/doc/html/rfc3720

Code:

A system that contains targets MUST support discovery sessions on
   each of its iSCSI IP address-port pairs, and MUST support the
   SendTargets command on the discovery session.  In a discovery
   session, a target MUST return all path information (target name and
   IP address-port pairs and portal group tags) for the targets on the
   target network entity which the requesting initiator is authorized to
   access.

Note, I did not do an extensive (re)-read of the RFC, just a quick glance.

Bring it up to PVE devs to make a call.
Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

mweigelt · Dec 12, 2023

So if a path to one of the targets is down when the node boots up the node will never come up? I guess that's not the right idea.

mweigelt · Dec 12, 2023

To be clear: The Linux iscsi initiator makes no problems and creating correct multipath sessions. It's the proxmox stuff around which is always restarting all the time if a target is not reachable. THIS shoult not be a correct behavior in case of a failure of a path...

bbgeek17 · Dec 12, 2023

mweigelt said:
To be clear: The Linux iscsi initiator makes no problems and creating correct multipath sessions. It's the proxmox stuff around which is always restarting all the time if a target is not reachable. THIS shoult not be a correct behavior in case of a failure of a path...

I wouldnt say I disagree with you. However, if you had two IPs reported at initial setup and only one was available, I can see where you might expect PVE to automatically establish connection to the second one when it becomes available.
As with anything, there are always edge and special cases that need to be accommodated. I am not sure that working around incorrectly reported information from one storage vendor is such case, but perhaps it is.
Likely the easiest way to deal with it, is to provide a fallback to prior behavior. As I said, make a detailed report in bugzilla and PVE devs can then properly track this.

good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

mweigelt · Dec 12, 2023

I agree with you that pve should try to reconnect to the second one. but it should not set the node on- and offline the whole time.

alexkenon · Jan 11, 2024

mweigelt said:
I agree with you that pve should try to reconnect to the second one. but it should not set the node on- and offline the whole time.

Haven't solved the problem yet? Any ideas?

RolandK · Jan 11, 2024

ticket for this in proxmox bugzilla:

https://bugzilla.proxmox.com/show_bug.cgi?id=5173

Search

Search

Multipath iSCSI problems with 8.1

mweigelt

Renowned Member

mweigelt

Renowned Member

bbgeek17

Distinguished Member

alexskysilk

Distinguished Member

bbgeek17

Distinguished Member

mweigelt

Renowned Member

bbgeek17

Distinguished Member

mweigelt

Renowned Member

mweigelt

Renowned Member

bbgeek17

Distinguished Member

mweigelt

Renowned Member

alexkenon

Active Member

RolandK

Renowned Member