iSCSI Login negotiation failed

coder

Member
Feb 5, 2018
8
2
8
41
Hi,

We are using a ReadyNAS storage device from Netgear, which is running a Linux Kernel using the LIO iSCSI target.

We are experiencing the following messages on the Netgear storage device which are logged permanently in an interval of 10 seconds:

Jan 28 19:43:42 Storage01 kernel: rx_data returned 0, expecting 48.
Jan 28 19:43:42 Storage01 kernel: iSCSI Login negotiation failed.
Jan 28 19:43:52 Storage01 kernel: rx_data returned 0, expecting 48.
Jan 28 19:43:52 Storage01 kernel: iSCSI Login negotiation failed.
Jan 28 19:44:02 Storage01 kernel: rx_data returned 0, expecting 48.
Jan 28 19:44:02 Storage01 kernel: iSCSI Login negotiation failed.
Jan 28 19:44:12 Storage01 kernel: rx_data returned 0, expecting 48.
Jan 28 19:44:12 Storage01 kernel: iSCSI Login negotiation failed.
Jan 28 19:44:22 Storage01 kernel: rx_data returned 0, expecting 48.
Jan 28 19:44:22 Storage01 kernel: iSCSI Login negotiation failed.

The iSCSI LUNs are working and can be used from within the Proxmox 4.x environment. Now, we migrated to Proxmox 5 and we are seing the following on our Proxmox systems:

[Sun Jan 28 19:44:21 2018] sd 5:0:0:0: alua: supports implicit and explicit TPGS
[Sun Jan 28 19:44:21 2018] sd 5:0:0:0: alua: device naa.60014057075af940f7e4d9e97f1a2569 port group 0 rel port 1
[Sun Jan 28 19:44:21 2018] sd 5:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
[Sun Jan 28 19:44:21 2018] sd 5:0:0:1: alua: supports implicit and explicit TPGS
[Sun Jan 28 19:44:21 2018] sd 5:0:0:1: alua: device naa.600140563b743d68c0d47288d772f587 port group 0 rel port 1
[Sun Jan 28 19:44:21 2018] sd 5:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[Sun Jan 28 19:44:21 2018] sd 6:0:0:0: alua: supports implicit and explicit TPGS
[Sun Jan 28 19:44:21 2018] sd 6:0:0:0: alua: device naa.60014057075af940f7e4d9e97f1a2569 port group 0 rel port 1
[Sun Jan 28 19:44:21 2018] sd 6:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
[Sun Jan 28 19:44:21 2018] sd 6:0:0:1: alua: supports implicit and explicit TPGS

The messages are logged constantly, like the ones on the storage device.
I tried to figure it out to pull some iSCSI traces, but up to now, I was not able to find the reason.

Are there any suggestions or ideas on this?

Thanks and Best Regards,
Volker
 

coder

Member
Feb 5, 2018
8
2
8
41
Hi,

Replying to my own post...

While debugging I've found the following is triggered by pvestatd doing its automated health checks and scsi scans. Especially the "iSCSI Login negotiation failed" messages are a result of a simple connect/disconnect to the portal port of the storage array.

I was able to track it down to the following function which is part of the ISCSIPlugin Module used by pvestatd:

Code:
sub iscsi_test_portal {
    my ($portal) = @_;

    my ($server, $port) = PVE::Tools::parse_host_and_port($portal);
    return 0 if !$server;
    return PVE::Network::tcp_ping($server, $port || 3260, 2);
}

So even the messages are annoying, and filling the logs, they can be ignored.

When working with LVM on ISCSI, one way to suppress this behavior is to deactivate the ISCSI storage target in Proxmox, and configure open-iscsi for automatic login and service discovery. In this scenario, an LVM target can directly be configured in Proxmox without the ISCSI target as a backend. - However, I didn't fully test his, yet.

Best Regards,
Volker
 

daemaz

New Member
Jan 24, 2017
15
1
3
33
Hi, have you done any more research on this? I have something similar. I'm not sure which was the deciding factor but after upgrading Proxmox from 5.1.41 to 5.1.46 and also upgrading my SAN software (Starwind SAN) to the latest version I started getting these errors. I originally was running an older version of Starwind (v8.0.7354) but now upgraded to v8.0.11818. Originally downgraded back because it was crashing the SAN service, found out that VAAI doesn't play well with iSCSI and so once I turned it off it works fine, albeit the log messages. On the Proxmox node:


Code:
Mar 04 18:56:16 Proxmox kernel: sd 11:0:0:0: alua: supports implicit TPGS
Mar 04 18:56:16 Proxmox kernel: sd 11:0:0:0: alua: device eui.dae4a70d707241f1 port group 1 rel port 1
Mar 04 18:56:16 Proxmox kernel: sd 11:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:56:16 Proxmox kernel: sd 12:0:0:0: alua: supports implicit TPGS
Mar 04 18:56:16 Proxmox kernel: sd 12:0:0:0: alua: device eui.dae4a70d707241f1 port group 1 rel port 1
Mar 04 18:56:16 Proxmox kernel: sd 12:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:56:36 Proxmox kernel: sd 9:0:0:0: alua: supports implicit TPGS
Mar 04 18:56:36 Proxmox kernel: sd 9:0:0:0: alua: device eui.0191d2c4c8226ff0 port group 1 rel port 1
Mar 04 18:56:36 Proxmox kernel: sd 9:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:56:36 Proxmox kernel: sd 10:0:0:0: alua: supports implicit TPGS
Mar 04 18:56:36 Proxmox kernel: sd 10:0:0:0: alua: device eui.0191d2c4c8226ff0 port group 1 rel port 1
Mar 04 18:56:36 Proxmox kernel: sd 10:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:56:46 Proxmox kernel: sd 7:0:0:0: alua: supports implicit TPGS
Mar 04 18:56:46 Proxmox kernel: sd 7:0:0:0: alua: device eui.518a6e00cbfcdcf2 port group 1 rel port 1
Mar 04 18:56:46 Proxmox kernel: sd 7:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:56:46 Proxmox kernel: sd 8:0:0:0: alua: supports implicit TPGS
Mar 04 18:56:46 Proxmox kernel: sd 8:0:0:0: alua: device eui.518a6e00cbfcdcf2 port group 1 rel port 1
Mar 04 18:56:46 Proxmox kernel: sd 8:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:57:05 Proxmox kernel: sd 7:0:0:0: alua: supports implicit TPGS
Mar 04 18:57:05 Proxmox kernel: sd 7:0:0:0: alua: device eui.518a6e00cbfcdcf2 port group 1 rel port 1
Mar 04 18:57:05 Proxmox kernel: sd 7:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA
Mar 04 18:57:05 Proxmox kernel: sd 8:0:0:0: alua: supports implicit TPGS
Mar 04 18:57:05 Proxmox kernel: sd 8:0:0:0: alua: device eui.518a6e00cbfcdcf2 port group 1 rel port 1
Mar 04 18:57:05 Proxmox kernel: sd 8:0:0:0: alua: port group 01 state A non-preferred supports TolUsNA

And on the SAN side (Starwind SAN software on 2012R2) a snippet of the event log:

Code:
3/4 19:00:00.310 6548 Srv: Accepted iSCSI connection from 172.16.1.2:34588 to 172.16.1.1:3260. (Id = 0x142b7)
3/4 19:00:00.310 6548 S[142b7]: Session (000000E00BF35200)
3/4 19:00:00.310 6548 C[142b7], FREE: Event - CONNECTED.
3/4 19:00:00.310 6548 C[142b7], XPT_UP: T3.
3/4 19:00:00.310 1850 C[142b7], XPT_UP: recvData returned 10058 (0x274a)!
3/4 19:00:00.310 1850 C[142b7], XPT_UP: *** 'recv' thread: recv failed 10058.
3/4 19:00:00.326 6548 Srv: Accepted iSCSI connection from 172.16.0.2:40632 to 172.16.0.1:3260. (Id = 0x142b8)
3/4 19:00:00.326 6548 S[142b8]: Session (000000E00BF32F40)
3/4 19:00:00.326 6548 C[142b8], FREE: Event - CONNECTED.
3/4 19:00:00.326 6548 C[142b8], XPT_UP: T3.
3/4 19:00:00.326 3e88 C[142b8], XPT_UP: recvData returned 10058 (0x274a)!
3/4 19:00:00.326 3e88 C[142b8], XPT_UP: *** 'recv' thread: recv failed 10058.
3/4 19:00:00.342 6548 Srv: Accepted iSCSI connection from 172.16.0.2:40634 to 172.16.0.1:3260. (Id = 0x142b9)
3/4 19:00:00.342 6548 S[142b9]: Session (000000E00BF34C40)
3/4 19:00:00.342 6548 C[142b9], FREE: Event - CONNECTED.
3/4 19:00:00.342 6548 C[142b9], XPT_UP: T3.
3/4 19:00:00.342 1bb4 C[142b9], XPT_UP: recvData returned 10058 (0x274a)!
3/4 19:00:00.342 1bb4 C[142b9], XPT_UP: *** 'recv' thread: recv failed 10058.
3/4 19:00:00.357 6548 Srv: Accepted iSCSI connection from 172.16.1.2:34594 to 172.16.1.1:3260. (Id = 0x142ba)
3/4 19:00:00.357 6548 S[142ba]: Session (000000E00BF34680)
3/4 19:00:00.357 6548 C[142ba], FREE: Event - CONNECTED.
3/4 19:00:00.357 6548 C[142ba], XPT_UP: T3.
3/4 19:00:00.357 41c8 C[142ba], XPT_UP: recvData returned 10058 (0x274a)!
3/4 19:00:00.357 41c8 C[142ba], XPT_UP: *** 'recv' thread: recv failed 10058.
3/4 19:00:00.357 6548 Srv: Accepted iSCSI connection from 172.16.1.2:34596 to 172.16.1.1:3260. (Id = 0x142bb)

Other than that I don't seem to have any issues. Based on a cursory glance on the starwind forums the "recv failed 10058" seems to be more informational than an error. Also, my multipath is over 2 1GB links and it shows that it's working properly:

Code:
root@Proxmox:~# multipath -ll
mpathc (2dae4a70d707241f1) dm-6 STARWIND,STARWIND
size=250G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  |- 12:0:0:0 sdq 65:0  active ready running
  `- 11:0:0:0 sdp 8:240 active ready running
mpathb (2518a6e00cbfcdcf2) dm-5 STARWIND,STARWIND
size=250G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  |- 8:0:0:0  sdl 8:176 active ready running
  `- 7:0:0:0  sdm 8:192 active ready running
mpatha (20191d2c4c8226ff0) dm-4 STARWIND,STARWIND
size=1.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
  |- 9:0:0:0  sdn 8:208 active ready running
  `- 10:0:0:0 sdo 8:224 active ready running

multipath.conf
Code:
defaults {
        polling_interval        4
        path_selector           "round-robin 0"
        path_grouping_policy    multibus
        uid_attribute           ID_SERIAL
        rr_min_io               100
        failback                immediate
        no_path_retry           queue
        user_friendly_names     yes
}
blacklist {
        wwid .*
}
blacklist_exceptions {
        wwid "2dae4a70d707241f1"
        wwid "2518a6e00cbfcdcf2"
        wwid "20191d2c4c8226ff0"
}

/etc/iscsi/nodes/examplenodeconfig
Code:
# BEGIN RECORD 2.0-873
node.name = iqn.xxxxxxxxxxxxxxxxxxxxxxxxxxx
node.tpgt = -1
node.startup = automatic
node.leading_login = No
iface.iscsi_ifacename = default
iface.transport_name = tcp
iface.vlan_id = 0
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
node.discovery_address = 172.16.0.1
node.discovery_port = 3260
node.discovery_type = send_targets
node.session.initial_cmdsn = 0
node.session.initial_login_retry_max = 8
node.session.xmit_thread_priority = -20
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.nr_sessions = 1
node.session.auth.authmethod = None
node.session.timeo.replacement_timeout = 15
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.err_timeo.host_reset_timeout = 60
node.session.iscsi.FastAbort = Yes
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 2
node.session.iscsi.MaxConnections = 1
node.session.iscsi.MaxOutstandingR2T = 1
node.session.iscsi.ERL = 0
node.conn[0].address = 172.16.0.1
node.conn[0].port = 3260
node.conn[0].startup = manual
node.conn[0].tcp.window_size = 524288
node.conn[0].tcp.type_of_service = 0
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.HeaderDigest = None
node.conn[0].iscsi.DataDigest = None
node.conn[0].iscsi.IFMarker = No
node.conn[0].iscsi.OFMarker = No
# END RECORD

pveversion -v
Code:
root@Proxmox:~# pveversion -v
proxmox-ve: 5.1-41 (running kernel: 4.13.13-6-pve)
pve-manager: 5.1-46 (running version: 5.1-46/ae8241d4)
pve-kernel-4.13.13-6-pve: 4.13.13-41
pve-kernel-4.13.13-4-pve: 4.13.13-35
pve-kernel-4.4.98-3-pve: 4.4.98-103
pve-kernel-4.4.40-1-pve: 4.4.40-82
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.16-1-pve: 4.4.16-64
corosync: 2.4.2-pve3
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-common-perl: 5.0-28
libpve-guest-common-perl: 2.0-14
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-17
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-2
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-11
pve-cluster: 5.0-20
pve-container: 2.0-19
pve-docs: 5.1-16
pve-firewall: 3.0-5
pve-firmware: 2.0-3
pve-ha-manager: 2.0-5
pve-i18n: 1.0-4
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.9.1-9
pve-xtermjs: 1.0-2
qemu-server: 5.0-22
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3

open-iscsi version
Code:
root@Proxmox:~# iscsiadm --version
iscsiadm version 2.0-874

multipath-tools version
Code:
root@Proxmox:~# apt search multipath-tools
Sorting... Done
Full Text Search... Done
kpartx/stable,now 0.6.4-5 amd64 [installed,automatic]
  create device mappings for partitions

multipath-tools/stable,now 0.6.4-5 amd64 [installed]
  maintain multipath block device access

multipath-tools-boot/stable 0.6.4-5 all
  Support booting from multipath devices

Like I said before, everything seems to be working normally except the log flooding on both the Proxmox node and the SAN node.

edit: messed up the CODE tags and tried to edit but the forum won't let me?
 

ChrisDB

New Member
Jan 28, 2020
8
4
1
33
Hi,

Replying to my own post...

While debugging I've found the following is triggered by pvestatd doing its automated health checks and scsi scans. Especially the "iSCSI Login negotiation failed" messages are a result of a simple connect/disconnect to the portal port of the storage array.

I was able to track it down to the following function which is part of the ISCSIPlugin Module used by pvestatd:

Code:
sub iscsi_test_portal {
    my ($portal) = @_;

    my ($server, $port) = PVE::Tools::parse_host_and_port($portal);
    return 0 if !$server;
    return PVE::Network::tcp_ping($server, $port || 3260, 2);
}

So even the messages are annoying, and filling the logs, they can be ignored.

When working with LVM on ISCSI, one way to suppress this behavior is to deactivate the ISCSI storage target in Proxmox, and configure open-iscsi for automatic login and service discovery. In this scenario, an LVM target can directly be configured in Proxmox without the ISCSI target as a backend. - However, I didn't fully test his, yet.

Best Regards,
Volker
In my case I have target (via targetcli-fb package) and initiator running on the same cluster. Target is configured on one node, and initiators on each node.
I can cinfirm that configuring the iSCSI initiators via iscsiadm, instead of via the PVE web interface, fixed the issue because "TCP pings" were no longer sent to the target, which were interpreted as connection failures.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!