Node startup with iSCSI multipath, some errors

ljety

Well-Known Member
Oct 25, 2018
56
17
48
We have Proxmox cluster with 3 nodes. Our SAN is an DELL MD3820i and we use iSCSI with LVM on top. Storage iSCSI with LVM was configured in Storage Manager.

Now, on first node I try to add multipath connection to iSCSI SAN. During node startup the process hang around 2 minutes during initialisation of ISCSI connection. Error issue: iscsiadm[3908]: iscsiadm: default: 1 session requested, but 1 already present.

Code:
root@pve1:~# systemctl status open-iscsi.service
● open-iscsi.service - Login to default iSCSI targets
   Loaded: loaded (/lib/systemd/system/open-iscsi.service; enabled; vendor preset: enabled)
   Active: active (exited) since Thu 2020-12-17 17:39:37 CET; 18h ago
     Docs: man:iscsiadm(8)
           man:iscsid(8)
  Process: 3901 ExecStartPre=/bin/systemctl --quiet is-active iscsid.service (code=exited, status=0/SUCCESS)
  Process: 3908 ExecStart=/sbin/iscsiadm -m node --loginall=automatic (code=exited, status=15)
  Process: 4408 ExecStart=/lib/open-iscsi/activate-storage.sh (code=exited, status=0/SUCCESS)
 Main PID: 4408 (code=exited, status=0/SUCCESS)

Dec 17 17:37:31 pve1 iscsiadm[3908]: iscsiadm: default: 1 session requested, but 1 already present.
Dec 17 17:37:31 pve1 iscsiadm[3908]: iscsiadm: default: 1 session requested, but 1 already present.
Dec 17 17:39:37 pve1 iscsiadm[3908]: iscsiadm: Could not login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3800i.600a098000bd9fb9000000005cd, portal: 192.168.131.102,3260].
Dec 17 17:39:37 pve1 iscsiadm[3908]: iscsiadm: initiator reported error (8 - connection timed out)
Dec 17 17:39:37 pve1 iscsiadm[3908]: iscsiadm: Could not login to [iface: default, target: iqn.1984-05.com.dell:powervault.md3800i.600a098000bd9fb9000000005cd, portal: 192.168.130.102,3260].
Dec 17 17:39:37 pve1 iscsiadm[3908]: iscsiadm: initiator reported error (8 - connection timed out)
Dec 17 17:39:37 pve1 iscsiadm[3908]: iscsiadm: Could not log into all portals
Dec 17 17:39:37 pve1 iscsiadm[3908]: Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3800i.600a098000bd9fb9000000005cd, portal: 192.168.131.102,3260] (multiple)
Dec 17 17:39:37 pve1 iscsiadm[3908]: Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.md3800i.600a098000bd9fb9000000005cd, portal: 192.168.130.102,3260] (multiple)
Dec 17 17:39:37 pve1 systemd[1]: Started Login to default iSCSI targets.

Multipath seems to work and I see all disks with lvs:

Code:
root@pve1:~# multipath -ll
os (3600a098000bd9fb9000003235d26) dm-6 DELL,MD38xxi
size=9.0T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
`-+- policy='round-robin 0' prio=14 status=active
  |- 15:0:0:3 sdc 8:32 active ready running
  `- 16:0:0:3 sde 8:64 active ready running
daten (3600a098000bd9fda000002895d26) dm-23 DELL,MD38xxi
size=15T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
`-+- policy='round-robin 0' prio=14 status=active
  |- 15:0:0:1 sdb 8:16 active ready running
  `- 16:0:0:1 sdd 8:48 active ready running

Multipath configuration:

Code:
defaults {
        polling_interval        2
        path_selector           "round-robin 0"
        path_grouping_policy    multibus
        path_checker            rdac
        getuid_callout          "/lib/udev/scsi_id -g -u -d /dev/%n"
        rr_min_io               100
        failback                immediate
        no_path_retry           queue
        user_friendly_names     yes
}
blacklist {
        wwid .*
}
blacklist_exceptions {
        wwid "3600a098000bd9fda000002895d26"
        wwid "3600a098000bd9fb9000003235d26"
        wwid "600a098000bd9fb9000000005cd"
        property "(ID_SCSI_VPD|ID_WWN|ID_SERIAL)"
}
multipaths {
  multipath {
        wwid "3600a098000bd9fda000002895d26"
        alias daten
  }
  multipath {
        wwid "3600a098000bd9fb9000003235d26"
        alias os
  }

  multipath {
        wwid "600a098000bd9fb9000000005cd"
        alias portal
  }

}

What is wrong in my configuration and how I can improve startup behavior to avoid errors?
 
Please post your storage config (/etc/pve/storage.cfg).
 
Storage config:

Code:
dir: local
        path /var/lib/vz
        content rootdir,images,snippets,vztmpl,iso
        maxfiles 3
        shared 0

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

iscsi: san2
        portal 192.168.130.101
        target iqn.1984-05.com.dell:powervault.md3800i.600a098000bd9fb9000000005cd
        content none

lvm: san2-os
        vgname iscsi-os
        content rootdir,images
        shared 1

lvm: san2-daten
        vgname iscsi-daten
        content rootdir,images
        shared 1
 
Md-Series from dell are active/passive arrays (at least a lot of them). Means only one controller has the LUN active at a time.
I have seen these timeouts in the past when the system tries to access the inactive controller for the LUN.
It seems you have 2 paths per LUN. I guess there are 4 paths in sum (2 per controller, 2 controller)
Have you tried disconnecting one controller and see if the behaviour persists?
 
Yes, I have two controller, active/passive in my MD3820i. At the moment I have only two paths to active controller. Passive controller is not attached yet to the switch.

I tried to disconnect one of two network connecton from active controller. It work and I not loose connection. Do you mean, that node missing 2 additional paths to passive controller during startup?

I will check it tommorow and post the results here.
 
At the moment I have only two paths to active controller. Passive controller is not attached yet to the switch.
Can you please provide more information about your cabling setup, IP addresses used, portals connected, iscsi-initiator.conf etc?
Because I jumped obviously to the wrong conclusion with the multipath output.
If the passive controller is not yet attached to the switch and you have the initiators already pointing to it this can also have these effects. In such case the initiator tries to connect to the target portal and repeats multiple times until it reaches the timeout.
 
On each node is two network connections, one for 192.168.130.0/24 othe for 192.168.131.0/24
SAN IP addresses
Controller 0:
port 0 - 192.168.130.101/24
port 1 - 192.168.131.101/24

Controller 1:
port 0 - 192.168.130.102/24
port 1 - 192.168.131.102/24

Passive controller is yet attached to the switch. Node startup faster but I got errors:

Code:
Failed to start LVM event activation on device 253:8
Failed to start LVM event activation on device 253:9

connect to 192.168.131.101:3260 failed (No route to host)
connect to 192.168.131.102:3260 failed (No route to host)

Attached please find issue of journalctl -b with iSCSI startup.
 

Attachments

The connect issue on IP subnet 131 could likely explain your lengthy boot times when the system retries multiple times with wait cycles in between.
The question is: why can this ipaddress not be reached? Can you ping those IPs?
 
Yes, nodes can ping all SAN IPs. Does it possible, that iscsi service start before network device for 192.168.131.0 startup?
 
Is there a route between subnet 130 and 131 or are the subnets / vlans not routed?

Could you please try using "ping -I" and specify the source address from that subnet. Does this work as well?
-I interface address Set source address to specified interface address. Argument may be numeric IP address or name of device. When pinging IPv6 link-local address this option is required.
 
There is no route between subnet 130 and 131 on switch. Node hast two network card one for subnet 130 and second one for subnet 131. Both cards does not have gateway.

Code:
root@pve2:~# ip route
default via 192.168.90.254 dev vlan90 proto kernel onlink
172.16.0.0/24 dev vlan3000 proto kernel scope link src 172.16.0.2
192.168.90.0/24 dev vlan90 proto kernel scope link src 192.168.90.2
192.168.130.0/24 dev eno2 proto kernel scope link src 192.168.130.2
192.168.131.0/24 dev ens1f3 proto kernel scope link src 192.168.131.2
 
Hm.
Are your connections running through a switch? (I assume yes)
How are your ports configured?
I am thinking that potentially the NIC tries to auto-negotiate and therefore perhaps takes a little longer. This could be through a defective cable for instance, a bad port or whatever.
Do the port actually report OK from speed perspective?
 
Yes, all connections between nodes and SAN running through switch. NICs on switch has "auto" as config-mode and indicated speed is 10Gig.

Do I need a route between SAN VLAN 130 and 131 on my switch?
 
Do I need a route between SAN VLAN 130 and 131 on my switch?
No please dont!
I was just asking because in this case (routing in place) some of your IP-tests could be negated.

All this sounds like a timing issue to me. And those are the hardest to troubleshoot :/
Do you have portfast enabled on your switch (assuming that feature is available)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!