hi,
we have a Open-E DSS7 ISCSI Cluster with two virtual IPs for the target. Also we have 3 ProxMox 2.1 cluster, but I have massive problems to get a stable ISCSI connections on all nodes:
The nodes and the ISCSI are on a own LACP channel bounding.
The nodes has three IPs:
1. 130.xx.xx.xx -> IP for external connections and vmbr0 -> also second path to cluster-ip for ISCSI
2. 192.168.200.x -> Cluster communication -> bond0 ->LACP
3. 192.168.220.x -> main ISCSI initiator -> ISCSI target for open-e 192.168.220.20 cluster IP
Adding the target is only done via 192.168.220.20
I also tried to use multipath, but this changes nothing. Both path are lost. Sometimes it works for only minutes, sometimes it works for hours.
We have also one initiator (Ubuntu) which works without any loosing connection.
Any suggestions?
we have a Open-E DSS7 ISCSI Cluster with two virtual IPs for the target. Also we have 3 ProxMox 2.1 cluster, but I have massive problems to get a stable ISCSI connections on all nodes:
Code:
[...]
May 16 18:28:01 node-01 iscsid: Kernel reported iSCSI connection 1:0 error (1021) state (3)
May 16 18:28:01 node-01 iscsid: connection1:0 is operational after recovery (1 attempts)
May 16 18:28:01 node-01 kernel: scsi 6:0:0:0: Device offlined - not ready after error recovery
May 16 18:28:43 node-01 kernel: connection2:0: detected conn error (1021)
May 16 18:28:44 node-01 iscsid: Kernel reported iSCSI connection 2:0 error (1021) state (3)
May 16 18:28:45 node-01 iscsid: connection2:0 is operational after recovery (1 attempts)
May 16 18:28:54 node-01 kernel: connection2:0: detected conn error (1021)
May 16 18:28:55 node-01 kernel: scsi 7:0:0:0: Device offlined - not ready after error recovery
May 16 18:28:55 node-01 iscsid: Kernel reported iSCSI connection 2:0 error (1021) state (3)
May 16 18:28:55 node-01 iscsid: connection2:0 is operational after recovery (1 attempts)
May 16 18:28:55 node-01 pvestatd[2201]: status update time (411.585 seconds)
May 16 18:29:57 node-01 kernel: connection3:0: detected conn error (1021)
May 16 18:29:58 node-01 iscsid: Kernel reported iSCSI connection 3:0 error (1021) state (3)
May 16 18:29:59 node-01 iscsid: connection3:0 is operational after recovery (1 attempts)
[...]
The nodes and the ISCSI are on a own LACP channel bounding.
Code:
Current active iSCSI sessions:
tcp: [1] 192.168.220.20:3260,1 iqn.2013-03:san.backuphost
tcp: [10] 130.xx.xx.xx:3260,1 iqn.2013-04:san.ldap2host
tcp: [2] 130.xx.xx.xx:3260,1 iqn.2013-03:san.backuphost
tcp: [3] 192.168.220.20:3260,1 iqn.2013-05:san.supporthost
tcp: [4] 130.xx.xx.xx:3260,1 iqn.2013-05:san.supporthost
tcp: [5] 192.168.220.20:3260,1 iqn.2013-04:san.icingahost
tcp: [6] 130.xx.xx.xx:3260,1 iqn.2013-04:san.icingahost
tcp: [7] 192.168.220.20:3260,1 iqn.2013-05:san.nypdhost
tcp: [8] 130.xx.xx.xx:3260,1 iqn.2013-05:san.nypdhost
tcp: [9] 192.168.220.20:3260,1 iqn.2013-04:san.ldap2host
The nodes has three IPs:
1. 130.xx.xx.xx -> IP for external connections and vmbr0 -> also second path to cluster-ip for ISCSI
2. 192.168.200.x -> Cluster communication -> bond0 ->LACP
3. 192.168.220.x -> main ISCSI initiator -> ISCSI target for open-e 192.168.220.20 cluster IP
Adding the target is only done via 192.168.220.20
I also tried to use multipath, but this changes nothing. Both path are lost. Sometimes it works for only minutes, sometimes it works for hours.
We have also one initiator (Ubuntu) which works without any loosing connection.
Any suggestions?