Cluster iSCSI issues

troycarpenter

Renowned Member
Feb 28, 2012
103
8
83
Central Texas
I have a cluster that was originally set up with 5 nodes. Each node has a management interface, as well as a private network for corosync to communicate over. Over time I added three more nodes to the cluster and all seems to be working.

I have now configured iSCSI on the datacenter as outlined in various sources, and setup LVM on that. What I have noticed is that the original 5 nodes in the cluster have no issues using the shared storage. What is weird is that the three new nodes are having all kinds of issues using the shared storage.

I've seen the following errors on one of the new nodes:
[431436.828980] blk_update_request: I/O error, dev dm-0, sector 4294967168
[431436.835805] blk_update_request: I/O error, dev dm-0, sector 4294967280
[431436.842690] blk_update_request: I/O error, dev dm-0, sector 0
[431436.848771] blk_update_request: I/O error, dev dm-0, sector 8

I'm also seeing this error on another new node (this one is by far the most common error I see):
[3284852.201673] connection1:0: detected conn error (1020)

Like I said, the orignal nodes in the cluster do not have the issue and appear to be working perfectly with the iSCSI configuration, only the three nodes I added at a later date.

Any ideas?
 
I tried a couple of time to get lxc on iscsi lvm working, but had issues. see iscsi/napp-it wiki page , lxc on isci section.

On the nodes with issues, can you check /etc/lvm/archive/ to see if it is filling up?
 
Sorry, been on PTO for the past 3 weeks. I'm not using lxc on iscsi lvm, these are all kvm.

So far what I found is that the three new nodes all had the same iSCSI initiator ID. I've changed them so they are all unique (the initiator IDs were already unique for the other nodes) and that seems to have solve the connection issues. However I'm still having trouble with the one node giving sector errors.

UPDATE: I made all the iSCSI initiators unique and rebooted the trouble nodes. Now all nodes seem to be working fine with iSCSI.
 
Last edited:
Sorry, been on PTO for the past 3 weeks. I'm not using lxc on iscsi lvm, these are all kvm.

So far what I found is that the three new nodes all had the same iSCSI initiator ID. I've changed them so they are all unique (the initiator IDs were already unique for the other nodes) and that seems to have solve the connection issues. However I'm still having trouble with the one node giving sector errors.

how do you check for 'iSCSI initiator ID' ?
 
cat /etc/iscsi/initiatorname.iscsi

I guess more properly called the InitiatorName.

I should have paid more close attention to the FreeNAS logs...it was telling me the problem all along.
 
cat /etc/iscsi/initiatorname.iscsi

I guess more properly called the InitiatorName.

I should have paid more close attention to the FreeNAS logs...it was telling me the problem all along.

So the InitiatorName= should be different at every node . Any idea how there were duplicates?
 
I'm guessing it's because the newer nodes were made from a disk image of one of the other nodes. That would explain how they had duplicate names...I'll have to add it to my list of things to modify when a new node is added to the cluster.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!