Hello,
we are running cluster with 5 nodes and shared storage (1x Synology 3617, 1x DELL MD3200i) connected to all nodes with iSCSI and on top of this we have LVM.
Last week we had a problem, when I removed unused disk from one VM stored on Synology. Everything looked fine, disk was removed from VM config, but after this task, if I ran lvscan/vgscan/pvscan on any of nodes, it was very slow (had to wait minutes, before system responded).
There was low load on cluster, but yet all VM become very slow. On Synology, I've seen spikes on disk queue graph and high latency values. On DELL storage everything seemed to be OK (no queues, latency low).
We've not found any process on any node, that can be responsible for heavyloading iSCSI.
After stoping node 1 where I started "remove unused disk" from GUI all other nodes become working healty (lvscan/vgscan was functioning normaly).
So it is obvious, that something connected to LVM on node 1 was heavyloading/breaking iSCSI communication on iSCSI target.
Have you seen this kind of problem already somewhere ?, have you any clue, how to solve this kind of situation without restarting node ?
we are running cluster with 5 nodes and shared storage (1x Synology 3617, 1x DELL MD3200i) connected to all nodes with iSCSI and on top of this we have LVM.
Last week we had a problem, when I removed unused disk from one VM stored on Synology. Everything looked fine, disk was removed from VM config, but after this task, if I ran lvscan/vgscan/pvscan on any of nodes, it was very slow (had to wait minutes, before system responded).
There was low load on cluster, but yet all VM become very slow. On Synology, I've seen spikes on disk queue graph and high latency values. On DELL storage everything seemed to be OK (no queues, latency low).
We've not found any process on any node, that can be responsible for heavyloading iSCSI.
After stoping node 1 where I started "remove unused disk" from GUI all other nodes become working healty (lvscan/vgscan was functioning normaly).
So it is obvious, that something connected to LVM on node 1 was heavyloading/breaking iSCSI communication on iSCSI target.
Have you seen this kind of problem already somewhere ?, have you any clue, how to solve this kind of situation without restarting node ?