Hi,
we have two proxmox clusters with about 5 hosts each and separate ceph cluster with 7 hosts. All our vm's are running off rbd storage on the ceph cluster. Occasionally a VM (not a specific one, not on a specific host, but just one) will hang for exactly 15 minutes. dmesg inside the VM will show processes are blocked for this time and it looks no disk-access is possible from within the VM for those 15 minutes. After these 15 minutes the VM will operate as normal without any problems.
It could be there is a packet loss causing this, but we are unable to change any network devices or configuration. Our main goal for this issue for now is to find which timeout could be the cause of this and are we able to lower this.
Thanks for any suggestions on where to look for this or even how to fix this.
Greetings,
Micha Kersloot.
we have two proxmox clusters with about 5 hosts each and separate ceph cluster with 7 hosts. All our vm's are running off rbd storage on the ceph cluster. Occasionally a VM (not a specific one, not on a specific host, but just one) will hang for exactly 15 minutes. dmesg inside the VM will show processes are blocked for this time and it looks no disk-access is possible from within the VM for those 15 minutes. After these 15 minutes the VM will operate as normal without any problems.
It could be there is a packet loss causing this, but we are unable to change any network devices or configuration. Our main goal for this issue for now is to find which timeout could be the cause of this and are we able to lower this.
Thanks for any suggestions on where to look for this or even how to fix this.
Greetings,
Micha Kersloot.