On one of the nodes in our 2-node-cluster I can see 1-4 instances of "vgs" running at 100 cpu.
"ps -ef" tells me:
root 21183 6134 99 16:48 ? 00:01:30 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free
Apparantly these calls fail because every 2 minutes theres a new attempt with a different PID. Also vgdisplay needs like... 5? minutes to display its results, which look fine though.
Furthermore, theres no errors in /var/log/messages or dmesg.
This causes machine start/stops to take VERY long, and the webservice on this node is unresponsive which causes the other node to not be able to display stats about the VMs on the faulty node #2.
The weird thing is: you can interact with the VMs on that node in the usual way and you can also shut them down from inside the VM just fine.
Any pointers to help resolve this?
PS: we are using SAN storage and it should go without saying thats its connected via FC, including a shared storage for KVM guests.
"ps -ef" tells me:
root 21183 6134 99 16:48 ? 00:01:30 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free
Apparantly these calls fail because every 2 minutes theres a new attempt with a different PID. Also vgdisplay needs like... 5? minutes to display its results, which look fine though.
Furthermore, theres no errors in /var/log/messages or dmesg.
This causes machine start/stops to take VERY long, and the webservice on this node is unresponsive which causes the other node to not be able to display stats about the VMs on the faulty node #2.
The weird thing is: you can interact with the VMs on that node in the usual way and you can also shut them down from inside the VM just fine.
Any pointers to help resolve this?
PS: we are using SAN storage and it should go without saying thats its connected via FC, including a shared storage for KVM guests.