I am seeing this error message in one of the newly upgraded Proxmox 3.3 node.
When this error occurs it freezes all VMs, prevents SSH access and shuts down Ceph MONs. Only way to bring everything back is to reboot the node. Any idea what is causing it. Seems to happen at least twice a day.
Code:
Oct 28 16:54:19 CA-00-01-01-13 kernel: INFO: task linkwatch:97 blocked for more than 120 seconds.
Oct 28 16:54:19 CA-00-01-01-13 kernel: Tainted: G W --------------- 2.6.32-32-pve #1
Oct 28 16:54:19 CA-00-01-01-13 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 28 16:54:19 CA-00-01-01-13 kernel: linkwatch D ffff88061c756e70 0 97 2 0 0x00000000
When this error occurs it freezes all VMs, prevents SSH access and shuts down Ceph MONs. Only way to bring everything back is to reboot the node. Any idea what is causing it. Seems to happen at least twice a day.