I've got a peculiar issue I just ran into . I have a 3 Node Cluster setup with CephFS. I created a Ubuntu desktop VM, added a new raw drive on local storage and mounted it to a folder. The OS drive is located on the CephFS.
Everything appears to be working just fine. After pulling a bunch of files through git and writing large amounts of data to the raw local drive. After a few minutes the disk locks up. All disk operations fails. Even a simple command like "ls" or "touch" just hangs. The OS is still running and the CEPH drive is still read/writeable. I tried this on multiple nodes and I still get the same failure.
On one of the nodes, a physical disk on the ceph cluster has failed but the overall ceph health status reports ok. I don't think a bad drive on a single ceph cluster could cause a local drive to fail. I'm going to replace the drive next time I'm at the datacenter but wanted to see if anyone ran into the same issue. Where can I see logs for VM disk IO errors?
Version: ProxMox 7.1-4
VM OS: Ubuntu 22.04 Desktop, Ubuntu 18.04 Desktop
Thanks,
Manit
Everything appears to be working just fine. After pulling a bunch of files through git and writing large amounts of data to the raw local drive. After a few minutes the disk locks up. All disk operations fails. Even a simple command like "ls" or "touch" just hangs. The OS is still running and the CEPH drive is still read/writeable. I tried this on multiple nodes and I still get the same failure.
On one of the nodes, a physical disk on the ceph cluster has failed but the overall ceph health status reports ok. I don't think a bad drive on a single ceph cluster could cause a local drive to fail. I'm going to replace the drive next time I'm at the datacenter but wanted to see if anyone ran into the same issue. Where can I see logs for VM disk IO errors?
Version: ProxMox 7.1-4
VM OS: Ubuntu 22.04 Desktop, Ubuntu 18.04 Desktop
Thanks,
Manit
Last edited: