Local Disk Locking Up

manitc

Member
Jan 4, 2022
8
1
8
24
I've got a peculiar issue I just ran into . I have a 3 Node Cluster setup with CephFS. I created a Ubuntu desktop VM, added a new raw drive on local storage and mounted it to a folder. The OS drive is located on the CephFS.

Everything appears to be working just fine. After pulling a bunch of files through git and writing large amounts of data to the raw local drive. After a few minutes the disk locks up. All disk operations fails. Even a simple command like "ls" or "touch" just hangs. The OS is still running and the CEPH drive is still read/writeable. I tried this on multiple nodes and I still get the same failure.

On one of the nodes, a physical disk on the ceph cluster has failed but the overall ceph health status reports ok. I don't think a bad drive on a single ceph cluster could cause a local drive to fail. I'm going to replace the drive next time I'm at the datacenter but wanted to see if anyone ran into the same issue. Where can I see logs for VM disk IO errors?

Version: ProxMox 7.1-4
VM OS: Ubuntu 22.04 Desktop, Ubuntu 18.04 Desktop

Thanks,

Manit
 
Last edited:
I've got a Windows 10 VM with the same setup in the cluster. The OS is on the local drive and a secondary drive is on CEPH. The OS keeps locking up.

On the Ubuntu desktops, I moved the disks to local storage, I'll run some tests today and update my findings.
 
I moved all drives to local and still seeing intermittent freezing/locks on the drive.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!