Not able to read from Ceph

EliteEng

New Member
Jan 6, 2022
1
0
1
42
I have just converted a 2 node cluster to use the ceph file store, the migration seemed to go well with writing all the data onto ceph.
When it came time to fire up the VM's I noticed that they just sit there trying to start.
I created some backups on the cephfs (which worked fine) then I tried to debug why the VM's wouldn't start.

I noticed that on the ceph dashboard in proxmox I was getting around 10MB/s write speed (about expected) and 0B/s read speed.
So I looked to migrated the VM's back out to local disks and I found that there will be a short burst of read speed (up to 300MB/s) for about 2-3 seconds then back to 0B/s where it will just sit there trying to migrate the VM until I stop it.

There is not a lot of data on the ceph filestore (~700GB / 10% used)

I have tried doing backups, snapshots, migrating through the web gui with the same result.
I have also tried rbd export, copy from the command line.
I have also tried to rsync the backups from the cephfs with the same result, I has a burst of data then hangs.

recovery and re-balancing is working between the nodes at around 10-20MB/s

ceph health is showing 1 slow metadata IO
Code:
mds.alpha(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked for 72899 secs


Hardware setup is
2 node cluster, 3 OSD's per node.

Proxmox versions
Code:
proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-8 (running version: 7.1-8/5b267f33)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-1
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-4
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3

Any help would be appreciated.
At this point I would be happy to be able to read any data even if some of it is corrupted/missing.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!