[SOLVED] KRBD and external CEPH slow - VM disk use 100%

infrant

Member
Apr 30, 2021
18
1
8
40
Hello, I have a problem in my proxmox cluster for some time, I have 16 nodes in my cluster with their VMs in an external CEPH cluster with 6 nodes and 72 osd SSD, everything is connected in 25Gb networks in 2 100Gb core switches each, I use an RBD pool and in the proxmox when mounting the storage I select RBD and in the configuration I add KRBD to gain speed however, the problem I have is that all VMS (mainly windows) occupy 100% of the disks that are in the storage, if I run a test or use the VM for a long time, the disk reaches 100% usage and it takes time to get back to normal, sometimes just by moving the mouse it uses more than 50%.


My doubt is if I need to activate something else in the storage for KRBD to work, because the VMS are using 100% of the disk (all of which have winows) if I don't have anything running many times?

as I use the machine or some disk speed test, executing the writing on the storage is getting slow, but check the image below and the disk usage stays at 100% all the time, before 1 month ago the machines with the active KRBD were just like a rocket, fast and stable and from a few weeks ago today something has changed in the cluster that KRBD is no longer efficient

First test the resut is very very good

1620183808209.png

after a few seconds i have this result and DISK 100% ,
1620185882425.png

Its only test, but the same result when i use the VM machine for the internet, video edit etc .. for a few hours .


when I use pool nfs for example reading and writing are mainly better and do not use the disk at 100%, however with RBD and KRBD active I have this problem

Tks
 
Last edited:
Could you please also post your configurations in Proxmox VE? A start would be
Code:
qm config <vmid>
cat /etc/pve/storage.cfg
 
hey Dominic, this is my configurations

qm config 101

1620212329700.png

root@devpve02:~# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content backup,iso,vztmpl

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

nfs: pvebackup
disable
export /mnt/zpool1/pvebackup
path /mnt/pve/pvebackup
server 10.21.8.6
content backup
prune-backups keep-all=1

pbs: pbsbackup
disable
datastore BackupStore
server 10.21.8.31
content backup
fingerprint c1:1c:b7:99:c5:ed:78:0a:fb:c1:77:00:9b:82:7d:c5:79:8e:e5:8f:aa:e9:c7:70:51:1c:82:7a:0e:2d:70:b8
prune-backups keep-all=1
username root@pam

rbd: storevm
disable
content images
krbd 1
monhost 10.21.22.101 10.21.22.102 10.21.22.103 10.21.22.104 10.21.22.105 10.21.22.106
pool storevm
username admin

rbd: ceph_vmstore
content images
krbd 1
monhost 10.21.22.101 10.21.22.102 10.21.22.103 10.21.22.104 10.21.22.105 10.21.22.106
pool vmstore
username admin

rbd: vmtest
content images
krbd 1
monhost 10.21.22.101 10.21.22.102 10.21.22.103 10.21.22.104 10.21.22.105 10.21.22.106
pool vmtest
username admin
 
For more information i make a iperf test in my cluster CEPH and this is a result

1620221849295.png

And proxmox to CEPH

1620221882584.png
 
What versions of Proxmox VE, Ceph & virtio drivers do you use exactly? Have you done any upgrades in the last weeks?
Code:
pveversion -v

Could you maybe run a benchmark in the Windows VM with fio? There are some example commands in the Proxmox VE Ceph Benchmark on page 20. With fio you can control a lot of parameters. It would be great if you could run something comparable. Comparing this to fio in a Linux VM might be interesting to pin down driver issues.
 
I did not make any updates, just reinstalled the ceph (the whole cluster) but in the proxmox using KRBD I have this problem of the 100% used disk, this same ceph ran before without problems, but from one day to the other the windows machines started having this symptom.

when i use NFS its every ok , but my ceph storage data is In RBD and i use in proxmox KRBD for the speed


This is a CEPH version

1620326635128.png


This is my version of pve

root@pve1:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
 
one day to the other
The day when your reinstalled Ceph?

So first of all, there you can upgrade to Proxmox VE 6.4 with a single apt update; apt full-upgrade.

after a few seconds i have this result and DISK 100% ,
I assume that some caches get full when the performance seems to drop. Therefore the fio tests would be interesting. You would also have to choose something different than cache=unsafe. Comparison to librbd (with fio) might also be interesting.
 
In my tests I saw something interesting, when I run a task on another VM other than the test VM, I see that it influences the test VM on the same storage. Another thing we noticed in the tests, when we use new SSD disks out of the box and put them on only 3 hosts, we deploy ceph on these disks and put the rbd to work it seems that everything is 100%, when we put the disks we formatted and even even those that were used for ot this (new) and we format them again the performance drops. Do you have anything to do when formatting the disc? I believe that not correct?
 
cache=unsafe -> the fsync are not going to ceph, but keep in your host memory. (fast). and when your host memory is full, the host is trying to flushing the data (and it'll be super slow for a long time).

also, if you have a power failure, your vm will be dead with fs corruption.

so never use cache=unsafe. (or maybe for a swap)
 
cache=unsafe -> the fsync are not going to ceph, but keep in your host memory. (fast). and when your host memory is full, the host is trying to flushing the data (and it'll be super slow for a long time).

also, if you have a power failure, your vm will be dead with fs corruption.

so never use cache=unsafe. (or maybe for a swap)

Thanks for the answer, yes we saw that unsafe is problematic, but I'm doing tests with all of them, but we always use Write back and discard, unsafe was for testing , my default configuration is this
1620401215126.png
 
so, if you test with cache=writeback, how are the results ?
Is the same :( , in my cluster ceph i've many slow ops , vm windows i saw disk use on 100% ,

1620404516515.png

when I use another NFS storage for example I have no problems, I did the thesis with KRBD in proxmox 6.4.5 and I get the same result, I can't see what could be causing this slowdown, a lot, but then when I need to take the storage ceph to another one it goes fast, my problem is recording, the reading is very fast above 4Gb but recording starts at 1Gb and then goes down until it hangs as if something lines up everything and no releases the disc at 100% usage for about 60 seconds
 
Is the same :( , in my cluster ceph i've many slow ops , vm windows i saw disk use on 100% ,

View attachment 25851

when I use another NFS storage for example I have no problems, I did the thesis with KRBD in proxmox 6.4.5 and I get the same result, I can't see what could be causing this slowdown, a lot, but then when I need to take the storage ceph to another one it goes fast, my problem is recording, the reading is very fast above 4Gb but recording starts at 1Gb and then goes down until it hangs as if something lines up everything and no releases the disc at 100% usage for about 60 seconds
i Stop the test and disk stay in 100%

1620404940524.png
 
A problem that I noticed in my tests, several tests, is that when I disconnect the krbd from the storage settings in the proxmox the disk behaves as expected, without occupying the 100%, but much slower, of course.

My doubt, what could make KRBD no longer perform, and how can I fix it?
 
We were able to find the performance problem, the SSDs had been formatted to be able to put in the ceph again, apparently the formatting of the disks was not correct, I did not find anything of documentation talking about how to format the disk for use in CEPH, as they are WD disks, we use their software to leave the SSD as factory default zero, then we installed the cluster and to our surprise everything returned to 100% performance, thanks to everyone I hope this helps someone too, we hit our heads until we reached this simple solution but we found nothing it turns out that that was it.
 
  • Like
Reactions: Dominic

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!