Why do I see so high IO delays?

proxwolfe

Well-Known Member
Jun 20, 2020
501
52
48
49
Hi,

I am running a little cluster in my homelab:

2x PVE on Xeon E3-1220 with 64GB Ram each
1x PVE on virtual Xeon E3-1220 mit 48GB (running in a VM on a real Xeon E3-1220 with 64GB; the physical machine also houses a PBS)

Each node has a 512GB nvme drive as part of a Ceph pool for VM storage and a 3GB HDD as part of a Ceph pool for data storage. The PVE cluster and the Ceph cluster each have their own 10gbe network.

On each physical node I have approx. 10 VMs running and I see CPU usage of around 10% (peaking at 20% once in a while). My IO delay oscillates around 5% (peaking at 10% once in a while).

Why is my IO delay so high? How can I improve (reduce) IO delay? (Adding more disks would not be my preferred option.)

Thanks!
 
1x PVE on virtual Xeon E3-1220 mit 48GB (running in a VM on a real Xeon E3-1220 with 64GB; the physical machine also houses a PBS)
Why? PVE in a VM is a bad idea, because of nested virtualization. It's totally fine to run PVE+PBS both bare metal on the same server. How to install PBS on a PVE node is described there: https://pbs.proxmox.com/docs/installation.html#install-proxmox-backup-server-on-proxmox-ve

My IO delay oscillates around 5% (peaking at 10% once in a while).

Why is my IO delay so high?
That isn't that high. I'm seeing 0-5% with local SSDs and 30+% with local HDDs. So a mix of SSDs and HDDs and that all over the network because of ceph doesn't sound that unresonable.
 
Well, a 3 node cluster is already overkill for my purposes. So I figured I would only run two physical nodes and a quorum device instead of the third. But then I thought, I might as well virtualize the third node on the machine I use for PBS. And I think one needs a minimum of three nodes for Ceph. And I wanted to keep the machine I use for PBS outside the cluster so that if anything happens to the cluster, I can still access my backups.

PVE in a VM is a bad idea, because of nested virtualization.
What is the issue with nested virtualization? I would expect there to be a performance penalty but otherwise... And when it comes to performance, the virtual node actually has the lowest IO delay of the three (oscillating around 4%). So far, I am happy with my decision.

It's totally fine to run PVE+PBS both bare metal on the same server. How to install PBS on a PVE node is described there: https://pbs.proxmox.com/docs/installation.html#install-proxmox-backup-server-on-proxmox-ve
That's how I am running it. It's just that I run another PVE (node 3) inside PVE.

That isn't that high. I'm seeing 0-5% with local SSDs and 30+% with local HDDs. So a mix of SSDs and HDDs and that all over the network because of ceph doesn't sound that unresonable.
Oh, I thought 5% was on the high end of what is acceptable. It is good to know your numbers to put mine into perspective! Thanks for that.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!