Proxmox disk performance is slow

Mar 24, 2022
21
7
8
Hi,

Like many other posters in this forum, I'm struggling with "very slow" disk I/O from guest VMs in proxmox.
For reference, I'm using buster which I think is 6.4(?). One of the threads I read yesterday suggested moving from ceph-nautilus to ceph-octopus, which I did yesterday and that has helped (write speed went up from 70MB/s to 100MB/s). Something that I don't recall seeing before was the option to choose "VirtIO" for the type of hard drive when adding a new disk to a VM.

In my case the OSDs are all RAID groups of SSDs, with the raw devices ("/dev/sd*") presented into proxmox, not LVM volumes. The RAID volumes for OSDs are distinct from the volume used to run proxmox. Using /sys, the RAID disks present as volumes with a physical block size of 4096 bytes and a logical volume size of 512 bytes. Which size will ceph be using? How do I verify if ceph is using 4K or make it use 4k if it isn't?

In addition to the "Metrics Server" (influxbd) configuration being set, I've got performance metrics from the OS going into influxdb which I can then query with grafana.

What I see is about 40% disk load generating 100MB/s with about 400 IOPS. When the system is otherwise occupied, I can see it doing > 10,000 IOPS (ie RAID controller isn't an issue.)

The cluster I'm running has a "front end" network that's 1G and a back end network that's 10G. When I'm doing disk performance tests, the 1G network is running at 100% - suggesting that there's some workload that could move.

In the ceph.conf, global.cluster_network is the 10G /24, global.mon_host has 10G IP#s but all of the mds.*.host entries point to a hostname that resolves to the 1G IP# and mon.*.public_addr entries point to an IP# on the public (1G) network. Can all of these be pushed to use IP#s on the 10G network?
 
Two significant things have helped here.
1) Upgrading from ceph-nautilus to ceph-octopus (~40-50% performance uplift)
2) Getting ceph to use the 10G network for all activity. Seemingly the "cluster_network = " is not used as I would expect it to. I needed to change "public_network = " and fix IP#s through ceph.conf and dual-listing the hostnames in /etc/hosts. My expectation was that putting "cluster_network =" in ceph.conf would result in all mgr/mon/osd traffic using IP addresses of cluster members that match the subnet - that's not what happens.
 
in my case, disable KSM sharing, disable swap, and No Cache for any disk helped.
 
2) Getting ceph to use the 10G network for all activity.
Have a look at the network configuration chapter of Ceph.

TL;DR: As the diagram shows, the public network is mandatory and contains all traffic if there is no cluster network configured. If you configure an optional cluster network, it is used for "inter OSD" traffic (heartbeat, replication, recovery). Therefore reducing the load on the public network if you reach limits there.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!