Ceph random 4K write performance difference on PVE host/PVE VM/ESXi VM

imoniker

Member
Aug 28, 2023
32
2
8
Hi, we did some PVE Ceph performance testing, here is the result:
- random 4K write on PVE host OS: IOSP = 121K, BW = 472MiB/s, Storage: 100GB block device on Ceph
- random 4K write inside PVE VM: IOPS = 23K, BW = 90.3 MiB/s, Storage: 500GB virtual disk on the same Ceph, virtio-scsi/no cache/Kernel Native AIO/CFQ
- random 4K write inside ESXi VM: IOPS = 66.4K, BW = 259 MiB/s : 100GB virtual disk on local storage (single Micro 5100Max 1.92TB SSD, RAID pass-through mode)

My questions are:
1. Is my testing result normal?
2. IOPS is lower inside VM than on Hypervisor, is there anything to optimize it?
3. IOPS is lower inside PVE VM than inside ESXi VM, is it normal?

PVE Ceph Environment: 6 servers, each is Dell R730xd, 2 x 2696v3 CPU, 384GB DDR4 2133MHz Memory, 1*10Gb Ethernet inter-connect, 6 * Micro 5100Max 1.92TB SSD.
ESXi Environment: 1 server, same configuration as PVE Ceph server.

Test 1: on hypervisor :
fio -direct=1 -iodepth=128 -rw=randwrite -ioengine=libaio -bs=4k size=50G -numjobs=48 -runtime=120 -group_reporting -filename=/mnt/rbd/iotest name=Rand_Write_IOPS_Test

Test 2: in PVE VM:
fio -bs=4k -ioengine=libaio -iodepth=32 -numjobs=16 -direct=1 -rw=randwrite -thread -time_based -runtime=60 -refill_buffers -norandommap -randrepeat=0 -group_reporting -name=fio-randread-lat -size=500G -filename=/dev/sdb

Test 3: in ESXi VM:
fio -bs=4k -ioengine=libaio -iodepth=32 -numjobs=16 -direct=1 rw=randwrite -thread -time_based -runtime=120 -refill_buffers -norandommap -randrepeat=0 group_reporting -name=fio-randread-lat -size=50G -filename=/dev/sdb
 
Looking at the FIO commands, they do not include the direct flag, avoiding caching as much as possible.

I assume the Ceph Storage on PVE did not have the KRBD flag enabled? -> Qemu connects directly to the RBD image with only a small cache and the disk image did not have any caching enabled.

Try to set the cache of the VM's disk to "writeback" and also check if KRBD in the storage config enabled changes the result (needs a cold boot or live-migration of the VM).

Maybe the benchmark paper from 2020 is of interest for you https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516/ where we did test the different VM and storage settings. And yes, a new one is on the todo list ;)
 
most of my virtual disks are like

Code:
scsihw: virtio-scsi-single
scsi0: CephRBD_NVMe:vm-9901-disk-0,aio=native,cache=writeback,discard=on,iothread=1,size=301G
scsi1: CephRBD_NVMe:vm-9901-disk-1,aio=native,cache=writeback,discard=on,iothread=1,size=401G

and I do not have a tremendous disparity between radosbench and VM performance
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!