What about on your side ? Do you have access to a Proxmox host with a local LV on a SSD drive ? Can you see if you get the same difference between VM and Host mounted on your side ?
So, I've done some extensive testing on one of our servers (wiped it beforehand and set up PVE from scratch) and my findings pretty much align with yours.
What I've tested were the following constellations on a pretty decent NVME with
fio
:
Host
pve-manager/8.2.7/3e0176e6bb2ade3b (running kernel: 6.8.12-2-pve)
- Bare ext4
- ext4 on LVM
- Bare xfs
- xfs on LVM
- ZFS (single disk, dataset with
primarycache=metadata
, compression=on
, recordsize=128K
)
Guest
6.1.0-25-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.106-3 (2024-08-26) x86_64 GNU/Linux
Code:
# qm config 100
agent: 1
boot: order=scsi0;ide2;net0
cores: 8
cpu: host
ide2: none,media=cdrom
memory: 32768
meta: creation-qemu=9.0.2,ctime=1727455008
name: deb-bench-test
net0: virtio=BC:24:11:9D:4E:B8,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-100-disk-0,iothread=1,size=32G,ssd=1
scsi1: lvm-bench-vm:vm-100-disk-0,cache=writeback,iothread=1,size=200G,ssd=1
scsi2: zfs-bench-vm-single:vm-100-disk-0,cache=writeback,iothread=1,size=200G,ssd=1
scsi3: lvm-bench-vm:vm-100-disk-1,cache=writeback,iothread=1,size=200G,ssd=1
scsi4: zfs-bench-vm-single:vm-100-disk-1,cache=writeback,iothread=1,size=200G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=fa27540b-2ff2-470a-9fdc-e52ad4301f97
sockets: 1
- ext4 on LVM (non-thin) storage
- xfs on LVM (non-thin) storage
- ext4 on ZFS (
primarycache=all, compression=on, volblocksize=16K
)
- xfs on ZFS (
primarycache=all, compression=on, volblocksize=16K
)
For guests In general, IOPS (and bandwidth) are down by quite a bit for smaller reads and writes, when compared to the same FS being used on the host; however, both IOPS and bandwidth seem to be equal to the same filesystem on the host for
larger reads and writes. This is kind of what I'd expect anyways.
What's also to be expected is that ZFS performance is tanking quite a bit compared to using it on the host, as there's no way for its ARC to properly cache anything in the benchmarks I've made. There's probably also some minor overhead due to compression. (And it's ZFS on a single disk, which is not something you usually wanna do anyways).
Either way, I digress; I've found that IO latencies are quite a bit higher than on the host itself, which aligns with your findings. For comparison:
ioping -S64M -L -s4k -W -q
on host (counts: 10, 100, 1000):
Code:
--- /mnt/bench/ext4 (ext4 /dev/nvme0n1p1 245.0 GiB) ioping statistics ---
9 requests completed in 654.2 us, 36 KiB written, 13.8 k iops, 53.7 MiB/s
generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
min/avg/max/mdev = 69.1 us / 72.7 us / 78.9 us / 3.07 us
--- /mnt/bench/ext4 (ext4 /dev/nvme0n1p1 245.0 GiB) ioping statistics ---
99 requests completed in 7.68 ms, 396 KiB written, 12.9 k iops, 50.4 MiB/s
generated 100 requests in 1.65 min, 400 KiB, 1 iops, 4.04 KiB/s
min/avg/max/mdev = 68.0 us / 77.5 us / 102.4 us / 7.16 us
--- /mnt/bench/ext4 (ext4 /dev/nvme0n1p1 245.0 GiB) ioping statistics ---
999 requests completed in 79.9 ms, 3.90 MiB written, 12.5 k iops, 48.8 MiB/s
generated 1 k requests in 16.7 min, 3.91 MiB, 1 iops, 4.00 KiB/s
min/avg/max/mdev = 43.6 us / 80.0 us / 143.2 us / 11.7 us
ioping -S64M -L -s4k -W -q
in VM (counts: 10, 100, 1000):
Code:
--- /mnt/bench/lvm-ext4 (ext4 /dev/sdb1 195.8 GiB) ioping statistics ---
9 requests completed in 3.72 ms, 36 KiB written, 2.42 k iops, 9.46 MiB/s
generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
min/avg/max/mdev = 286.4 us / 413.0 us / 476.2 us / 55.5 us
--- /mnt/bench/lvm-ext4 (ext4 /dev/sdb1 195.8 GiB) ioping statistics ---
99 requests completed in 41.8 ms, 396 KiB written, 2.37 k iops, 9.24 MiB/s
generated 100 requests in 1.65 min, 400 KiB, 1 iops, 4.04 KiB/s
min/avg/max/mdev = 233.2 us / 422.7 us / 571.7 us / 47.5 us
--- /mnt/bench/lvm-ext4 (ext4 /dev/sdb1 195.8 GiB) ioping statistics ---
999 requests completed in 418.8 ms, 3.90 MiB written, 2.38 k iops, 9.32 MiB/s
generated 1 k requests in 16.7 min, 3.91 MiB, 1 iops, 4.00 KiB/s
min/avg/max/mdev = 182.7 us / 419.2 us / 658.9 us / 55.4 us
I believe that this is very much due to the virtualization overhead; though, specifically for latency, I cannot say if this has always been the case or not, perhaps some of the more experienced users can weigh in here. In my personal workloads latency was never really an issue (low latency was never really a requirement / I never really hit any points where latency ended up mattering).
I'm not sure what Xen does in particular to make it that fast; perhaps it just passes any IO through directly to LVM, or it does whatever it wants and tells you it's done with whatever operation you give it (kind of like setting
cache=unsafe
on a disk in PVE, perhaps?). I think the best bet would be to just use a CT if you can, if you really want to minimise the overhead. Alternativel you could also try mounting some kind of network storage inside the VM, if you happen to have one that's particularly fast.
Also, there are a couple extra resources I dug up that might be of interest to you:
You might want to look into optimising MySQL itself, if you haven't already. Specifically, if there's a way to reduce IOPS by using larger RW ops instead of a bunch of smaller ones, you might want to give that a shot.
I hope that helps!