On our Proxmox servers we've been surprised at how much slower the VM clients are than the host. How much disk IO overhead is normal or acceptable for VMs?
These tests were done on a server running Proxmox VE 6.2-15; writing to the dpool on the host machine itself, and then performing the same write benchmark within the client, which has its disk stored on the same dpool.
Testing overall throughput of large file writes...
I ran 5 tests of this same command, with the following results:
On this same host, now testing from a VM:
With cache=none, the results of 5 runs are as follows: (command:
Result: roughly 18% performance hit average for virtual vs bare metal for large writes; also, much greater spread in the results. 32 MB/s??
Now testing smaller writes using the following command:
Average of 5 runs on bare metal, the Proxmox host:
Average 5 runs on the Ubuntu VM described above:
Result: 55% performance hit, which hardly seems acceptable?
We also tested with
On another system where the ZFS raid is configured with L2ARC cache and ZIL, the max throughput on the client is 90% slower than the host.
So in summary, we're experiencing a 30-90% overhead of disk IO performance on ZFS raid between host and client.
Any suggestions on how to resolve this would be much appreciated.
These tests were done on a server running Proxmox VE 6.2-15; writing to the dpool on the host machine itself, and then performing the same write benchmark within the client, which has its disk stored on the same dpool.
Code:
NAME STATE READ WRITE CKSUM
dpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD3000FYYZ-01UL1B1_WD-WCC132177822 ONLINE 0 0 0
ata-WDC_WD3000FYYZ-DATTO-1_WD-WCC131198607 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD3000FYYZ-DATTO-1_WD-WCC131330748 ONLINE 0 0 0
ata-WDC_WD3000FYYZ-DATTO-1_WD-WCC1F0366551 ONLINE 0 0 0
Testing overall throughput of large file writes...
Code:
root@chestnut:~# dd if=/dev/zero of=/dpool/test bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 9.10997 s, 118 MB/s
I ran 5 tests of this same command, with the following results:
MIN: 99.6 MB/s MAX: 119 MB/s AVG: 108.92 MB/s
On this same host, now testing from a VM:
Code:
boot:
cores: 8
cpu: host
ide2: local:iso/ubuntu-20.04-live-server-amd64.iso,media=cdrom
memory: 12288
name: cloudberry
net0: e1000=2A:5B:65:68:19:BE,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
parent: before_install
scsi0: VHDs:vm-105-disk-0,cache=none,size=500G
scsihw: virtio-scsi-single
smbios1: uuid=c73ce295-12a4-41fb-9925-0f29dbfefd7e
sockets: 1
vmgenid: 9114d856-946b-42ef-a55d-66d0676f0201
With cache=none, the results of 5 runs are as follows: (command:
command: dd if=/dev/zero of=test bs=1G count=1 oflag=dsync
MIN: 32.4 MB/s MAX: 112 MB/s AVG: 89.22 MB/s
Result: roughly 18% performance hit average for virtual vs bare metal for large writes; also, much greater spread in the results. 32 MB/s??
Now testing smaller writes using the following command:
dd if=/dev/zero of=test bs=512 count=1000 oflag=dsync
Average of 5 runs on bare metal, the Proxmox host:
MIN: 41.3 kB/s MAX: 52.2 kB/s AVG: 49.08 kB/s
Average 5 runs on the Ubuntu VM described above:
MIN: 21.4 kB/s MAX: 22.4 kB/s AVG: 22.04 kB/s
Result: 55% performance hit, which hardly seems acceptable?
We also tested with
cache=writeback
and the results were within a percentage point or two.On another system where the ZFS raid is configured with L2ARC cache and ZIL, the max throughput on the client is 90% slower than the host.
So in summary, we're experiencing a 30-90% overhead of disk IO performance on ZFS raid between host and client.
Any suggestions on how to resolve this would be much appreciated.