Hi!
I am benchmarking my setup and found, that the disk write performance in a VM is appr. 42% less than on the host.
I am using dd for testing. I know it is not ideal, but it should give an
estimation and a lot of others are using it, too.
I have a LSI RAID1 with a BBU configured in "Writeback with BBU" mode with two
drives. One is 100GB (sda1: LVM VG PVE) used for the Proxmox installation, the
other is 1,8TB (sdb2: LVM VG VMs) used for the VM disks.
And there is another 4T Hitachi disk for backups with an ext4 file system.
For testing I created two LV with 2GB on each of the VGs.
There is only ONE VM running in the system. The root disk of this VM is on
a DRBD device, but the second server is disconnected (no influence by DRBD).
The VM test disk in on a RAID backed LVM LV.
On the host RAID (LVM LV 2G on VMs(/dev/sdb1)) I get:
Direct:
dd if=/dev/zero of=/dev/vg_vms/vm-100-disk-2 bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 6,41299 s, 83,7 MB/s
536870912 bytes (537 MB) copied, 6,36552 s, 84,3 MB/s
536870912 bytes (537 MB) copied, 6,39561 s, 83,9 MB/s
No direct:
dd if=/dev/zero of=/dev/vg_vms/vm-100-disk-2 bs=512M count=1
536870912 bytes (537 MB) copied, 1,14586 s, 469 MB/s
536870912 bytes (537 MB) copied, 1,12022 s, 479 MB/s
536870912 bytes (537 MB) copied, 1,09654 s, 490 MB/s
On the host RAID (LVM LV 2G on PVE(/dev/sda1)) I get:
Direct:
dd if=/dev/zero of=/dev/pve/test bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 4,09055 s, 131 MB/s
536870912 bytes (537 MB) copied, 4,20133 s, 128 MB/s
536870912 bytes (537 MB) copied, 4,25934 s, 126 MB/s
No direct:
dd if=/dev/zero of=/dev/pve/test bs=512M count=1
536870912 bytes (537 MB) copied, 5,03491 s, 107 MB/s
536870912 bytes (537 MB) copied, 5,16682 s, 104 MB/s
536870912 bytes (537 MB) copied, 6,31498 s, 85,0 MB/s
On the 4T (mounted partition with ext4) disk I get:
Direct:
dd if=/dev/zero of=/mnt/scratch/xx/ff bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 4,20657 s, 128 MB/s
536870912 bytes (537 MB) copied, 4,18068 s, 128 MB/s
536870912 bytes (537 MB) copied, 4,09591 s, 131 MB/s
No direct:
dd if=/dev/zero of=/mnt/scratch/xx/ff bs=512M count=1
536870912 bytes (537 MB) copied, 1,37137 s, 391 MB/s
536870912 bytes (537 MB) copied, 4,44111 s, 121 MB/s
536870912 bytes (537 MB) copied, 4,27835 s, 125 MB/s
On the VM the 2G(vm-100-disk-2) [cache=directsync] I get:
Direct:
dd if=/dev/zero of=/dev/vdd bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 11.6293 s, 46.2 MB/s
536870912 bytes (537 MB) copied, 10.9080 s, 49.2 MB/s
536870912 bytes (537 MB) copied, 11.0009 s, 48.8 MB/s
No direct:
dd if=/dev/zero of=/dev/vdd bs=512M count=1
536870912 bytes (537 MB) copied, 11.1444 s, 48.2 MB/s
536870912 bytes (537 MB) copied, 11.4851 s, 46.7 MB/s
536870912 bytes (537 MB) copied, 11.0797 s, 48.5 MB/s
On the 4T (mounted partition with ext4):
pveperf /mnt/scratch/xx
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 951930
HD SIZE: 1932.59 GB (/dev/sdc2)
BUFFERED READS: 135.83 MB/sec
AVERAGE SEEK TIME: 16.00 ms
FSYNCS/SECOND: 57.79
DNS EXT: 27.09 ms
DNS INT: 1007.16 ms (anw.at)
On the host RAID (LVM LV 2G on PVE(/dev/sda1)[ ext4 ]):
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 931980
HD SIZE: 1.91 GB (/dev/mapper/pve-test)
BUFFERED READS: 108.39 MB/sec
AVERAGE SEEK TIME: 7.22 ms
FSYNCS/SECOND: 50.52
DNS EXT: 27.38 ms
DNS INT: 1006.04 ms (anw.at)
On the host RAID (LVM LV 2G on PVE(/dev/sda1)[ ext3 ]):
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 955866
HD SIZE: 1.91 GB (/dev/mapper/pve-test)
BUFFERED READS: 147.09 MB/sec
AVERAGE SEEK TIME: 6.91 ms
FSYNCS/SECOND: 35.23
DNS EXT: 28.51 ms
DNS INT: 1006.71 ms (anw.at)
On the host RAID (LVM LV 2G on PVE(/dev/sda1)[ XFS ]):
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 933939
HD SIZE: 1.99 GB (/dev/mapper/pve-test)
BUFFERED READS: 140.27 MB/sec
AVERAGE SEEK TIME: 6.17 ms
FSYNCS/SECOND: 86.79
DNS EXT: 26.49 ms
DNS INT: 1006.47 ms (anw.at)
So there is a big difference which file system is used!
On the host RAID (LVM LV 2G on VMs(/dev/sdb1))[ ext4 ]:
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 924504
HD SIZE: 1.91 GB (/dev/mapper//dev/vg_vms/vm-100-disk-2)
BUFFERED READS: 81.18 MB/sec
AVERAGE SEEK TIME: 7.47 ms
FSYNCS/SECOND: 26.23
DNS EXT: 27.70 ms
DNS INT: 1006.62 ms (anw.at)
The LV on the bigger disk performs here also bad compared to the
LV on the smaller one.
a) I can't understand why I have such a big difference on the host with
the two disks provided by the RAID controller.
b) The smaller RAID disk performs worse in NON direct mode!?
c) What could be the reason that the VM disk performance is so bad
(host: 84 MB/s vs. VM: 48 MB/s = -42%)?
I am using the latest PVE 4.4 with a slightly older Kernel.
proxmox-ve: 4.4-76 (running kernel: 4.4.24-1-pve)
pve-manager: 4.4-1 (running version: 4.4-1/eb2d6f1e)
pve-kernel-4.4.35-1-pve: 4.4.35-76
pve-kernel-4.4.24-1-pve: 4.4.24-72
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-101
pve-firmware: 1.1-10
libpve-common-perl: 4.0-83
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-70
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.0-9
pve-container: 1.0-88
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-2
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
Edit: I just discovered that my BBU is in not optimal state. But this should
have no impact on the questions. Especially why is the VM write
performance so bad.
BR,
Jasmin
I am benchmarking my setup and found, that the disk write performance in a VM is appr. 42% less than on the host.
I am using dd for testing. I know it is not ideal, but it should give an
estimation and a lot of others are using it, too.
I have a LSI RAID1 with a BBU configured in "Writeback with BBU" mode with two
drives. One is 100GB (sda1: LVM VG PVE) used for the Proxmox installation, the
other is 1,8TB (sdb2: LVM VG VMs) used for the VM disks.
And there is another 4T Hitachi disk for backups with an ext4 file system.
For testing I created two LV with 2GB on each of the VGs.
There is only ONE VM running in the system. The root disk of this VM is on
a DRBD device, but the second server is disconnected (no influence by DRBD).
The VM test disk in on a RAID backed LVM LV.
On the host RAID (LVM LV 2G on VMs(/dev/sdb1)) I get:
Direct:
dd if=/dev/zero of=/dev/vg_vms/vm-100-disk-2 bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 6,41299 s, 83,7 MB/s
536870912 bytes (537 MB) copied, 6,36552 s, 84,3 MB/s
536870912 bytes (537 MB) copied, 6,39561 s, 83,9 MB/s
No direct:
dd if=/dev/zero of=/dev/vg_vms/vm-100-disk-2 bs=512M count=1
536870912 bytes (537 MB) copied, 1,14586 s, 469 MB/s
536870912 bytes (537 MB) copied, 1,12022 s, 479 MB/s
536870912 bytes (537 MB) copied, 1,09654 s, 490 MB/s
On the host RAID (LVM LV 2G on PVE(/dev/sda1)) I get:
Direct:
dd if=/dev/zero of=/dev/pve/test bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 4,09055 s, 131 MB/s
536870912 bytes (537 MB) copied, 4,20133 s, 128 MB/s
536870912 bytes (537 MB) copied, 4,25934 s, 126 MB/s
No direct:
dd if=/dev/zero of=/dev/pve/test bs=512M count=1
536870912 bytes (537 MB) copied, 5,03491 s, 107 MB/s
536870912 bytes (537 MB) copied, 5,16682 s, 104 MB/s
536870912 bytes (537 MB) copied, 6,31498 s, 85,0 MB/s
On the 4T (mounted partition with ext4) disk I get:
Direct:
dd if=/dev/zero of=/mnt/scratch/xx/ff bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 4,20657 s, 128 MB/s
536870912 bytes (537 MB) copied, 4,18068 s, 128 MB/s
536870912 bytes (537 MB) copied, 4,09591 s, 131 MB/s
No direct:
dd if=/dev/zero of=/mnt/scratch/xx/ff bs=512M count=1
536870912 bytes (537 MB) copied, 1,37137 s, 391 MB/s
536870912 bytes (537 MB) copied, 4,44111 s, 121 MB/s
536870912 bytes (537 MB) copied, 4,27835 s, 125 MB/s
On the VM the 2G(vm-100-disk-2) [cache=directsync] I get:
Direct:
dd if=/dev/zero of=/dev/vdd bs=512M count=1 oflag=direct
536870912 bytes (537 MB) copied, 11.6293 s, 46.2 MB/s
536870912 bytes (537 MB) copied, 10.9080 s, 49.2 MB/s
536870912 bytes (537 MB) copied, 11.0009 s, 48.8 MB/s
No direct:
dd if=/dev/zero of=/dev/vdd bs=512M count=1
536870912 bytes (537 MB) copied, 11.1444 s, 48.2 MB/s
536870912 bytes (537 MB) copied, 11.4851 s, 46.7 MB/s
536870912 bytes (537 MB) copied, 11.0797 s, 48.5 MB/s
On the 4T (mounted partition with ext4):
pveperf /mnt/scratch/xx
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 951930
HD SIZE: 1932.59 GB (/dev/sdc2)
BUFFERED READS: 135.83 MB/sec
AVERAGE SEEK TIME: 16.00 ms
FSYNCS/SECOND: 57.79
DNS EXT: 27.09 ms
DNS INT: 1007.16 ms (anw.at)
On the host RAID (LVM LV 2G on PVE(/dev/sda1)[ ext4 ]):
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 931980
HD SIZE: 1.91 GB (/dev/mapper/pve-test)
BUFFERED READS: 108.39 MB/sec
AVERAGE SEEK TIME: 7.22 ms
FSYNCS/SECOND: 50.52
DNS EXT: 27.38 ms
DNS INT: 1006.04 ms (anw.at)
On the host RAID (LVM LV 2G on PVE(/dev/sda1)[ ext3 ]):
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 955866
HD SIZE: 1.91 GB (/dev/mapper/pve-test)
BUFFERED READS: 147.09 MB/sec
AVERAGE SEEK TIME: 6.91 ms
FSYNCS/SECOND: 35.23
DNS EXT: 28.51 ms
DNS INT: 1006.71 ms (anw.at)
On the host RAID (LVM LV 2G on PVE(/dev/sda1)[ XFS ]):
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 933939
HD SIZE: 1.99 GB (/dev/mapper/pve-test)
BUFFERED READS: 140.27 MB/sec
AVERAGE SEEK TIME: 6.17 ms
FSYNCS/SECOND: 86.79
DNS EXT: 26.49 ms
DNS INT: 1006.47 ms (anw.at)
So there is a big difference which file system is used!
On the host RAID (LVM LV 2G on VMs(/dev/sdb1))[ ext4 ]:
pveperf /mnt/test
CPU BOGOMIPS: 28727.28
REGEX/SECOND: 924504
HD SIZE: 1.91 GB (/dev/mapper//dev/vg_vms/vm-100-disk-2)
BUFFERED READS: 81.18 MB/sec
AVERAGE SEEK TIME: 7.47 ms
FSYNCS/SECOND: 26.23
DNS EXT: 27.70 ms
DNS INT: 1006.62 ms (anw.at)
The LV on the bigger disk performs here also bad compared to the
LV on the smaller one.
a) I can't understand why I have such a big difference on the host with
the two disks provided by the RAID controller.
b) The smaller RAID disk performs worse in NON direct mode!?
c) What could be the reason that the VM disk performance is so bad
(host: 84 MB/s vs. VM: 48 MB/s = -42%)?
I am using the latest PVE 4.4 with a slightly older Kernel.
proxmox-ve: 4.4-76 (running kernel: 4.4.24-1-pve)
pve-manager: 4.4-1 (running version: 4.4-1/eb2d6f1e)
pve-kernel-4.4.35-1-pve: 4.4.35-76
pve-kernel-4.4.24-1-pve: 4.4.24-72
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-101
pve-firmware: 1.1-10
libpve-common-perl: 4.0-83
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-70
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.0-9
pve-container: 1.0-88
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-2
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
Edit: I just discovered that my BBU is in not optimal state. But this should
have no impact on the questions. Especially why is the VM write
performance so bad.
BR,
Jasmin