PVE 6.0 slow SSD RAID1 performance in Windows VM

freeman1doma

Active Member
Apr 1, 2019
23
1
43
42
Hello.
I have pve 6.0-7 on DELL R730
PVE installed on raid1 hdd
and have ssd raid1 for mssql VMs and sql data
SSD model: 2 x SSDSC2KB480G8R Dell Certified Intel S4x00/D3-S4x10 Series SSDs (Intel d3 s4510 480Gb)

# megacli -LDinfo -Lall -aALL


Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :sysraid1hdd
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 3.637 TB
Sector Size : 512
Is VD emulated : No
Mirror Data : 3.637 TB
State : Optimal
Strip Size : 256 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Enabled
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No


Virtual Drive: 1 (Target Id: 1)
Name :dbssd
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 401.940 GB
Sector Size : 512
Is VD emulated : Yes
Mirror Data : 401.940 GB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Enabled
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: No
LD has drives that support T10 power conditions: No
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No

On ssd raid1 created gpt partition in ext4
# parted /dev/sdb print
Model: DELL PERC H730 Mini (scsi)
Disk /dev/sdb: 432GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 1049kB 432GB 432GB ext4 primary

# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content backup,vztmpl,iso

lvmthin: local-lvm
thinpool data
vgname pve
content rootdir,images

dir: ssddir
path /mnt/ssd-lin/images
content rootdir,images
shared 0

Created Windows 2016 Standart VM as described in Proxmox tutorials on official wiki pages (performance guide and etc, raw, paravirt and so on):
# cat /etc/pve/qemu-server/101.conf
agent: 1
bootdisk: scsi0
cores: 2
cpu: host
ide0: local:iso/virtio-win-0.1.171.iso,media=cdrom,size=363020K
ide2: local:iso/en_windows_server_2016_updated_feb_2018_x64_dvd_11636692.iso,media=cdrom
memory: 8192
name: win-ssddir
net0: virtio=7A:50:73:82:CE:38,bridge=vmbr1,firewall=1
numa: 1
ostype: win10
scsi0: ssddir:101/vm-101-disk-0.raw,cache=writeback,discard=on,size=50G
scsihw: virtio-scsi-pci
smbios1: uuid=cab7bd37-7015-40d6-bb2a-c7dd9c097b66
sockets: 1
vmgenid: cf54452a-7a3d-4b4a-af00-634576b7a7c7

Test in CrystalDiskMark show very slow 4k random read and write (((
win16-crystmr.png
On HDD VMs I have better speed than on SSD.

On the host system performance is normal.

pveperf /mnt/ssd-lin/
CPU BOGOMIPS: 134418.40
REGEX/SECOND: 2137222
HD SIZE: 394.63 GB (/dev/sdb1)
BUFFERED READS: 353.10 MB/sec
AVERAGE SEEK TIME: 0.13 ms
FSYNCS/SECOND: 3293.86
DNS EXT: 14.72 ms
DNS INT: 7.95 ms (server.com)

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test4kreadwrite --filename=/mnt/ssd-lin/4test.raw --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test4kreadwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.12
Starting 1 process
test4kreadwrite: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=270MiB/s,w=89.1MiB/s][r=69.1k,w=22.8k IOPS][eta 00m:00s]
test4kreadwrite: (groupid=0, jobs=1): err= 0: pid=8932: Mon Sep 30 20:33:16 2019
read: IOPS=71.5k, BW=279MiB/s (293MB/s)(3070MiB/10990msec)
bw ( KiB/s): min=208904, max=331560, per=100.00%, avg=289717.33, stdev=46596.04, samples=21
iops : min=52226, max=82890, avg=72429.33, stdev=11649.01, samples=21
write: IOPS=23.9k, BW=93.4MiB/s (97.9MB/s)(1026MiB/10990msec); 0 zone resets
bw ( KiB/s): min=68944, max=110672, per=100.00%, avg=96882.67, stdev=15946.85, samples=21
iops : min=17236, max=27668, avg=24220.67, stdev=3986.71, samples=21
cpu : usr=13.60%, sys=80.33%, ctx=63044, majf=0, minf=11
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=279MiB/s (293MB/s), 279MiB/s-279MiB/s (293MB/s-293MB/s), io=3070MiB (3219MB), run=10990-10990msec
WRITE: bw=93.4MiB/s (97.9MB/s), 93.4MiB/s-93.4MiB/s (97.9MB/s-97.9MB/s), io=1026MiB (1076MB), run=10990-10990msec

Disk stats (read/write):
sdb: ios=779122/260515, merge=0/27, ticks=381438/59662, in_queue=0, util=99.17%

**********************************

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test4kreadwrite --filename=/mnt/ssd-lin/1g-test.raw --bs=4k --iodepth=64 --size=1G --readwrite=randrw --rwmixread=75
test4kreadwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.12
Starting 1 process
test4kreadwrite: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=316MiB/s,w=106MiB/s][r=80.8k,w=27.1k IOPS][eta 00m:00s]
test4kreadwrite: (groupid=0, jobs=1): err= 0: pid=15812: Mon Sep 30 21:08:18 2019
read: IOPS=64.7k, BW=253MiB/s (265MB/s)(768MiB/3037msec)
bw ( KiB/s): min=212008, max=328528, per=99.77%, avg=258214.67, stdev=54747.00, samples=6
iops : min=53002, max=82132, avg=64553.67, stdev=13686.75, samples=6
write: IOPS=21.6k, BW=84.4MiB/s (88.5MB/s)(256MiB/3037msec); 0 zone resets
bw ( KiB/s): min=70832, max=110728, per=99.79%, avg=86278.67, stdev=18693.19, samples=6
iops : min=17708, max=27682, avg=21569.67, stdev=4673.30, samples=6
cpu : usr=13.41%, sys=81.46%, ctx=14246, majf=0, minf=115
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=196498,65646,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=253MiB/s (265MB/s), 253MiB/s-253MiB/s (265MB/s-265MB/s), io=768MiB (805MB), run=3037-3037msec
WRITE: bw=84.4MiB/s (88.5MB/s), 84.4MiB/s-84.4MiB/s (88.5MB/s-88.5MB/s), io=256MiB (269MB), run=3037-3037msec

Disk stats (read/write):
sdb: ios=185452/61913, merge=0/1, ticks=76622/11295, in_queue=0, util=96.74%

Have tried on LVM volume over ssd raid with the same result as directory.
What can I do to get back performance of ssd to normal state on Windows VM?
 
Last edited:
Please do not double post.
 
I have tried nocache+iothread on virtio scsi single wit no luck too... Anyone had a similar situation?
 
barrier=0 in fstab also the same result ((
Maybe CrystalDiskMark 6.02 is not actual in VM and I need alternative benchmarks? What the best tool for monitor real 4k random read and write in Windows VM? Thanks. Maybe fio, what parameters are right?
 
Maybe CrystalDiskMark 6.02 is not actual in VM and I need alternative benchmarks?
Generall I would use FIO for Windows [1].
Then you have the same benchmark in the VM and on the host.
The only diffrent with FIO on WIN/Linux is the IOEngine.

1.) https://bsdio.com/fio/
 
Generall I would use FIO for Windows [1].
Then you have the same benchmark in the VM and on the host.
The only diffrent with FIO on WIN/Linux is the IOEngine.

1.) https://bsdio.com/fio/


Thnks. I was tested fio with settings like crystaldiskmark:

[global]
ioengine=windowsaio
group_reporting
filename=C:\Users\Administrator\Documents\fio-diskmark.raw
size=1G
direct=1
ramp_time=5s
runtime=25s
refill_buffers
norandommap
randrepeat=0
allrandrepeat=0

[128K-Q32T1-Seq-Read]
rw=read
bs=128K
iodepth=32
stonewall

[128K-Q32T1-Seq-Write]
rw=write
bs=128K
iodepth=32
stonewall

[4K-Q8T8-Rand-Read]
rw=randread
bs=4K
iodepth=8
numjobs=8
stonewall

[4K-Q8T8-Rand-Write]
rw=randwrite
bs=4K
iodepth=8
numjobs=8
stonewall

[4K-Q32T1-Rand-Read]
rw=randread
bs=4K
iodepth=32
stonewall

[4K-Q32T1-Rand-Write]
rw=randwrite
bs=4K
iodepth=32
stonewall

[4K-Q1T1-Rand-Read]
rw=randread
bs=4K
iodepth=1
stonewall

[4K-Q1T1-Rand-Write]
rw=randwrite
bs=4K
iodepth=1
stonewall

# cat fio-crystal.cmd
[global]
ioengine=libaio
group_reporting
filename=/mnt/ssd-lin/fio-diskmark.raw
size=1G
direct=1
ramp_time=5s
runtime=25s
refill_buffers
norandommap
randrepeat=0
allrandrepeat=0

[128K-Q32T1-Seq-Read]
rw=read
bs=128K
iodepth=32
stonewall

[128K-Q32T1-Seq-Write]
rw=write
bs=128K
iodepth=32
stonewall

[4K-Q8T8-Rand-Read]
rw=randread
bs=4K
iodepth=8
numjobs=8
stonewall

[4K-Q8T8-Rand-Write]
rw=randwrite
bs=4K
iodepth=8
numjobs=8
stonewall

[4K-Q32T1-Rand-Read]
rw=randread
bs=4K
iodepth=32
stonewall

[4K-Q32T1-Rand-Write]
rw=randwrite
bs=4K
iodepth=32
stonewall

[4K-Q1T1-Rand-Read]
rw=randread
bs=4K
iodepth=1
stonewall

[4K-Q1T1-Rand-Write]
rw=randwrite
bs=4K
iodepth=1
stonewall


On Windows VM 4k random read 19.6MB/s and random write 4k 17.8MB/s
On Pve6 4k random read 35.0MB/s and random write 4k 67.8MB/s

Thats dramaticaly low values for ssd in VM, which planed to be mssql ((
 

Attachments

On Pve6 4k random read 35.0MB/s and random write 4k 67.8MB/s
If the write is faster as the read then a cache must be involved in this system.
The question is where it is.

I would retest it with an LVM(not LVM-thin) instead of ext4.
 
  • Like
Reactions: ramrot
The IO path of a VM is different from the host.
This is why it is essential to understand where the caches are and are they used or not.
Often the problem is too many caches are involved in the IO stack, and the result is the performance decrease.
Have you tried with disabled VM disk cache?

Also, another possibility would the Windows disk driver.
Did you try your benchmark with a Linux VM?
 
  • Like
Reactions: freeman1doma
Thnk for reply.

Have you tried with disabled VM disk cache?
Yes, I tried different state with vm disk (writeback, no cache, write trough, direct sync) the best performance in writeback mode, with nocache random read and write lower at twice then writeback

Also, another possibility would the Windows disk driver.
Did you try your benchmark with a Linux VM?
I tried also some virto scisi drivers for win (the last and stable from fedora site) no difference.
Fresh install of Debian the same speeds as Windows.

Another my try-1: remove raid1 ssd and enable JBOD on ssd, created zfs pool (mirror) with two SSDs, as result - identical performance like LVM or ext4 Directory

Another my try-2: remove raid1 ssd and enable JBOD on ssd, create zfs pool on one single SSD, as result:
Code:
-  on host machine performance in fio wonderfull: rand 4k READ: IOPS=59.5k bw=423MiB/s (444MB/s), 423MiB/s-423MiB/s (444MB/s-444MB/s), io=1024MiB (1074MB), run=2418-2418msec

rand 4k WRITE:  IOPS=66.2k bw=259MiB/s (271MB/s), 259MiB/s-259MiB/s (271MB/s-271MB/s), io=1024MiB (1074MB), run=3957-3957msec


- on windows and linux vms maximum 38mbs  read and 28mbs write (disk of vm nocache on both)


All of tests on fresh PVE 6.0.7 after apt-get full-upgrade
GPT and align on zpool (compressions lz4 atime off) or lvm by default
 
I will test tomorrow and see what I get here.
Thanks, waiting for it.

Can the bottleneck be that the system is installed on the hard drive? All tests I made VMs was located on SSD storage, but OS PVE instaled on hdd hardware raid1.
 
Your fio settings are not 100% working
1.) ioengngine=libaio
Note that Linux may only support queued behavior with non-buffered I/O (set `direct=1' or `buffered=0')
So i used instead posixaio wat is comparable with windowsaio
2.) size will trigger an bug when you use jobnum>1 so I removed it anyway it is better if the test run always 25sec and not 25sec or 1G.
3.) refill_buffers
randrepeat=0
allrandrepeat=0
This can be a problem in the VM because of missing entropy.
4.) norandommap
With an async I/O engine and an I/O depth > 1, it is possible for the same block to be overwritten, which can cause verification errors. Either do not use norandommap in this case, or also use the lfsr random generator.

My results are better but in the 4K 1Q 1T is somewhere a bottleneck(limit).
 

Attachments

Last edited:
Code:
[global]
ioengine=[posixaio|windowsaio]
group_reporting
filename=/dev/nvme1n1
direct=1
ramp_time=5s
runtime=25s

[128K-Q32T1-Seq-Read]
rw=read
bs=128K
iodepth=32
stonewall

[128K-Q32T1-Seq-Write]
rw=write
bs=128K
iodepth=32
stonewall

[4K-Q8T8-Rand-Read]
rw=randread
bs=4K
iodepth=8
numjobs=8
stonewall

[4K-Q8T8-Rand-Write]
rw=randwrite
bs=4K
iodepth=8
numjobs=8
stonewall

[4K-Q32T1-Rand-Read]
rw=randread
bs=4K
iodepth=32
stonewall

[4K-Q32T1-Rand-Write]
rw=randwrite
bs=4K
iodepth=32
stonewall

[4K-Q1T1-Rand-Read]
rw=randread
bs=4K
iodepth=1
stonewall

[4K-Q1T1-Rand-Write]
rw=randwrite
bs=4K
iodepth=1
stonewall
 
Also, ensure that the CPU freq are stable and as hight as possible.
I noticed the latency gets worst if the CPU freq is not stable.
 
Without size fio cannot start on Windows VM:
4K-Q1T1-Rand-Write: you need to specify size=
fio: pid=0, err=22/file:filesetup.c:1007, func=total_file_size, error=Invalid argument

I set size=1G in your fio conf for Win VM, on host all rigt without size. And what i was get on single intel ssd on zfspool:

Cpu use during test, on image, first vm and second - host:
cpuuse.png
 

Attachments

Last edited:
Also, ensure that the CPU freq are stable and as hight as possible.
I noticed the latency gets worst if the CPU freq is not stable.
On windows vm CPU always 100% in use when fio is running (tasmanager)
On debian vm CPU jumping from 35 to 55% (vm top)
On host CPU jumping from 30 to 100% between fio and z_wr_iss (saw in top)
 
If you are still interested you can try vhost to improve the performance.

see https://zonedstorage.io/projects/qemu/
for the VM config, you must include
args: -device vhost-scsi-pci,wwpn=naa.500140506c409f62,bus=pci.0,addr=0xf
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!