[SOLVED] Only 8xMB/s writes when SATA controller is passed through

aqsss

New Member
Jun 23, 2022
5
0
1
Hi all,
I tried to PCI passthrough a SATA controller (not the onboard one) to my vm and found a significant performance drop, the write speed was dropped to 8x MB/s.
But when I mount it directly on PVE, the writing speed came back to normal. Can someone help me on this issue? Thank you!

Here is my grub setting:

GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on drm.debug=0 kvm_amd.nested=1 kvm.ignore_msrs=1 kvm.report_ignored_msrs=0 vfio_iommu_type1.allow_unsafe_interrupts=1"

This is my pcipassthrough setting on PVE gui
1655961900061.png1655961951974.png

Here is dmesg in the VM:
[ 0.148830] pci 0000:06:10.0: [197b:0585] type 00 class 0x010601 [ 0.150356] pci 0000:06:10.0: reg 0x10: [io 0x9200-0x927f] [ 0.153708] pci 0000:06:10.0: reg 0x14: [io 0x9180-0x91ff] [ 0.155051] pci 0000:06:10.0: reg 0x18: [io 0x9100-0x917f] [ 0.156358] pci 0000:06:10.0: reg 0x1c: [io 0x9080-0x90ff] [ 0.157652] pci 0000:06:10.0: reg 0x20: [io 0x9000-0x907f] [ 0.158992] pci 0000:06:10.0: reg 0x24: [mem 0xc1600000-0xc1601fff] [ 0.159811] pci 0000:06:10.0: PME# supported from D3hot [ 0.756494] ahci 0000:06:10.0: SSS flag set, parallel bus scan disabled [ 0.757025] ahci 0000:06:10.0: AHCI 0001.0301 32 slots 5 ports 6 Gbps 0x1f impl SATA mode [ 0.757387] ahci 0000:06:10.0: flags: 64bit ncq sntf stag pm led clo pmp fbs pio slum part ccc apst boh
 
Please provide the output of pveversion -v.

Can you run a fio benchmark on both the host using the controller and in a VM, running for 10 minutes?
Use the following command line and adapt it to your actual disk path:
fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4K --numjobs=1 --iodepth=1 --name write_4k --filename=/path/to/disk
fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4m --numjobs=1 --iodepth=1 --name write_4m --filename=/path/to/disk

This will provide the performance of 4k writes and 4m writes.
 
Please provide the output of pveversion -v.

Can you run a fio benchmark on both the host using the controller and in a VM, running for 10 minutes?
Use the following command line and adapt it to your actual disk path:
fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4K --numjobs=1 --iodepth=1 --name write_4k --filename=/path/to/disk
fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4m --numjobs=1 --iodepth=1 --name write_4m --filename=/path/to/disk

This will provide the performance of 4k writes and 4m writes.
Thanks for the prompt reply! Here is the result

pveversion -v
Code:
root@pve:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.30-2-pve)
pve-manager: 7.2-3 (running version: 7.2-3/c743d6c1)
pve-kernel-helper: 7.2-2
pve-kernel-5.15: 7.2-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-8
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-6
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.2-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.1.8-1
proxmox-backup-file-restore: 2.1.8-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-10
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-1
pve-ha-manager: 3.3-4
pve-i18n: 2.7-1
pve-qemu-kvm: 6.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1

How the disk be mounted with fstab:
/dev/sda /mnt/data1 ext4 rw,relatime,data=writeback,barrier=0,nobh,errors=remount-ro 0 2

In VM (SATA controller passthrough) - 4K
Code:
aqsss@vm:~$ timeout 600 fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --size=25G --name write_4k --filename=/mnt/data1/test.bin
write_4k: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.28
Starting 1 process
write_4k: Laying out IO file (1 file / 25600MiB)
Jobs: 1 (f=1): [W(1)][14.7%][w=6192KiB/s][w=1548 IOPS][eta 57m:55s]
fio: terminating on signal 15

fio: terminating on signal 15

write_4k: (groupid=0, jobs=1): err= 0: pid=1986: Thu Jun 23 14:17:49 2022
  write: IOPS=1608, BW=6435KiB/s (6590kB/s)(3769MiB/599819msec); 0 zone resets
    clat (usec): min=307, max=730386, avg=620.71, stdev=4174.24
     lat (usec): min=307, max=730386, avg=620.80, stdev=4174.24
    clat percentiles (usec):
     |  1.00th=[   383],  5.00th=[   412], 10.00th=[   416], 20.00th=[   416],
     | 30.00th=[   420], 40.00th=[   433], 50.00th=[   449], 60.00th=[   449],
     | 70.00th=[   449], 80.00th=[   453], 90.00th=[   482], 95.00th=[   586],
     | 99.00th=[   832], 99.50th=[   898], 99.90th=[ 57934], 99.95th=[ 99091],
     | 99.99th=[183501]
   bw (  KiB/s): min=  488, max=11456, per=100.00%, avg=6442.38, stdev=884.41, samples=1198
   iops        : min=  122, max= 2864, avg=1610.58, stdev=221.10, samples=1198
  lat (usec)   : 500=91.99%, 750=6.74%, 1000=0.96%
  lat (msec)   : 2=0.03%, 4=0.01%, 10=0.02%, 20=0.02%, 50=0.10%
  lat (msec)   : 100=0.09%, 250=0.05%, 500=0.01%, 750=0.01%
  cpu          : usr=0.24%, sys=1.98%, ctx=1962853, majf=0, minf=14
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,964991,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=6435KiB/s (6590kB/s), 6435KiB/s-6435KiB/s (6590kB/s-6590kB/s), io=3769MiB (3953MB), run=599819-599819msec

Disk stats (read/write):
  sdb: ios=5/2894105, merge=0/1929726, ticks=730/579048, in_queue=579779, util=99.99%

In PVE HOST - 4K
Code:
root@pve:~# timeout 600 fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --size=25G --name write_4k --filename=/mnt/data1/test.bin
write_4k: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.25
Starting 1 process
write_4k: Laying out IO file (1 file / 25600MiB)
Jobs: 1 (f=1): [W(1)][15.3%][w=7147KiB/s][w=1786 IOPS][eta 55m:05s]
fio: terminating on signal 15

write_4k: (groupid=0, jobs=1): err= 0: pid=2923: Thu Jun 23 22:53:10 2022
  write: IOPS=1678, BW=6714KiB/s (6875kB/s)(3933MiB/599747msec); 0 zone resets
    clat (usec): min=342, max=248238, avg=595.37, stdev=3865.19
     lat (usec): min=342, max=248238, avg=595.43, stdev=3865.19
    clat percentiles (usec):
     |  1.00th=[   379],  5.00th=[   383], 10.00th=[   383], 20.00th=[   388],
     | 30.00th=[   416], 40.00th=[   416], 50.00th=[   416], 60.00th=[   416],
     | 70.00th=[   429], 80.00th=[   449], 90.00th=[   453], 95.00th=[   553],
     | 99.00th=[   832], 99.50th=[   898], 99.90th=[ 57410], 99.95th=[ 84411],
     | 99.99th=[175113]
   bw (  KiB/s): min= 4240, max=10192, per=100.00%, avg=6717.17, stdev=755.05, samples=1199
   iops        : min= 1060, max= 2548, avg=1679.24, stdev=188.75, samples=1199
  lat (usec)   : 500=92.15%, 750=6.57%, 1000=0.99%
  lat (msec)   : 2=0.02%, 4=0.01%, 10=0.02%, 20=0.02%, 50=0.11%
  lat (msec)   : 100=0.10%, 250=0.04%
  cpu          : usr=0.17%, sys=2.39%, ctx=2013456, majf=0, minf=13
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1006720,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=6714KiB/s (6875kB/s), 6714KiB/s-6714KiB/s (6875kB/s-6875kB/s), io=3933MiB (4124MB), run=599747-599747msec

Disk stats (read/write):
  sda: ios=0/3019492, merge=0/2013341, ticks=0/577127, in_queue=577127, util=100.00%

In VM (SATA controller passthrough) - 4M
Code:
aqsss@vm:~$ timeout 600 fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4m --numjobs=1 --iodepth=1 --size=25G --name write_4k --filename=/mnt/data1/test.bin
write_4k: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=psync, iodepth=1
fio-3.28
Starting 1 process
write_4k: Laying out IO file (1 file / 25600MiB)
Jobs: 1 (f=1): [W(1)][99.7%][w=36.0MiB/s][w=9 IOPS][eta 00m:02s]
fio: terminating on signal 15
Jobs: 1 (f=1): [W(1)][99.8%][w=44.0MiB/s][w=11 IOPS][eta 00m:01s]
write_4k: (groupid=0, jobs=1): err= 0: pid=2086: Thu Jun 23 14:29:12 2022
  write: IOPS=10, BW=42.6MiB/s (44.6MB/s)(24.9GiB/599904msec); 0 zone resets
    clat (msec): min=54, max=709, avg=93.91, stdev=10.92
     lat (msec): min=54, max=709, avg=93.98, stdev=10.92
    clat percentiles (msec):
     |  1.00th=[   84],  5.00th=[   88], 10.00th=[   88], 20.00th=[   89],
     | 30.00th=[   91], 40.00th=[   92], 50.00th=[   94], 60.00th=[   95],
     | 70.00th=[   95], 80.00th=[   96], 90.00th=[  102], 95.00th=[  104],
     | 99.00th=[  110], 99.50th=[  117], 99.90th=[  178], 99.95th=[  236],
     | 99.99th=[  709]
   bw (  KiB/s): min= 8208, max=49250, per=100.00%, avg=43627.34, stdev=4323.64, samples=1199
   iops        : min=    2, max=   12, avg=10.60, stdev= 1.07, samples=1199
  lat (msec)   : 100=89.53%, 250=10.42%, 500=0.03%, 750=0.02%
  cpu          : usr=0.08%, sys=0.14%, ctx=19063, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,6383,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=42.6MiB/s (44.6MB/s), 42.6MiB/s-42.6MiB/s (44.6MB/s-44.6MB/s), io=24.9GiB (26.8GB), run=599904-599904msec

Disk stats (read/write):
  sdb: ios=5/38322, merge=0/13190, ticks=780/1681843, in_queue=1682622, util=100.00%

In PVE HOST - 4M
Code:
root@pve:~# timeout 600 fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4m --numjobs=1 --iodepth=1 --size=25G --name write_4k --filename=/mnt/data1/test.bin
write_4k: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=psync, iodepth=1
fio-3.25
Starting 1 process
write_4k: Laying out IO file (1 file / 25600MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=44.0MiB/s][w=11 IOPS][eta 00m:00s]
write_4k: (groupid=0, jobs=1): err= 0: pid=4376: Thu Jun 23 23:02:47 2022
  write: IOPS=11, BW=47.7MiB/s (50.1MB/s)(25.0GiB/536314msec); 0 zone resets
    clat (msec): min=48, max=268, avg=83.72, stdev=17.01
     lat (msec): min=48, max=268, avg=83.79, stdev=17.01
    clat percentiles (msec):
     |  1.00th=[   52],  5.00th=[   53], 10.00th=[   54], 20.00th=[   56],
     | 30.00th=[   85], 40.00th=[   88], 50.00th=[   89], 60.00th=[   91],
     | 70.00th=[   94], 80.00th=[   95], 90.00th=[   97], 95.00th=[  102],
     | 99.00th=[  108], 99.50th=[  114], 99.90th=[  180], 99.95th=[  226],
     | 99.99th=[  271]
   bw (  KiB/s): min=24576, max=82084, per=100.00%, avg=48938.36, stdev=11489.71, samples=1072
   iops        : min=    6, max=   20, avg=11.93, stdev= 2.81, samples=1072
  lat (msec)   : 50=0.03%, 100=93.72%, 250=6.22%, 500=0.03%
  cpu          : usr=0.11%, sys=0.18%, ctx=12804, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,6400,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=47.7MiB/s (50.1MB/s), 47.7MiB/s-47.7MiB/s (50.1MB/s-50.1MB/s), io=25.0GiB (26.8GB), run=536314-536314msec

Disk stats (read/write):
  sda: ios=0/64008, merge=0/13267, ticks=0/2551072, in_queue=2551072, util=100.00%
 
Please provide the output of pveversion -v.

Can you run a fio benchmark on both the host using the controller and in a VM, running for 10 minutes?
Use the following command line and adapt it to your actual disk path:
fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4K --numjobs=1 --iodepth=1 --name write_4k --filename=/path/to/disk
fio --ioengine=psync --direct=1 --sync=1 --rw=write --bs=4m --numjobs=1 --iodepth=1 --name write_4m --filename=/path/to/disk

This will provide the performance of 4k writes and 4m writes.
One quick update. The drive I used for testing is an old disk.
1655996820965.png

But the FIO result only achieve 50MB/s not 80MB/s
 
That's not the problem, the problem are the EXTREMLY low IOPS ~10 in passthrough mode compared to ~1600 - that is a factor of 160!
I think you're mixing 4K and 4M results here.

One quick update. The drive I used for testing is an old disk.
View attachment 38326

But the FIO result only achieve 50MB/s not 80MB/s
fio, especially when you let it run for 10 minutes and with a size bigger than any cache, it will result in the actual performance you can expect.
How long does the `Disk Test Suite` run? If it is similar to `CrystalDisk`, don't expect it to provide the actual performance you can expect.

With a filesystem on top you have to expect a slight to moderate performance hit, depending on the filesystem.
And based on the results, your controller in the VM doesn't lose much performance. A little bit is to be expected when using virtualization.
 
I think you're mixing 4K and 4M results here.


fio, especially when you let it run for 10 minutes and with a size bigger than any cache, it will result in the actual performance you can expect.
How long does the `Disk Test Suite` run? If it is similar to `CrystalDisk`, don't expect it to provide the actual performance you can expect.

With a filesystem on top you have to expect a slight to moderate performance hit, depending on the filesystem.
And based on the results, your controller in the VM doesn't lose much performance. A little bit is to be expected when using virtualization.
Thank you so much! One following up question.

When I tried to copy 2.6G file to the disk. VM took 28 seconds (~80MB/s) but PVE host only took less than 10 seconds. Is that due to cache?

p.s. my PVE host has 32GB ram where VM only has 4GB
 
Yes, that sounds like cache.

On linux you can check the used cache size by running free -h.
Look for `buff/cache` in its output.
 
You can also try to copy the file directly with dd (without using cache):

Code:
dd if=infile of=outfile bs=1M count=1024 iflag=direct oflag=sync
 
You can also try to copy the file directly with dd (without using cache):

Code:
dd if=infile of=outfile bs=1M count=1024 iflag=direct oflag=sync
After using this command and "nocache" tool. I can confirm it was caused by cache!
Thanks all for the help!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!