Little remark: MS SQL can now run on Linux, too.
Wow, I almost flipped out of my chair when I seen your message!!
That is amazing! unfortunately I still need windows server for other things beyond ms sql but that was one of the reasons for sure.
Little remark: MS SQL can now run on Linux, too.
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
4kqd32_read: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32
fio-2.15
Starting 1 thread
4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB)
Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/355.8MB/0KB /s] [0/91.8K/0 iops] [eta 00m:00s]
4kqd32_read: (groupid=0, jobs=1): err= 0: pid=3928: Tue Dec 13 04:02:36 2016
Description : [4K QD32]
write: io=10810MB, bw=368956KB/s, iops=92239, runt= 30001msec
slat (usec): min=4, max=3019.6K, avg= 8.69, stdev=1819.83
clat (usec): min=1, max=3309.1K, avg=169.36, stdev=3294.51
lat (usec): min=18, max=3333.8K, avg=178.05, stdev=3855.91
clat percentiles (usec):
| 1.00th=[ 25], 5.00th=[ 39], 10.00th=[ 52], 20.00th=[ 78],
| 30.00th=[ 104], 40.00th=[ 131], 50.00th=[ 157], 60.00th=[ 183],
| 70.00th=[ 209], 80.00th=[ 235], 90.00th=[ 266], 95.00th=[ 318],
| 99.00th=[ 454], 99.50th=[ 548], 99.90th=[ 996], 99.95th=[ 2128],
| 99.99th=[ 5728]
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.45%, 50=8.81%
lat (usec) : 100=18.97%, 250=57.13%, 500=13.95%, 750=0.51%, 1000=0.07%
lat (msec) : 2=0.05%, 4=0.04%, 10=0.01%, 20=0.01%, 50=0.01%
lat (msec) : >=2000=0.01%
cpu : usr=6.67%, sys=76.67%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=1.2%, 4=11.7%, 8=27.7%, 16=55.9%, 32=3.5%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=96.6%, 8=0.1%, 16=0.1%, 32=3.4%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=2767263/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: io=10810MB, aggrb=368956KB/s, minb=368956KB/s, maxb=368956KB/s, mint=30001msec, maxt=30001msec
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
4kqd32_read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32
fio-2.15
Starting 1 thread
4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB)
Jobs: 1 (f=0): [f(1)] [100.0% done] [252.9MB/0KB/0KB /s] [64.8K/0/0 iops] [eta 00m:00s]
4kqd32_read: (groupid=0, jobs=1): err= 0: pid=3688: Wed Dec 14 02:33:16 2016
Description : [4K QD32]
read : io=13490MB, bw=460428KB/s, iops=115106, runt= 30001msec
slat (usec): min=3, max=340, avg= 6.60, stdev= 2.17
clat (usec): min=0, max=8624, avg=152.31, stdev=117.12
lat (usec): min=17, max=8632, avg=158.91, stdev=117.67
clat percentiles (usec):
| 1.00th=[ 23], 5.00th=[ 33], 10.00th=[ 47], 20.00th=[ 74],
| 30.00th=[ 98], 40.00th=[ 121], 50.00th=[ 145], 60.00th=[ 167],
| 70.00th=[ 191], 80.00th=[ 213], 90.00th=[ 237], 95.00th=[ 266],
| 99.00th=[ 410], 99.50th=[ 636], 99.90th=[ 1672], 99.95th=[ 1848],
| 99.99th=[ 2008]
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.92%, 50=9.75%
lat (usec) : 100=20.13%, 250=62.98%, 500=5.52%, 750=0.30%, 1000=0.01%
lat (msec) : 2=0.38%, 4=0.01%, 10=0.01%
cpu : usr=6.67%, sys=86.67%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=1.4%, 4=10.7%, 8=27.0%, 16=57.0%, 32=4.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=96.6%, 8=0.1%, 16=0.1%, 32=3.4%, 64=0.0%, >=64=0.0%
issued : total=r=3453324/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: io=13490MB, aggrb=460427KB/s, minb=460427KB/s, maxb=460427KB/s, mint=30001msec, maxt=30001msec
What is random (16)? Please use fio and report back.
E:\bin\fio\2.15>fio C:\fio.test
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
4kqd32_read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32
fio-2.15
Starting 1 thread
4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB)
Jobs: 1 (f=1): [r(1)] [100.0% done] [891.8MB/0KB/0KB /s] [228K/0/0 iops] [eta 00m:00s]
4kqd32_read: (groupid=0, jobs=1): err= 0: pid=10280: Wed Dec 14 17:56:21 2016
Description : [4K QD32]
read : io=26528MB, bw=905463KB/s, iops=226365, runt= 30001msec
slat (usec): min=2, max=143, avg= 3.63, stdev= 2.43
clat (usec): min=12, max=7200, avg=123.96, stdev=49.74
lat (usec): min=62, max=7203, avg=127.58, stdev=49.70
clat percentiles (usec):
| 1.00th=[ 73], 5.00th=[ 81], 10.00th=[ 87], 20.00th=[ 94],
| 30.00th=[ 100], 40.00th=[ 106], 50.00th=[ 113], 60.00th=[ 122],
| 70.00th=[ 133], 80.00th=[ 149], 90.00th=[ 175], 95.00th=[ 199],
| 99.00th=[ 262], 99.50th=[ 290], 99.90th=[ 370], 99.95th=[ 410],
| 99.99th=[ 1864]
lat (usec) : 20=0.01%, 50=0.01%, 100=29.11%, 250=69.56%, 500=1.30%
lat (usec) : 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%
cpu : usr=13.33%, sys=70.00%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.8%, 16=73.2%, 32=26.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=94.5%, 8=3.7%, 16=1.5%, 32=0.3%, 64=0.0%, >=64=0.0%
issued : total=r=6791197/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: io=26528MB, aggrb=905462KB/s, minb=905462KB/s, maxb=905462KB/s, mint=30001msec, maxt=30001msec
E:\bin\fio\2.15>fio C:\fio.test
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
4kqd32_read: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32
fio-2.15
Starting 1 thread
Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/308.7MB/0KB /s] [0/79.2K/0 iops] [eta 00m:00s]
4kqd32_read: (groupid=0, jobs=1): err= 0: pid=6828: Wed Dec 14 17:57:51 2016
Description : [4K QD32]
write: io=9929.3MB, bw=338906KB/s, iops=84726, runt= 30001msec
slat (usec): min=2, max=198, avg= 4.66, stdev= 2.20
clat (usec): min=19, max=41468, avg=366.38, stdev=499.59
lat (usec): min=23, max=41472, avg=371.03, stdev=499.61
clat percentiles (usec):
| 1.00th=[ 102], 5.00th=[ 126], 10.00th=[ 139], 20.00th=[ 165],
| 30.00th=[ 199], 40.00th=[ 235], 50.00th=[ 274], 60.00th=[ 318],
| 70.00th=[ 378], 80.00th=[ 466], 90.00th=[ 652], 95.00th=[ 820],
| 99.00th=[ 1640], 99.50th=[ 2704], 99.90th=[ 7008], 99.95th=[ 8512],
| 99.99th=[12736]
lat (usec) : 20=0.01%, 50=0.05%, 100=0.79%, 250=42.99%, 500=38.58%
lat (usec) : 750=10.89%, 1000=4.17%
lat (msec) : 2=1.76%, 4=0.48%, 10=0.25%, 20=0.03%, 50=0.01%
cpu : usr=6.67%, sys=33.33%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=48.1%, 32=51.7%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=98.0%, 8=1.6%, 16=0.4%, 32=0.1%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=2541881/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: io=9929.3MB, aggrb=338906KB/s, minb=338906KB/s, maxb=338906KB/s, mint=30001msec, maxt=30001msec
My test was from a 950 Pro 512 GB, maybe the difference is coming from that fact, not from some Windows-slowness, but we'll see
Here my Test on Windows 10, 64-bit, 1607 run as Administrator (using your example from Page 1)
Code:E:\bin\fio\2.15>fio C:\fio.test fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning. 4kqd32_read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32 fio-2.15 Starting 1 thread 4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 1 (f=1): [r(1)] [100.0% done] [891.8MB/0KB/0KB /s] [228K/0/0 iops] [eta 00m:00s] 4kqd32_read: (groupid=0, jobs=1): err= 0: pid=10280: Wed Dec 14 17:56:21 2016 Description : [4K QD32] read : io=26528MB, bw=905463KB/s, iops=226365, runt= 30001msec slat (usec): min=2, max=143, avg= 3.63, stdev= 2.43 clat (usec): min=12, max=7200, avg=123.96, stdev=49.74 lat (usec): min=62, max=7203, avg=127.58, stdev=49.70 clat percentiles (usec): | 1.00th=[ 73], 5.00th=[ 81], 10.00th=[ 87], 20.00th=[ 94], | 30.00th=[ 100], 40.00th=[ 106], 50.00th=[ 113], 60.00th=[ 122], | 70.00th=[ 133], 80.00th=[ 149], 90.00th=[ 175], 95.00th=[ 199], | 99.00th=[ 262], 99.50th=[ 290], 99.90th=[ 370], 99.95th=[ 410], | 99.99th=[ 1864] lat (usec) : 20=0.01%, 50=0.01%, 100=29.11%, 250=69.56%, 500=1.30% lat (usec) : 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% cpu : usr=13.33%, sys=70.00%, ctx=0, majf=0, minf=0 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.8%, 16=73.2%, 32=26.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=94.5%, 8=3.7%, 16=1.5%, 32=0.3%, 64=0.0%, >=64=0.0% issued : total=r=6791197/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): READ: io=26528MB, aggrb=905462KB/s, minb=905462KB/s, maxb=905462KB/s, mint=30001msec, maxt=30001msec E:\bin\fio\2.15>fio C:\fio.test fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning. 4kqd32_read: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32 fio-2.15 Starting 1 thread Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/308.7MB/0KB /s] [0/79.2K/0 iops] [eta 00m:00s] 4kqd32_read: (groupid=0, jobs=1): err= 0: pid=6828: Wed Dec 14 17:57:51 2016 Description : [4K QD32] write: io=9929.3MB, bw=338906KB/s, iops=84726, runt= 30001msec slat (usec): min=2, max=198, avg= 4.66, stdev= 2.20 clat (usec): min=19, max=41468, avg=366.38, stdev=499.59 lat (usec): min=23, max=41472, avg=371.03, stdev=499.61 clat percentiles (usec): | 1.00th=[ 102], 5.00th=[ 126], 10.00th=[ 139], 20.00th=[ 165], | 30.00th=[ 199], 40.00th=[ 235], 50.00th=[ 274], 60.00th=[ 318], | 70.00th=[ 378], 80.00th=[ 466], 90.00th=[ 652], 95.00th=[ 820], | 99.00th=[ 1640], 99.50th=[ 2704], 99.90th=[ 7008], 99.95th=[ 8512], | 99.99th=[12736] lat (usec) : 20=0.01%, 50=0.05%, 100=0.79%, 250=42.99%, 500=38.58% lat (usec) : 750=10.89%, 1000=4.17% lat (msec) : 2=1.76%, 4=0.48%, 10=0.25%, 20=0.03%, 50=0.01% cpu : usr=6.67%, sys=33.33%, ctx=0, majf=0, minf=0 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=48.1%, 32=51.7%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=98.0%, 8=1.6%, 16=0.4%, 32=0.1%, 64=0.0%, >=64=0.0% issued : total=r=0/w=2541881/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: io=9929.3MB, aggrb=338906KB/s, minb=338906KB/s, maxb=338906KB/s, mint=30001msec, maxt=30001msec
So, there is "only" 228k IOPS on Windows, as Proxmox VE had 331k IOPS.
Are you sure your NVMe slot is actually a fully 4 lanes slot and not a capped mSATA slot?
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
4kqd32_read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32
fio-2.15
Starting 1 thread
4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB)
Jobs: 1 (f=1): [r(1)] [100.0% done] [20852KB/0KB/0KB /s] [5213/0/0 iops] [eta 00m:00s]
4kqd32_read: (groupid=0, jobs=1): err= 0: pid=3852: Thu Dec 15 03:29:26 2016
Description : [4K QD32]
read : io=606960KB, bw=20228KB/s, iops=5056, runt= 30006msec
slat (usec): min=7, max=187, avg=12.54, stdev= 3.61
clat (usec): min=264, max=20773, avg=6294.17, stdev=2666.21
lat (usec): min=277, max=20787, avg=6306.71, stdev=2666.18
clat percentiles (usec):
| 1.00th=[ 1048], 5.00th=[ 2040], 10.00th=[ 2768], 20.00th=[ 3792],
| 30.00th=[ 4640], 40.00th=[ 5472], 50.00th=[ 6240], 60.00th=[ 7072],
| 70.00th=[ 7904], 80.00th=[ 8768], 90.00th=[ 9792], 95.00th=[10560],
| 99.00th=[12096], 99.50th=[12736], 99.90th=[14144], 99.95th=[14912],
| 99.99th=[17024]
lat (usec) : 500=0.12%, 750=0.30%, 1000=0.47%
lat (msec) : 2=3.86%, 4=17.55%, 10=69.30%, 20=8.39%, 50=0.01%
cpu : usr=0.00%, sys=6.67%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued : total=r=151740/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: io=606960KB, aggrb=20227KB/s, minb=20227KB/s, maxb=20227KB/s, mint=30006msec, maxt=30006msec
My test was from a 950 Pro 512 GB, maybe the difference is coming from that fact, not from some Windows-slowness, but we'll see
Here my Test on Windows 10, 64-bit, 1607 run as Administrator (using your example from Page 1)
Code:E:\bin\fio\2.15>fio C:\fio.test fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning. 4kqd32_read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32 fio-2.15 Starting 1 thread 4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 1 (f=1): [r(1)] [100.0% done] [891.8MB/0KB/0KB /s] [228K/0/0 iops] [eta 00m:00s] 4kqd32_read: (groupid=0, jobs=1): err= 0: pid=10280: Wed Dec 14 17:56:21 2016 Description : [4K QD32] read : io=26528MB, bw=905463KB/s, iops=226365, runt= 30001msec slat (usec): min=2, max=143, avg= 3.63, stdev= 2.43 clat (usec): min=12, max=7200, avg=123.96, stdev=49.74 lat (usec): min=62, max=7203, avg=127.58, stdev=49.70 clat percentiles (usec): | 1.00th=[ 73], 5.00th=[ 81], 10.00th=[ 87], 20.00th=[ 94], | 30.00th=[ 100], 40.00th=[ 106], 50.00th=[ 113], 60.00th=[ 122], | 70.00th=[ 133], 80.00th=[ 149], 90.00th=[ 175], 95.00th=[ 199], | 99.00th=[ 262], 99.50th=[ 290], 99.90th=[ 370], 99.95th=[ 410], | 99.99th=[ 1864] lat (usec) : 20=0.01%, 50=0.01%, 100=29.11%, 250=69.56%, 500=1.30% lat (usec) : 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% cpu : usr=13.33%, sys=70.00%, ctx=0, majf=0, minf=0 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.8%, 16=73.2%, 32=26.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=94.5%, 8=3.7%, 16=1.5%, 32=0.3%, 64=0.0%, >=64=0.0% issued : total=r=6791197/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): READ: io=26528MB, aggrb=905462KB/s, minb=905462KB/s, maxb=905462KB/s, mint=30001msec, maxt=30001msec E:\bin\fio\2.15>fio C:\fio.test fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning. 4kqd32_read: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iodepth=32 fio-2.15 Starting 1 thread Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/308.7MB/0KB /s] [0/79.2K/0 iops] [eta 00m:00s] 4kqd32_read: (groupid=0, jobs=1): err= 0: pid=6828: Wed Dec 14 17:57:51 2016 Description : [4K QD32] write: io=9929.3MB, bw=338906KB/s, iops=84726, runt= 30001msec slat (usec): min=2, max=198, avg= 4.66, stdev= 2.20 clat (usec): min=19, max=41468, avg=366.38, stdev=499.59 lat (usec): min=23, max=41472, avg=371.03, stdev=499.61 clat percentiles (usec): | 1.00th=[ 102], 5.00th=[ 126], 10.00th=[ 139], 20.00th=[ 165], | 30.00th=[ 199], 40.00th=[ 235], 50.00th=[ 274], 60.00th=[ 318], | 70.00th=[ 378], 80.00th=[ 466], 90.00th=[ 652], 95.00th=[ 820], | 99.00th=[ 1640], 99.50th=[ 2704], 99.90th=[ 7008], 99.95th=[ 8512], | 99.99th=[12736] lat (usec) : 20=0.01%, 50=0.05%, 100=0.79%, 250=42.99%, 500=38.58% lat (usec) : 750=10.89%, 1000=4.17% lat (msec) : 2=1.76%, 4=0.48%, 10=0.25%, 20=0.03%, 50=0.01% cpu : usr=6.67%, sys=33.33%, ctx=0, majf=0, minf=0 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=48.1%, 32=51.7%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=98.0%, 8=1.6%, 16=0.4%, 32=0.1%, 64=0.0%, >=64=0.0% issued : total=r=0/w=2541881/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: io=9929.3MB, aggrb=338906KB/s, minb=338906KB/s, maxb=338906KB/s, mint=30001msec, maxt=30001msec
So, there is "only" 228k IOPS on Windows, as Proxmox VE had 331k IOPS.
Are you sure your NVMe slot is actually a fully 4 lanes slot and not a capped mSATA slot?
fio: this platform does not support process shared mutexes, forcing use of threa
ds. Use the 'thread' option to get rid of this warning.
4kqd32_read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=windowsaio, iode
pth=32
fio-2.15
Starting 1 thread
4kqd32_read: Laying out IO file(s) (1 file(s) / 1024MB)
Jobs: 1 (f=1): [r(1)] [9.7% done] [836.2MB/0KB/0KB /s] [214K/0/0 iops] [eta 00m:
Jobs: 1 (f=1): [r(1)] [12.9% done] [852.3MB/0KB/0KB /s] [218K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [16.1% done] [858.6MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [19.4% done] [858.9MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [22.6% done] [855.9MB/0KB/0KB /s] [219K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [25.8% done] [862.6MB/0KB/0KB /s] [221K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [29.0% done] [842.7MB/0KB/0KB /s] [216K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [32.3% done] [862.5MB/0KB/0KB /s] [221K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [35.5% done] [860.2MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [38.7% done] [863.6MB/0KB/0KB /s] [221K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [41.9% done] [860.2MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [46.7% done] [857.9MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [48.4% done] [862.6MB/0KB/0KB /s] [221K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [51.6% done] [858.8MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [54.8% done] [857.8MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [58.1% done] [860.4MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [61.3% done] [858.7MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [64.5% done] [855.1MB/0KB/0KB /s] [219K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [67.7% done] [857.4MB/0KB/0KB /s] [219K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [71.0% done] [852.4MB/0KB/0KB /s] [218K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [74.2% done] [854.1MB/0KB/0KB /s] [219K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [77.4% done] [858.9MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [80.6% done] [859.8MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [83.9% done] [858.7MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [90.0% done] [858.1MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [90.3% done] [859.2MB/0KB/0KB /s] [220K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [93.5% done] [856.7MB/0KB/0KB /s] [219K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [96.8% done] [862.6MB/0KB/0KB /s] [221K/0/0 iops] [eta 00m
Jobs: 1 (f=1): [r(1)] [100.0% done] [861.9MB/0KB/0KB /s] [221K/0/0 iops] [eta 00
m:00s]
4kqd32_read: (groupid=0, jobs=1): err= 0: pid=4020: Thu Dec 15 01:54:47 2016
Description : [4K QD32]
read : io=24955MB, bw=851762KB/s, iops=212940, runt= 30001msec
slat (usec): min=1, max=68, avg= 3.97, stdev= 2.56
clat (usec): min=32, max=16147, avg=120.92, stdev=117.22
lat (usec): min=58, max=16152, avg=124.89, stdev=117.16
clat percentiles (usec):
| 1.00th=[ 65], 5.00th=[ 74], 10.00th=[ 80], 20.00th=[ 89],
| 30.00th=[ 96], 40.00th=[ 103], 50.00th=[ 110], 60.00th=[ 117],
| 70.00th=[ 125], 80.00th=[ 137], 90.00th=[ 157], 95.00th=[ 181],
| 99.00th=[ 258], 99.50th=[ 310], 99.90th=[ 2320], 99.95th=[ 2320],
| 99.99th=[ 2416]
lat (usec) : 50=0.01%, 100=34.78%, 250=64.11%, 500=0.85%, 750=0.02%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.03%, 4=0.21%, 10=0.01%, 20=0.01%
cpu : usr=10.00%, sys=53.33%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=90.8%, 32=9.1%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=92.1%, 8=1.2%, 16=6.5%, 32=0.2%, 64=0.0%, >=64=0.0%
issued : total=r=6388427/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: io=24955MB, aggrb=851761KB/s, minb=851761KB/s, maxb=851761KB/s, mint=30
001msec, maxt=30001msec