Lower than expected IOPS in Windows VM on ceph

Ibrahim Yashau

New Member
Feb 28, 2019
5
0
1
34
Hi,

I have a fairly good setup here:

4 Nodes each with:
2x Dual EPYC CPU
256GB memory
6x 1.6TB P4610
2x25G CX4 LACP (no RDMA)

Code:
Object prefix: benchmark_data_pve1_154426
 sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
   0       0         0         0         0         0           -           0
   1     255       565       310   1239.94      1240    0.455183    0.449924
   2     255      1121       866   1731.81      2224    0.463446    0.456692
   3     255      1531      1276    1701.1      1640    0.711298    0.477325
   4     255      1908      1653   1652.77      1508    0.521955    0.550107
   5     255      2470      2215   1771.67      2248    0.461535    0.526798
   6     255      3053      2798   1865.01      2332    0.450543    0.508571
   7     255      3617      3362   1920.82      2256    0.443641    0.499579
   8     255      4173      3918   1958.68      2224    0.447039    0.493914
   9     255      4741      4486   1993.45      2272    0.447463    0.488368
  10     255      5308      5053   2020.88      2268    0.448922    0.484186
  11     255      5889      5634   2048.41      2324    0.437174    0.480094
  12     255      6470      6215   2071.35      2324      0.4385    0.476092
  13     255      7025      6770   2082.76      2220    0.454969     0.47459
  14     255      7583      7328    2093.4      2232    0.446134    0.473655
  15     255      8134      7879   2100.75      2204    0.484177    0.472777
  16     255      8672      8417   2103.94      2152    0.469763    0.472979
  17     255      9246      8991   2115.22      2296    0.442974    0.471614
  18     255      9778      9523   2115.91      2128    0.490992    0.471736
  19     255     10349     10094   2124.74      2284    0.442785    0.470837

............
Total time run:         60.203895
Total writes made:      33557
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     2229.56
Stddev Bandwidth:       184.158
Max bandwidth (MB/sec): 2332
Min bandwidth (MB/sec): 1240

Performance is fine in the rados bench (could be a bit better) but it's not going too well inside Windows guests. I'm getting like 7MB/s read (<2000 IOPS) in a 4KQ1T1 benchmark. I've tried all of the caching modes and drivers, doesn't make a difference. I assume I have to adjust read ahead in the guest OS? But I don't believe it's possible in Windows. All drivers are up to date as of this post.

This system is not in production right now so I have freedom to experiment with things. I found an identical forum thread on this already but it's not conclusive.

I'm open to any ideas.

Thanks.
 
dual cpu Epyc has known IO limitations. try it again with one cpu removed.
Do you have any literature on this? I'd love to read up.
Regarding the removal of one CPU, that's not possible unfortunately. The cluster is mostly going to use CPU to offload encoding tasks. If I have to live with slightly lower IOPs, I'll just take that compromise.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!