Windows server 2022 poor random read and write performance

Kukwiak

New Member
Dec 12, 2023
8
3
3
Hello,
I am looking for a way to improve the performance of random reads and writes on a virtual machine with windows server 2022. VM configuration:
agent: 1
boot: order=virtio0;ide2;net0;ide0
cores: 6
cpu: qemu64
machine: pc-i440fx-9.0
memory: 16384
meta: creation-qemu=9.0.0,ctime=1724249118
name: win2022-virtio
numa: 1
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=23024246-a995-44cc-a67d-d2af956fec5f
sockets: 2
virtio0: vms:vm-101-disk-0,discard=on,iothread=1,size=50G
vmgenid: ab52be2a-c650-4af6-b1e4-e035fb838120
The machine is located on a single node on which ceph is installed, whose configuration is the default:
[global].
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 172.16.18.1/24
fsid = 17f1b434-e301-4db9-b5dc-7645f8cbbb3
mon_allow_pool_delete = true
mon_host = 172.16.17.1
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 2
public_network = 172.16.17.1/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[client.crash].
keyring = /etc/pve/ceph/$cluster.$name.keyring

[mon.cf01]
public_addr = 172.16.17.1


Changing disk types from virtio and scsi does not give any meaningful results.
Is there any option in the configuration to improve the performance of random reads and writes?
Screenshot 2024-08-23 at 17.22.33.png
 
Set cache to writeback. What is the backing storage for the virtual disk?

I must say, 1600MB/sec read and ~600MB/sec write is pretty good for a VM. That's definitely faster than you would get from spinning disk.

For speeding up reads, mirroring is a typical solution. Faster Writes is using a good nvme with a high TBW rating, possibly in raid10 / zfs mirror pool.
 
Hello,
I am looking for a way to improve the performance of random reads and writes on a virtual machine with windows server 2022. VM configuration:
agent: 1
boot: order=virtio0;ide2;net0;ide0
cores: 6
cpu: qemu64
Change CPU Type to host
machine: pc-i440fx-9.0
memory: 16384
meta: creation-qemu=9.0.0,ctime=1724249118
name: win2022-virtio
numa: 1
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=23024246-a995-44cc-a67d-d2af956fec5f
sockets: 2
2 sockets with 6 cores each are very large for a simple Windows VM. Usually 4 cores on one socket are enough. Too many vCPUs create a lot of overhead.
virtio0: vms:vm-101-disk-0,discard=on,iothread=1,size=50G
vmgenid: ab52be2a-c650-4af6-b1e4-e035fb838120
The machine is located on a single node on which ceph is installed, whose configuration is the default:
[global].
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 172.16.18.1/24
fsid = 17f1b434-e301-4db9-b5dc-7645f8cbbb3
mon_allow_pool_delete = true
mon_host = 172.16.17.1
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 2
public_network = 172.16.17.1/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[client.crash].
keyring = /etc/pve/ceph/$cluster.$name.keyring

[mon.cf01]
public_addr = 172.16.17.1
The CPU probably improves the result by 5-10% but the most important thing is what kind of storage does your host use?
 
  • Like
Reactions: Kingneutron
In my testing enabling hugepages increased I/O by around 10-15%

Disable mitigations on older hardware if its not a production environment server where these things might be a risk to security can help.

I would suggest placing pagefile on a different disk if that is an option, personally i use a second NVME drive and add virtual disks from it to VMs for swap/pagefile this helps performance in general quite significantly. Plus if the nvme fails its not exactly hard to set a new pagefile, windows will do it automatically if the drive is missing, so poses essentially no risk and decent reward.
(Best if you could use an optane drive or pcie enterprise ssd but anything works)

I am still having some latency issues with some things in the windows VM with proxmox like compiling software and other things that make many small requests in succession but these things and some general tweaks to windows such as disabling things like defender, core isolation, etc and minimising parts of the system that consume I/O help a lot
 
  • Like
Reactions: Kingneutron
Change CPU Type to host

2 sockets with 6 cores each are very large for a simple Windows VM. Usually 4 cores on one socket are enough. Too many vCPUs create a lot of overhead.

The CPU probably improves the result by 5-10% but the most important thing is what kind of storage does your host use?
What do you mean by storage type? The VM uses cephfs and its controller is virtio.
 
Set cache to writeback. What is the backing storage for the virtual disk?

I must say, 1600MB/sec read and ~600MB/sec write is pretty good for a VM. That's definitely faster than you would get from spinning disk.

For speeding up reads, mirroring is a typical solution. Faster Writes is using a good nvme with a high TBW rating, possibly in raid10 / zfs mirror pool.
As for sequential writes, I do not complain about their performance. I do have a problem with random writes and reads. Is this how it should work? Is there any method to improve these values?
Isolating block.db and block.wall could help in this situation?
 
What do you mean by storage type? The VM uses cephfs and its controller is virtio.
Are you sure you are using CephFS and not RBD?
If you have VM disks on a CephFS, you can expect poor random performance.
 
Are you sure you are using CephFS and not RBD?
If you have VM disks on a CephFS, you can expect poor random performance.
Sorry, my mistake. Storage is RBD and these are the results for VM with disk on rbd storage.
 
Sorry, my mistake. Storage is RBD and these are the results for VM with disk on rbd storage.
Random I/O with Ceph is extremely dependent on the network and, of course, on the performance of the individual SSD.
I always recommend 25GBit upwards (not 40 GBIt, which is only 4x 10GBIt) because the latency is less than half that of 10/40 GBIt.
What does your hardware look like and do you use LACP for the Ceph network?
 
Random I/O with Ceph is extremely dependent on the network and, of course, on the performance of the individual SSD.
I always recommend 25GBit upwards (not 40 GBIt, which is only 4x 10GBIt) because the latency is less than half that of 10/40 GBIt.
What does your hardware look like and do you use LACP for the Ceph network?
I don't think I mentioned it before, ceph consists of 4 osd U.2 drives. Ceph is within one switch - because of lab's capabilities. However, it is bundled with 2x 10 Gbps network as lacp.
 
I don't think I mentioned it before, ceph consists of 4 osd U.2 drives. Ceph is within one switch - because of lab's capabilities. However, it is bundled with 2x 10 Gbps network as lacp.
Is the LACP bond layer 3+4? Anything else will not have the desired effect.
With 10 GBit and a standard pool with 3 replicas, you only have 3.3 GBit write performance in the worst case. With small random writes, this can be significantly worse due to the transfer overhead.
 
Is the LACP bond layer 3+4? Anything else will not have the desired effect.
With 10 GBit and a standard pool with 3 replicas, you only have 3.3 GBit write performance in the worst case. With small random writes, this can be significantly worse due to the transfer overhead.
I am using openvswitch to create a bond, the type I have chosen is balance-tcp, as far as I understand correctly it uses layer 2+4, while balance-slb uses layer 3+4, should I use balance-slb?
 
I am using openvswitch to create a bond, the type I have chosen is balance-tcp, as far as I understand correctly it uses layer 2+4, while balance-slb uses layer 3+4, should I use balance-slb?
SLB Bonding is only Layer2 Balancing without LACP.
Balancing TCP is also a balancing algorithm that can be used without the knowledge of the connected switch.
I have not worked with OVS yet, but since OVS also supports LACP, please switch to LACP. If you can set it, please set it to Layer3+4.
If you cannot set it, it uses the setting of the peer, then please activate Layer3+4 on the switch.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!