Proxmox x Hyper-V storage performance.

fdcastel

Active Member
Sep 28, 2021
40
5
28
I’m evaluating Proxmox for potential use in a professional environment to host Windows VMs. Current production setup runs on Microsoft Hyper-V Server.

Results follow:


1) Using
--scsi0 "$VM_STORAGE:$VM_DISKSIZE,discard=on,iothread=1,ssd=1":

Code:
C:\> fsutil fsinfo sectorInfo C:
LogicalBytesPerSector :                                 512
PhysicalBytesPerSectorForAtomicity :                    512
PhysicalBytesPerSectorForPerformance :                  512
FileSystemEffectivePhysicalBytesPerSectorForAtomicity : 512
Device Alignment :                                      Aligned (0x000)
Partition alignment on device :                         Aligned (0x000)
No Seek Penalty
Trim Supported
Not DAX capable
Is Thinly-Provisioned, SlabSize :                       4,096 bytes (4.0 KB)

Code:
------------------------------------------------------------------------------
CrystalDiskMark 9.0.1 x64 (C) 2007-2025 hiyohiyo
                                  Crystal Dew World: https://crystalmark.info/
------------------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

[Read]
  SEQ    1MiB (Q=  8, T= 1): 27767.491 MB/s [  26481.1 IOPS] <   261.24 us>
  SEQ    1MiB (Q=  1, T= 1):  8474.260 MB/s [   8081.7 IOPS] <   123.38 us>
  RND    4KiB (Q= 32, T= 1):   430.900 MB/s [ 105200.2 IOPS] <    32.57 us>
  RND    4KiB (Q=  1, T= 1):   156.498 MB/s [  38207.5 IOPS] <    25.86 us>

[Write]
  SEQ    1MiB (Q=  8, T= 1):  4769.246 MB/s [   4548.3 IOPS] <  1744.31 us>
  SEQ    1MiB (Q=  1, T= 1):  3817.791 MB/s [   3640.9 IOPS] <   272.73 us>
  RND    4KiB (Q= 32, T= 1):   356.408 MB/s [  87013.7 IOPS] <    48.21 us>
  RND    4KiB (Q=  1, T= 1):   136.599 MB/s [  33349.4 IOPS] <    29.65 us>

Profile: Default
   Test: 1 GiB (x3) [C: 8% (9/119GiB)]
   Mode: [Admin]
   Time: Measure 5 sec / Interval 5 sec
   Date: 2025/12/04 21:29:04
     OS: Windows Server 2022 Server Standard 21H2 [10.0 Build 20348] (x64)

1764933272639.png



2) Using
--scsi0 "$VM_STORAGE:$VM_DISKSIZE,discard=on,iothread=1,ssd=1" \
--args "-global scsi-hd.physical_block_size=4096 -global scsi-hd.logical_block_size=4096"

Code:
C:\> fsutil fsinfo sectorInfo C:
LogicalBytesPerSector :                                 4096
PhysicalBytesPerSectorForAtomicity :                    4096
PhysicalBytesPerSectorForPerformance :                  4096
FileSystemEffectivePhysicalBytesPerSectorForAtomicity : 4096
Device Alignment :                                      Aligned (0x000)
Partition alignment on device :                         Aligned (0x000)
No Seek Penalty
Trim Supported
Not DAX capable
Is Thinly-Provisioned, SlabSize :                       4,096 bytes (4.0 KB)

Code:
------------------------------------------------------------------------------
CrystalDiskMark 9.0.1 x64 (C) 2007-2025 hiyohiyo
                                  Crystal Dew World: https://crystalmark.info/
------------------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

[Read]
  SEQ    1MiB (Q=  8, T= 1): 26848.890 MB/s [  25605.1 IOPS] <   273.71 us>
  SEQ    1MiB (Q=  1, T= 1):  8394.444 MB/s [   8005.6 IOPS] <   124.57 us>
  RND    4KiB (Q= 32, T= 1):   439.610 MB/s [ 107326.7 IOPS] <    32.42 us>
  RND    4KiB (Q=  1, T= 1):   156.111 MB/s [  38113.0 IOPS] <    25.93 us>

[Write]
  SEQ    1MiB (Q=  8, T= 1):  3486.522 MB/s [   3325.0 IOPS] <  1777.70 us>
  SEQ    1MiB (Q=  1, T= 1):  1679.348 MB/s [   1601.6 IOPS] <   623.50 us>
  RND    4KiB (Q= 32, T= 1):   370.713 MB/s [  90506.1 IOPS] <    50.52 us>
  RND    4KiB (Q=  1, T= 1):   131.360 MB/s [  32070.3 IOPS] <    30.85 us>

Profile: Default
   Test: 1 GiB (x3) [C: 8% (9/119GiB)]
   Mode: [Admin]
   Time: Measure 5 sec / Interval 5 sec
   Date: 2025/12/04 21:42:41
     OS: Windows Server 2022 Server Standard 21H2 [10.0 Build 20348] (x64)

1764933297373.png



3) Using
--scsi0 "$VM_STORAGE:$VM_DISKSIZE,discard=on,iothread=1,ssd=1" \
--args "-global scsi-hd.physical_block_size=4096 -global scsi-hd.logical_block_size=512"

Code:
C:\> fsutil fsinfo sectorInfo C:
LogicalBytesPerSector :                                 512
PhysicalBytesPerSectorForAtomicity :                    4096
PhysicalBytesPerSectorForPerformance :                  4096
FileSystemEffectivePhysicalBytesPerSectorForAtomicity : 4096
Device Alignment :                                      Aligned (0x000)
Partition alignment on device :                         Aligned (0x000)
No Seek Penalty
Trim Supported
Not DAX capable
Is Thinly-Provisioned, SlabSize :                       4,096 bytes (4.0 KB)

Code:
------------------------------------------------------------------------------
CrystalDiskMark 9.0.1 x64 (C) 2007-2025 hiyohiyo
                                  Crystal Dew World: https://crystalmark.info/
------------------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

[Read]
  SEQ    1MiB (Q=  8, T= 1): 26964.330 MB/s [  25715.2 IOPS] <   272.93 us>
  SEQ    1MiB (Q=  1, T= 1):  8267.413 MB/s [   7884.4 IOPS] <   126.46 us>
  RND    4KiB (Q= 32, T= 1):   432.969 MB/s [ 105705.3 IOPS] <    32.67 us>
  RND    4KiB (Q=  1, T= 1):   152.006 MB/s [  37110.8 IOPS] <    26.63 us>

[Write]
  SEQ    1MiB (Q=  8, T= 1):  4361.074 MB/s [   4159.0 IOPS] <  1144.96 us>
  SEQ    1MiB (Q=  1, T= 1):  3845.416 MB/s [   3667.3 IOPS] <   271.88 us>
  RND    4KiB (Q= 32, T= 1):   366.912 MB/s [  89578.1 IOPS] <    47.63 us>
  RND    4KiB (Q=  1, T= 1):   130.553 MB/s [  31873.3 IOPS] <    31.05 us>

Profile: Default
   Test: 1 GiB (x3) [C: 8% (9/119GiB)]
   Mode: [Admin]
   Time: Measure 5 sec / Interval 5 sec
   Date: 2025/12/04 21:56:48
     OS: Windows Server 2022 Server Standard 21H2 [10.0 Build 20348] (x64)

1764933317836.png



Test system:
- Proxmox 9.1.1 running on AMD EPYC 4585PX / 256 GB RAM
- Storage: 4x1.92TB Samsung PM9A3
Code:
# lsblk -o NAME,FSTYPE,LABEL,MOUNTPOINT,SIZE,MODEL,ALIGNMENT,STATE,OPT-IO,PHY-SEC,LOG-SEC,MIN-IO,OPT-IO
NAME        FSTYPE            LABEL          MOUNTPOINT   SIZE MODEL                      ALIGNMENT STATE   OPT-IO PHY-SEC LOG-SEC MIN-IO   OPT-IO
nvme3n1                                                   1.7T SAMSUNG MZQL21T9HCJR-00A07         0 live    131072    4096     512 131072   131072
├─nvme3n1p1 linux_raid_member                             511M                                    0         131072    4096     512 131072   131072
│ └─md1     vfat              EFI_SYSPART    /boot/efi  510.9M                                    0         131072    4096     512 131072   131072
├─nvme3n1p2 linux_raid_member md2                           1G                                    0         131072    4096     512 131072   131072
│ └─md2     ext4              boot           /boot       1022M                                    0         131072    4096     512 131072   131072
├─nvme3n1p3 linux_raid_member md3                          20G                                    0         131072    4096     512 131072   131072
│ └─md3     ext4              root           /             20G                                    0         131072    4096     512 131072   131072
├─nvme3n1p4 swap              swap-nvme1n1p4 [SWAP]         1G                                    0         131072    4096     512 131072   131072
└─nvme3n1p5 zfs_member        data                        1.7T                                    0         131072    4096     512 131072   131072
nvme1n1                                                   1.7T SAMSUNG MZQL21T9HCJR-00A07         0 live    131072    4096     512 131072   131072
├─nvme1n1p1 zfs_member        spool                       1.7T                                    0         131072    4096     512 131072   131072
└─nvme1n1p9                                                 8M                                    0         131072    4096     512 131072   131072
nvme2n1                                                   1.7T SAMSUNG MZQL21T9HCJR-00A07         0 live    131072    4096     512 131072   131072
├─nvme2n1p1 zfs_member        spool                       1.7T                                    0         131072    4096     512 131072   131072
└─nvme2n1p9                                                 8M                                    0         131072    4096     512 131072   131072
nvme0n1                                                   1.7T SAMSUNG MZQL21T9HCJR-00A07         0 live    131072    4096     512 131072   131072
├─nvme0n1p1 linux_raid_member                             511M                                    0         131072    4096     512 131072   131072
│ └─md1     vfat              EFI_SYSPART    /boot/efi  510.9M                                    0         131072    4096     512 131072   131072
├─nvme0n1p2 linux_raid_member md2                           1G                                    0         131072    4096     512 131072   131072
│ └─md2     ext4              boot           /boot       1022M                                    0         131072    4096     512 131072   131072
├─nvme0n1p3 linux_raid_member md3                          20G                                    0         131072    4096     512 131072   131072
│ └─md3     ext4              root           /             20G                                    0         131072    4096     512 131072   131072
├─nvme0n1p4 swap              swap-nvme0n1p4 [SWAP]         1G                                    0         131072    4096     512 131072   131072
├─nvme0n1p5 zfs_member        data                        1.7T                                    0         131072    4096     512 131072   131072
└─nvme0n1p6 iso9660           config-2                      2M                                40960         131072    4096     512 131072   131072

Code:
zpool list -o name,size,alloc,free,ckpoint,expandsz,frag,cap,dedup,health,altroot,ashift
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT  ASHIFT
data   1.72T  21.5G  1.70T        -         -     0%     1%  1.00x    ONLINE  -            12
spool  1.73T  20.5G  1.71T        -         -     1%     1%  1.00x    ONLINE  -            12

The data zpool is a 2-disk mirror used for the operating system, and the spool zpool is a 2-disk mirror dedicated to the VMs.

According to the SSD specifications, the drive physical page size is 16 KB. I plan to rerun the tests tomorrow using ashift=13 and ashift=14.

Any comments are welcome.
 
Last edited:
For reference, the existing Hyper-V Server deployment (on identical hardware) yields the following results:

Code:
C:\> fsutil fsinfo sectorInfo C:
LogicalBytesPerSector :                                 512
PhysicalBytesPerSectorForAtomicity :                    4096
PhysicalBytesPerSectorForPerformance :                  4096
FileSystemEffectivePhysicalBytesPerSectorForAtomicity : 4096
Device Alignment :                                      Aligned (0x000)
Partition alignment on device :                         Aligned (0x000)
Performs Normal Seeks
Trim Supported
Not DAX capable
Is Thinly-Provisioned, SlabSize :                       1.048.576 bytes (1,0 MB)

Code:
------------------------------------------------------------------------------
CrystalDiskMark 9.0.1 x64 (C) 2007-2025 hiyohiyo
                                  Crystal Dew World: https://crystalmark.info/
------------------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

[Read]
  SEQ    1MiB (Q=  8, T= 1):  6811.150 MB/s [   6495.6 IOPS] <  1230.51 us>
  SEQ    1MiB (Q=  1, T= 1):  1944.947 MB/s [   1854.8 IOPS] <   538.76 us>
  RND    4KiB (Q= 32, T= 1):   805.810 MB/s [ 196731.0 IOPS] <   148.90 us>
  RND    4KiB (Q=  1, T= 1):    47.140 MB/s [  11508.8 IOPS] <    86.79 us>

[Write]
  SEQ    1MiB (Q=  8, T= 1):  2768.327 MB/s [   2640.1 IOPS] <  3026.41 us>
  SEQ    1MiB (Q=  1, T= 1):  2726.607 MB/s [   2600.3 IOPS] <   384.29 us>
  RND    4KiB (Q= 32, T= 1):   458.443 MB/s [ 111924.6 IOPS] <   274.98 us>
  RND    4KiB (Q=  1, T= 1):   103.928 MB/s [  25373.0 IOPS] <    39.33 us>

Profile: Default
   Test: 1 GiB (x3) [C: 70% (167/240GiB)]
   Mode: [Admin]
   Time: Measure 5 sec / Interval 5 sec
   Date: 2025/12/04 22:17:20
     OS: Windows Server 2022 Server Standard 21H2 [10.0 Build 20348] (x64)

1764933340446.png
 
Last edited:
At first glance, Proxmox appears to offer substantial improvements over the old setup, with a few important observations:

1) According to Samsung’s official specifications this model is rated for 6800 MB/s sequential read and 2700 MB/s sequential write -- both numbers closely matching the Hyper-V results.

2) I can’t account for the large performance difference shown in the Proxmox results, especially considering I’m using cache=None. Given the manufacturer’s specs, I’m starting to think the results aren’t telling the whole story -- or I’m doing something blatantly wrong.

3) I also don’t yet understand why Hyper-V delivers significantly better performance in the RND4K Q32T1 test.

4) Using scsi-hd.logical_block_size=4096 had a measurable (negative) impact on sequential write performance.
The remaining differences (regarding the impact of different block sizes on Proxmox) appear to be within statistical tolerance.
 
Last edited:
  • Like
Reactions: FrankList80
What exactly are you asking? The read speeds would be affected by the ZFS read cache which is in RAM, writes would be affected by the fact you are mirroring, so it can only go as fast as the slowest drive can respond and that is based on an IOPS measure.

You should really test with deeper queue depths and more threads and larger sizes if you want to measure full performance, according to the spec sheet a single die will only go as fast as the above numbers indicate.

Manufacturers typically speak about peak performance when you can spread the load to multiple chips, with large block sizes to minimize overhead.

HyperV I am assuming is directly talking to 1 drive. NTFS doesn’t come from an era where large RAM caches or fast drives were feasible.
 
I believe @spirit has nailed the issue of RND4K Q32T1 performance:

what is your cpu usage during the bench ? currently iothread use only 1core, so you could be cpu limited in rand4k. (I'm working on it to add support for the new multithreading feature ). It could explain the difference, if hyper-v is able to use multiple cores by disk.



Windows guest on Proxmox:
1764936147007.png
During the RND4K Q32T1 test, cpu went to 12% (100% of 1 core on a 8-vcpu)



Windows guest on Hyper-V Server:
1764936154974.png
During the RND4K Q32T1 test, CPU usage reached about 15% (equivalent to roughly 50–60% of a single core on a 4-vCPU system).

I also just realized this guest has only 4 vCPUs, compared to 8 on the other system. Sorry for the confusion -- but the results are still meaningful.

The Hyper-V implementation also seems to operate on a single core, although it delivers significantly better performance while using less cpu.
 
Last edited:
What exactly are you asking?

- why identical tests on identical hardware are producing significantly different results?

- why the Hyper-V benchmarks seem to align more closely with the manufacturer’s published performance? (It might simply be coincidence)

- why Hyper-V appears to perform better in one specific case of random reads? (probably already answered by @spirit)

The Proxmox numbers seem “too good to be true” unless some additional caching is happening -- as you said, ZFS caching may be another factor -- which could be masking the real performance.



Please don’t misunderstand me: I’m just trying to understand what’s actually going on before making a final decision. That’s why I’m testing both systems right now. I intend to run application-level tests later, but for now I'm restricting the tests to raw storage performance.

This is also why I’m using CrystalDiskMark with its limited settings. This is just an initial benchmark. I understand your suggestion to use different parameters to extract the full potential of the hardware, but that’s not the goal at this stage.



For now, I simply want to reproduce the same test faithfully on both hypervisors and identify any pros and cons. So far, I’ve noticed:
- ZFS caching makes everything extremely fast
- Hyper-V seems to outperform in certain random-access tests, with less cpu usage.
 
Last edited:
HyperV I am assuming is directly talking to 1 drive. NTFS doesn’t come from an era where large RAM caches or fast drives were feasible.

No. Both servers are configured identically: 2x1.92 TB drives (mirrored, "RAID 1") for the operating system, and another 2x1.92 TB drives (mirrored, "RAID 1") for the VMs.
 
Last edited:
2) I can’t account for the large performance difference shown in the Proxmox results, especially considering I’m using cache=None. Given the manufacturer’s specs, I’m starting to think the results aren’t telling the whole story -- or I’m doing something blatantly wrong.

cache=none leaves the cache of the storage system enabled, use directsync instead. See here for comparision of caching modes https://pve.proxmox.com/wiki/Performance_Tweaks#Disk_Cache
 
  • Like
Reactions: Johannes S
You are not comparing apples to apples though. HyperV does RAID1 how exactly? Software RAID, Intel VROC, Hardware RAID, at a higher data distribution layer?

Windows is closer to what expected benchmark? Because a proper benchmark has a range of inputs and outputs, what the vendor is testing for marketing materials will be the optimal setting for best numbers, that doesn’t mean it is realistic for your workload. Windows will get close to a synthetic benchmark exactly because it doesn’t do much optimization. ZFS is a volume manager with features like CoW, ZIL, ARC and L2ARC, encryption, compression, checksums which is not what NTFS does which is much closer to raw writing of blocks (which works until the power goes out)
 
  • Like
Reactions: Johannes S