[TUTORIAL] Proxmox VE vs VMware ESXi performance comparison

bbgeek17

Distinguished Member
Nov 20, 2020
3,546
862
153
Blockbridge
www.blockbridge.com
Many discussions have compared Proxmox and VMware from a feature perspective, but almost none compare performance. We tested PVE 7.2 (kernel=5.15.53-1-pve) and VMware ESXi 7.0 (update 3c) with a 32 VM workload to see which performs better under load with storage-heavy applications.

The results were surprising:
  • Proxmox VE beat VMware ESXi in 56 of 57 tests, delivering IOPS performance gains of nearly 50%. Peak gains in individual test cases with large queue depths and small I/O sizes exceed 70%.
  • Proxmox VE reduced latency by more than 30% while simultaneously delivering higher IOPS, besting VMware in 56 of 57 tests.
  • Proxmox achieved 38% higher bandwidth than VMware ESXi during peak load conditions: 12.8GB/s for Proxmox versus 9.3GB/s for VMware ESXi.
Here's a link to the analysis with graphs and data and a simple overview of the differences in architecture:
If you find this helpful, please let me know. Questions, comments, and corrections are welcome.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Thanks for sharing such a detailed comparison between Proxmox and VMware. The performance results are quite impressive and it's great to see Proxmox outperforming VMware in multiple tests
 
Impressive, indeed. What's more impressive is that Proxmox didn't require any tuning to get these results. With VMware, you practically need a PhD in ESX to make the most of the hardware.

FWIW, we can get more out of the hardware with Proxmox via low-level tuning. However, we're really trying to avoid comparisons of special-case configurations.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Very interesting read indeed - one thing I noticed is that the links in the sidebar for the Proxmox section do not work/jump to the VMWare section since the ID's for the HTML elements are duplicated.
 
Hi,

1. you could use the NVMe guest hba in vmware.
afaik it does not provide more iops in itself vs the scsi adapter.
but it is supposed to uses less cpu cycles on the host vs the scsi version.
so depending on your host load the NVMe adapter could speed up some things in high load situations.

2. "Note: VVOLs and Raw Device Mappings offer a more direct path but are not supported for NVMeOF devices."
i think VVOLs are supported in vsphere 8.0 for NVMeOF
VMware NVMe Storage
i do not have v8 or VVOLs in use as of now, so no idea how well it works.

3. i think this is a bit of apples vs oranges
if you use vmvfs/disk images on the one side you should use xfs/ext.. and qcow2 images on the other side.
so both sides can do snapshots and disk moves at the hypervisor level.

4. when using NVMeOF i assume this is for a max performance use case.
you could mount the NVMeOF inside the guest os for the high performance data and boot the vm from traditional images.
that way you eliminate the whole hypervisor disk layer for the high performance stuff.

5. last time i checked there is a section in the vmware eula, that states that you are only permitted to share benchmarks after you get the ok from vmware.
and they do enforce this sometimes, they even removed stuff that i posted on the vmware forum where i posted numbers but did not compare it to anything.. so keep an eye out for angry vmware mails...
 
Hi @ALFi, really great observations!

1. We tried using the virtual NVMe controller on ESX. Unfortunately, it appeared to have stability issues under load in update 3c. We plan to test more recent updates for our guidance on best practices. The symptoms we saw during testing were I/O timeouts in the guest, followed by a complete VM hang. This condition ultimately required a reboot of ESX because the VMs were not killable. Under light load (i.e., single VM), we did get some data suggesting only minor performance differences. However, we need to be more confident in the data to share it.

2. v8 is on our list. FWIW, vvols are just "raw device to VM" connections. This is very similar to what we tested with Proxmox. VVOLs were a bigger deal in the old days when devices didn't have a lot of queue depth. However, VVOLs were not too popular in customer environments.

3. Agreed. It's impossible to make a straightforward comparison for solutions with such different architectures. However, the data volumes in both cases have snapshots and mobility across hosts orchestrated by the hypervisor. Our objective is to observe the performance of applications running in standard deployment models in an enterprise/datacenter HA environment. We could add additional layers into testing on the Proxmox side, but people would not do this in a production environment. Even with the default CEPH storage, no filesystem or QCOW is involved.

4. Believe it or not, NVMeOF is not always max performance. It depends on what sort of performance you are looking for. For example, iSCSI QD1 latency tends to be a few microseconds faster than NVMe/TCP because of the Linux driver implementation. However, for aggregate IOPS, NVMe/TCP wins without a contest. We've run experiments similar to what you describe in the past, with ISCSI, without seeing an advantage. But it is worth trying it again with NVMe/TCP. The "inside the VM" strategy focuses on testing the networking stack instead of the storage path. That said, it's not exactly scalable from a management perspective.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
One other thought on #4 for those who come across this thread in the future: When you move the device management into the VM, you lose essential hypervisor intergraded data management features, like Proxmox Backup, templates, cloning, snapshots, and rollback.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Great comparison, thanks for this.

What I believe would provider great(er) value (at least in my case) would be a benchmark with more "humane hardware".

As a network engineer I drool over those 100Gbps Melanox switches, but realistically for smaller shops 25/40Gbps setups are more affordable.

Thx,
Seb
 
Hi @sebyp,

Part of the purpose of the comparison is to understand the relative efficiency of performing storage operations in each platform. Under normal circumstances, you would not spend 100% of your CPU cycles on storage. If we used lower bandwidth links, the network would become the bottleneck instead of the hypervisor, calling the comparison into question.

What could be interesting is to measure average latency at a fixed IOPS workload on lower bandwidth links. Then, we could quantify the impact on an application. What do you think?


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
Hello back @bbgeek17 ,

You're right, upon another more careful read of the test it's "only" storage, my bad for interfering :). I was looking for a more comprehensive test, including also CPU and actual VM performance comparison. I did find something on reddit some time ago that stated VMware VM's are faster, but the methodology of the test seemed a bit strange and not really relevant in the server world environment.

Anyway, thanks a lot for the info!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!