Very poor performance vs ESXi -- At first glance

oturn

New Member
Jul 8, 2023
4
3
3
Wanted to get some opinions on surprisingly poor performance I’m seeing with a brand new Proxmox VE 8.0-2 installation on a 2018 Mac Mini.

Mac Mini Late 2018 8,1
3.2 GHz Core i7 "Coffee Lake" (I7-8700B)
64 GB 2666 MHz DDR4 PC4-21300 SDRAM
2TB Apple NVMe SSD AP2048

After disabling secure boot on the Mac Mini, Proxmox installed on the SSD with no problems. I’m not using any other storage device. After booting, Proxmox does complain about missing support for the built-in WiFi and Bluetooth, but I won’t be using either. Other than those items everything else is working great. I went with the default installation settings and also let Proxmox format the entire disk, so I have the the typical local and local-lvm listed under the server.

As a test I created a Debian VM using debian-12.0.0-amd64-netinst.iso. I configured the VM with the defaults, changing only the CPUs to 1 socket, 2 cores, memory to 2 GB, QEMU guest agent enabled, and discards enabled.

The total install time for Debian was 11 minutes and 30 seconds. The boot time is 16 seconds.

This is where it gets interesting…

Next I installed VMware 7.0U3I on the Mac Mini, completely overwriting Proxmox. ESXi is installed on the SSD, and the datastore is also on the SSD.

I created a new VM with the same Debian ISO, the same CPU and memory, and selecting 64-bit Debian Linux as the OS.

The total install time with VMware was 3 minutes and 30 seconds. The boot time is 13 seconds.

I ran both of these tests twice on Proxmox and on ESXi. The result were the same both times.

This was a simple test, but I was surprised at the huge gap in performance. I’m new to Proxmox and didn’t expect this at all. From everything I’ve read I was under the impression the performance of Proxmox should at least be equivalent to ESXi, if not better. Is this a Mac Mini driver or compatibility issue with Proxmox vs. ESXi? Is there something I can change on Proxmox or the Debian VM that might improve the performance?
 
I assume you test with different storage types? VMWare file format vs. LVM-thin.
Talking about LVM-thin, it is known that first time block allocation is quite slow. My guess is
that this is the cause for the bad result of your benchmark.
 
There are lots of options to optimize a vm. Like cpu type, nic type and bios etc. Also debian 12 is quite new and a more stable distro might be a better choice.
Thanks for testing.
 
Boot and installation times are also not a great benchmark to test performance ;)
 
+ pve set vm disks cache to "default none" where esxi iirc, vm writecache is enable.
so, pve is more resilient in case of power failure or crash.
 
Yes ESXi is using VMFS with thin provisioning. I’ll plan on running some real benchmarks, first in ESXi, then I’ll reinstall Proxmox. Is there a recommended benchmark tool for Debian? Also I should benchmark in Proxmox with writecache default and then with it enabled? Any other changes on the Proxmox side?
 
Setting CPU type to "host" for example and using virtio block/virtio scsi single/virtio. Diabling mitigations.
 
My first impression of Proxmox (but that was 2-3y ago) even without Virtio Nic and without virtio block device:
(I was a newbie that time and used scsi device + e1000 nic xD)

However, the first impression i had that time was: FML, the Windows VM is running so much faster, everything is snappy compared to ESXI.

That was the time as i hated Proxmox, because i was an hardcore "ESXI Admin" and we used only esci servers with iscsi Storage.

It took a year, and 50% of the servers are running Proxmox now since then.
The only reason we don't have only Proxmox Servers, is because of HA, and on some Servers we need hyper-v.

However, nowadays i install/run every VM on the Proxmox servers with hostcpu and everything virtio whats possible.
Even slowly switching everything to q35 either, but that's not because of performance, more so because of passthrough. Somewhat passthrough works better with q35.

Even at some point UEFI VM's got very stable, so that this is just gotten an thing, that is "use what you prefer more".

However, you should try an windows VM, since you feel there the difference in my opinion a lot more, as with an linux vm.
Like Duniun said, boot time and installation time is something i wouldn't use as an benchmark.

Try using hostcpu and virtio devices.
Tho don't use any write back option for the vm Storage, as i found out not so long ago, that it whyever decrease performance, instead of increasing, at least for me.
Not sure why that happens (4x 990pro in zfs raid10)

Cheers
 
Ok here it is, and of course the experts here were correct. :) In summary, Proxmox came out on top after running several performance tests. I used sysbench and fio. These tests are by no means comprehensive. I just wanted some quick metrics.

In addition, there were a couple optimizations pointed out vs. ESXi that I found the most interesting and useful. By default, ESXi has CPU mitigations turned off, and Proxmox has them on. Also, ESXi has EVC mode disabled by default. I believe having EVC disabled is similar to choosing the "host" CPU type in Proxmox. After learning this, in Proxmox I disabled CPU mitigations, set the VM cpu type to "host", and verified I was using VirtIO SCSI single for the SCSI controller, and VirtIO for the NIC.

One other note about ESXi. On my 2018 Mac mini, the disks tests would frequently cause the SSD to have extremely high latency, to the point of not responding for several seconds after the tests were completed. This did not happen with Proxmox.

Apologies if the formatting and readability aren't that great on these results. Also, I have to add the Mac Mini may be older and slow by some measures, but it's an extremely capable little machine. Plus, it is completely silent, which is very important to me. I ran ESXi on a 2012 Mac Mini for almost 10 years before getting this 2018 Mac Mini on eBay just recently.

Finally, I still do find the disparity in installation times interesting. I'm assuming it's mostly because of the first-time block allocation in LVM-thin, but I'm also wondering if it has anything to do with the performance of reading the ISO file on Proxmox local vs. VMFS.

Thanks very much for the assistance from the community! I look forward to becoming more proficient with Proxmox.


Here are the results:


ESXi - sysbench

Test the speed of a single CPU core

sysbench cpu run
CPU speed:
events per second: 1444.60

General statistics:
total time: 10.0002s
total number of events: 14448

Latency (ms):
min: 0.67
avg: 0.69
max: 1.54
95th percentile: 0.73
sum: 9997.91

Threads fairness:
events (avg/stddev): 14448.0000/0.00
execution time (avg/stddev): 9.9979/0.00

Test the speed of multiple CPU cores or threads
sysbench cpu --threads=6 run
CPU speed:
events per second: 2865.79

General statistics:
total time: 10.0011s
total number of events: 28664

Latency (ms):
min: 0.68
avg: 2.09
max: 24.70
95th percentile: 16.71
sum: 59950.57

Threads fairness:
events (avg/stddev): 4777.3333/8.10
execution time (avg/stddev): 9.9918/0.01

Conduct a prolonged benchmark to generate sustained CPU load
sysbench cpu --threads=6 --time=60 run
CPU speed:
events per second: 2859.93

General statistics:
total time: 60.0007s
total number of events: 171601

Latency (ms):
min: 0.67
avg: 2.10
max: 36.71
95th percentile: 16.71
sum: 359892.51

Threads fairness:
events (avg/stddev): 28600.1667/13.26
execution time (avg/stddev): 59.9821/0.01

Memory Test
sysbench --test=memory --memory-block-size=1M --memory-total-size=8G run
Total operations: 8192 (25347.25 per second)

8192.00 MiB transferred (25347.25 MiB/sec)

General statistics:
total time: 0.3220s
total number of events: 8192

Latency (ms):
min: 0.04
avg: 0.04
max: 0.08
95th percentile: 0.04
sum: 320.41

Threads fairness:
events (avg/stddev): 8192.0000/0.00
execution time (avg/stddev): 0.3204/0.00


Proxmox - sysbench

Test the speed of a single CPU core

sysbench cpu run
CPU speed:
events per second: 1510.26

General statistics:
total time: 10.0005s
total number of events: 15105

Latency (ms):
min: 0.65
avg: 0.66
max: 1.09
95th percentile: 0.70
sum: 9998.26

Threads fairness:
events (avg/stddev): 15105.0000/0.00
execution time (avg/stddev): 9.9983/0.00

Test the speed of multiple CPU cores or threads
sysbench cpu --threads=6 run
CPU speed:
events per second: 2961.01

General statistics:
total time: 10.0010s
total number of events: 29616

Latency (ms):
min: 0.67
avg: 2.02
max: 32.69
95th percentile: 16.71
sum: 59970.44

Threads fairness:
events (avg/stddev): 4936.0000/11.06
execution time (avg/stddev): 9.9951/0.01

Conduct a prolonged benchmark to generate sustained CPU load
sysbench cpu --threads=6 --time=60 run
CPU speed:
events per second: 2943.68

General statistics:
total time: 60.0011s
total number of events: 176627

Latency (ms):
min: 0.67
avg: 2.04
max: 28.77
95th percentile: 16.71
sum: 359889.28

Threads fairness:
events (avg/stddev): 29437.8333/13.17
execution time (avg/stddev): 59.9815/0.01

Memory Test
sysbench --test=memory --memory-block-size=1M --memory-total-size=8G run
Total operations: 8192 (26803.03 per second)

8192.00 MiB transferred (26803.03 MiB/sec)

General statistics:
total time: 0.3044s
total number of events: 8192

Latency (ms):
min: 0.04
avg: 0.04
max: 0.11
95th percentile: 0.04
sum: 302.85

Threads fairness:
events (avg/stddev): 8192.0000/0.00
execution time (avg/stddev): 0.3029/0.00



ESXi - fio (ran 3 tests)

4K random write test

fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --size=4g --iodepth=1 --runtime=60 --time_based --end_fsync=1
-WRITE: bw=14.8MiB/s (15.5MB/s), 14.8MiB/s-14.8MiB/s (15.5MB/s-15.5MB/s), io=1330MiB (1394MB), run=89899-89899msec
-WRITE: bw=19.4MiB/s (20.4MB/s), 19.4MiB/s-19.4MiB/s (20.4MB/s-20.4MB/s), io=2108MiB (2210MB), run=108516-108516msec
-WRITE: bw=151MiB/s (159MB/s), 151MiB/s-151MiB/s (159MB/s-159MB/s), io=9136MiB (9579MB), run=60413-60413msec

16 parallel 64KiB random write processes
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=60 --time_based --end_fsync=1
-WRITE: bw=68.1MiB/s (71.5MB/s), 3148KiB/s-6167KiB/s (3224kB/s-6315kB/s), io=5053MiB (5298MB), run=60047-74148msec
-WRITE: bw=1405MiB/s (1473MB/s), 83.2MiB/s-93.1MiB/s (87.2MB/s-97.6MB/s), io=82.6GiB (88.7GB), run=60060-60214msec
-WRITE: bw=1418MiB/s (1487MB/s), 87.3MiB/s-90.1MiB/s (91.5MB/s-94.5MB/s), io=83.4GiB (89.5GB), run=60039-60213msec

Single 1MiB random write process
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
-WRITE: bw=37.3MiB/s (39.1MB/s), 37.3MiB/s-37.3MiB/s (39.1MB/s-39.1MB/s), io=2647MiB (2776MB), run=71049-71049msec
-WRITE: bw=65.2MiB/s (68.4MB/s), 65.2MiB/s-65.2MiB/s (68.4MB/s-68.4MB/s), io=6406MiB (6717MB), run=98233-98233msec
-WRITE: bw=73.1MiB/s (76.6MB/s), 73.1MiB/s-73.1MiB/s (76.6MB/s-76.6MB/s), io=8419MiB (8828MB), run=115199-115199msec


Proxmox - fio (ran 3 tests)

4K random write test

fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --size=4g --iodepth=1 --runtime=60 --time_based --end_fsync=1
-WRITE: bw=148MiB/s (156MB/s), 148MiB/s-148MiB/s (156MB/s-156MB/s), io=8952MiB (9386MB), run=60361-60361msec
-WRITE: bw=147MiB/s (154MB/s), 147MiB/s-147MiB/s (154MB/s-154MB/s), io=8889MiB (9320MB), run=60391-60391msec
-WRITE: bw=145MiB/s (152MB/s), 145MiB/s-145MiB/s (152MB/s-152MB/s), io=8760MiB (9185MB), run=60393-60393msec
—-
16 parallel 64KiB random write processes
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=60 --time_based --end_fsync=1
-WRITE: bw=1828MiB/s (1917MB/s), 107MiB/s-122MiB/s (112MB/s-128MB/s), io=108GiB (116GB), run=60022-60267msec
-WRITE: bw=1799MiB/s (1886MB/s), 105MiB/s-119MiB/s (110MB/s-125MB/s), io=106GiB (114GB), run=60041-60248msec
-WRITE: bw=1849MiB/s (1939MB/s), 108MiB/s-122MiB/s (113MB/s-128MB/s), io=109GiB (117GB), run=60038-60273msec

Single 1MiB random write process
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
-WRITE: bw=1965MiB/s (2061MB/s), 1965MiB/s-1965MiB/s (2061MB/s-2061MB/s), io=115GiB (124GB), run=60123-60123msec
-WRITE: bw=1965MiB/s (2060MB/s), 1965MiB/s-1965MiB/s (2060MB/s-2060MB/s), io=115GiB (124GB), run=60120-60120msec
-WRITE: bw=1928MiB/s (2021MB/s), 1928MiB/s-1928MiB/s (2021MB/s-2021MB/s), io=113GiB (122GB), run=60125-60125msec
 
Last edited:
  • Like
Reactions: Ramalama and Dunuin
@oturn Great tests, very helpfull.
But could you check esxi 8.0 either? xD

They improved a lot and i think im not the only one who would be happy to see some performance comparization and not doing it myself xD
If not then not, but thanks anyway for the benchmarks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!