Unusual setup: Proxmox with only one VM, how to maximize KVM performance without "hurting" the host?

Jan 25, 2025
2
0
1
Hello Community!

First of all, I know that your time is precious so I tried to be as synthetic as possible.

This is only my second message on this forum but I would like to thank everyone.

Many of my questions were answered by "only" reading topics and their answers.

# ---

I am (a bit) aware of some "facts": (please send me a RTFM link if I am wrong... ^^)

- Proxmox is designed to handle multiple/many VM/CT competing (or not) on a single host
- using only a single VM on Proxmox (this is temporary *1) is a very unusual situation
- to avoid wasting any CPU cycles, using a bare metal server is highly recommended *2
- AFAIK Spectre/Meltdown are local attacks *3 so using "mitigations=off" would be fine
- SMT/HyperThreading are "tricks": getting only 15% to 30% boost is yet a massive gain
- disabling SMT/HT doubles each CPU L1/L2 caches and could be a more efficient solution
- reading "Sockets vs Cores vs Threads vs vCPU vs CPU Units" was really helpful! *4 *5

*1) The main goal is to switch later to multiple unprivileged LXC containers to separate each "layer"
*2) Automating bare-metal provisioning is currently beyond my knowledge: Puppet, Ansible, ...
*3) I am the only one able to access the host plus a trusted person using SSH (non root) on the VM
*4) "1 vm core = 1 qemu thread" => setting a VM to use all CPU threads is possible but not advised
*5) "You will always compete with the host" => when a VM is configured to use all host CPU threads

# ---

If it can help, here is the host currently used with Proxmox:

- AMD EPYC GENOA 9554 64c/128t 3.1GHz/3.75GHz
- 128GB DDR5 ECC 4800MHz
- 2x SSD NVMe 960GB Datacenter Class

About Proxmox:

- pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.8.12-11-pve)
- everything is up to date, Debian (bookworm) and Proxmox (pve-enterprise)

About the VM (QEMU config):

agent: 1
balloon: 0
boot: order=scsi0;ide2;net0
cores: 128
cpu: EPYC-Genoa
ide2: none,media=cdrom
memory: 98304
meta: creation-qemu=9.0.2,ctime=1738866549
name: www.XXXXXXXXXX.com
net0: virtio=BC:24:11:35:DC:9B,bridge=vmbr1
numa: 0
onboot: 1
ostype: l26
rng0: source=/dev/urandom
scsi0: qemu:vm-1001-disk-0,discard=on,iothread=1,size=256G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=2c38474a-5ec1-4d2d-9d2a-fed4c92f928c
sockets: 1
vmgenid: 338325a5-afe3-47ea-8675-533e9ee55186

# ---

Because I have many doubts, here are my questions: (again, please feel free to send me a RTFM link!)

- Is it possible to run KVM processes (not QEMU processes) with "nice -n 19"?

IIUC, this would help the host and Proxmox to run smoothly even if a VM is using all CPU threads.

- Since only trusted users can SSH the host or the VM, is it advisable to disable all mitigations?

Are mitigated: "Spec rstack overflow", "Spec store bypass", "Spectre v1", "Spectre v2"

- SMT/Hyper-Threading seems to be just a marketing "lie", would it be benefic to disable it?

AFAIK, each real CPU will then fully use its L1 (and L2?) cache, leading to good results.

# ---

The goal is to understand how to get maximum performances with only one VM in Proxmox.

As I said earlier, the goal is to switch to LXC containers because there is almost no CPU overhead.

But, due to lack of time (and knowledge), using a single VM was the simplest option.

# ---

Again, many thanks to the Proxmox team and the community!

PS: i am not a native English speaker so if I misused a term or a word, please let me know.
 
- Is it possible to run KVM processes (not QEMU processes) with "nice -n 19"?

Probably. But why? The kernel is optimized with years of development to distribute compute power in a sane manner.

Usually it is plainly wrong to believe that you know it better. (Exemptions just confirm this rule.)

- Since only trusted users can SSH the host or the VM, is it advisable to disable all mitigations?

No.

If you only allow trusted people to use your car, is it ok to ignore the seat belt?

- SMT/Hyper-Threading seems to be just a marketing "lie", would it be benefic to disable it?

This depends on your actual workload. If there are only as many active processes as primary cores are available, then yes, disable HT.

Usually you have a lot of processes concurrently running at the same time. Then having more threads is a good thing.

(During the very early days of HT there was an actual performance hit for single threaded applications. I believe that one has dropped to be irrelevant several years ago. At least for most use-cases.)

Why don't you test it? Please post your results here :-)
 
  • Like
Reactions: Johannes S