Proxmox vCPU flags

Kobus · May 15, 2020

Good afternoon

Now please dont shoot as you read this, but I have recently taken over a Proxmox installation that is rather old, very much in use and I need to make some changes to make sure that the move to a new Proxmox installation on the latest version goes smoothly.

Some version details:
Proxmox 3.4-15/e1daa307 (I know this is ancient. I am in the process of upgrading this proxmox, but before I can I need to make sure that the current environment is ready)

So the first question is:
we have a series of containers that are used for video transcoding. On the containers the video transcoding takes around 1-4 minutes depending on the settings.
I created a new VM and the VM runs about twice as slow. Transcoding around 3-10 minutes. The difference between the containers and the VM's is the CPU flags. The containers flags looks like this:

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf cpuid_faulting pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx xsaveopt cqm_llc cqm_occup_llc

The VM flags looks like this:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb lm constant_tsc arch_perfmon rep_good unfair_spinlock eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch xsaveopt invpcid_single pti retpoline fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx

How do I get the VM flags to be the same as the container flags?

And please, I know this is an old version, but I need this so I can move all of our containers to vm's.

Thank you in advance.

Kobus

BobhWasatch · May 15, 2020

Have you tried setting the CPU type to "host"? Because that's what a container will see.

Kobus · May 15, 2020

Yes I did. I have also tried various CPU types and Broadwell came closest but not the same.

Kobus · May 15, 2020

Is it possible to set these flags manually?

BobhWasatch · May 15, 2020

On pve 3.4? No idea. Way before my time.

If you don't mind my asking, why are you convinced that it is the flags causing the performance loss? It looks like the obvious ones for transcoding are there (sse*, avx*, etc). Switching from containers to VM's will generally cause some loss of I/O performance, particularly with such old versions of kvm and qemu. If you are transcoding large files I could see that being a significant problem.

Kobus · May 15, 2020

In that case, how do I make transcoding perform quicker on VM's closer to how containers perform?

Kobus · May 15, 2020

What made me look at the flags is that is the only thing really significantly different.

LnxBil · May 15, 2020

Kobus said:
In that case, how do I make transcoding perform quicker on VM's closer to how containers perform?

Nothing. If the problem is the I/O you cannot strip away layers, if it's the CPU with cpu=host setting, you cannot change anymore.

Have you checked if the container software is stock software or is it optimized for the CPU flags?

BobhWasatch · May 15, 2020

Kobus said:
What made me look at the flags is that is the only thing really significantly different.

Well, except for all of the I/O devices being virtualized rather than handled directly by the host OS. Which is a pretty significant difference. Some virtual devices are more efficient, while others are more compatible. For Linux, which has had virtio drivers for a while, generally you want to use those rather than emulating legacy devices like IDE.

You might want to do some profiling. Is the transcoding using 100% CPU, or is I/O bound in the new environment?

Kobus · May 16, 2020

LnxBil said:
Nothing. If the problem is the I/O you cannot strip away layers, if it's the CPU with cpu=host setting, you cannot change anymore.

Have you checked if the container software is stock software or is it optimized for the CPU flags?

Thank you.
I enquired internally a bit more and the container has been optimised for the CPU flags.
Is the same possible for VM’s?

Kobus · May 16, 2020

BobhWasatch said:
Well, except for all of the I/O devices being virtualized rather than handled directly by the host OS. Which is a pretty significant difference. Some virtual devices are more efficient, while others are more compatible. For Linux, which has had virtio drivers for a while, generally you want to use those rather than emulating legacy devices like IDE.

You might want to do some profiling. Is the transcoding using 100% CPU, or is I/O bound in the new environment?

In the VM it is using 100% but not in the container. The container reaches around 80-90%, bit for very brief timings where as the VM sits with 100% for about 3-4 times longer. Up to about 10 mins.

fabiosun · May 16, 2020

Kobus said:
Is it possible to set these flags manually?

I do not know if I have understood well your question, but if cpu are the same models you can put args you want in vm config (boot args)
I do it for my OSX VM

agent: 1
args: -smbios type=2 -cpu host,vendor=GenuineIntel,+invtsc,+fpu,+avx2,+tsc_adjust...and so on
bios: ovmf
boot: d
cores: 64
cpu: host
efidisk0: local-lvm:vm-100-disk-1,size=128K
hostpci0: 21:00,pcie=1,x-vga=1

Kobus · May 16, 2020

fabiosun said:
I do not know if I have understood well your question, but if cpu are the same models you can put args you want in vm config (boot args)
I do it for my OSX VM

agent: 1
args: -smbios type=2 -cpu host,vendor=GenuineIntel,+invtsc,+fpu,+avx2,+tsc_adjust...and so on
bios: ovmf
boot: d
cores: 64
cpu: host
efidisk0: local-lvm:vm-100-disk-1,size=128K
hostpci0: 21:00,pcie=1,x-vga=1

Thank you. I was looking at doing this about 20 mins ago. This truly helps.

LnxBil · May 17, 2020

Kobus said:
I enquired internally a bit more and the container has been optimised for the CPU flags.
Is the same possible for VM’s?

Everything is possible, the problem is how. You can optimize everything, download the sources and do all the optimization by compiling for your host. The question is if you know how to do that. That involves knowledge of the compile infrastructure of the problem you want to compile and such....

If you could convert the convert the container into a real VM, you can use the same compiled code, you can also try to chroot the container in a VM - but if you don't know what I'm talking about, don't go also down that hole.

In general, I assume this is a very specific use case with optimizations in place that the previous admin/developer created. You cannot simple convert this with all the optimizations in place. Is there a documentation how it was set up? Reverse engineering everything will take a considerate amount of time and may be completely out of scope.

Kobus · May 18, 2020

Kobus said:
Thank you. I was looking at doing this about 20 mins ago. This truly helps.

Hi
Thank you for your reply. I am unable to get it to recognise the changed flags. Here is what I tried:
I tried running the vm with
/usr/bin/kvm -id 40039 -chardev 'socket,id=qmp,path=/var/run/qemu-server/40039.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/40039.vnc,x509,password -pidfile /var/run/qemu-server/40039.pid -daemonize -smbios 'type=1,uuid=69a53e73-7f81-4b6e-ac93-875ab0a3e411' -name pdcpavpr012 -smp '8,sockets=4,cores=2,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -cpu host,+x2apic,+vme,+acpi,+ss,+tm,+pbe,+pdpe1gb,+rdtscp,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+fma,+xtpr,+pdcm,+pcid,+dca,+movbe,+xsave,+avx,+f16c,+rdrand,+abm,+3dnowprefetch,+fsgsbase,+bmi1,+hle,+avx2,+smep,+bmi2,+erms,+invpcid,+rtm,+rdseed,+adx -m 4096 -k en-gb -cpuunits 1000 -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:b3921d2a445c' -device 'lsi,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/rpool/zfsdisks/vm-40039-disk-1,if=none,id=drive-scsi0,cache=writeback,aio=threads,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,scsi-id=0,drive=drive-scsi0,id=scsi0,bootindex=100' -drive 'if=none,id=drive-ide2,media=cdrom,aio=native' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -netdev 'type=tap,id=net0,ifname=tap40039i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=C6:87:19:7E:8E:A1,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'

I also tried the following:
I change the qemu conf file for this vm from:
/root/40039.conf
#Production DC - Production - Proc - 012
#Shared Proc Pool
bootdisk: scsi0
cores: 2
cpu: host
ide2: none,media=cdrom
memory: 4096
name: pdcpavpr012
net0: e1000=C6:87:19:7E:8E:A1,bridge=vmbr1257
numa: 0
onboot: 1
ostype: l24
scsi0: ZFSDisks:vm-40039-disk-1,cache=writeback,size=32G
smbios1: uuid=69a53e73-7f81-4b6e-ac93-875ab0a3e411
sockets: 4

to:
cat 40039.conf
#Production DC - Production - Procs - 012
#Shared VodProc Pool
args: -cpu host,vendor=GenuineIntel,+x2apic,+vme,+acpi,+ss,+tm,+pbe,+pdpe1gb,+rdtscp,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+fma,+xtpr,+pdcm,+pcid,+dca,+movbe,+xsave,+avx,+f16c,+rdrand,+abm,+3dnowprefetch,+fsgsbase,+bmi1,+hle,+avx2,+smep,+bmi2,+erms,+invpcid,+rtm,+rdseed,+adx
bootdisk: scsi0
cores: 2
cpu: host
ide2: none,media=cdrom
memory: 4096
name: pdcpavpr012
net0: e1000=C6:87:19:7E:8E:A1,bridge=vmbr1257
numa: 1
onboot: 1
ostype: l24
scsi0: ZFSDisks:vm-40039-disk-1,cache=writeback,size=32G
smbios1: uuid=69a53e73-7f81-4b6e-ac93-875ab0a3e411
sockets: 4

I also tried various other options but in the end, none of these flags are available.

Do I have to eneble these in the kernel maybe?

Thanks
Kobus

LnxBil · May 18, 2020

Kobus said:
I also tried various other options but in the end, none of these flags are available.

If you have cpu=host, there are already all flags that your CPU supports available in your VM, you cannot increase this beyond the limits of your CPU.

Kobus · May 18, 2020

LnxBil said:
If you have cpu=host, there are already all flags that your CPU supports available in your VM, you cannot increase this beyond the limits of your CPU.

Thank you.

Maybe I dont understand something here. I have a container that has and supports far more cpu flags than a vm with cpu type host. Why is that? Why cant I get the vm to support the same cpu flags as the container?

BobhWasatch · May 18, 2020

Some flags won't pass thru to a VM without doing something special (e.g. vmx/svm for nested virtualization). Do you know specifically which ones relate to your workload?

LnxBil · May 19, 2020

BobhWasatch said:
Some flags won't pass thru to a VM without doing something special (e.g. vmx/svm for nested virtualization).

You're partly right. Some are not passed through, some are completely new, but e.g. svm actually is passed through in my case. But all flags that are relevant for computing (sse, mmx, aes-ni) are also available.

Kobus · May 19, 2020

Last question please. How can I improve the VM performance so that it does not take 3 times longer than a container? I have tried so many things now, I just dont know where to go with this?

@BobhWasatch The flags I am most interested in are these:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf cpuid_faulting pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx xsaveopt cqm_llc cqm_occup_llc

We have containers that have these flags enabled and if I can get a VM to have the same as these, I am sure I will be a lot closer to a solution.
At the moment, if I use CPU=host I get these:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb lm constant_tsc arch_perfmon rep_good unfair_spinlock eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch xsaveopt invpcid_single pti retpoline fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx

Any other CPU type give me even less flags and some are not 3 times but 4 times slower.

Kobus

Proxmox vCPU flags

New Member

Famous Member

New Member

New Member

Famous Member

New Member

New Member

Distinguished Member

Famous Member

New Member

New Member

New Member

New Member

Distinguished Member

New Member

Distinguished Member

New Member

Famous Member

Distinguished Member

New Member

We value your privacy