how statically set CPU affinity for a VM?

mmenaz

Renowned Member
Jun 25, 2009
835
26
93
Northern east Italy
Hi, I have a problem and, to workaround, I need to set CPU affinity.
The system has 2 physical CPU without hyperthreading, so I have 8 "cpu"
I've 2 KVM Win2003 VM, each set with 4 cores
Yesterday I had really bad performances, and htop showed first 4 core be at 10%, while last 4 core very often at 100% (proxmox web interface shows 50-60%, of course, being the mean value among the 2 CPU). One of the VM seem to be the problem, but probably only because is the one that really does some hard work.
Also suspicious is that processes are "balanced" among CPU 1-8, but 100% is reaced only by last 4 ones. I've opened the chassy and checked air flow and fans, but seem to work.
With htop I've moved the tasks of VM1 (heavier work) on top 4 cores, and VM2 on bottom ones, and everything works smootly now, but, at least seems to me, that other processes are created times to time that have no affinity.
Also if I reboot the affinity is lost.
To make it short, what I ask is: how or what can I set in proxmox so automatically all kvm processes of VM1 are pinned to CPU1 (core 1-4) and the ones of VM2 to CPU2?
Has to work at boot and be valid for each new process that VMx creates.
Thanks a lot!

Code:
 pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.7-9
pve-kernel-2.6.32-4-pve: 2.6.32-30
pve-kernel-2.6.35-1-pve: 2.6.35-9
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4

Intel(R) Xeon(R) CPU  E5506  2.13GHz
 
Unfortunatly, at least 2.6.32 at the time of the installation, did not recognized "RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)", so proxmox has been installed in a ahci sata, kernel upgraded to 2.6.35 and then lvm, raid10 storage has been seen (kernel module in use: megaraid_sas).
I ask: is it normal that if a VM becomes crazy, all the system is put on his knee even if half of the resources are free? Would a 2.6.32 RHEL6x kernel fix it?
Also problematic is that during daytime we can't turn off the server, and test a remote kernel change is too risky.
In my limited experience seem to find a paradox:
the bigger the customer:
- more complex the IT structure and interconnections
- bigger the data (i.e. I've a VM with 300GB)
- higher availability is needed
BUT
- with big data, backup/restore/move is damn slow (some hours)
- you have almost no possibility to experiment and try to prevent problems through updates
- if something goes wrong, a lot of related stuff does too
- if something goes wrong, you are in panic and risk a lot of money
- customer in any case want to spend very few money, so forget external storage and 2 clustered servers
 
first, if you do business with Proxmox VE you should think of buying server support subscriptions. also use only well known working components, never use outdated packages.

if you do not want to spend money for support you can´t expect support outside this community forum.

and again, 2.6.35 is end of live so don´t use it anymore, no one here will do test and updates for this combination.
 
You are right, but my boss has a vision about Free software that I really don't like, and does not invest in it.
I've done a (small) donation to proxmox time ago from my own pocket because I use it at home and feel that is, above all, morally correct. But moral apart, invest in FOSS is the best investment OMHO. We have to change people mentality split, after decades of proprietary software poisoning, among "programs that you pay for and ones you don't", and transform into "programs that belong to others and programs you are part of and 'have' to help and contribute for".
Maybe, just an idea, you should (or we, as community should) prepare some "marketing material" to help people focus of Proxmox strong points to make "sellers" well aware of the target market they can reach with proxmox, and the opportunity to raise the right amount of money and give the right part to proxmox team as reward for their works and future improvements.
Also some user's case description of good examples of installation could help.
Just an idea.
Thanks a lot!
 
Just for record: the all problem was probably due to BBU stop working and LSI Raid controller going in "Write through" mode, FSYNC/SEC from usual 2600 to... 80!
Don't know if there is a simple way from proxmox bash to see if high CPU is due to VM waiting for I/O (I/O delay was not high, under 10. Wondering if it monitors also LVM storage or only pve root partition).
 
you should monitor your raid card. I do not know the management software for this card in detail but if the BBU has a problem you should be informed, e.g. via email. Adaptec Raid cards can do this as far as I know.
 
I'm doing some research. I'm looking for a Free (as in Freedom) solution, and I've found that there is dpt-i2o-raidutils package for Adaptec and mpt-status for LSI (that is the controllers we are using so far, sold by Fujitsu with the server). I'll do some test and report back. I'm surprised that Adaptec (just doing some research about 6405 and AFM module, that should be more reliable and lasting than BBU) releases Free drivers but, seems so far, not Free monitoring utilities (only binary and also rpm).
Thanks for your attention :)