32-bits OpenBSD VM unbearably slow after kernel 5.13

landryb

New Member
Jul 1, 2023
14
1
3
I run a proxmox install on an old 24-core X5670, with OpenBSD VMs, both i386(32-bits) and amd64 (64-bits). All VMs had the default `kvm64` cpu type.

in march i upgraded from kernel 5.11 to 5.15 and experienced super-slow runtime on the i386 VMs. super slow startup of the VM (like 10mn where an amd64 vm takes 30s), ssh takes ages, and the kvm process on the host gobbles CPU like crazy. At the time i think i tried changing the cpu type, the scsi type, the machine type .. to no avail.

At the time, rolled back the pve kernel to 5.11 worked around the issue and i forgot about it.

just upgraded the host from 7.4 to 8.0 (and from qemu 7 to 8), and at the same time upgraded to the 6.2 kernel... and the i386 VMs are slow again, and consume host cpu like crazy.

So far i've tried the following without positive changes (on the 6.2 kernel):
- switching from i440fx to q35
- switching from kvm64 to kvm32, to qemu32, to the host CPU
- switched all scsi configs to virtio-scsi

vm config (i have 3 i386 VMs, all exhibit the same behaviour):

Code:
bootdisk: virtio0
cores: 4
cpu: qemu32
machine: q35
memory: 4096
name: c32
net0: virtio=A2:6E:16:9F:42:81,bridge=vmbr2
numa: 0
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=0df36665-e125-42cd-a925-2a1519861922
sockets: 1
virtio0: local-lvm:vm-101-disk-1
virtio1: local-lvm:vm-101-disk-2

i will probably rollback again to kernel 5.11 to make sure it's a regression from the kernel version, but i wonder how can i debug this issue, and where to start (what to look for in the kernel changelogs ?).
 
i can confirm that rolling back to 5.11.22-7-pve (still on proxmox 8) gives me normal behaviour of my OpenBSD/i386 VMs, with the exact same qm config.

Code:
root@openbsd-amd64:~# pveversion -v
proxmox-ve: 8.0.1 (running kernel: 5.11.22-7-pve)
pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)
pve-kernel-6.2: 8.0.2
pve-kernel-5.15: 7.4-4
pve-kernel-6.1: 7.3-6
pve-kernel-5.11: 7.0-10
pve-kernel-6.2.16-3-pve: 6.2.16-3
pve-kernel-6.1.15-1-pve: 6.1.15-1
pve-kernel-5.15.108-1-pve: 5.15.108-1
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-qemu-kvm: 8.0.2-3
qemu-server: 8.0.6

is there an archive of all 5.11/5.13/5.15 kernels that can be installed on 8.0 so that i can at least try bisecting ? Or look for something else more obvious somewhere else ? i can of course start qemu with debug mode/output if instructed how...
 
Hi,
is there an archive of all 5.11/5.13/5.15 kernels that can be installed on 8.0 so that i can at least try bisecting ?
for Proxmox VE 8, only the 6.2 kernel is packaged. You could manually install older kernels, but you'd need to pick them from the repository for Proxmox VE 7 and it's not recommended.

What you can also do is bisect with the Ubuntu mainline kernels: https://kernel.ubuntu.com/~kernel-ppa/mainline/ Maybe try the newest one first, to see if the regression has been fixed already? If you're lucky the change happened in a minor version and is then easier to find from the git history, but in many cases, you'd need to bisect all the way down to the problematic commit by building the kernel yourself.
 
Hi,

for Proxmox VE 8, only the 6.2 kernel is packaged. You could manually install older kernels, but you'd need to pick them from the repository for Proxmox VE 7 and it's not recommended.

Well, even if not recommended at least i know how to manually fetch & install older pve-kernel debs, provided the debian dependencies are okay..
What you can also do is bisect with the Ubuntu mainline kernels: https://kernel.ubuntu.com/~kernel-ppa/mainline/ Maybe try the newest one first, to see if the regression has been fixed already? If you're lucky the change happened in a minor version and is then easier to find from the git history, but in many cases, you'd need to bisect all the way down to the problematic commit by building the kernel yourself.
given that the problem shows after 5.11, i'm not sure nobody cared about i386/32-bits emulation since then.. and im not going to go through the insane hassle of rebuilding kernels :) would have more expected/hoped someone at proxmox might know where to look at in qemu itself, or try/recommend various other qemu/kvm options.

im not sure it can matter but on the kernel cmdline i had (from "back in the days..") added "kvm-intel.preemption_timer=0" option to fix problems with OpenBSD VMs some years ago, i'll have to check if that kernel setting can be related, and if the problems that it was supposed to solve some years ago have been fixed in the meantime. I have another zfs-based pve install on the same hardware, can also install an i386 VM there to compare if the storage matters.
 
so i've more data points to add:
- on a second pve install on the same hardware, but with ZFS, the problem shows the same with latest/6.2 kernel
- i've tried booting OpenBSD/i386 kernels 7.1, 7.2, 7.3 - same behaviour exhibiting the issue, so not an OpenBSD regression afaict
- i've tried setting the cpu type to pentium, host or kvm32 -> no change on the i386 VMs
- i've removed kvm-intel.preemption_timer=0 from the kernel cmdline -> no change on the i386 VMs

that kernel option was added initially to workaround a problem with clock drifting and hangs, cf:
- https://marc.info/?l=openbsd-tech&m=151807652801188&w=2
- https://marc.info/?l=openbsd-misc&m=164997171012392&w=2
- https://marc.info/?l=openbsd-misc&m=151605213329615&w=2
html

i've started bisecting host/pve kernels, taking debian pkgs from http://download.proxmox.com/debian/pve/dists/bullseye/pve-no-subscription/binary-amd64/:
- 5.11.22-7 is ok (last 5.11 kernel)
- 5.13.14-1 is ok (first 5.13 kernel)
- 5.13.19-6 is ok (last 5.13 kernel)
- 5.15.12-1 is NOT ok (first 5.15 kernel)
- 5.15.108-1 is NOT ok (last 5.15 kernel)
- 6.2.16-3 is NOT ok (current 6.2 kernel)

afaict the 'regression window' is now between 5.13 and 5.15 branches..
 
afaict the 'regression window' is now between 5.13 and 5.15 branches..
Yes, but that is a huge window unfortunately.

To reduce it further, as already said:
What you can also do is bisect with the Ubuntu mainline kernels: https://kernel.ubuntu.com/~kernel-ppa/mainline/ Maybe try the newest one first, to see if the regression has been fixed already? If you're lucky the change happened in a minor version and is then easier to find from the git history, but in many cases, you'd need to bisect all the way down to the problematic commit by building the kernel yourself.
 
Yes, but that is a huge window unfortunately.

I've started having a look at:
- https://kernelnewbies.org/Linux_5.14
- https://kernelnewbies.org/Linux_5.15
- https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14
- https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15

looking for KVM-related commits some potential candidates:
- KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels: https://git.kernel.org/pub/scm/linu.../?id=1383279c6494c6b62d1d6939f34906a4d2ef721c (dunno if that can be related, cpu has nx flag on the host)
- https://git.kernel.org/pub/scm/linu.../?id=36824f198c621cebeb22966b5e244378fa341295
- https://git.kernel.org/pub/scm/linu.../?id=192ad3c27a4895ee4b2fa31c5b54a932f5bb08c1

but i wouldnt know what to do with those list of commits..

To reduce it further, as already said:

well .. are those ubuntu kernels equivalent to pve-kernel package ? same build options/no exotic things ? i guess i need both linux-image and linux-modules packages ?
 
Last edited:
looking for KVM-related commits some potential candidates:
- KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels: https://git.kernel.org/pub/scm/linu.../?id=1383279c6494c6b62d1d6939f34906a4d2ef721c (dunno if that can be related, cpu has nx flag on the host)
- https://git.kernel.org/pub/scm/linu.../?id=36824f198c621cebeb22966b5e244378fa341295
- https://git.kernel.org/pub/scm/linu.../?id=192ad3c27a4895ee4b2fa31c5b54a932f5bb08c1

but i wouldnt know what to do with those list of commits..
For the stand-alone commit you found, you could build a custom kernel checking out the source code tree once before the commit and once after the commit and test it. If you're lucky, this is the problematic change.

well .. are those ubuntu kernels equivalent to pve-kernel package ? same build options/no exotic things ?
Not entirely, but I'd be surprised if it mattered much. Otherwise, we will know that difference in the config is at fault which is also progress ;)

i guess i need both linux-image and linux-modules packages ?
Yes.
 
whichever version of OpenBSD/i386 is slow, you can try with 7.4 here: https://ftp.fr.openbsd.org/pub/OpenBSD/7.4/i386/install74.iso

i know i've kept the proxmox host where i run a bunch of i386 VMs on linux kernel 5.13 for now a 5.15 was the first showing the issue, my other proxmox instances are on more recent kernels (6.2, 6.5..)

sorry i was pretty sure i had looked in bugzilla before posting here, but apparently i missed the already open issue.
 
ouch

i missed that. i guess since i do not use openbsd/freebsd besides pfsense, i did not remember that i also came across that one

and it's not mentioned here , too:

https://www.sfritsch.de/openbsd-virtualization.html

i installed openbsd 7.4 i386 in VM on my futro s740 mini pc (celeron J4105) and i cannot see orders of magnitude in performance difference

Code:
linux(host):
# time(for i in {1..1000} ; do dd if=/dev/zero of=/dev/null bs=1024k count=1 >/dev/null 2>&1 ; done)
real    0m3.905s
user    0m1.191s
sys    0m2.905s

openbsd(vm):
# time(for i in {1..1000} ; do dd if=/dev/zero of=/dev/null bs=1024k count=1 >/dev/null 2>&1 ; done)

real    0m5.514s
user    0m0.330s
sys    0m4.660s


any advice which command can be used to better show performance difference ?

> and the kvm process on the host gobbles CPU like crazy

my vm is <=10% idle with 2 vCPU
 
Last edited:
just booting the VM was unbearably slow in my experience, eg from the time kernel messages in blue are done, until you get a login prompt takes ..several minutes, while it should be ~5s in normal conditions.

I think i've created test VMs on 3 proxmox hosts on very different hardware, and all showed the issue.

I hadnt think about the USB emulation, i'll have to give that option a try.
 
can confirm that my two OpenBSD/i386 VMs now work normally on top of latest proxmox 6.5.13-1-pve kernel, once applied the disable acpimadt/disable mpbios trick.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!