We're having a VM freeze/CPU usage issue on our proxmox server.
We are running a program in the host OS, and another program inside a KVM virtual machine.
We send the output of the host program to the program in the VM using a host-only bridge interface. We send the data using multicast.
There is a high amount of traffic flowing from the host to the VM (around 600 mbps).
What I've noticed when the system is running. When I pull up 'top' on the host and go into 'thread' mode by hitting 'H'. I can see that the main thread for the KVM uses 99.9% of a core. The VM is receiving the network traffic and doing a lot of disk IO.
The issue that I'm having is once in a while the VM seems to lock up for a half second or so. When the VM comes back I get a CPU spike as it tries to make up for lost time. But during this outages, I also get packet loss on the network bridge interface.
My hunch is that the main KVM thread is trying to do too much, since it is using 99.9% of a core, and once in a while it causes the entire VM to lock up.
I have tried quite a few things to help:
- like giving the VM exclusive access to its cores (by locking the VM to specific cores, and set the host OS to not schedule onto those cores).
- using aio=io_uring and setting iothread=1 for disk access
- for network I've set multiqueue on and use the total number of cores I've assigned to the VM (22 cores).
- I'm using virtio for the GPU, scsi controller, and network devices.
- I tried both a OVS bridge and the default linux bridge.
So far none of these things has helped. I have been able to work around the issue by setting the txqueuelen on the bridge port to 4096 (with the command ip link set tap100i1 txqueuelen 4096). But this seems less than ideal.
Wondering if anyone has any ideas or suggestions on what may be going on, and if there is a way to get the main kvm thread to use less cpu.
Thanks for any help.
Mark S
We are running a program in the host OS, and another program inside a KVM virtual machine.
We send the output of the host program to the program in the VM using a host-only bridge interface. We send the data using multicast.
There is a high amount of traffic flowing from the host to the VM (around 600 mbps).
What I've noticed when the system is running. When I pull up 'top' on the host and go into 'thread' mode by hitting 'H'. I can see that the main thread for the KVM uses 99.9% of a core. The VM is receiving the network traffic and doing a lot of disk IO.
The issue that I'm having is once in a while the VM seems to lock up for a half second or so. When the VM comes back I get a CPU spike as it tries to make up for lost time. But during this outages, I also get packet loss on the network bridge interface.
My hunch is that the main KVM thread is trying to do too much, since it is using 99.9% of a core, and once in a while it causes the entire VM to lock up.
I have tried quite a few things to help:
- like giving the VM exclusive access to its cores (by locking the VM to specific cores, and set the host OS to not schedule onto those cores).
- using aio=io_uring and setting iothread=1 for disk access
- for network I've set multiqueue on and use the total number of cores I've assigned to the VM (22 cores).
- I'm using virtio for the GPU, scsi controller, and network devices.
- I tried both a OVS bridge and the default linux bridge.
So far none of these things has helped. I have been able to work around the issue by setting the txqueuelen on the bridge port to 4096 (with the command ip link set tap100i1 txqueuelen 4096). But this seems less than ideal.
Wondering if anyone has any ideas or suggestions on what may be going on, and if there is a way to get the main kvm thread to use less cpu.
Thanks for any help.
Mark S