High HDD activity kills network throughput with VirtIO SCSI

damarrin

New Member
Oct 11, 2022
7
0
1
I'm running Proxmox 7 (upgraded from 6) on a HPE Microserver Gen 8 with a Xeon E31260L. On Proxmox I'm running IPFire in a VM as my main router with VirtIO SCSI set as the controller.

Every 20 seconds or so Proxmox generates high HDD activity for a few seconds and this kills network throughput for IPFire.

As far as I can tell, the command that generates the activity is
Code:
kvm -id [...] -name [...] -no-shutdown -chardev socket,id=qmp,path=/var/ru~ -machine type=pc+pve0 -device virtio-rng-pci,id=rng0,bus=pci.0,addr=0x17
...and that's as much as fits on the screen in iotop.

I found this out while trying to stream games to PS5 from Playstation Plus, whenever this happens the stream breaks up and the console complains of poor network, this always coincides with the server showing high HDD activity and the HDD audibly crunching. I did see my internet connection hesitating for a bit before from time to time, but put it down to wifi being crappy the way it is, but the streaming brought it out in full (and is all over cable).

Setting the SCSI controller in the VM to LSI 53C895A fixes the problem BUT (and it's a really big but) this will generate a CPU soft lockup in the VM after a week or so, so it isn't really a solution. I also tried VirtIO SCSI Single, that didn't help any.

Any ideas or help would be very appreciated.
 
Last edited:
Please share your VM config, i.e. output of qm config <VMID>.
Does the high HDD activity happen only when the VM is running, or also when it's shut down?
 
Thank you very much for your response.

Here's the output of qm config:
Code:
args: -device virtio-rng-pci,id=rng0,bus=pci.0,addr=0x17
boot: order=scsi0;net0;net1
cores: 2
memory: 2048
name: ipfire
net0: virtio=9E:B8:C9:01:AE:30,bridge=vmbr0
net1: virtio=A2:AE:E0:F6:44:19,bridge=vmbr1
numa: 0
onboot: 1
ostype: l26
scsi0: wd160gb-lvm:vm-102-disk-1,size=40G
smbios1: uuid=029984e9-c930-4a74-a5c2-661f29c157e3
sockets: 1

The (possibly?) interesting thing is that in args the device is virtio-rng-pci even though the SCSI controller is now set to LSI. IDK if that's significant or not.

The high disk activity doesn't happen when the VM is off.
 
Last edited:
Do you need the RNG device? Maybe try removing it and see if the issue persists.

The (possibly?) interesting thing is that in args the device is virtio-rng-pci even though the SCSI controller is now set to LSI. IDK if that's significant or not.
Correct me if I'm wrong, but the args line was added by you manually. If you add a RNG device through the GUI, it should show up as rng0: in the config.
Either way, the device is virtio-rng-pci because that's the RNG device from qemu, and isn't related to the SCSI controller.
 
Oh, right, this is the random number generator, isn't it? I did not add the line itself manually IIRC, but IPFire does need it otherwise it takes an age to start waiting for entropy. I don't remember what was the procedure I used, it was years ago. I'll try without it and report back.
 
So I commented out the args line but the issue persists. Every time the kvm command runs and writes to the HDD (it no longer includes the rng device) network transfer dies.
 
Hello again,

I'm still no closer to solving this. Perhaps there's a way to stop the virtual LSI controller locking up the CPU periodically?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!