Redhat VirtIO developers would like to coordinate with Proxmox devs re: "[vioscsi] Reset to device ... system unresponsive"

@liim

Using IDE / SATA disks has been a work around for some time.

The VM config isn't much use since you moved away from vioscsi...
Having said that, I presumed you were still using aio=io_uring or aio=native with vioscsi, so exposed to the qemu global mutex.
This means other operations will share the thread with your disk.
Perhaps the scrubbing was the root cause...?

The lack of swap might rule out memory pressure from using a relatively large ARC size.

You mentioned VirtIO block. This is the viostor driver and not vioscsi.
You needed to use VirtIO SCSI single and not VirtIO block...!

I'm still inclined to think your issue is something else... Given the pressure has dropped off, maybe we will never know...

If you had a screenshot of one of the errors that would be helpful too.
I presume you didn't get any kvm errors in the PVE journal. This is a distinctive feature of this problem.
 
Switched VM OS disk from VirtIO block to ide
both don't use the fixed driver.
VirtIO Block isn't affected because it isn't qemu thread independant yet.
Only SCSI disk type + VirtIO SCSI Single Controller is fixed in this thread.
IDE + SATA don't use VirtIO drivers at all. they aren't impacted because they are slower than SCSI VirtIO, and they aren't qemu thread indépendant.
 
  • Like
Reactions: Whatever
Version 100.85.104.20800 installed now.

I made a few other changes since that last post.
  • Stopped scrubbing runs on both Proxmox servers - this seemed to be having the most impact, and is probably the cause of the OS lockups
  • Switched VM OS disk from VirtIO block to ide (This has no requirement to be fast, just needs to be reliable) - There had been no reset warnings related to this
  • Downgraded the vioscsi driver to 208 as mentioned
I was still seeing errors (1/min), but interestingly the general disk stability was a lot better and was avoiding the 1 minute pauses seen previously. The backups are now working, so pressure to fix this has dropped off substantially.
I'd tend to agree with the other suggestions in this thread that this might rather be a performance problem of the underlying storage. I'd expect a RAIDZ2 pool with spinning disks to be quite slow, and the ahcistor warnings when attaching the disks via SATA (the VirtIO SCSI/Block guest drivers wouldn't be involved in that case) and that the issues improve when scrubbing on the host is paused, also hints into that direction. If you'd like to debug this further -- can you check the IO pressure in /proc/pressure/io [1] while the VM is running / while you are seeing the issues? Also, would it be possible to temporarily move the VM disks to a fast local storage (like local SSD or NVME), and see if you still see issues then? If you'd like to look into this further, it would be great if you could open a new thread -- feel free to reference it here.

[1] https://facebookmicrosites.github.io/psi/docs/overview
 
I regret to share that I am using Proxmox VE 8.3 with spinning disks and kernel 6.8, and despite varying the VM configurations, there is consistently a very high I/O delay. This issue is particularly noticeable during operations involving both network and disk activity, such as backups, restores, and snapshots.
 
I regret to share that I am using Proxmox VE 8.3 with spinning disks and kernel 6.8, and despite varying the VM configurations, there is consistently a very high I/O delay. This issue is particularly noticeable during operations involving both network and disk activity, such as backups, restores, and snapshots.
Are these Windows VMs using VirtIO SCSI, and if yes, do you also see the device resets discussed in this thread in the Windows event viewer? The issue rather sounds like the underlying storage may be the culprit. Could you please open a new thread and provide some more details, including the output of pveversion -v, the config of an affected VM (the output of qm config VMID), the storage configuration (the contents of /etc/pve/storage.cfg) and some more details on your storage setup?
 
  • Like
Reactions: Fantu
Are these Windows VMs using VirtIO SCSI, and if yes, do you also see the device resets discussed in this thread in the Windows event viewer? The issue rather sounds like the underlying storage may be the culprit. Could you please open a new thread and provide some more details, including the output of pveversion -v, the config of an affected VM (the output of qm config VMID), the storage configuration (the contents of /etc/pve/storage.cfg) and some more details on your storage setup?
These are Linux-based virtual machines (VMs) only with Linux. All the VMs experienced significant I/O wait issues, but the Proxmox host itself does not appear to be affected.

After thorough testing of system procedures such as backups, restores, and snapshots, all performed on the same storage, the results were clear: Proxmox 7 does not exhibit I/O wait problems, while Proxmox 8 consistently shows these issues regardless of the kernel version. I tested multiple kernels (6.2, 6.5, 6.8, and 6.11) with Proxmox 8, all using the same storage, and the I/O wait problem persisted across all configurations.

This is deeply frustrating, as it raises concerns about the reliability of Proxmox 8 and has led me to consider abandoning Proxmox entirely.
 
If your issue is not related (as you have only linux vm and here is a topic only related to virtio driver for windows vm) please open a new topic related.
Also post all the information requested by @fweber, without information is impossible help to found the cause of your issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!