I used a kvm virtual host in pve and found a lot of I/O wait times (Steal Time, st) usage on the virtual machine, which lasted up to 90%.

丶小樱

New Member
Dec 8, 2024
2
0
1
I used a kvm virtual host in pve and found a lot of I/O wait times (Steal Time, st) usage on the virtual machine, which lasted up to 90%.Is this a problem caused by cache=writeback? Should it be changed to cache=writethrough?If it is necessary to modify the host node and all vm servers?

https://pve.proxmox.com/wiki/Performance_Tweaks

1.png

2.png

3.png


4.gif

This is a report sent to me by a user. I have confirmed this phenomenon on multiple host nodes. VM frequently appears io wait causing kernel crash
Message from syslogd@vm906233 at Dec 5 13:31:08 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 49s! [kworker/0:2:25910]
Message from syslogd@vm906233 at Dec 5 13:37:13 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 24s! [kworker/0:3:26358]

What should I do to restore availability?
Hello, I found that the ST value of the server is very high, causing
the system load to occupy about 10. The normal system load should be
0.00-0.05
The current average share of st is 90%
#90xxxx-xxxxx
server ip 45.140.xxx.xxx
root
O0RkSiW0lHsSm3H

Top Task Manager Wiki README
0.0% st — The percentage of CPU occupied by overselling of response
wait time (steal time) during virtualization
The st value should always remain at 0.0%. If there is any
fluctuation, it means that the host is oversold.

Dear [Service Provider],

I hope this message finds you well.

We have encountered severe performance issues with your service, and after researching Proxmox VE documentation, we suspect the issue is due to the host machine's cache settings. The current setting, `cache=writeback`, seems to be causing significant performance degradation. I strongly recommend switching all VMs to `cache=writethrough` to improve performance.

In a Proxmox VE environment, cache settings greatly impact storage performance. The `cache=writeback` option caches I/O operations in memory, making writes appear faster. However, this can lead to increased I/O wait times (Steal Time, st) under high loads, which is contributing to the high load issues we're experiencing.

According to official performance optimization documentation, the `cache=writethrough` mode is more reliable, ensuring that write operations are fully confirmed before reaching storage devices. While it may slightly decrease apparent I/O speed, it significantly reduces system load over the long term and improves stability and reliability. This is especially important in virtualized environments, as it maximizes data consistency and durability, avoiding data loss risks from uncommitted cache writes.

Moreover, our analysis shows that the `cache=writeback` setting leads to high CPU usage on the host machine, particularly during heavy data requests, which significantly increases the st value. This indicates uneven resource consumption, affecting the normal operation of the VPS and user experience.

To ensure system stability and performance enhancement, we suggest promptly adjusting the VM configurations on the host to `cache=writethrough`. This change should alleviate the current high load issues and ensure reliable future operations. Additionally, we recommend monitoring the system after making these changes to ensure the desired outcomes. Please feel free to reach out if you have further questions.

Thank you for your attention and cooperation.
 
Last edited:
Hi,
I'd recommend turning on the iothread option for the VM disks, otherwise the IO load is handled by the QEMU main thread and that can cause virtual CPUs to get stuck.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!