Guys, this thread is about the CPU stalls / hung tasks that happen in KVM guests when there is high IO load on the host. I have no idea why you hijack this thread talking about the effect of double COW, but rest assured: the issue discussed here happens on ZFS+ZVOL nodes, EXT4+LVM nodes, etc...
I was really hoping that jumping to 4.10 kernel in PVE 5 would solve this issue, but unfortunately, from the reports pouring in it looks like this kernel / KVM / VirtIO issue is still there.
Let's recap what we know: when the host does heavy IO on the block device where KVM guests are stored...
The slowing down of and hung tasks in KVM guests during high IO activity on the host is most likely due to a kernel issue between the virtual memory subsytem and KVM+VirtIO that will hopefully get solved eventually.
We have found that the following settings - while do not solve the problem...
We have found that the following settings - while do not solve the problem completely - considerably lessen their impact. All of these settings are for the Proxmox hosts.
1. Linux virtual memory subsystem tuning
vm.dirty_ratio and vm.dirty_background_ratio
You need to lower these considerably...
Increasing vm.dirty_ratio and vm.dirty_background_ ratio did not help, but decreasing them afterwards did help a little: when set to the below values on the host, the KVM guests still slow down during backups (website response times increase 10x), but they less often fully time out (or produce...
What ZFS upgrade? Do you have any more information on that?
BTW this forum is full of threads about this issue, and many users are experiencing hung tasks and cpu stalls on LVM+RAW, LVM+ext4 and NFS filesystems when backups, restores and migrations are running, so I don't expect that a ZFS...
Thank you for your post. I don't think that metadata is the problem, because we are backing up single QCOW2 files at once, which hardly use lot of metadata. We don't store big filesystem trees on ZFS, only a few QCOW2 disk files on every node.
The errors are always happening well into the...
@fabian I have tried to move all the affected VMs to 32k blocksize ZVOLs, same errors. Also tried to convert all VMs to QCOW2 disks (stored on ZFS). No change. We have even tried to disable KSM, did not help either.
Unfortunately, the problem remains: every night during vzdump backups, some of...
Thanks for these benchmarks.
Do I understand correctly that you put ZFS on top of a HP HW RAID array? This is not a recommended setup, therefore not much point in benchmarking IMHO. I would gladly see ZFS benchmarks though when the Proxmox installer is instructed to build a ZFS RAID10 array...
You will in fact need to change a few things, because the default Proxmox ZFS setup is far from being optimized, and can lead to unstable servers (during high memory pressure and/or when vzdump backups are running).
1. Limit the ZFS ARC size
We aggressively limit the ZFS ARC size, as it has led...
Not sure where you read about a JBOD expander. Never used one.
We are using an Adaptec 6805E RAID controller in many of our nodes for connecting the member SATA disks, in Morphed JBOD mode, which passes through whole disks to the OS but keeps them bootable via the controller's BIOS.
Morphed JBOD is the only way to give a full disk to ZFS while still being able to boot from the Adaptec card. Any more insight on why is this not optimal?
We have a whole cluster of servers, each full of hard drives and some with solid state drives. As I wrote above, all of the servers exhibit...
All of our servers are affected by this.
- some nodes use SSD drives connected to Intel ICH controller (ZFS RAIDZ)
- some nodes use Toshiba DT01ACA drives connected to Adaptec 6805E controllers (disks are Morphed JBOD, ZFS RAID10)
- we use default compression, no dedup
- we use default ZVOL...
There is not much point in posting vm configs or S.M.A.R.T. reports, because this issue affects all of our servers and many different VMs.
But you can reproduce it easily: according to our tests, when VMs have their RAW disks on ZFS Zvols, and you start to restore another (big) VM to the same...
This issue has been with us since we upgraded our cluster to Proxmox 4.x, and converted our guests from OpenVZ to KVM. We have single and dual socket Westmere, Sandy Bridge and Ivy Bridge nodes, using ZFS RAID10 HDD or ZFS RAIDZ SSD arrays, and every one of them is affected.
Description
When...
If you replied to my post: I never said our VMs use the same Ceph storage. In fact they run on local storage, and we already use Ceph as backup storage that can withstand node failure. Anyway, this is not an argument against CephFS as backup storage, which would be very useful either way.
We would love to use CephFS for backup storage. Currently, to backup VMs to Ceph, we have to use OpenMediaVault running as a KVM guest, having a huge RAW disk on a Ceph pool, sharing it over NFS. This has a number of shortcomings:
- it's a single point of failure, even though Ceph isn't
-...
Thanks for your detailed answer. I didn't know that the cache tier was mandatory for RBD, however CephFS would be very useful for Proxmox - think backups or installation images.
I'm not familiar with your quoted figures, even though I have read many articles about EC pools. In theory, 3 nodes...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.