Cluster configuration / resource starvation issues

lightnet-barry

Active Member
Feb 7, 2017
19
2
43
Hi all,

I've had a HA cluster running for a few years now and I'm looking for pointers since I'm sure a lot of things have changed and there's probably better practice than I used when I built it originally. The original configuration is from around 2012, the current nodes were slotted in to replace Dell R310s which were there originally.

Current setup is a 3 node cluster consisting of:
Dual Intel X2620v4 16 core @2.1GHz
256 GB DDR4 ECC RAM
RAID 1 M2 SSDs for OS
PCIMvE SSD for CEPH Journal
4x 4 TB SATA HDD JBOD

CEPH is configured with 1 monitor and 3 OSD on each node (4th disk per node is just in case)

I've started to find (or notice) that I am getting a lot of rcu_sched detecting CPU stall, rcu_sched kthread starvation, OOM errors, ATA Write DMA errors and so on for certain VMs.

I've made some tweaks recently, e.g ensured pcid is on for all VM processers, adjusted CPU units for lower priority VMs which have improved things a bit but I'm still seeing server loads are running at around 16, IO delay is around 10% and VMs are being starved of resources. Two nodes are also showing > 0B for KSM even though none of them are near 80% RAM utilisation.

I am intending to replace this cluster again soon with new hardware, while this hardware will be moved to a remote site replacing the original Dells running backup services.

The purpose of this post is two-fold:
1. Is there anything anyone would suggest I might do for my current setup to improve VM performance, particularly around resource starvation?
2. Is there better practice I might implement on my future setup (since I haven't yet specified hardware)

Any input appreciated :cool:

Barry.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!