Proxmox ceph cluster - high system cpu usage

martinb

New Member
Apr 28, 2014
8
0
1
Hello,

We have recently built a 4 node Proxmox/Ceph cluster and we're having performance issues with it. We have noticed that every ~5s, the system cpu usage goes up and the processes involved seem to be migration/x. We have tried increasing kernel.sched_migration_cost from 500000 to 5000000 but it doesn't make much difference. Any ideas what could be causing this ?

Cluster info (4 servers):

Proxmox 3.4
Kernel 2.6.32-39-pve (2.6.32-156)
Dual E5-2640 v3 (8 physical cores each CPU + HT)
40 Gb Infiniband
128 GB RAM
PERC H730/830 disk controllers
SAS 4TB 7.2k drives
~15 OSDs per server (total 65)
In addition to the Ceph OSDs / monitor, each server runs a 4 cpu, 8 GB ram VM acting as file server.

Thanks,
Martin
 

Attachments

We found that the high sys usage/load was caused by the ceph journal sync. We changed 'filestore max sync interval' from 5s to 30s and it improved things a lot. Load avg decreased from 10-12 to 2-4. Maybe there is some other problem that makes journal sync inefficient.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!