ceph failure scenario

czechsys

Renowned Member
Nov 18, 2015
383
36
93
Hi,

we are testing some failure scenarios and spotted unexpected long delay on disk access availability:

Scenario: 3 x pve hosts, every has mgr,mon,2 osds, replica 3/2

1] hard power loss on one node
result: ~ 24s before disks available (grace period >= 20 seconds)

I thought, that node fail will not make such impact, so found those defaults:

osd_hearbeat_grace = 20 #changed to 10
osd_hearbeat_interval = 6 #changed to 3

New test and result is again more than 20 seconds. So, some parameter need to tune.

Anybody tunned it? And how? Or majority is running with defaults? 20+ sec all cluster disks non-availability looks as huge gap...
 
I think you might also be over suffering as you have only a small amount of test OSD's. So during the one node going down the other disks will be hit hard during the peering stage.

With more OSD's and nodes being peering progress will be staggered more across hardware.


How many PGs do you have set in the pool?
 
I think you might also be over suffering as you have only a small amount of test OSD's. So during the one node going down the other disks will be hit hard during the peering stage.

With more OSD's and nodes being peering progress will be staggered more across hardware.


How many PGs do you have set in the pool?

Currently 2 pools, every 128 pgs, so total 256.
Disks are ssds, so the hit isn't such hard.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!