ceph failure scenario

czechsys · Sep 17, 2019

Hi,

we are testing some failure scenarios and spotted unexpected long delay on disk access availability:

Scenario: 3 x pve hosts, every has mgr,mon,2 osds, replica 3/2

1] hard power loss on one node
result: ~ 24s before disks available (grace period >= 20 seconds)

I thought, that node fail will not make such impact, so found those defaults:

osd_hearbeat_grace = 20 #changed to 10
osd_hearbeat_interval = 6 #changed to 3

New test and result is again more than 20 seconds. So, some parameter need to tune.

Anybody tunned it? And how? Or majority is running with defaults? 20+ sec all cluster disks non-availability looks as huge gap...

czechsys · Sep 23, 2019

up. Devels please?

adamb · Sep 23, 2019

Check out this. Good read on the topic. I guess you need to make your own decision on the trade off's.

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001431.html

sg90 · Sep 23, 2019

I think you might also be over suffering as you have only a small amount of test OSD's. So during the one node going down the other disks will be hit hard during the peering stage.

With more OSD's and nodes being peering progress will be staggered more across hardware.

How many PGs do you have set in the pool?

czechsys · Sep 24, 2019

sg90 said:
I think you might also be over suffering as you have only a small amount of test OSD's. So during the one node going down the other disks will be hit hard during the peering stage.

With more OSD's and nodes being peering progress will be staggered more across hardware.

How many PGs do you have set in the pool?

Currently 2 pools, every 128 pgs, so total 256.
Disks are ssds, so the hit isn't such hard.

sg90 · Sep 24, 2019

czechsys said:
Currently 2 pools, every 128 pgs, so total 256.
Disks are ssds, so the hit isn't such hard.

What model of SSD?

czechsys · Sep 24, 2019

sg90 said:
What model of SSD?

Intel S4610 2TB + Kingston DC500M 2TB

Search

Search

ceph failure scenario

czechsys

Renowned Member

czechsys

Renowned Member

adamb

Famous Member

sg90

Renowned Member

czechsys

Renowned Member

sg90

Renowned Member

czechsys

Renowned Member

We value your privacy