[SOLVED] Ceph Scrub Errors not going away.

damon1 · Dec 4, 2019

HI All,
simple set up 3 nodes each node has 1 SSD and 1 HDD connected via 10GB network.
I stutdown (via GUI) and moved one node from one location to another with a down time of about 15 minutes.
I had moved all the VM's from that node to other nodes.
When I started the node it all seemed to start fine and Ceph went to work but then it stopped and showed the bellow error.
I thought it would fix itself after a day or two but the message hasn't changed for 5 days.

all the VM's are running fine.
all the monitors and nodes are on the same versions

looking in the log this indicates the problem

osd 2, 4 and 0 are all the SSD drives which are then cache tiered to the HDD.

what command should I run to repair the problem.

thanks
Damon

Alwin · Dec 4, 2019

damon1 said:
I stutdown (via GUI) and moved one node from one location to another with a down time of about 15 minutes.

Are all OSDs up?

damon1 said:
osd 2, 4 and 0 are all the SSD drives which are then cache tiered to the HDD.

How did you configure the cache tiering?

damon1 · Dec 6, 2019

Yes all OSD's are up and running latest version = 14.2.4.1

the instructions i followed for cache teiring are from this page
http://tacoisland.net/2019/01/06/hyperconverged-hybrid-storage-on-the-cheap-with-proxmox-and-ceph/

basically it sets up one rule for SSD and one for HDD

ceph osd crush rule create-replicated ssd-only default osd ssd
ceph osd crush rule create-replicated hdd-only default osd hdd

you then add your disks and assign them to the correct crush rule

then
ceph osd tier add hdd-pool ssd-pool
ceph osd tier cache-mode ssd-pool writeback
ceph osd tier set-overlay hdd-pool ssd-pool

then
ceph osd pool set ssd-pool hit_set_type bloom
ceph osd pool set ssd-pool hit_set_count 1
ceph osd pool set ssd-pool hit_set_period 3600
ceph osd pool set ssd-pool min_read_recency_for_promote 1
ceph osd pool set ssd-pool min_write_recency_for_promote 1

then lastly,
ceph osd pool set ssd-pool cache_target_dirty_ratio 0.5
ceph osd pool set ssd-pool cache_target_dirty_high_ratio 0.75
ceph osd pool set ssd-pool cache_target_full_ratio 0.9
ceph osd pool set ssd-pool cache_min_flush_age 60
ceph osd pool set ssd-pool cache_min_evict_age 300

with 15 windows VM's running I get good read and write speeds.

thanks

damon1 · Dec 6, 2019

ps
this is the global configuration

thanks

damon1 · Dec 10, 2019

If any one else finds this or I need it again.

run
ceph osd repair all

on any node. This will get rid of the scrub errors but you then need to tell ceph to forget about the errors and "revert" to a previous version
It doesn't take too long to run and the log should show when it is finsihed

Once finished then run ,
ceph health detail

at the top, it will list all of the "parts" which are bad, it will also probably tell you that the same parts haven't been scrubbed or deep scrubbed

the lines start like this

pg 1.41

So for each pg you need to run this command

ceph pg 1.41 mark_unfound_lost revert

this will countdown your errors.

the next thing to do is wait for ceph to do its thing, it should also do a scrub and deep scrub of all the files now which will leave you with a clean health.

This can happen within an hour or over night.

thanks
Damon

damon1 · Dec 10, 2019

If you want to force the pages to deep scrub use "ceph health detail" to find the pages

then
ceph pg deep-scrub 1.5a
etc...
you can also do
ceph pg scrub 1.5a

not sure if both are needed ?

damon1 · Dec 10, 2019

deep-scrub does the jop

Alwin · Dec 10, 2019

damon1 said:
ceph osd crush rule create-replicated ssd-only default osd ssd
ceph osd crush rule create-replicated hdd-only default osd hdd

The failure domain is OSD. This places an replica on an OSD, regardless of node boundaries.

This is why the objects were marked lost, as the worst case, two copies of the same object landed on the same node. This is programmed to have data loss.

damon1 said:
the instructions i followed for cache teiring are from this page
http://tacoisland.net/2019/01/06/hyperconverged-hybrid-storage-on-the-cheap-with-proxmox-and-ceph/

I consider this an dangerous article, as it doesn't go into the purpose and implications of the setup.

But I am glad, that you could repair your cluster.

damon1 said:
ceph pg deep-scrub 1.5a
etc...
you can also do
ceph pg scrub 1.5a

Deep scrub, checks also the copies against eachother, while scrub does only metadata compare.

damon1 · Dec 11, 2019

Alwin said:
Deep scrub, checks also the copies against eachother, while scrub does only metadata compare.

good to know
thanks

Search

Search

[SOLVED] Ceph Scrub Errors not going away.

damon1

Well-Known Member

Alwin

Proxmox Retired Staff

damon1

Well-Known Member

Attachments

damon1

Well-Known Member

damon1

Well-Known Member

damon1

Well-Known Member

damon1

Well-Known Member

Alwin

Proxmox Retired Staff

damon1

Well-Known Member

We value your privacy