24 Scrub Errors, 3pgs inconsistent

proxwolfe · Jan 4, 2024

Hi,

On my 3 node home lab cluster, Ceph tells me that it has discovered 24 scrub errors and that 3 pgs are inconsistent. That does not sound overly promising.

More importantly, however, I have no idea what to do with this information...

I take it that the nature of this error prevents Ceph from fixing it itself. But what can I do?

Of course, all important stuff is backed up. But how can I identify which VMs(' data) is affected so that I can restore that from the backups?

Or is there anything else I could do?

Thanks!

fluxX04 · Jan 4, 2024

Hi,

can you post the output from ceph -s and ceph health detail?

Greetz

sb-jw · Jan 4, 2024

proxwolfe said:
I take it that the nature of this error prevents Ceph from fixing it itself. But what can I do?

You can try the ceph pg repair PGID.

See: https://docs.ceph.com/en/latest/rados/operations/pg-repair/

proxwolfe said:
But how can I identify which VMs(' data) is affected so that I can restore that from the backups?

Since your cluster does not consist of several hundred servers and thousands of OSDs, you can assume that all of your pools are affected and therefore potentially all VMs.

But you can find out in detail with the following command:
ceph osd map POOLNAME DISKNAME

Then you have to compare whether the result is on your of the three affected PGs.
But it doesn't have to mean that you have to do a restore right away. If you are working with Replica 3, a repair will probably bring you back to a consistent state or you may have to remove or replace a disk.

fluxX04 said:
can you post the output from ceph -s and ceph health detail?

If you send this to us we can definitely help you better.

proxwolfe · Jan 4, 2024

So, scrubbing has continued and apparently, it identified two more errors:

ceph -s


  cluster:
    id:     xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    health: HEALTH_ERR
            26 scrub errors
            Possible data damage: 3 pgs inconsistent
 
  services:
    mon: 3 daemons, quorum node1,node3,node2 (age 3w)
    mgr: node2(active, since 3w), standbys: node1, node3
    mds: 2/2 daemons up, 1 standby
    osd: 14 osds: 11 up (since 5d), 11 in (since 9d)
 
  data:
    volumes: 2/2 healthy
    pools:   8 pools, 417 pgs
    objects: 4.00M objects, 15 TiB
    usage:   45 TiB used, 27 TiB / 72 TiB avail
    pgs:     414 active+clean
             3   active+clean+inconsistent
 
  io:
    client:   3.3 KiB/s rd, 1.0 MiB/s wr, 1 op/s rd, 99 op/s wr

ceph health

HEALTH_ERR 26 scrub errors; Possible data damage: 3 pgs inconsistent

Should I try ceph pg repair PGID? Or does any of the above suggest a different approach?

Thanks!

sb-jw · Jan 4, 2024

proxwolfe said:
osd: 14 osds: 11 up (since 5d), 11 in (since 9d)

There seem to be 3 OSDs missing!?

Please post the output of the following commands (please us Code-Tag [CODE][/CODE] and not Inline Code Tag):

Code:

ceph health detail
ceph pg dump_stuck inactive
ceph osd df tree

proxwolfe · Jan 4, 2024

sb-jw said:
There seem to be 3 OSDs missing!?

Yes, they were on another node that I have since removed but missed the window to remove the OSDs first... I haven't got around to delete them from <whererever>. They don't play a role here.

ceph health detail


HEALTH_ERR 26 scrub errors; Possible data damage: 3 pgs inconsistent
[ERR] OSD_SCRUB_ERRORS: 26 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 3 pgs inconsistent
    pg 3.13 is active+clean+inconsistent, acting [1,3,16]
    pg 3.36 is active+clean+inconsistent, acting [3,1,16]
    pg 3.48 is active+clean+inconsistent, acting [1,16,3]

ceph pg dump_stuck inactive
ok

ceph osd df tree


ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME           
-1         86.70790         -   72 TiB   45 TiB   45 TiB  410 KiB   84 GiB   27 TiB  62.51  1.00    -          root default        
-3         25.24937         -   25 TiB   15 TiB   15 TiB  122 KiB   25 GiB   10 TiB  59.50  0.95    -              host node1
16    hdd  20.00969   1.00000   20 TiB   13 TiB   13 TiB   21 KiB   21 GiB  6.9 TiB  65.74  1.05  192      up          osd.16      
 6   nvme   1.74660   1.00000  1.7 TiB  169 GiB  168 GiB   43 KiB  675 MiB  1.6 TiB   9.44  0.15   32      up          osd.6       
11    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   58 KiB  3.4 GiB  1.8 TiB  48.80  0.78  193      up          osd.11      
-9         23.43108         -   23 TiB   15 TiB   15 TiB  141 KiB   32 GiB  8.4 TiB  64.15  1.03    -              host node2
 3    hdd   5.45799   1.00000  5.5 TiB  4.2 TiB  4.2 TiB    8 KiB  7.5 GiB  1.2 TiB  77.21  1.24   62      up          osd.3       
 8    hdd  12.73340   1.00000   13 TiB  8.9 TiB  8.9 TiB   36 KiB   19 GiB  3.8 TiB  70.25  1.12  130      up          osd.8       
10   nvme   1.74660   1.00000  1.7 TiB  170 GiB  168 GiB   42 KiB  1.9 GiB  1.6 TiB   9.51  0.15   32      up          osd.10      
13    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   55 KiB  3.6 GiB  1.8 TiB  48.80  0.78  193      up          osd.13      
-5         23.43108         -   23 TiB   15 TiB   15 TiB  147 KiB   28 GiB  8.4 TiB  64.13  1.03    -              host node3
 0    hdd   5.45799   1.00000  5.5 TiB  4.0 TiB  4.0 TiB   11 KiB  7.3 GiB  1.4 TiB  73.45  1.17   62      up          osd.0       
 1    hdd  12.73340   1.00000   13 TiB  9.1 TiB  9.1 TiB   26 KiB   16 GiB  3.6 TiB  71.84  1.15  130      up          osd.1       
 5   nvme   1.74660   1.00000  1.7 TiB  169 GiB  168 GiB   53 KiB  660 MiB  1.6 TiB   9.44  0.15   32      up          osd.5       
14    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   57 KiB  3.7 GiB  1.8 TiB  48.81  0.78  193      up          osd.14      
-7         14.59637         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host obsolete
 2    hdd  12.73340         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.2       
 4   nvme   0.93149         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.4       
12   nvme   0.93149         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.12      
                        TOTAL   72 TiB   45 TiB   45 TiB  414 KiB   84 GiB   27 TiB  62.51

sb-jw · Jan 4, 2024

Sorry, I can't really read it on my smartphone. As already noted, please use the code tag and not inline code tag. Then you can read it and the spacing remains.

proxwolfe · Jan 4, 2024

sb-jw said:
please use the code tag and not inline code tag

Sorry, I had missed that. I edited my post above accordingly.

sb-jw · Jan 5, 2024

proxwolfe said:
Sorry, I had missed that. I edited my post above accordingly.

It's still the inline code tag

This is the code tag:

Code:

ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME           
-1         86.70790         -   72 TiB   45 TiB   45 TiB  410 KiB   84 GiB   27 TiB  62.51  1.00    -          root default       
-3         25.24937         -   25 TiB   15 TiB   15 TiB  122 KiB   25 GiB   10 TiB  59.50  0.95    -              host node1
16    hdd  20.00969   1.00000   20 TiB   13 TiB   13 TiB   21 KiB   21 GiB  6.9 TiB  65.74  1.05  192      up          osd.16     
 6   nvme   1.74660   1.00000  1.7 TiB  169 GiB  168 GiB   43 KiB  675 MiB  1.6 TiB   9.44  0.15   32      up          osd.6       
11    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   58 KiB  3.4 GiB  1.8 TiB  48.80  0.78  193      up          osd.11     
-9         23.43108         -   23 TiB   15 TiB   15 TiB  141 KiB   32 GiB  8.4 TiB  64.15  1.03    -              host node2
 3    hdd   5.45799   1.00000  5.5 TiB  4.2 TiB  4.2 TiB    8 KiB  7.5 GiB  1.2 TiB  77.21  1.24   62      up          osd.3       
 8    hdd  12.73340   1.00000   13 TiB  8.9 TiB  8.9 TiB   36 KiB   19 GiB  3.8 TiB  70.25  1.12  130      up          osd.8       
10   nvme   1.74660   1.00000  1.7 TiB  170 GiB  168 GiB   42 KiB  1.9 GiB  1.6 TiB   9.51  0.15   32      up          osd.10     
13    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   55 KiB  3.6 GiB  1.8 TiB  48.80  0.78  193      up          osd.13     
-5         23.43108         -   23 TiB   15 TiB   15 TiB  147 KiB   28 GiB  8.4 TiB  64.13  1.03    -              host node3
 0    hdd   5.45799   1.00000  5.5 TiB  4.0 TiB  4.0 TiB   11 KiB  7.3 GiB  1.4 TiB  73.45  1.17   62      up          osd.0       
 1    hdd  12.73340   1.00000   13 TiB  9.1 TiB  9.1 TiB   26 KiB   16 GiB  3.6 TiB  71.84  1.15  130      up          osd.1       
 5   nvme   1.74660   1.00000  1.7 TiB  169 GiB  168 GiB   53 KiB  660 MiB  1.6 TiB   9.44  0.15   32      up          osd.5       
14    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   57 KiB  3.7 GiB  1.8 TiB  48.81  0.78  193      up          osd.14     
-7         14.59637         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host obsolete
 2    hdd  12.73340         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.2       
 4   nvme   0.93149         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.4       
12   nvme   0.93149         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.12     
                        TOTAL   72 TiB   45 TiB   45 TiB  414 KiB   84 GiB   27 TiB  62.51

proxwolfe said:
Yes, they were on another node that I have since removed but missed the window to remove the OSDs first... I haven't got around to delete them from <whererever>. They don't play a role here.

You shouldn't leave it like this, as it will potentially continue to have an impact on your crush map.

See: https://docs.ceph.com/en/latest/rados/operations/crush-map/#modifying-the-crush-map

proxwolfe said:
ceph pg dump_stuck inactive
ok

Once you weren't paying attention... "inactive" obviously doesn't fit. Take a look at what outputs on ceph pg dump_stuck unclean. If not, then it doesn't matter.

Have you checked whether the affected OSDs (1, 3 and 16) have any abnormalities? Are there negative SMART values or is there something in the syslog?

Please run the following commands:

Code:

rados list-inconsistent-obj 3.13 --format=json-pretty
rados list-inconsistent-obj 3.36 --format=json-pretty
rados list-inconsistent-obj 3.48 --format=json-pretty

Before you even consider a repair, you should make sure that these errors are not caused by physical defects in the disks. If all of your affected OSDs are defective, CEPH can no longer necessarily help you maintain a consistent condition.

But I also assume that you run the HDDs in their own pool and have not mixed them with SSD and NVMe. Please note that your current level will end in a complete catastrophe. You have two HDDs in nodes node2 and node3, with a replica of 3 CEPH will inevitably distribute the data only to the specific node. However, on node2 and node3 your HDDs are so full that the failure of one HDD can no longer be compensated for and will inevitably end in read-only. In order to ensure operation even in the event of a failure, your HDDs may have a maximum capacity of 42.5%, so in the event of a failure you are directly at near full, which is not healthy either, but currently you are directly at full.

With one NVMe and SSD per node, the problem does not exist because there are simply no more data storage devices available and the CEPH therefore does not carry out any healing. However, the problem arises with two data carriers of the same type.

proxwolfe · Jan 5, 2024

sb-jw said:
You shouldn't leave it like this, as it will potentially continue to have an impact on your crush map.

Done.

sb-jw said:
Once you weren't paying attention... "inactive" obviously doesn't fit.

Sorry, I don't understand. Where/when wasn't I paying attention. And what do you mean that "'inactive' obiviously doesn't fit? Sorry, my Ceph knowledge is at "noob" level...

sb-jw said:
Take a look at what outputs on ceph pg dump_stuck unclean. If not, then it doesn't matter.

Code:

PG_STAT  STATE                                     UP        UP_PRIMARY  ACTING    ACTING_PRIMARY
3.76                  active+remapped+backfilling  [0,8,16]           0  [1,8,16]               1
3.6f                  active+remapped+backfilling  [16,3,1]          16  [16,3,0]              16
3.68                  active+remapped+backfilling  [8,16,0]           8  [8,16,1]               8
3.64                  active+remapped+backfilling  [0,3,16]           0  [1,16,8]               1
3.14                  active+remapped+backfilling  [3,1,16]           3  [0,16,8]               0
3.22                  active+remapped+backfilling  [3,16,1]           3  [3,16,0]               3
3.5c                  active+remapped+backfilling  [3,1,16]           3  [3,16,0]               3
3.57                  active+remapped+backfilling  [0,16,8]           0  [1,16,8]               1
3.39                  active+remapped+backfilling  [8,0,16]           8  [0,16,3]               0
3.23                  active+remapped+backfilling  [8,0,16]           8  [0,16,3]               0
3.3a                  active+remapped+backfilling  [16,1,3]          16  [16,1,8]              16
3.2f                  active+remapped+backfilling  [1,16,8]           1  [0,16,8]               0
3.43                  active+remapped+backfilling  [3,16,1]           3  [3,16,0]               3
3.36     active+remapped+inconsistent+backfilling  [3,0,16]           3  [3,16,1]               3
3.48     active+remapped+inconsistent+backfilling  [1,16,8]           1  [1,16,3]               1
3.2d                  active+remapped+backfilling  [16,1,3]          16  [16,1,8]              16
3.2c                  active+remapped+backfilling  [16,0,8]          16  [16,8,1]              16
3.45                  active+remapped+backfilling  [0,16,8]           0  [0,16,3]               0
3.2a                  active+remapped+backfilling  [16,8,1]          16  [16,1,3]              16
ok

There is suddenly a lot more going on, which I believe was triggered by my removing the OSDs (and the respective node) from the crush map that didn't exist anymore.

sb-jw said:
Have you checked whether the affected OSDs (1, 3 and 16) have any abnormalities? Are there negative SMART values or is there something in the syslog?

They all seem to be performing perfectly fine. No SMART errors or abnormal values.

sb-jw said:
rados list-inconsistent-obj 3.13 --format=json-pretty

Code:

{
    "epoch": 35190,
    "inconsistents": [
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000ac9f6",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 253390
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000ac9f6",
                    "key": "",
                    "snapid": -2,
                    "hash": 3075686803,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'287249",
                "prior_version": "29581'253390",
                "last_reqid": "osd.1.0:6867405",
                "user_version": 253390,
                "size": 2867200,
                "mtime": "2023-08-01T02:02:00.479577+0200",
                "local_mtime": "2023-08-01T02:02:00.482681+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0x0b39ce3d",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 2867200,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x0b39ce3d"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 2867200
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 2867200,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x0b39ce3d"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000cfe64",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 261741
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000cfe64",
                    "key": "",
                    "snapid": -2,
                    "hash": 4027877011,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'286775",
                "prior_version": "29590'261741",
                "last_reqid": "osd.1.0:6865883",
                "user_version": 261741,
                "size": 4194304,
                "mtime": "2023-08-02T20:18:40.541011+0200",
                "local_mtime": "2023-08-02T20:18:40.542417+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0xba1280a7",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xba1280a7"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xba1280a7"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000d8103",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 262985
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000d8103",
                    "key": "",
                    "snapid": -2,
                    "hash": 1966136979,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'287141",
                "prior_version": "29591'262985",
                "last_reqid": "osd.1.0:6867081",
                "user_version": 262985,
                "size": 4194304,
                "mtime": "2023-08-02T23:14:57.219986+0200",
                "local_mtime": "2023-08-02T23:14:57.223069+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0x09cedf2b",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x09cedf2b"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x09cedf2b"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000e2696",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 264575
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000e2696",
                    "key": "",
                    "snapid": -2,
                    "hash": 2539886995,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'287558",
                "prior_version": "29593'264575",
                "last_reqid": "osd.1.0:6868421",
                "user_version": 264575,
                "size": 4194304,
                "mtime": "2023-08-03T02:36:22.059766+0200",
                "local_mtime": "2023-08-03T02:36:22.064144+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0xf41e4424",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xf41e4424"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xf41e4424"
                }
            ]
        },

to be continued due to post size limitations...

proxwolfe · Jan 5, 2024

Code:

        {
            "object": {
                "name": "rbd_data.453d0387ce7766.00000000001028e4",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 371918
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.453d0387ce7766.00000000001028e4",
                    "key": "",
                    "snapid": -2,
                    "hash": 3162397075,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "35232'371918",
                "prior_version": "29661'302824",
                "last_reqid": "client.55474939.0:843152145",
                "user_version": 371918,
                "size": 4194304,
                "mtime": "2024-01-02T21:49:09.007652+0100",
                "local_mtime": "2024-01-02T21:49:09.108641+0100",
                "lost": 0,
                "flags": [
                    "dirty",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0xffffffff",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xae6d213f"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xae6d213f"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.453d0387ce7766.00000000001c7b4a",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 331944
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.453d0387ce7766.00000000001c7b4a",
                    "key": "",
                    "snapid": -2,
                    "hash": 3892734611,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "30394'334313",
                "prior_version": "30206'331944",
                "last_reqid": "osd.1.0:12408390",
                "user_version": 331944,
                "size": 2523136,
                "mtime": "2023-09-29T14:07:21.843659+0200",
                "local_mtime": "2023-09-29T14:07:21.847260+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0x5b6b4d3a",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 2523136,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x5b6b4d3a"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 2523136
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 2523136,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x5b6b4d3a"
                }
            ]
        },

proxwolfe · Jan 5, 2024

Code:

                "data_digest": "0xec9c58e1",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4145152,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xec9c58e1"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4145152
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4145152,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xec9c58e1"
                }
            ]
        }
    ]
}

proxwolfe · Jan 5, 2024

sb-jw said:
rados list-inconsistent-obj 3.36 --format=json-pretty

Code:

No scrub information available for pg 3.36
error 2: (2) No such file or directory

sb-jw said:
rados list-inconsistent-obj 3.48 --format=json-pretty

Code:

No scrub information available for pg 3.48
error 2: (2) No such file or directory

sb-jw said:
But I also assume that you run the HDDs in their own pool and have not mixed them with SSD and NVMe.

Correct.

sb-jw said:
Please note that your current level will end in a complete catastrophe. You have two HDDs in nodes node2 and node3, with a replica of 3 CEPH will inevitably distribute the data only to the specific node. However, on node2 and node3 your HDDs are so full that the failure of one HDD can no longer be compensated for and will inevitably end in read-only. In order to ensure operation even in the event of a failure, your HDDs may have a maximum capacity of 42.5%, so in the event of a failure you are directly at near full, which is not healthy either, but currently you are directly at full.

With one NVMe and SSD per node, the problem does not exist because there are simply no more data storage devices available and the CEPH therefore does not carry out any healing. However, the problem arises with two data carriers of the same type.

Thank you for the recommendation. I am aware of this problem and am in the process of rectifying. But this being a hobby project, I can't replace all disks at once and am acquiring one new disk per month. Node1 has been completed, Node2 is due this month and Node3 will have to wait until February.

sb-jw · Jan 5, 2024

proxwolfe said:
Sorry, I don't understand. Where/when wasn't I paying attention. And what do you mean that "'inactive' obiviously doesn't fit? Sorry, my Ceph knowledge is at "noob" level...

Sorry, that wasn't directed at you but at me. I wasn't paying attention and sent you the wrong thing.

proxwolfe said:
{ "osd": 3, "primary": false, "errors": [ "read_error" ], "size": 2867200 },

So there was a read error, which usually indicates that there is a problem with the data carrier. So you should examine the OSD.3 more closely or keep an eye on it.

Then you can now try whether the repair fixes the problem, but the error may remain persistent, as the read_error usually indicates a storage problem.
ceph pg repair 3.13

If this occurs frequently, it could also indicate a problem with your NTP service. You should definitely monitor your cluster for clock skew as well, if the time on the nodes runs differently, this can lead to a big problem.

proxwolfe · Jan 7, 2024

sb-jw said:
So there was a read error, which usually indicates that there is a problem with the data carrier. So you should examine the OSD.3 more closely or keep an eye on it.

I just went through upgrading my cluster to PVE version 8. After the upgrade, OSD.3 did not come back online again (it kept being shown as "in" but "down", irrespective of how often I started it or the entire node).

I then checked the SMART values for the drive again and this time I did find several read errors (unsure how I could miss this the first time). So I replaced the drive and am now waiting for it to be filled again. Interestingly, CEPH's health status page now only shows 2 pgs as inconstistent.

(So what does it mean anyway when a pg is inconsistent? Does that mean that the other two drives holding the "same" data are actually holding different data? In that case, it would seem to be easy to repair the information on the incosistent pg...)

The remapping and backfilling is estimated to take between another 13 hours and a day. Let's see where this goes. I'll report back.

sb-jw · Jan 7, 2024

proxwolfe said:
CEPH's health status page now only shows 2 pgs as inconstistent.

The PG 3.36 is gone, right?

proxwolfe said:
(So what does it mean anyway when a pg is inconsistent? Does that mean that the other two drives holding the "same" data are actually holding different data? In that case, it would seem to be easy to repair the information on the incosistent pg...)

Here you can find some more information: https://docs.ceph.com/en/latest/rados/operations/pg-repair/#more-information-on-pg-repair

If that doesn't answer your question or you have more, ask again

However, I always think it is better to simply refer to the developer documentation if there is an article for it than to try to explain it myself.

proxwolfe · Jan 7, 2024

sb-jw said:
The PG 3.36 is gone, right?

3.48 is gone.

sb-jw said:
Here you can find some more information

Thanks.

sb-jw · Jan 8, 2024

proxwolfe said:
3.48 is gone.

Interesting, but OSD.3 was not primary for the PG. So I expected something different.

proxwolfe · Jan 8, 2024

3.36 is now gone as well.

3.13 is active, clean and inconsistent. (But there is one pg still being deep scrubbed - so maybe that is 3.13 and it will be gone soon, too?)

I'll report back.

BTW: Today, the second 22TB HDD for this pool arrived and I can replace the OSDs in one more node. But I will wait for the deep scrubbing to finish first.

proxwolfe · Jan 9, 2024

3.13 is now gone as well. (1 pg is still being deep scrubbed)

So (other than the 1 pg being deep scrubbed) all is green again - woohoo!

And now I will shake things up a bit again by replacing the two smaller HDDs in Node2 with the one new 22TB HDD (and hope this goes smoothly).

Thanks for your support!

24 Scrub Errors, 3pgs inconsistent

Renowned Member

Renowned Member

Famous Member

Renowned Member

Famous Member

Renowned Member

Famous Member

Renowned Member

Famous Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Famous Member

Renowned Member

Famous Member

Renowned Member

Famous Member

Renowned Member

Renowned Member

We value your privacy