24 Scrub Errors, 3pgs inconsistent

proxwolfe

Active Member
Jun 20, 2020
449
38
33
49
Hi,

On my 3 node home lab cluster, Ceph tells me that it has discovered 24 scrub errors and that 3 pgs are inconsistent. That does not sound overly promising.

More importantly, however, I have no idea what to do with this information...

I take it that the nature of this error prevents Ceph from fixing it itself. But what can I do?

Of course, all important stuff is backed up. But how can I identify which VMs(' data) is affected so that I can restore that from the backups?

Or is there anything else I could do?

Thanks!
 
I take it that the nature of this error prevents Ceph from fixing it itself. But what can I do?
You can try the ceph pg repair PGID.

See: https://docs.ceph.com/en/latest/rados/operations/pg-repair/

But how can I identify which VMs(' data) is affected so that I can restore that from the backups?
Since your cluster does not consist of several hundred servers and thousands of OSDs, you can assume that all of your pools are affected and therefore potentially all VMs.

But you can find out in detail with the following command:
ceph osd map POOLNAME DISKNAME

Then you have to compare whether the result is on your of the three affected PGs.
But it doesn't have to mean that you have to do a restore right away. If you are working with Replica 3, a repair will probably bring you back to a consistent state or you may have to remove or replace a disk.

can you post the output from ceph -s and ceph health detail?
If you send this to us we can definitely help you better.
 
So, scrubbing has continued and apparently, it identified two more errors:

ceph -s
cluster: id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx health: HEALTH_ERR 26 scrub errors Possible data damage: 3 pgs inconsistent services: mon: 3 daemons, quorum node1,node3,node2 (age 3w) mgr: node2(active, since 3w), standbys: node1, node3 mds: 2/2 daemons up, 1 standby osd: 14 osds: 11 up (since 5d), 11 in (since 9d) data: volumes: 2/2 healthy pools: 8 pools, 417 pgs objects: 4.00M objects, 15 TiB usage: 45 TiB used, 27 TiB / 72 TiB avail pgs: 414 active+clean 3 active+clean+inconsistent io: client: 3.3 KiB/s rd, 1.0 MiB/s wr, 1 op/s rd, 99 op/s wr

ceph health

HEALTH_ERR 26 scrub errors; Possible data damage: 3 pgs inconsistent

Should I try ceph pg repair PGID? Or does any of the above suggest a different approach?

Thanks!
 
osd: 14 osds: 11 up (since 5d), 11 in (since 9d)
There seem to be 3 OSDs missing!?

Please post the output of the following commands (please us Code-Tag [CODE][/CODE] and not Inline Code Tag):
Code:
ceph health detail
ceph pg dump_stuck inactive
ceph osd df tree
 
There seem to be 3 OSDs missing!?

Yes, they were on another node that I have since removed but missed the window to remove the OSDs first... I haven't got around to delete them from <whererever>. They don't play a role here.

ceph health detail
HEALTH_ERR 26 scrub errors; Possible data damage: 3 pgs inconsistent [ERR] OSD_SCRUB_ERRORS: 26 scrub errors [ERR] PG_DAMAGED: Possible data damage: 3 pgs inconsistent pg 3.13 is active+clean+inconsistent, acting [1,3,16] pg 3.36 is active+clean+inconsistent, acting [3,1,16] pg 3.48 is active+clean+inconsistent, acting [1,16,3]

ceph pg dump_stuck inactive
ok

ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 86.70790 - 72 TiB 45 TiB 45 TiB 410 KiB 84 GiB 27 TiB 62.51 1.00 - root default -3 25.24937 - 25 TiB 15 TiB 15 TiB 122 KiB 25 GiB 10 TiB 59.50 0.95 - host node1 16 hdd 20.00969 1.00000 20 TiB 13 TiB 13 TiB 21 KiB 21 GiB 6.9 TiB 65.74 1.05 192 up osd.16 6 nvme 1.74660 1.00000 1.7 TiB 169 GiB 168 GiB 43 KiB 675 MiB 1.6 TiB 9.44 0.15 32 up osd.6 11 ssd 3.49309 1.00000 3.5 TiB 1.7 TiB 1.7 TiB 58 KiB 3.4 GiB 1.8 TiB 48.80 0.78 193 up osd.11 -9 23.43108 - 23 TiB 15 TiB 15 TiB 141 KiB 32 GiB 8.4 TiB 64.15 1.03 - host node2 3 hdd 5.45799 1.00000 5.5 TiB 4.2 TiB 4.2 TiB 8 KiB 7.5 GiB 1.2 TiB 77.21 1.24 62 up osd.3 8 hdd 12.73340 1.00000 13 TiB 8.9 TiB 8.9 TiB 36 KiB 19 GiB 3.8 TiB 70.25 1.12 130 up osd.8 10 nvme 1.74660 1.00000 1.7 TiB 170 GiB 168 GiB 42 KiB 1.9 GiB 1.6 TiB 9.51 0.15 32 up osd.10 13 ssd 3.49309 1.00000 3.5 TiB 1.7 TiB 1.7 TiB 55 KiB 3.6 GiB 1.8 TiB 48.80 0.78 193 up osd.13 -5 23.43108 - 23 TiB 15 TiB 15 TiB 147 KiB 28 GiB 8.4 TiB 64.13 1.03 - host node3 0 hdd 5.45799 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 11 KiB 7.3 GiB 1.4 TiB 73.45 1.17 62 up osd.0 1 hdd 12.73340 1.00000 13 TiB 9.1 TiB 9.1 TiB 26 KiB 16 GiB 3.6 TiB 71.84 1.15 130 up osd.1 5 nvme 1.74660 1.00000 1.7 TiB 169 GiB 168 GiB 53 KiB 660 MiB 1.6 TiB 9.44 0.15 32 up osd.5 14 ssd 3.49309 1.00000 3.5 TiB 1.7 TiB 1.7 TiB 57 KiB 3.7 GiB 1.8 TiB 48.81 0.78 193 up osd.14 -7 14.59637 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host obsolete 2 hdd 12.73340 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down osd.2 4 nvme 0.93149 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down osd.4 12 nvme 0.93149 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down osd.12 TOTAL 72 TiB 45 TiB 45 TiB 414 KiB 84 GiB 27 TiB 62.51
 
Sorry, I can't really read it on my smartphone. As already noted, please use the code tag and not inline code tag. Then you can read it and the spacing remains.
 
Sorry, I had missed that. I edited my post above accordingly.
It's still the inline code tag :)

This is the code tag:
Code:
ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME           
-1         86.70790         -   72 TiB   45 TiB   45 TiB  410 KiB   84 GiB   27 TiB  62.51  1.00    -          root default       
-3         25.24937         -   25 TiB   15 TiB   15 TiB  122 KiB   25 GiB   10 TiB  59.50  0.95    -              host node1
16    hdd  20.00969   1.00000   20 TiB   13 TiB   13 TiB   21 KiB   21 GiB  6.9 TiB  65.74  1.05  192      up          osd.16     
 6   nvme   1.74660   1.00000  1.7 TiB  169 GiB  168 GiB   43 KiB  675 MiB  1.6 TiB   9.44  0.15   32      up          osd.6       
11    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   58 KiB  3.4 GiB  1.8 TiB  48.80  0.78  193      up          osd.11     
-9         23.43108         -   23 TiB   15 TiB   15 TiB  141 KiB   32 GiB  8.4 TiB  64.15  1.03    -              host node2
 3    hdd   5.45799   1.00000  5.5 TiB  4.2 TiB  4.2 TiB    8 KiB  7.5 GiB  1.2 TiB  77.21  1.24   62      up          osd.3       
 8    hdd  12.73340   1.00000   13 TiB  8.9 TiB  8.9 TiB   36 KiB   19 GiB  3.8 TiB  70.25  1.12  130      up          osd.8       
10   nvme   1.74660   1.00000  1.7 TiB  170 GiB  168 GiB   42 KiB  1.9 GiB  1.6 TiB   9.51  0.15   32      up          osd.10     
13    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   55 KiB  3.6 GiB  1.8 TiB  48.80  0.78  193      up          osd.13     
-5         23.43108         -   23 TiB   15 TiB   15 TiB  147 KiB   28 GiB  8.4 TiB  64.13  1.03    -              host node3
 0    hdd   5.45799   1.00000  5.5 TiB  4.0 TiB  4.0 TiB   11 KiB  7.3 GiB  1.4 TiB  73.45  1.17   62      up          osd.0       
 1    hdd  12.73340   1.00000   13 TiB  9.1 TiB  9.1 TiB   26 KiB   16 GiB  3.6 TiB  71.84  1.15  130      up          osd.1       
 5   nvme   1.74660   1.00000  1.7 TiB  169 GiB  168 GiB   53 KiB  660 MiB  1.6 TiB   9.44  0.15   32      up          osd.5       
14    ssd   3.49309   1.00000  3.5 TiB  1.7 TiB  1.7 TiB   57 KiB  3.7 GiB  1.8 TiB  48.81  0.78  193      up          osd.14     
-7         14.59637         -      0 B      0 B      0 B      0 B      0 B      0 B      0     0    -              host obsolete
 2    hdd  12.73340         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.2       
 4   nvme   0.93149         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.4       
12   nvme   0.93149         0      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.12     
                        TOTAL   72 TiB   45 TiB   45 TiB  414 KiB   84 GiB   27 TiB  62.51

Yes, they were on another node that I have since removed but missed the window to remove the OSDs first... I haven't got around to delete them from <whererever>. They don't play a role here.
You shouldn't leave it like this, as it will potentially continue to have an impact on your crush map.

See: https://docs.ceph.com/en/latest/rados/operations/crush-map/#modifying-the-crush-map

ceph pg dump_stuck inactive
ok
Once you weren't paying attention... "inactive" obviously doesn't fit. Take a look at what outputs on ceph pg dump_stuck unclean. If not, then it doesn't matter.

Have you checked whether the affected OSDs (1, 3 and 16) have any abnormalities? Are there negative SMART values or is there something in the syslog?

Please run the following commands:
Code:
rados list-inconsistent-obj 3.13 --format=json-pretty
rados list-inconsistent-obj 3.36 --format=json-pretty
rados list-inconsistent-obj 3.48 --format=json-pretty

Before you even consider a repair, you should make sure that these errors are not caused by physical defects in the disks. If all of your affected OSDs are defective, CEPH can no longer necessarily help you maintain a consistent condition.

But I also assume that you run the HDDs in their own pool and have not mixed them with SSD and NVMe. Please note that your current level will end in a complete catastrophe. You have two HDDs in nodes node2 and node3, with a replica of 3 CEPH will inevitably distribute the data only to the specific node. However, on node2 and node3 your HDDs are so full that the failure of one HDD can no longer be compensated for and will inevitably end in read-only. In order to ensure operation even in the event of a failure, your HDDs may have a maximum capacity of 42.5%, so in the event of a failure you are directly at near full, which is not healthy either, but currently you are directly at full.

With one NVMe and SSD per node, the problem does not exist because there are simply no more data storage devices available and the CEPH therefore does not carry out any healing. However, the problem arises with two data carriers of the same type.
 
  • Like
Reactions: fluxX04
You shouldn't leave it like this, as it will potentially continue to have an impact on your crush map.
Done.

Once you weren't paying attention... "inactive" obviously doesn't fit.
Sorry, I don't understand. Where/when wasn't I paying attention. And what do you mean that "'inactive' obiviously doesn't fit? Sorry, my Ceph knowledge is at "noob" level...

Take a look at what outputs on ceph pg dump_stuck unclean. If not, then it doesn't matter.
Code:
PG_STAT  STATE                                     UP        UP_PRIMARY  ACTING    ACTING_PRIMARY
3.76                  active+remapped+backfilling  [0,8,16]           0  [1,8,16]               1
3.6f                  active+remapped+backfilling  [16,3,1]          16  [16,3,0]              16
3.68                  active+remapped+backfilling  [8,16,0]           8  [8,16,1]               8
3.64                  active+remapped+backfilling  [0,3,16]           0  [1,16,8]               1
3.14                  active+remapped+backfilling  [3,1,16]           3  [0,16,8]               0
3.22                  active+remapped+backfilling  [3,16,1]           3  [3,16,0]               3
3.5c                  active+remapped+backfilling  [3,1,16]           3  [3,16,0]               3
3.57                  active+remapped+backfilling  [0,16,8]           0  [1,16,8]               1
3.39                  active+remapped+backfilling  [8,0,16]           8  [0,16,3]               0
3.23                  active+remapped+backfilling  [8,0,16]           8  [0,16,3]               0
3.3a                  active+remapped+backfilling  [16,1,3]          16  [16,1,8]              16
3.2f                  active+remapped+backfilling  [1,16,8]           1  [0,16,8]               0
3.43                  active+remapped+backfilling  [3,16,1]           3  [3,16,0]               3
3.36     active+remapped+inconsistent+backfilling  [3,0,16]           3  [3,16,1]               3
3.48     active+remapped+inconsistent+backfilling  [1,16,8]           1  [1,16,3]               1
3.2d                  active+remapped+backfilling  [16,1,3]          16  [16,1,8]              16
3.2c                  active+remapped+backfilling  [16,0,8]          16  [16,8,1]              16
3.45                  active+remapped+backfilling  [0,16,8]           0  [0,16,3]               0
3.2a                  active+remapped+backfilling  [16,8,1]          16  [16,1,3]              16
ok

There is suddenly a lot more going on, which I believe was triggered by my removing the OSDs (and the respective node) from the crush map that didn't exist anymore.

Have you checked whether the affected OSDs (1, 3 and 16) have any abnormalities? Are there negative SMART values or is there something in the syslog?
They all seem to be performing perfectly fine. No SMART errors or abnormal values.

rados list-inconsistent-obj 3.13 --format=json-pretty
Code:
{
    "epoch": 35190,
    "inconsistents": [
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000ac9f6",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 253390
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000ac9f6",
                    "key": "",
                    "snapid": -2,
                    "hash": 3075686803,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'287249",
                "prior_version": "29581'253390",
                "last_reqid": "osd.1.0:6867405",
                "user_version": 253390,
                "size": 2867200,
                "mtime": "2023-08-01T02:02:00.479577+0200",
                "local_mtime": "2023-08-01T02:02:00.482681+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0x0b39ce3d",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 2867200,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x0b39ce3d"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 2867200
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 2867200,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x0b39ce3d"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000cfe64",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 261741
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000cfe64",
                    "key": "",
                    "snapid": -2,
                    "hash": 4027877011,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'286775",
                "prior_version": "29590'261741",
                "last_reqid": "osd.1.0:6865883",
                "user_version": 261741,
                "size": 4194304,
                "mtime": "2023-08-02T20:18:40.541011+0200",
                "local_mtime": "2023-08-02T20:18:40.542417+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0xba1280a7",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xba1280a7"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xba1280a7"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000d8103",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 262985
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000d8103",
                    "key": "",
                    "snapid": -2,
                    "hash": 1966136979,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'287141",
                "prior_version": "29591'262985",
                "last_reqid": "osd.1.0:6867081",
                "user_version": 262985,
                "size": 4194304,
                "mtime": "2023-08-02T23:14:57.219986+0200",
                "local_mtime": "2023-08-02T23:14:57.223069+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0x09cedf2b",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x09cedf2b"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x09cedf2b"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.08743f59d73563.00000000000e2696",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 264575
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.08743f59d73563.00000000000e2696",
                    "key": "",
                    "snapid": -2,
                    "hash": 2539886995,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "29607'287558",
                "prior_version": "29593'264575",
                "last_reqid": "osd.1.0:6868421",
                "user_version": 264575,
                "size": 4194304,
                "mtime": "2023-08-03T02:36:22.059766+0200",
                "local_mtime": "2023-08-03T02:36:22.064144+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0xf41e4424",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xf41e4424"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xf41e4424"
                }
            ]
        },
to be continued due to post size limitations...
 
Code:
        {
            "object": {
                "name": "rbd_data.453d0387ce7766.00000000001028e4",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 371918
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.453d0387ce7766.00000000001028e4",
                    "key": "",
                    "snapid": -2,
                    "hash": 3162397075,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "35232'371918",
                "prior_version": "29661'302824",
                "last_reqid": "client.55474939.0:843152145",
                "user_version": 371918,
                "size": 4194304,
                "mtime": "2024-01-02T21:49:09.007652+0100",
                "local_mtime": "2024-01-02T21:49:09.108641+0100",
                "lost": 0,
                "flags": [
                    "dirty",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0xffffffff",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xae6d213f"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4194304
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4194304,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xae6d213f"
                }
            ]
        },
        {
            "object": {
                "name": "rbd_data.453d0387ce7766.00000000001c7b4a",
                "nspace": "",
                "locator": "",
                "snap": "head",
                "version": 331944
            },
            "errors": [],
            "union_shard_errors": [
                "read_error"
            ],
            "selected_object_info": {
                "oid": {
                    "oid": "rbd_data.453d0387ce7766.00000000001c7b4a",
                    "key": "",
                    "snapid": -2,
                    "hash": 3892734611,
                    "max": 0,
                    "pool": 3,
                    "namespace": ""
                },
                "version": "30394'334313",
                "prior_version": "30206'331944",
                "last_reqid": "osd.1.0:12408390",
                "user_version": 331944,
                "size": 2523136,
                "mtime": "2023-09-29T14:07:21.843659+0200",
                "local_mtime": "2023-09-29T14:07:21.847260+0200",
                "lost": 0,
                "flags": [
                    "dirty",
                    "data_digest",
                    "omap_digest"
                ],
                "truncate_seq": 0,
                "truncate_size": 0,
                "data_digest": "0x5b6b4d3a",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 2523136,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x5b6b4d3a"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 2523136
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 2523136,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0x5b6b4d3a"
                }
            ]
        },
 
Code:
                "data_digest": "0xec9c58e1",
                "omap_digest": "0xffffffff",
                "expected_object_size": 4194304,
                "expected_write_size": 4194304,
                "alloc_hint_flags": 0,
                "manifest": {
                    "type": 0
                },
                "watchers": {}
            },
            "shards": [
                {
                    "osd": 1,
                    "primary": true,
                    "errors": [],
                    "size": 4145152,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xec9c58e1"
                },
                {
                    "osd": 3,
                    "primary": false,
                    "errors": [
                        "read_error"
                    ],
                    "size": 4145152
                },
                {
                    "osd": 16,
                    "primary": false,
                    "errors": [],
                    "size": 4145152,
                    "omap_digest": "0xffffffff",
                    "data_digest": "0xec9c58e1"
                }
            ]
        }
    ]
}
 
rados list-inconsistent-obj 3.36 --format=json-pretty

Code:
No scrub information available for pg 3.36
error 2: (2) No such file or directory

rados list-inconsistent-obj 3.48 --format=json-pretty

Code:
No scrub information available for pg 3.48
error 2: (2) No such file or directory

But I also assume that you run the HDDs in their own pool and have not mixed them with SSD and NVMe.
Correct.

Please note that your current level will end in a complete catastrophe. You have two HDDs in nodes node2 and node3, with a replica of 3 CEPH will inevitably distribute the data only to the specific node. However, on node2 and node3 your HDDs are so full that the failure of one HDD can no longer be compensated for and will inevitably end in read-only. In order to ensure operation even in the event of a failure, your HDDs may have a maximum capacity of 42.5%, so in the event of a failure you are directly at near full, which is not healthy either, but currently you are directly at full.

With one NVMe and SSD per node, the problem does not exist because there are simply no more data storage devices available and the CEPH therefore does not carry out any healing. However, the problem arises with two data carriers of the same type.

Thank you for the recommendation. I am aware of this problem and am in the process of rectifying. But this being a hobby project, I can't replace all disks at once and am acquiring one new disk per month. Node1 has been completed, Node2 is due this month and Node3 will have to wait until February.
 
Sorry, I don't understand. Where/when wasn't I paying attention. And what do you mean that "'inactive' obiviously doesn't fit? Sorry, my Ceph knowledge is at "noob" level...
Sorry, that wasn't directed at you but at me. I wasn't paying attention and sent you the wrong thing.
{ "osd": 3, "primary": false, "errors": [ "read_error" ], "size": 2867200 },
So there was a read error, which usually indicates that there is a problem with the data carrier. So you should examine the OSD.3 more closely or keep an eye on it.

Then you can now try whether the repair fixes the problem, but the error may remain persistent, as the read_error usually indicates a storage problem.
ceph pg repair 3.13

If this occurs frequently, it could also indicate a problem with your NTP service. You should definitely monitor your cluster for clock skew as well, if the time on the nodes runs differently, this can lead to a big problem.
 
So there was a read error, which usually indicates that there is a problem with the data carrier. So you should examine the OSD.3 more closely or keep an eye on it.
I just went through upgrading my cluster to PVE version 8. After the upgrade, OSD.3 did not come back online again (it kept being shown as "in" but "down", irrespective of how often I started it or the entire node).

I then checked the SMART values for the drive again and this time I did find several read errors (unsure how I could miss this the first time). So I replaced the drive and am now waiting for it to be filled again. Interestingly, CEPH's health status page now only shows 2 pgs as inconstistent.

(So what does it mean anyway when a pg is inconsistent? Does that mean that the other two drives holding the "same" data are actually holding different data? In that case, it would seem to be easy to repair the information on the incosistent pg...)

The remapping and backfilling is estimated to take between another 13 hours and a day. Let's see where this goes. I'll report back.
 
CEPH's health status page now only shows 2 pgs as inconstistent.
The PG 3.36 is gone, right?

(So what does it mean anyway when a pg is inconsistent? Does that mean that the other two drives holding the "same" data are actually holding different data? In that case, it would seem to be easy to repair the information on the incosistent pg...)
Here you can find some more information: https://docs.ceph.com/en/latest/rados/operations/pg-repair/#more-information-on-pg-repair

If that doesn't answer your question or you have more, ask again :)
However, I always think it is better to simply refer to the developer documentation if there is an article for it than to try to explain it myself.
 
3.36 is now gone as well.

3.13 is active, clean and inconsistent. (But there is one pg still being deep scrubbed - so maybe that is 3.13 and it will be gone soon, too?)

I'll report back.

BTW: Today, the second 22TB HDD for this pool arrived and I can replace the OSDs in one more node. But I will wait for the deep scrubbing to finish first.
 
3.13 is now gone as well. (1 pg is still being deep scrubbed)

So (other than the 1 pg being deep scrubbed) all is green again - woohoo!

And now I will shake things up a bit again by replacing the two smaller HDDs in Node2 with the one new 22TB HDD (and hope this goes smoothly).

Thanks for your support!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!