[SOLVED] Ceph Health Warning

ssaman

Active Member
Oct 28, 2015
38
2
28
Current state:
Code:
HEALTH_WARN 2202024/8010258 objects misplaced (27.490%); Degraded data redundancy: 1529819/8010258 objects degraded (19.098%), 297 pgs degraded, 292 pgs undersized
OBJECT_MISPLACED 2202024/8010258 objects misplaced (27.490%)
PG_DEGRADED Degraded data redundancy: 1529819/8010258 objects degraded (19.098%), 297 pgs degraded, 292 pgs undersized
    pg 10.1a1 is active+undersized+degraded+remapped+backfill_wait, acting [9,0]
    pg 10.1a2 is stuck undersized for 1080.687532, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,2]
    pg 10.1a3 is stuck undersized for 1094.744400, current state active+undersized+degraded+remapped+backfill_wait, last acting [8,2]
    pg 10.1a5 is stuck undersized for 1094.536700, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,7]
    pg 10.1a6 is stuck undersized for 1107.473438, current state active+undersized+degraded+remapped+backfill_wait, last acting [6,0]
    pg 10.1a7 is stuck undersized for 1081.659077, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,9]
    pg 10.1aa is stuck undersized for 1161.469348, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,6]
    pg 10.1ab is stuck undersized for 1081.685938, current state active+undersized+degraded+remapped+backfill_wait, last acting [8,2]
    pg 10.1ac is stuck undersized for 1080.660866, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,9]
    pg 10.1ae is stuck undersized for 1081.681883, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,8]
    pg 10.1af is stuck undersized for 1093.752812, current state active+undersized+degraded+remapped+backfill_wait, last acting [6,1]
    pg 10.1b0 is stuck undersized for 1080.694461, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,2]
    pg 10.1b1 is stuck undersized for 1080.691981, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,2]
    pg 10.1b3 is stuck undersized for 1107.482624, current state active+undersized+degraded+remapped+backfill_wait, last acting [8,2]
    pg 10.1b4 is stuck undersized for 1161.475753, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,3]
    pg 10.1b5 is stuck undersized for 1081.701087, current state active+undersized+degraded+remapped+backfill_wait, last acting [6,3]
    pg 10.1b7 is stuck undersized for 1080.664060, current state active+undersized+degraded+remapped+backfill_wait, last acting [1,8]
    pg 10.1b9 is stuck undersized for 1080.675157, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,6]
    pg 10.1ba is stuck undersized for 1080.700778, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,1]
    pg 10.1bb is stuck undersized for 1081.679482, current state active+undersized+degraded+remapped+backfill_wait, last acting [8,1]
    pg 10.1be is stuck undersized for 1094.541709, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,7]
    pg 10.1c0 is stuck undersized for 1094.700153, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,7]
    pg 10.1c1 is stuck undersized for 1081.667484, current state active+undersized+degraded+remapped+backfill_wait, last acting [1,9]
    pg 10.1c5 is stuck undersized for 1080.697481, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,0]
    pg 10.1c8 is stuck undersized for 1093.659052, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,9]
    pg 10.1ca is stuck undersized for 1093.704534, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,8]
    pg 10.1cd is stuck undersized for 1081.683283, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,8]
    pg 10.1ce is stuck undersized for 1108.482487, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,8]
    pg 10.1d3 is stuck undersized for 1094.662001, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,9]
    pg 10.1d4 is stuck undersized for 1107.449976, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,8]
    pg 10.1d7 is stuck undersized for 1161.477670, current state active+undersized+degraded+remapped+backfill_wait, last acting [1,9]
    pg 10.1dc is stuck undersized for 1108.505061, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,0]
    pg 10.1df is stuck undersized for 1081.675267, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,9]
    pg 10.1e0 is stuck undersized for 1093.747868, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,1]
    pg 10.1e1 is stuck undersized for 1094.657532, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,6]
    pg 10.1e3 is stuck undersized for 1093.728888, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,6]
    pg 10.1e4 is stuck undersized for 1107.466004, current state active+undersized+degraded+remapped+backfill_wait, last acting [1,9]
    pg 10.1e5 is stuck undersized for 1093.739119, current state active+undersized+degraded+remapped+backfill_wait, last acting [1,9]
    pg 10.1e7 is stuck undersized for 1107.490762, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,1]
    pg 10.1e8 is stuck undersized for 1081.670516, current state active+undersized+degraded+remapped+backfill_wait, last acting [3,9]
    pg 10.1e9 is stuck undersized for 1080.703190, current state active+undersized+degraded+remapped+backfill_wait, last acting [7,2]
    pg 10.1eb is stuck undersized for 1080.700021, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,1]
    pg 10.1ec is stuck undersized for 1080.691338, current state active+undersized+degraded+remapped+backfill_wait, last acting [6,1]
    pg 10.1f1 is stuck undersized for 1080.662770, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,6]
    pg 10.1f2 is stuck undersized for 1107.476919, current state active+undersized+degraded+remapped+backfill_wait, last acting [8,1]
    pg 10.1f3 is stuck undersized for 1081.655963, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,7]
    pg 10.1f5 is stuck undersized for 1107.469215, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,7]
    pg 10.1f6 is stuck undersized for 1093.755114, current state active+undersized+degraded+remapped+backfill_wait, last acting [9,2]
    pg 10.1f8 is stuck undersized for 1107.488552, current state active+undersized+degraded+remapped+backfill_wait, last acting [8,3]
    pg 10.1fb is stuck undersized for 1081.680847, current state active+undersized+degraded+remapped+backfill_wait, last acting [0,9]
    pg 10.1fc is stuck undersized for 1093.714856, current state active+undersized+degraded+remapped+backfill_wait, last acting [2,7]
Code:
{
    "mon": {
        "ceph version 12.2.11 (c96e82ac735a75ae99d4847983711e1f2dbf12e5) luminous (stable)": 3
    },
    "mgr": {
        "ceph version 12.2.11 (c96e82ac735a75ae99d4847983711e1f2dbf12e5) luminous (stable)": 4
    },
    "osd": {
        "ceph version 12.2.11 (c96e82ac735a75ae99d4847983711e1f2dbf12e5) luminous (stable)": 14
    },
    "mds": {},
    "overall": {
        "ceph version 12.2.11 (c96e82ac735a75ae99d4847983711e1f2dbf12e5) luminous (stable)": 21
    }
}
Code:
ID  CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
 -1       75.09892 root default
 -3       31.43779     host node1
  0   hdd  7.27730         osd.0      up  1.00000 1.00000
  1   hdd  7.27730         osd.1      up  1.00000 1.00000
  2   hdd  7.27730         osd.2      up  1.00000 1.00000
  3   hdd  7.27730         osd.3      up  1.00000 1.00000
  4   ssd  1.45540         osd.4      up  1.00000 1.00000
  5   ssd  0.87320         osd.5      up  1.00000 1.00000
 -7       21.83057     host node2
  6   hdd  5.45740         osd.6      up  1.00000 1.00000
  7   hdd  5.45740         osd.7      up  1.00000 1.00000
  8   hdd  5.45789         osd.8      up  1.00000 1.00000
  9   hdd  5.45789         osd.9      up  1.00000 1.00000
-10       21.83057     host node3
 10   hdd  5.45740         osd.10     up  1.00000 1.00000
 11   hdd  5.45740         osd.11     up  1.00000 1.00000
 12   hdd  5.45789         osd.12     up  1.00000 1.00000
 13   hdd  5.45789         osd.13     up  1.00000 1.00000
Code:
pool 10 'hdd_mainpool' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 512 pgp_num 512 last_change 222 flags hashpspool stripe_width 0 application rbd
        removed_snaps [1~3]
Code:
  cluster:
    id:     e999c2ba-bd91-41d1-92b1-c7874b4b2b40
    health: HEALTH_WARN
            2203136/8010258 objects misplaced (27.504%)
            Degraded data redundancy: 1530933/8010258 objects degraded (19.112%), 298 pgs degraded, 293 pgs undersized

  services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: c6-node1(active), standbys: node1, node2, node3
    osd: 14 osds: 14 up, 14 in; 512 remapped pgs

  data:
    pools:   1 pools, 512 pgs
    objects: 2.67M objects, 10.2TiB
    usage:   26.0TiB used, 49.1TiB / 75.1TiB avail
    pgs:     1530933/8010258 objects degraded (19.112%)
             2203136/8010258 objects misplaced (27.504%)
             292 active+undersized+degraded+remapped+backfill_wait
             214 active+remapped+backfill_wait
             5   active+recovery_wait+degraded+remapped
             1   active+undersized+degraded+remapped+backfilling

  io:
    client:   104KiB/s rd, 2.72MiB/s wr, 11op/s rd, 438op/s wr
    recovery: 25.8MiB/s, 6objects/s
Background: we had a wrong calculated pool for node1.
after we added two nodes to the cluster. We created a new pool with those settings you see.
After that, we moved all VM-Disks to the new pool and destoreyed the old one.
Sadly we have now this warning.

I hope someone can help us.
 
Can you please share your crush rule for the pool?
The 'ceph osd tree' shows an uneven distribution of your OSDs, the smaller nodes will limit the amount of data that can be stored.

Background: we had a wrong calculated pool for node1.
after we added two nodes to the cluster. We created a new pool with those settings you see.
A pool is always for the whole cluster. Depending on your settings, the recovery would take place anyhow.
 
We use replicated_hdd for the pool
Code:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class ssd
device 5 osd.5 class ssd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host node1 {
   id -3       # do not change unnecessarily
   id -2 class hdd       # do not change unnecessarily
   id -5 class ssd       # do not change unnecessarily
   # weight 31.438
   alg straw2
   hash 0   # rjenkins1
   item osd.0 weight 7.277
   item osd.1 weight 7.277
   item osd.2 weight 7.277
   item osd.3 weight 7.277
   item osd.4 weight 1.455
   item osd.5 weight 0.873
}
host node2 {
   id -7       # do not change unnecessarily
   id -8 class hdd       # do not change unnecessarily
   id -9 class ssd       # do not change unnecessarily
   # weight 21.831
   alg straw2
   hash 0   # rjenkins1
   item osd.6 weight 5.457
   item osd.7 weight 5.457
   item osd.8 weight 5.458
   item osd.9 weight 5.458
}
host node3 {
   id -10       # do not change unnecessarily
   id -11 class hdd       # do not change unnecessarily
   id -12 class ssd       # do not change unnecessarily
   # weight 21.831
   alg straw2
   hash 0   # rjenkins1
   item osd.10 weight 5.457
   item osd.11 weight 5.457
   item osd.12 weight 5.458
   item osd.13 weight 5.458
}
root default {
   id -1       # do not change unnecessarily
   id -4 class hdd       # do not change unnecessarily
   id -6 class ssd       # do not change unnecessarily
   # weight 75.099
   alg straw2
   hash 0   # rjenkins1
   item node1 weight 31.438
   item node2 weight 21.831
   item node3 weight 21.831
}

# rules
rule replicated_rule {
   id 0
   type replicated
   min_size 1
   max_size 10
   step take default
   step chooseleaf firstn 0 type host
   step emit
}
rule replicated_ssd {
   id 1
   type replicated
   min_size 1
   max_size 10
   step take default class ssd
   step chooseleaf firstn 0 type host
   step emit
}
rule replicated_hdd {
   id 2
   type replicated
   min_size 1
   max_size 10
   step take default class hdd
   step chooseleaf firstn 0 type host
   step emit
}

# end crush map
Logs
 
After that, we moved all VM-Disks to the new pool and destoreyed the old one.
How did you move the disks? With 'move disk' or directly through ceph?
 
How was the timeline for ceph? Did you first add the OSDs and then create the pool or was it the other way around?
 
  1. added OSDs on node1
  2. created new pool (old pool)
  3. created VMs on node1
  4. addes node2
  5. created new pool with current settings
  6. moved disks from old pool to current pool
  7. removed unused disks (old pool) over GUI except 2 VMs
  8. destoreyed old pool
  9. removed last unused disks (from the 2 VMs bevor) from old pool
 
I assume the recovery has gone forward?

4. addes node2
Does this mean, you added the OSDs to node2 & node3? If so, then the creation of the first pool (the destroyed one) never completed properly. The min_size of 2 should have made it not possible to write to the pool in the first place.
 
The old pool had "min_size 1" because it was a temporary pool to add VMs from an old cluster.

Yes, we added node2 and node3 to the cluster and also added the disks(OSDs) on the nodes to the new pool.
 
Besides the data movement, does the recovery go forward?
 
Yes, ceph health getting better.
Since Yesterday:
Code:
# ceph health
HEALTH_WARN 1987253/8010258 objects misplaced (24.809%); Degraded data redundancy: 970715/8010258 objects degraded (12.118%), 187 pgs degraded, 187 pgs undersized
less misplaced and degraded data.

Edit:
Do you mean we just have to wait until ceph recovers by himself?
 
Do you mean we just have to wait until ceph recovers by himself?
:rolleyes::D

Well that's one of Ceph's features, self-healing. On the beginning of the thread the recovery did 25 MB/s, it had then roughly 4TB to recover. So, the recovery time could be around 2 days, if the recovery speed keeps stady at 25 MB/s.

The recovery might be accelerated but this usually comes at the cost of interfering with client I/O.
https://forum.proxmox.com/threads/increase-ceph-recovery-speed.36728/
 
  • Like
Reactions: ssaman

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!