PVE Ceph Stuck

rvalencia0

New Member
Apr 21, 2022
16
0
1
A couple of hours ago, I made a configuration with 3 SSDs and 4 SATA disks and added them to the ceph configuration, at first, everything seemed fine, however now I am not able to access the graphical interface of proxmox in one of the servers, however I have been able to access through another server in the cluster, but it does not allow me to make any changes. When I try to view the status of ceph, it does not show me any information, as if it was stuck. To make any modification, I must do it through CLI since otherwise I cannot see the OSD or POOLS created in proxmox. If someone could help me (at cost) ASAP that would be great.
 
# ceph -s
cluster:
id: 45f49255-3148-4086-8c7d-eb470216a1cc
health: HEALTH_WARN
1 pools have many more objects per pg than average
Reduced data availability: 372 pgs inactive
Degraded data redundancy: 35/61538 objects degraded (0.057%), 31 pgs degraded, 372 pgs undersized

services:
mon: 2 daemons, quorum cls1-ca,clm-ca (age 9d)
mgr: cls1-ca(active, since 9d), standbys: clm-ca
osd: 7 osds: 7 up (since 16h), 7 in (since 16h); 12 remapped pgs

data:
pools: 3 pools, 385 pgs
objects: 30.77k objects, 120 GiB
usage: 247 GiB used, 53 TiB / 54 TiB avail
pgs: 96.623% pgs not active
35/61538 objects degraded (0.057%)
1/61538 objects misplaced (0.002%)
341 undersized+peered
31 undersized+degraded+peered
12 active+clean+remapped
1 active+clean

progress:
PG autoscaler decreasing pool 3 PGs from 256 to 32 (16h)
[............................]
PG autoscaler decreasing pool 4 PGs from 128 to 32 (16h)
[............................]
 
Seems like I need some additional information. How many nodes do you have in your ceph cluster? Is this running on one node or do you have multiple?

You have 2 monitors running, which is a bit iffy because if one monitor fails the cluster loses quorum. For production systems a minimum of 3 monitors is recommended.

Can you post the outputs of:
ceph health detail
 
I have 6 servers on the cluster, and ceph was installed a few weeks ago in order to move VM to ceph storage but still locally on each server.
We recently add one server to provide ssd storage to the ceph cluster and create 2 pools, one for hdd and another for ssd. this is not a monitor server yet but we can add all server to monitor ceph to avoid quorum losses.

Here is the output of health details:

HEALTH_WARN 1 pools have many more objects per pg than average; Reduced data availability: 372 pgs inactive; Degraded data redundancy: 35/61538 objects degraded (0.057%), 31 pgs degraded, 372 pgs undersized
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than average
pool device_health_metrics objects per pg (30733) is more than 389.025 times cluster average (79)
[WRN] PG_AVAILABILITY: Reduced data availability: 372 pgs inactive
pg 3.61 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.62 is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.63 is stuck inactive for 15h, current state undersized+peered, last acting [3]
pg 3.64 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.65 is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.66 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.67 is stuck inactive for 15h, current state undersized+peered, last acting [3]
pg 3.68 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.69 is stuck inactive for 15h, current state undersized+peered, last acting [3]
pg 3.6a is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.6b is stuck inactive for 15h, current state undersized+peered, last acting [2]
pg 3.6c is stuck inactive for 15h, current state undersized+peered, last acting [2]
pg 3.6d is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.6e is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.6f is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.70 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.71 is stuck inactive for 15h, current state undersized+peered, last acting [3]
pg 3.72 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.73 is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.74 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.75 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [1]
pg 3.76 is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.77 is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.78 is stuck inactive for 15h, current state undersized+degraded+peered, last acting [0]
pg 3.79 is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.7a is stuck inactive for 15h, current state undersized+peered, last acting [0]
pg 3.fd is stuck inactive for 15h, current state undersized+peered, last acting [1]
pg 3.ff is stuck inactive for 15h, current state undersized+degraded+peered, last acting [0]
pg 4.60 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.61 is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.62 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.63 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.64 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.65 is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.66 is stuck inactive for 15h, current state undersized+peered, last acting [4]
pg 4.68 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.69 is stuck inactive for 15h, current state undersized+peered, last acting [4]
pg 4.6a is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.6b is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.6c is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.6d is stuck inactive for 15h, current state undersized+peered, last acting [4]
pg 4.6e is stuck inactive for 15h, current state undersized+peered, last acting [4]
pg 4.6f is stuck inactive for 15h, current state undersized+peered, last acting [4]
pg 4.70 is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.71 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.72 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.73 is stuck inactive for 15h, current state undersized+peered, last acting [4]
pg 4.74 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.75 is stuck inactive for 15h, current state undersized+peered, last acting [6]
pg 4.76 is stuck inactive for 15h, current state undersized+peered, last acting [5]
pg 4.77 is stuck inactive for 15h, current state undersized+peered, last acting [5]
[WRN] PG_DEGRADED: Degraded data redundancy: 35/61538 objects degraded (0.057%), 31 pgs degraded, 372 pgs undersized
pg 3.61 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.62 is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.63 is stuck undersized for 15h, current state undersized+peered, last acting [3]
pg 3.64 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.65 is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.66 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.67 is stuck undersized for 15h, current state undersized+peered, last acting [3]
pg 3.68 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.69 is stuck undersized for 15h, current state undersized+peered, last acting [3]
pg 3.6a is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.6b is stuck undersized for 15h, current state undersized+peered, last acting [2]
pg 3.6c is stuck undersized for 15h, current state undersized+peered, last acting [2]
pg 3.6d is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.6e is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.6f is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.70 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.71 is stuck undersized for 15h, current state undersized+peered, last acting [3]
pg 3.72 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.73 is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.74 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.75 is stuck undersized for 15h, current state undersized+degraded+peered, last acting [1]
pg 3.76 is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.77 is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.78 is stuck undersized for 15h, current state undersized+degraded+peered, last acting [0]
pg 3.79 is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.7a is stuck undersized for 15h, current state undersized+peered, last acting [0]
pg 3.fd is stuck undersized for 15h, current state undersized+peered, last acting [1]
pg 3.ff is stuck undersized for 15h, current state undersized+degraded+peered, last acting [0]
pg 4.60 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.61 is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.62 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.63 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.64 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.65 is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.66 is stuck undersized for 15h, current state undersized+peered, last acting [4]
pg 4.68 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.69 is stuck undersized for 15h, current state undersized+peered, last acting [4]
pg 4.6a is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.6b is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.6c is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.6d is stuck undersized for 15h, current state undersized+peered, last acting [4]
pg 4.6e is stuck undersized for 15h, current state undersized+peered, last acting [4]
pg 4.6f is stuck undersized for 15h, current state undersized+peered, last acting [4]
pg 4.70 is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.71 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.72 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.73 is stuck undersized for 15h, current state undersized+peered, last acting [4]
pg 4.74 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.75 is stuck undersized for 15h, current state undersized+peered, last acting [6]
pg 4.76 is stuck undersized for 15h, current state undersized+peered, last acting [5]
pg 4.77 is stuck undersized for 15h, current state undersized+peered, last acting [5]
 
can I destroy all osd and pool in order to fix this issue? I have no important data on the ceph storage.
 
I have my suspicions, can you maybe also provide the output of ceph osd tree ? Could it be that the new pools are only available on one host? That would explain the problem in my opinion. The output of the command should clarify this.
 
here is the osd tree:

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 53.55347 root default
-7 2.61987 host clc5-ca
4 ssd 0.87329 osd.4 up 1.00000 1.00000
5 ssd 0.87329 osd.5 up 1.00000 1.00000
6 ssd 0.87329 osd.6 up 1.00000 1.00000
-3 50.93359 host cls1-ca
0 hdd 12.73340 osd.0 up 1.00000 1.00000
1 hdd 12.73340 osd.1 up 1.00000 1.00000
2 hdd 12.73340 osd.2 up 1.00000 1.00000
3 hdd 12.73340 osd.3 up 1.00000 1.00000
 
So I think the issue here is that you have one host with only ssd storage and one host with hdd storage. When ceph tries to replicate the objects from the pool it cannot find another host that also supports hdd/ssd storage, which is why I think you have the problems you describe.
 
Additionally your crushmap would be interesting, you can retrieve it via GUI: node -> Ceph -> Configuration, the right column should contain the crushmap.


just for clarification with 6 nodes you mean the PVE cluster or the ceph cluster itself?
 
6 nodes are on the proxmox cluster.
The idea was to create an ssdpool to move the virtual machines of the computing nodes, these have 4 ssd disks of which 3 would be added to the pool gradually to grow it. this is the reason why there is only 1 node with ssd disks and 1 with hdd disks, since the intention is to add more disks by reusing the existing nodes in the next cluster.

here is the crushmap:

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class ssd
device 5 osd.5 class ssd
device 6 osd.6 class ssd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root

# buckets
host cls1-ca {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
id -10 class ssd # do not change unnecessarily
# weight 50.934
alg straw2
hash 0 # rjenkins1
item osd.0 weight 12.733
item osd.1 weight 12.733
item osd.2 weight 12.733
item osd.3 weight 12.733
}
host clc5-ca {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
id -11 class ssd # do not change unnecessarily
# weight 2.620
alg straw2
hash 0 # rjenkins1
item osd.5 weight 0.873
item osd.6 weight 0.873
item osd.4 weight 0.873
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
id -12 class ssd # do not change unnecessarily
# weight 53.553
alg straw2
hash 0 # rjenkins1
item cls1-ca weight 50.934
item clc5-ca weight 2.620
}

# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule ssdpool {
id 1
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 0 type host
step emit
}
rule hddpool {
id 2
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}

# end crush map
 
Can I delete all pools and osd to fix it?
Yes you can, but with your placement rules (depending on if they are configured for your storage pool) I think you will just run into the exact same problems again. In order to get a basic Ceph cluster running you should have at least 3 nodes in your cluster where all 3 have an OSD with class hdd and ssd.

I assume you are using the hddpool and ssdpool rules for your storage pools. If that is the case then Ceph will run into problems since there is only one host with hdd and one host with ssd class devices available. Ceph in its default configuration wants to have all objects replicated twice, but AT LEAST once. This is not possible with your configuration since there is only one host with hdd/ssd available so Ceph cannot distribute your object among multiple hdd/ssd OSDs.

It is advisable to distribute the ssds and hdds between the hosts evenly, so every host has at least one ssd and one hdd available. Again, you should have at least 3 nodes in your cluster otherwise it doesn't really make sense to run a Ceph cluster with less. Even for testing purposes.
 
At this moment there is only one node, but as I said, the intention is to add 6 more nodes with 3 ssd disks each, in the end, there will be 7 nodes contributing 3 disks (21 ssd in total). Additionally, one more node with hdd disks is being integrated to complement the current one. The question in this case is: If I keep the current configuration, could I integrate the nodes later? And if so, would you not have problems adding them later?
 
At this moment there is only one node, but as I said, the intention is to add 6 more nodes with 3 ssd disks each, in the end, there will be 7 nodes contributing 3 disks (21 ssd in total). Additionally, one more node with hdd disks is being integrated to complement the current one. The question in this case is: If I keep the current configuration, could I integrate the nodes later? And if so, would you not have problems adding them later?
Adding those additional nodes with ssds/hdds should fix the problems you are seeing as far as I can tell. If it doesn't you can still nuke the nodes and recreate the ceph cluster. But if you just want to remove it, then you can also just kill the cluster and then set it up with your new nodes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!