RBD storage 100% full

cmonty14

Well-Known Member
Mar 4, 2014
343
5
58
Hello!

I have successfully setup a PVE cluster with Ceph.

After creating ceph pools and related RBD storage I moved the VM's drive to this newly created RBD storage.

Due to some issues I needed to reboot all cluster nodes one after the other.
Since then the PVE storage reports that all RBD is 100% allocated.
This is impossible considering the available OSDs.

root@ld3955:~# pvesm status
Name Type Status Total Used Available %
iso dir active 20971520 6078812 13942084 28.99%
local dir active 20971520 6078812 13942084 28.99%
pve_ct rbd active 4765684 4765684 0 100.00%
pve_k8s rbd active 34049196 34049196 0 100.00%
pve_vm rbd active 4765684 4765684 0 100.00%


Can you please how to fix this?

I have already considered to move the VM's drive back to local storage via WebUI, but this is not working.
How can I perform this task in the CLI?

THX
 
What does a 'ceph -s' and a 'ceph osd df tree' show?
 
root@ld3955:~# ceph -s
cluster:
id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
health: HEALTH_WARN
1 MDSs report slow metadata IOs
34080/25026 objects misplaced (136.178%)
Reduced data availability: 5115 pgs inactive, 19 pgs peering
Degraded data redundancy: 5052 pgs undersized

services:
mon: 3 daemons, quorum ld5505,ld5506,ld5507
mgr: ld5506(active), standbys: ld5507, ld5505
mds: pve_cephfs-1/1/1 up {0=ld3955=up:creating}
osd: 268 osds: 268 up, 268 in; 5696 remapped pgs

data:
pools: 8 pools, 10880 pgs
objects: 8.34k objects, 32.5GiB
usage: 521GiB used, 448TiB / 449TiB avail
pgs: 46.829% pgs unknown
0.184% pgs not active
34080/25026 objects misplaced (136.178%)
5095 unknown
5052 active+undersized+remapped
644 active+clean+remapped
69 active+clean
19 creating+peering
1 creating+activating


root@ld3955:~# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-50 0 - 0B 0B 0B 0 0 - root hdd
-46 0 - 0B 0B 0B 0 0 - host ld5505-hdd
-47 0 - 0B 0B 0B 0 0 - host ld5506-hdd
-48 0 - 0B 0B 0B 0 0 - host ld5507-hdd
-49 0 - 0B 0B 0B 0 0 - host ld5508-hdd
-32 0 - 0B 0B 0B 0 0 - root nvme
-31 0 - 0B 0B 0B 0 0 - host ld5505-nvme
-37 0 - 0B 0B 0B 0 0 - host ld5506-nvme
-40 0 - 0B 0B 0B 0 0 - host ld5507-nvme
-43 0 - 0B 0B 0B 0 0 - host ld5508-nvme
-17 0 - 0B 0B 0B 0 0 - root hdd_strgbox
-16 0 - 0B 0B 0B 0 0 - host ld5505-hdd_strgbox
-22 0 - 0B 0B 0B 0 0 - host ld5506-hdd_strgbox
-25 0 - 0B 0B 0B 0 0 - host ld5507-hdd_strgbox
-28 0 - 0B 0B 0B 0 0 - host ld5508-hdd_strgbox
-1 446.36633 - 449TiB 521GiB 448TiB 0.11 1.00 - root default
-3 111.59158 - 112TiB 103GiB 112TiB 0.09 0.79 - host ld5505
8 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 0 osd.8
9 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 0 osd.9
10 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 2 osd.10
11 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 1 osd.11
12 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 0 osd.12
13 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 0 osd.13
14 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 0 osd.14
15 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 2 osd.15
[...]
121 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 1 osd.121
122 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 3 osd.122
123 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 2 osd.123
0 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 0 osd.0
1 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 0 osd.1
-7 111.59158 - 112TiB 123GiB 112TiB 0.11 0.94 - host ld5506
25 hdd 1.62723 1.00000 1.64TiB 3.13GiB 1.63TiB 0.19 1.65 3 osd.25
26 hdd 1.62723 1.00000 1.64TiB 2.55GiB 1.63TiB 0.15 1.34 2 osd.26
27 hdd 1.62723 1.00000 1.64TiB 3.46GiB 1.63TiB 0.21 1.82 4 osd.27
28 hdd 1.62723 1.00000 1.64TiB 2.61GiB 1.63TiB 0.16 1.38 2 osd.28
29 hdd 1.62723 1.00000 1.64TiB 2.00GiB 1.64TiB 0.12 1.05 3 osd.29
[...]
168 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 12 osd.168
169 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 5 osd.169
170 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 8 osd.170
171 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 12 osd.171
2 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 68 osd.2
3 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 79 osd.3
-10 111.59158 - 112TiB 129GiB 112TiB 0.11 0.99 - host ld5507
42 hdd 1.62723 1.00000 1.64TiB 2.82GiB 1.63TiB 0.17 1.49 10 osd.42
43 hdd 1.62723 1.00000 1.64TiB 2.58GiB 1.63TiB 0.15 1.36 7 osd.43
44 hdd 1.62723 1.00000 1.64TiB 3.30GiB 1.63TiB 0.20 1.74 19 osd.44
45 hdd 1.62723 1.00000 1.64TiB 3.10GiB 1.63TiB 0.18 1.63 11 osd.45
46 hdd 1.62723 1.00000 1.64TiB 3.71GiB 1.63TiB 0.22 1.95 6 osd.46
47 hdd 1.62723 1.00000 1.64TiB 2.77GiB 1.63TiB 0.17 1.46 9 osd.47
48 hdd 1.62723 1.00000 1.64TiB 3.06GiB 1.63TiB 0.18 1.61 9 osd.48
[...]
216 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 66 osd.216
217 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 74 osd.217
218 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 40 osd.218
219 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 55 osd.219
4 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 147 osd.4
5 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 151 osd.5
-13 111.59158 - 112TiB 166GiB 112TiB 0.14 1.27 - host ld5508
59 hdd 1.62723 1.00000 1.64TiB 5.46GiB 1.63TiB 0.33 2.88 26 osd.59
60 hdd 1.62723 1.00000 1.64TiB 5.85GiB 1.63TiB 0.35 3.08 32 osd.60
61 hdd 1.62723 1.00000 1.64TiB 5.53GiB 1.63TiB 0.33 2.92 22 osd.61
62 hdd 1.62723 1.00000 1.64TiB 4.85GiB 1.63TiB 0.29 2.55 30 osd.62
63 hdd 1.62723 1.00000 1.64TiB 5.75GiB 1.63TiB 0.34 3.03 22 osd.63
64 hdd 1.62723 1.00000 1.64TiB 4.49GiB 1.63TiB 0.27 2.37 23 osd.64
65 hdd 1.62723 1.00000 1.64TiB 5.18GiB 1.63TiB 0.31 2.73 26 osd.65
66 hdd 1.62723 1.00000 1.64TiB 5.02GiB 1.63TiB 0.30 2.65 21 osd.66
[...]
264 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 151 osd.264
265 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 142 osd.265
266 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 165 osd.266
267 hdd 1.62723 1.00000 1.64TiB 1.54GiB 1.64TiB 0.09 0.81 168 osd.267
6 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 253 osd.6
7 nvme 2.91089 1.00000 2.91TiB 1.42GiB 2.91TiB 0.05 0.42 234 osd.7
TOTAL 449TiB 521GiB 448TiB 0.11
MIN/MAX VAR: 0.42/3.15 STDDEV: 0.06
 
usage: 521GiB used, 448TiB / 449TiB avail
The usage of the cluster ist only 512 GiB, of the 449 TiB.

root@ld3955:~# ceph osd df tree
The output looks weird to me. I suppose these 4x nodes are high density storage server, as the seem to have roughly 60 disks each. And you wanted to segment these through an altered crushmap.

pgs: 46.829% pgs unknown
0.184% pgs not active
34080/25026 objects misplaced (136.178%)
5095 unknown
The placement of the PGs seems to have changed after the reboot, hence the full storage in 'pvesm status' (goes for the available/used space of a pool).

Check your crushmap, could it be that the OSDs were put back into the default root and not the ones you set? Or is this a configuration leftover/preparation?

You can separate the pools by device classes. If you need to distinguish the HDDs further, then you could try to set a new device class for each OSD.
https://pve.proxmox.com/pve-docs/chapter-pveceph.html#pve_ceph_device_classes
https://ceph.com/community/new-luminous-crush-device-classes/

EDIT: you can see them with 'ceph osd crush tree --show-shadow'
 
Last edited:
Actually I defined device classes.

The output looks strange to me:
root@ld3955:~# ceph osd crush tree --show-shadow
ID CLASS WEIGHT TYPE NAME
-52 nvme 0 root hdd~nvme
-60 nvme 0 host ld5505-hdd~nvme
-58 nvme 0 host ld5506-hdd~nvme
-56 nvme 0 host ld5507-hdd~nvme
-54 nvme 0 host ld5508-hdd~nvme
-51 hdd 0 root hdd~hdd
-59 hdd 0 host ld5505-hdd~hdd
-57 hdd 0 host ld5506-hdd~hdd
-55 hdd 0 host ld5507-hdd~hdd
-53 hdd 0 host ld5508-hdd~hdd
-50 0 root hdd
-46 0 host ld5505-hdd
-47 0 host ld5506-hdd
-48 0 host ld5507-hdd
-49 0 host ld5508-hdd
-36 nvme 0 root nvme~nvme
-35 nvme 0 host ld5505-nvme~nvme
-39 nvme 0 host ld5506-nvme~nvme
-42 nvme 0 host ld5507-nvme~nvme
-45 nvme 0 host ld5508-nvme~nvme
-34 hdd 0 root nvme~hdd
-33 hdd 0 host ld5505-nvme~hdd
-38 hdd 0 host ld5506-nvme~hdd
-41 hdd 0 host ld5507-nvme~hdd
-44 hdd 0 host ld5508-nvme~hdd
-32 0 root nvme
-31 0 host ld5505-nvme
-37 0 host ld5506-nvme
-40 0 host ld5507-nvme
-43 0 host ld5508-nvme
-21 nvme 0 root hdd_strgbox~nvme
-20 nvme 0 host ld5505-hdd_strgbox~nvme
-24 nvme 0 host ld5506-hdd_strgbox~nvme
-27 nvme 0 host ld5507-hdd_strgbox~nvme
-30 nvme 0 host ld5508-hdd_strgbox~nvme
-19 hdd 0 root hdd_strgbox~hdd
-18 hdd 0 host ld5505-hdd_strgbox~hdd
-23 hdd 0 host ld5506-hdd_strgbox~hdd
-26 hdd 0 host ld5507-hdd_strgbox~hdd
-29 hdd 0 host ld5508-hdd_strgbox~hdd
-17 0 root hdd_strgbox
-16 0 host ld5505-hdd_strgbox
-22 0 host ld5506-hdd_strgbox
-25 0 host ld5507-hdd_strgbox
-28 0 host ld5508-hdd_strgbox

-6 nvme 23.28711 root default~nvme
-5 nvme 5.82178 host ld5505~nvme
0 nvme 2.91089 osd.0
1 nvme 2.91089 osd.1
-9 nvme 5.82178 host ld5506~nvme
2 nvme 2.91089 osd.2
3 nvme 2.91089 osd.3
-12 nvme 5.82178 host ld5507~nvme
4 nvme 2.91089 osd.4
5 nvme 2.91089 osd.5
-15 nvme 5.82178 host ld5508~nvme
6 nvme 2.91089 osd.6
7 nvme 2.91089 osd.7
-2 hdd 423.07922 root default~hdd


I'll share my crushmap with you here.
 

Attachments

  • crush_map.txt
    20.6 KB · Views: 5
Update:
This crush map does not reflect the device classes.
Therefor it must be customized.

I did this already before, therefore my question is:
How can the crush map be "resetted" after cluster node reboot? Why does this happen?
 
This crush map does not reflect the device classes.
Therefor it must be customized.
You don't have to take the predefined ssd/hdd/nvme, it also allows other naming too, eg. strgbox. On OSD creation, it can be specified.

How can the crush map be "resetted" after cluster node reboot? Why does this happen?
On startup the location of the OSD gets checked and with a non-default location, the OSD moved itself on the crushmap back to the default root. I suppose, when using the device classes, this will not trigger, as the OSDs are still at the same place.
Code:
osd crush update on start = false
You can find more information in the link.
http://docs.ceph.com/docs/luminous/rados/operations/crush-map/#crush-location
 
I have modified crush map and ceph cluster runs stable again.
Please check the attached document for this crush map; if you don't mind please comment on this crush map in case there's an error.

Now, there's only one issue, but this is related to "unknown pgs" and I will open another thread for this issue.
 

Attachments

  • crush_map_new.txt
    20.6 KB · Views: 8
The crushmap looks cleaner, but besides that, I did not see anything odd jumping out.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!