Hello,
I'm testing a 3-node cluster with Ceph, and I noticed that when one node is down, the total size of the ceph storage in the GUI increases. Here's the config for each node:
pve01
2 x 32GB Boot disks with ZFS mirror
2 x 50GB OSDs
pve02
2 x 32GB Boot disks with ZFS mirror
2 x 50GB OSDs
pve03
2 x 32GB Boot disks with ZFS mirror
2 x 50GB OSDs
I've created a Ceph pool called "ceph", using size 3/2.
The storage size shown in the GUI is 96.16GB when the pool is healthy, but it increases to 144.24GB when one node is down. MAX AVAIL (ceph df) also increases.
I don't understand this behaviour. I assume it shows 96.16GB as total space because the pool has 3 copies, but why it increases when one node is down?
Usage also increases, but no new information was written.
Healthy pool
One node down
At Ceph GUI, the raw space is being corrected few minutes after one node is down:
Healthy cluster
One node down
Could anyone put some light?
Thank you!
I'm testing a 3-node cluster with Ceph, and I noticed that when one node is down, the total size of the ceph storage in the GUI increases. Here's the config for each node:
pve01
2 x 32GB Boot disks with ZFS mirror
2 x 50GB OSDs
pve02
2 x 32GB Boot disks with ZFS mirror
2 x 50GB OSDs
pve03
2 x 32GB Boot disks with ZFS mirror
2 x 50GB OSDs
I've created a Ceph pool called "ceph", using size 3/2.
The storage size shown in the GUI is 96.16GB when the pool is healthy, but it increases to 144.24GB when one node is down. MAX AVAIL (ceph df) also increases.
I don't understand this behaviour. I assume it shows 96.16GB as total space because the pool has 3 copies, but why it increases when one node is down?
Usage also increases, but no new information was written.
Healthy pool
Code:
root@pve01:~# ceph -s
cluster:
id: 44c1909c-1de4-4933-a606-ceea387e4a03
health: HEALTH_OK
services:
mon: 3 daemons, quorum pve01,pve02,pve03 (age 2d)
mgr: pve02(active, since 2d), standbys: pve01, pve03
mds: 1/1 daemons up, 2 standby
osd: 6 osds: 6 up (since 2d), 6 in (since 2d)
data:
volumes: 1/1 healthy
pools: 4 pools, 73 pgs
objects: 4.46k objects, 17 GiB
usage: 52 GiB used, 248 GiB / 300 GiB avail
pgs: 73 active+clean
io:
client: 4.7 KiB/s wr, 0 op/s rd, 0 op/s wr
Bash:
root@pve01:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 300 GiB 248 GiB 52 GiB 52 GiB 17.44
TOTAL 300 GiB 248 GiB 52 GiB 52 GiB 17.44
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 577 KiB 2 1.7 MiB 0 74 GiB
ceph 2 32 15 GiB 4.29k 46 GiB 16.96 74 GiB
ISO_Ceph_data 3 32 598 MiB 150 1.8 GiB 0.78 74 GiB
ISO_Ceph_metadata 4 8 120 KiB 22 441 KiB 0 74 GiB
One node down
Bash:
root@pve01:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 300 GiB 248 GiB 52 GiB 52 GiB 17.44
TOTAL 300 GiB 248 GiB 52 GiB 52 GiB 17.44
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 865 KiB 2 1.7 MiB 0 112 GiB
ceph 2 32 23 GiB 4.29k 46 GiB 16.96 112 GiB
ISO_Ceph_data 3 32 898 MiB 150 1.8 GiB 0.78 112 GiB
ISO_Ceph_metadata 4 8 180 KiB 22 441 KiB 0 112 GiB
Bash:
root@pve01:~# ceph -s
cluster:
id: 44c1909c-1de4-4933-a606-ceea387e4a03
health: HEALTH_WARN
1/3 mons down, quorum pve01,pve02
2 osds down
1 host (2 osds) down
Degraded data redundancy: 4460/13380 objects degraded (33.333%), 71 pgs degraded
services:
mon: 3 daemons, quorum pve01,pve02 (age 58s), out of quorum: pve03
mgr: pve02(active, since 2d), standbys: pve01
mds: 1/1 daemons up, 2 standby
osd: 6 osds: 4 up (since 44s), 6 in (since 2d)
data:
volumes: 1/1 healthy
pools: 4 pools, 73 pgs
objects: 4.46k objects, 17 GiB
usage: 52 GiB used, 248 GiB / 300 GiB avail
pgs: 4460/13380 objects degraded (33.333%)
71 active+undersized+degraded
2 active+undersized
io:
client: 11 KiB/s wr, 0 op/s rd, 1 op/s wr
At Ceph GUI, the raw space is being corrected few minutes after one node is down:
Healthy cluster
One node down
Could anyone put some light?
Thank you!
Last edited: