Hi Everyone,
I run a 3-node proxmox+ceph cluster in my home-lab serving as rdb storage for virtual machines for 2 years now.
When I installed it, I did some testing to ensure, that when one node would fail, the remaining 2 nodes would keep the system up while the 3rd node is being replaced.
Recently I had to reboot a node on that cluster and realized, that the redundancy was gone.
Each of the 3 nodes has 4x4TB OSDs which makes 16TB per node or 48 in total.
As mentioned, I use proxmox, so I used their interface to set up the OSDs and Pools.
I have 2 Pools. One for my Virtual machines, one for ceph-fs.
Each pool's size/min is 3/2, has 256 PGs and Autoscaler on.
And now here's what I don't understand: I have the impression, that for what reason ever, it seams, as if my cluster would be over provisioned:
As the command outputs below show, ceph-iso_metadata consume 19TB accordingly to ceph df, how ever, the mounted ceph-iso filesystem is only 9.2TB big.
Same goes with my ceph-vm storage, that ceph belives is 8.3TB but in reality is only 6.3TB (accordingly to the proxmox gui).
The problem now is obvious: out of my 48TB Rawdata I should not be using more then 16TB, else I can't afford to loose a node.
Now Ceph tells me, that in total I am using 27TB, but compared to the mounted volumes/storages I am not using more then 16TB.
So, where are the 11TB (27-16) gone?
What am I not understanding?
Thank you for any hint on that.
regards,
Felix
I run a 3-node proxmox+ceph cluster in my home-lab serving as rdb storage for virtual machines for 2 years now.
When I installed it, I did some testing to ensure, that when one node would fail, the remaining 2 nodes would keep the system up while the 3rd node is being replaced.
Recently I had to reboot a node on that cluster and realized, that the redundancy was gone.
Each of the 3 nodes has 4x4TB OSDs which makes 16TB per node or 48 in total.
As mentioned, I use proxmox, so I used their interface to set up the OSDs and Pools.
I have 2 Pools. One for my Virtual machines, one for ceph-fs.
Each pool's size/min is 3/2, has 256 PGs and Autoscaler on.
And now here's what I don't understand: I have the impression, that for what reason ever, it seams, as if my cluster would be over provisioned:
As the command outputs below show, ceph-iso_metadata consume 19TB accordingly to ceph df, how ever, the mounted ceph-iso filesystem is only 9.2TB big.
Same goes with my ceph-vm storage, that ceph belives is 8.3TB but in reality is only 6.3TB (accordingly to the proxmox gui).
The problem now is obvious: out of my 48TB Rawdata I should not be using more then 16TB, else I can't afford to loose a node.
Now Ceph tells me, that in total I am using 27TB, but compared to the mounted volumes/storages I am not using more then 16TB.
So, where are the 11TB (27-16) gone?
What am I not understanding?
Thank you for any hint on that.
regards,
Felix
Code:
ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 44 TiB 17 TiB 27 TiB 27 TiB 61.70
TOTAL 44 TiB 17 TiB 27 TiB 27 TiB 61.70
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 0 B 0 0 B 0 3.0 TiB
ceph-vm 2 256 2.7 TiB 804.41k 8.3 TiB 47.76 3.0 TiB
ceph-iso_data 3 256 6.1 TiB 3.11M 19 TiB 67.23 3.0 TiB
ceph-iso_metadata 4 32 3.1 GiB 132.51k 9.3 GiB 0.10 3.0 TiB
rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
ceph-iso_data 19 TiB 3105013 0 9315039 0 0 0 75202 97 GiB 28776 9.2 MiB 0 B 0 B
ceph-iso_metadata 9.3 GiB 132515 0 397545 0 0 0 15856613330 13 TiB 28336539064 93 TiB 0 B 0 B
ceph-vm 8.3 TiB 804409 0 2413227 0 0 0 94160784 40 TiB 62581002 4.4 TiB 0 B 0 B
device_health_metrics 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B
total_objects 4041937
total_used 27 TiB
total_avail 17 TiB
total_space 44 TiB
df -h
Size Used Avail Avail% mounted on
9,2T 6,2T 3,1T 67% /mnt/pve/ceph-iso
Last edited: