[SOLVED] Ceph & cephfs_data Pools

linuxmanru

Member
Jun 29, 2021
9
0
6
49
Hi everyone,
In my case I have 7 pve node.
osd_pool_default_min_size = 2
osd_pool_default_size = 3
ceph osd pool autoscale-status POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE device_health_metrics 15426k 2.0 106.4T 0.0000 1.0 1 on vm.pool 2731G 3.0 106.4T 0.0752 1.0000 1.0000 1.0 512 on cephfs_data 1978 2.0 106.4T 0.0000 1.0 32 on cephfs_metadata 23799k 2.0 106.4T 0.0000 4.0 2 on

ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 106 TiB 98 TiB 8.1 TiB 8.1 TiB 7.60 TOTAL 106 TiB 98 TiB 8.1 TiB 8.1 TiB 7.60 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 15 MiB 22 30 MiB 0 46 TiB vm.pool 2 512 2.7 TiB 741.87k 8.0 TiB 8.01 31 TiB cephfs_data 3 32 1.9 KiB 0 3.9 KiB 0 46 TiB cephfs_metadata 4 2 23 MiB 28 48 MiB 0 46 TiB
rados df POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR cephfs_data 3.9 KiB 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B cephfs_metadata 48 MiB 28 0 56 0 0 0 0 0 B 14 13 KiB 0 B 0 B device_health_metrics 30 MiB 22 0 44 0 0 0 22 44 KiB 22 231 KiB 0 B 0 B vm.pool 8.0 TiB 741870 0 2225610 0 0 0 113390981 74 TiB 707422564 12 TiB 0 B 0 B total_objects 741920 total_used 8.1 TiB total_avail 98 TiB total_space 106 TiB

The pools cephfs_data & cephfs_metadata created when I Create CephFS.
The 7 node I add the last.
For now if check with CLI ceph df : cephfs_data + cephfs_metadata MAX AVAIL is 46 TiB , but the vm.pool 31 TiB
I don't understand why the last pool I create have MAX AVAIL space less than pool crated with Create CephFS.

Because for now the pool cephfs_data I don't use yet, if I destroy it like I read in : Destroy CephFS
I need ask if I destroy CephFS pool is will affect other my last pool (vm.pool) or affect the ceph storage ?
And the most importanrt if I destroy it MAX AVAIL space will increase for the pool -> vm.pool
If I need the cephfs_data pool I can it create again but after the destroy it before.

Proxmox VE version is :
proxmox-ve: 6.4-1 (running kernel: 5.4.162-1-pve) pve-manager: 6.4-13 (running version: 6.4-13/9f411e79) pve-kernel-5.4: 6.4-12 pve-kernel-helper: 6.4-12 pve-kernel-5.4.162-1-pve: 5.4.162-2 pve-kernel-5.4.157-1-pve: 5.4.157-1 pve-kernel-5.4.106-1-pve: 5.4.106-1 ceph: 15.2.15-pve1~bpo10 ceph-fuse: 15.2.15-pve1~bpo10 corosync: 3.1.5-pve2~bpo10+1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: residual config ifupdown2: 3.0.0-1+pve4~bpo10 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.22-pve2~bpo10+1 libproxmox-acme-perl: 1.1.0 libproxmox-backup-qemu0: 1.1.0-1 libpve-access-control: 6.4-3 libpve-apiclient-perl: 3.1-3 libpve-common-perl: 6.4-4 libpve-guest-common-perl: 3.1-5 libpve-http-server-perl: 3.2-3 libpve-storage-perl: 6.4-1 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.6-2 lxcfs: 4.0.6-pve1 novnc-pve: 1.1.0-1 openvswitch-switch: 2.12.3-1 proxmox-backup-client: 1.1.13-2 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.6-1 pve-cluster: 6.4-1 pve-container: 3.3-6 pve-docs: 6.4-2 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-4 pve-firmware: 3.3-2 pve-ha-manager: 3.1-1 pve-i18n: 2.3-1 pve-qemu-kvm: 5.2.0-6 pve-xtermjs: 4.7.0-3 qemu-server: 6.4-2 smartmontools: 7.2-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 2.0.7-pve1
 
Last edited:
I need ask if I destroy CephFS pool is will affect other my last pool (vm.pool) or affect the ceph storage ?
And the most importanrt if I destroy it MAX AVAIL space will increase for the pool -> vm.pool
Looking at the output of ceph df, I don't think you will gain any noteworthy space if you destroy the Ceph FS.

If you run pveceph pool ls you should see which size/min_size are set.
 
Looking at the output of ceph df, I don't think you will gain any noteworthy space if you destroy the Ceph FS.

If you run pveceph pool ls you should see which size/min_size are set.
If I not get more space if destroy Ceph FS pools I think not need to destroy it.
The output :
pveceph pool ls
Rich (BB code):
pveceph pool ls
┌───────────────────────┬──────┬──────────┬────────┬─────────────┬────────────────┬───────────────────┬──────────────────────────┬───────────────────────────┬─────────────────┬──────────────────────┬───────────────┐
│ Name                  │ Size │ Min Size │ PG Num │ min. PG Num │ Optimal PG Num │ PG Autoscale Mode │ PG Autoscale Target Size │ PG Autoscale Target Ratio │ Crush Rule Name │               %-Used │          Used │
╞═══════════════════════╪══════╪══════════╪════════╪═════════════╪════════════════╪═══════════════════╪══════════════════════════╪═══════════════════════════╪═════════════════╪══════════════════════╪═══════════════╡
│ cephfs_data           │    2 │        2 │     32 │             │             32 │ on                │                          │                           │ replicated_rule │                    0 │             0 │
├───────────────────────┼──────┼──────────┼────────┼─────────────┼────────────────┼───────────────────┼──────────────────────────┼───────────────────────────┼─────────────────┼──────────────────────┼───────────────┤
│ cephfs_metadata       │    2 │        2 │      2 │           2 │              2 │ on                │                          │                           │ replicated_rule │ 5.03807939367107e-07 │      49749196 │
├───────────────────────┼──────┼──────────┼────────┼─────────────┼────────────────┼───────────────────┼──────────────────────────┼───────────────────────────┼─────────────────┼──────────────────────┼───────────────┤
│ device_health_metrics │    2 │        2 │      1 │           1 │              1 │ on                │                          │                           │ replicated_rule │ 3.77427340936265e-07 │      37269565 │
├───────────────────────┼──────┼──────────┼────────┼─────────────┼────────────────┼───────────────────┼──────────────────────────┼───────────────────────────┼─────────────────┼──────────────────────┼───────────────┤
│ vm.pool               │    3 │        2 │    512 │             │            512 │ on                │                          │                         1 │ replicated_rule │   0.0823325961828232 │ 8859461644192 │
└───────────────────────┴──────┴──────────┴────────┴─────────────┴────────────────┴───────────────────┴──────────────────────────┴───────────────────────────┴─────────────────┴──────────────────────┴───────────────┘
 
Your other pools are only size 2. Which is totally not recommended. This explains the space available difference.

You will not gain any more space by deleting the other pools. These only hold a couple of megabytes.
 
Last edited:
  • Like
Reactions: aaron
Your other pools are only size 2. Which is totally not recommended. This explains the space available difference.

You will not gain any more space by deleting the other pools. These only hold a couple of megabytes.
If I understand correctly: If I set up other pools with a size of 3/Min. Size 2 space, then the total space can change ?
 
Now I set 3/2 on other pools and ceph df show :
[ICODE] ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 106 TiB 98 TiB 8.1 TiB 8.1 TiB 7.62 TOTAL 106 TiB 98 TiB 8.1 TiB 8.1 TiB 7.62 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 17 MiB 22 51 MiB 0 30 TiB vm.pool 2 512 2.7 TiB 743.78k 8.1 TiB 8.23 30 TiB cephfs_data 3 32 0 B 0 0 B 0 30 TiB cephfs_metadata 4 2 22 MiB 28 69 MiB 0 30 TiB [/ICODE]

Is normal when set 3/2 ?
 
Yes, you have 98TiB raw available, divided by 3 is roughly 30TiB. As the pools share all OSDs each pool has the same max avail space. If one pool starts to fill the other pools will have less max avail space.
Thanks. Now I understand.
Please tell my when you see the on the vm.pool replication size to 4?
 
Yes, you have 98TiB raw available, divided by 3 is roughly 30TiB. As the pools share all OSDs each pool has the same max avail space. If one pool starts to fill the other pools will have less max avail space.
One question about avail space.
If I try the command :
set fixed expected pool size
# ceph osd pool set MY_POOL_NAME target_size_bytes 60T ( or 50T)

or relative pool size (of full space)

# ceph osd pool set MY_POOL_NAME target_size_ratio .9

Is change the avail space on vm.pool?
 
any target_size / target_ratio is only there for the autoscaler to have an idea how large the pool will be. It will not have any effect on how much space the pool has / sees.

As @gurubert already mentioned, take the raw capacity of the cluster and divide it by the size of the pool, and you get roughly how large the pool can get.

BUT: do not fill the pool all the way up. Ceph needs free space in case one of your nodes fails. Also, if the OSDs are used very differently, it can happen that an OSD is getting quite a bit fuller than most others. This will also reduce the available space. If that happens, you can check if the pg_num could be optimized or the balancer activated.

If you require more space, add more / larger OSDs or more nodes to the cluster. Do NOT change the size/min_size value to anything lower than 3/2 as it is a recipe for data loss.

Ceph can deal and recover from a lot of situations quite well. Running out of space is one of the few things you want to avoid at all costs though!
 
any target_size / target_ratio is only there for the autoscaler to have an idea how large the pool will be. It will not have any effect on how much space the pool has / sees.

As @gurubert already mentioned, take the raw capacity of the cluster and divide it by the size of the pool, and you get roughly how large the pool can get.

BUT: do not fill the pool all the way up. Ceph needs free space in case one of your nodes fails. Also, if the OSDs are used very differently, it can happen that an OSD is getting quite a bit fuller than most others. This will also reduce the available space. If that happens, you can check if the pg_num could be optimized or the balancer activated.

If you require more space, add more / larger OSDs or more nodes to the cluster. Do NOT change the size/min_size value to anything lower than 3/2 as it is a recipe for data loss.

Ceph can deal and recover from a lot of situations quite well. Running out of space is one of the few things you want to avoid at all costs though!
Thank you aaron, now I understand all.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!