Hi'
Our cluster consists of 8 servers, each with 6 disks worth of 450GB each, totaling an aggregated storage capacity of about 21TB.
The overall design idea is to use erasure coding with the jerasure plugin and the liber8tion technique, thus going with k=6 and m=2.
When creating the pool with plain ceph tools, things look as follows.
The output of command "ceph osd pool ls detail" and the min_size setting puzzles me a bit, but the output of command "ceph df detail" with its MAX AVAIL limit of 15TB seems as expected with k=6 and m=2.
Now I get to try to create the same pool setup with the pveceph tool, which also nicely integrates the pool within the pve cluster for it to be usable for vm's and ct's. This gives rise to the following.
With this a number of observations and questions concerning pveceph arise.
Thanks.
Best regards.
Thomas.
Our cluster consists of 8 servers, each with 6 disks worth of 450GB each, totaling an aggregated storage capacity of about 21TB.
The overall design idea is to use erasure coding with the jerasure plugin and the liber8tion technique, thus going with k=6 and m=2.
When creating the pool with plain ceph tools, things look as follows.
Code:
root@hugin-1:~# ceph osd erasure-code-profile set cephtest k=6 m=2 plugin=jerasure technique=liber8tion
root@hugin-1:~# ceph osd erasure-code-profile get cephtest
crush-device-class=
crush-failure-domain=host
crush-root=default
k=6
m=2
packetsize=2048
plugin=jerasure
technique=liber8tion
w=8
root@hugin-1:~# ceph osd pool create cephtest erasure cephtest
pool 'cephtest' created
root@hugin-1:~# ceph osd pool ls detail
pool 1 'device_health_metrics' replicated size 2 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 13 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
pool 13 'cephtest' erasure profile cephtest size 8 min_size 7 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 423 flags hashpspool stripe_width 393216
root@hugin-1:~# ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 21 TiB 21 TiB 2.3 GiB 2.3 GiB 0.01
TOTAL 21 TiB 21 TiB 2.3 GiB 2.3 GiB 0.01
--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
device_health_metrics 1 1 0 B 0 B 0 B 0 0 B 0 B 0 B 0 10 TiB N/A N/A N/A 0 B 0 B
cephtest 13 32 0 B 0 B 0 B 0 0 B 0 B 0 B 0 15 TiB N/A N/A N/A 0 B 0 B
The output of command "ceph osd pool ls detail" and the min_size setting puzzles me a bit, but the output of command "ceph df detail" with its MAX AVAIL limit of 15TB seems as expected with k=6 and m=2.
Now I get to try to create the same pool setup with the pveceph tool, which also nicely integrates the pool within the pve cluster for it to be usable for vm's and ct's. This gives rise to the following.
Code:
root@hugin-1:~# ceph osd erasure-code-profile set cephtest k=6 m=2 plugin=jerasure technique=liber8tion
root@hugin-1:~# ceph osd erasure-code-profile get cephtest
crush-device-class=
crush-failure-domain=host
crush-root=default
k=6
m=2
packetsize=2048
plugin=jerasure
technique=liber8tion
w=8
root@hugin-1:~# pveceph pool create cephtest --erasure-coding profile=cephtest
400 Parameter verification failed.
erasure-coding: invalid format - format error
erasure-coding.k: property is missing and it is not optional
erasure-coding.m: property is missing and it is not optional
pveceph pool create <name> [OPTIONS]
root@hugin-1:~# pveceph pool create cephtest --erasure-coding k=6,m=2,profile=cephtest
pool cephtest-data: applying allow_ec_overwrites = true
pool cephtest-data: applying application = rbd
pool cephtest-data: applying pg_autoscale_mode = warn
pool cephtest-data: applying pg_num = 128
pool cephtest-metadata: applying size = 3
pool cephtest-metadata: applying application = rbd
pool cephtest-metadata: applying min_size = 2
pool cephtest-metadata: applying pg_autoscale_mode = warn
pool cephtest-metadata: applying pg_num = 32
root@hugin-1:~# ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 21 TiB 21 TiB 2.4 GiB 2.4 GiB 0.01
TOTAL 21 TiB 21 TiB 2.4 GiB 2.4 GiB 0.01
--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
device_health_metrics 1 1 0 B 0 B 0 B 0 0 B 0 B 0 B 0 10 TiB N/A N/A N/A 0 B 0 B
cephtest-data 16 128 0 B 0 B 0 B 0 0 B 0 B 0 B 0 15 TiB N/A N/A N/A 0 B 0 B
cephtest-metadata 17 32 0 B 0 B 0 B 0 0 B 0 B 0 B 0 6.6 TiB N/A N/A N/A 0 B 0 B
root@hugin-1:~# ceph osd pool ls detail
pool 1 'device_health_metrics' replicated size 2 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 13 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
pool 16 'cephtest-data' erasure profile cephtest size 8 min_size 7 crush_rule 2 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change 446 flags hashpspool,ec_overwrites stripe_width 393216 application rbd
pool 17 'cephtest-metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 452 flags hashpspool stripe_width 0 application rbd
root@hugin-1:~# pvesm status
Name Type Status Total Used Available %
cephtest rbd active 7123198464 0 7123198464 0.00%
local dir active 57225328 3471220 50814820 6.07%
local-lvm lvmthin active 147238912 0 147238912 0.00%
With this a number of observations and questions concerning pveceph arise.
- Despite that my erasure coding profile does contain k=6 and m=2, pveceph appears to not recognize this, hence I need to specify this explicitly on the command line.
- As compared with the result when using the plain ceph commands, pg numbers are set to 128, which "ceph health detail" warns about should be 32 instead, just as it is when the plain ceph commands are used for creating the pool.
- I am very unsure about how to interpret the MAX AVAIL limits of 15TB and 6.6TB for pools cephtest-data and cephtest-metadata, respectively.
- The pvesm command is reporting rather unexpected availability numbers, pretty much one third of the total aggregated storage available, which could suggest that pvesm does not see the pool as an erasure coding pool but merely a replication pool
Thanks.
Best regards.
Thomas.