Please help me with your advice.
I need to implement fault tolerance at the datacenter level in the Proxmox VE hyperconverged cluster (pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.13-1-pve)) and Ceph Reef 18.2.1.
To test future changes, I created a virtual test bench in VirtualBox that closely mimics my cluster in production.
I have reproduced exactly the number of servers in the cluster (17), completely recreated the network configuration and IP addresses on the network interfaces.
I was unable to recreate the hard drive volumes due to the limitations of my work computer. Real disks with a volume of 4 terabytes were simulated by 4 gigabytes.
According to the official documentation on the Ceph project website and various Internet sources, the following commands were executed.
As a result, the Ceph cluster issued HEALTH_OK, but data rebalancing was not going to start: pgs: 1100/3306 objects misplaced (33.273%)
The "ceph balancer status" command outputs: "optimize_result": "Too many objects (0.332728 > 0.050000) are misplaced; try again later",
Since the testbed was created to experiment and gain experience from my mistakes, I tried to fix the situation by adding hard drives. It didn't help.
I tried restarting the servers one by one. It didn't help.
I tried to use a different rule.
But the cluster still doesn't try to finish balancing the data.
My virtual testbed looks like this.
I need to implement fault tolerance at the datacenter level in the Proxmox VE hyperconverged cluster (pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.13-1-pve)) and Ceph Reef 18.2.1.
To test future changes, I created a virtual test bench in VirtualBox that closely mimics my cluster in production.
I have reproduced exactly the number of servers in the cluster (17), completely recreated the network configuration and IP addresses on the network interfaces.
I was unable to recreate the hard drive volumes due to the limitations of my work computer. Real disks with a volume of 4 terabytes were simulated by 4 gigabytes.
According to the official documentation on the Ceph project website and various Internet sources, the following commands were executed.
Bash:
ceph osd crush add-bucket dc1 datacenter
ceph osd crush add-bucket dc2 datacenter
ceph osd crush move dc1 root=default
ceph osd crush move dc2 root=default
ceph osd crush move pn1 datacenter=dc1
...
ceph osd crush move pn2 datacenter=dc2
...
ceph osd crush rule create-replicated StarDCreplicated default datacenter hdd
ceph osd pool set rbd crush_rule StarDCreplicated
As a result, the Ceph cluster issued HEALTH_OK, but data rebalancing was not going to start: pgs: 1100/3306 objects misplaced (33.273%)
The "ceph balancer status" command outputs: "optimize_result": "Too many objects (0.332728 > 0.050000) are misplaced; try again later",
Since the testbed was created to experiment and gain experience from my mistakes, I tried to fix the situation by adding hard drives. It didn't help.
I tried restarting the servers one by one. It didn't help.
I tried to use a different rule.
Bash:
ceph osd crush rule create-simple StarDCsimple default datacenter
ceph osd pool set rbd crush_rule StarDCsimple
But the cluster still doesn't try to finish balancing the data.
My virtual testbed looks like this.
Rich (BB code):
cluster:
id: dfce5bc5-428f-4ede-af8d-2d801e84578e
health: HEALTH_OK
services:
mon: 3 daemons, quorum pn1,pn2,pn3 (age 2h)
mgr: pn2(active, since 2h), standbys: pn1, pn3
osd: 34 osds: 34 up (since 2h), 34 in (since 2h); 128 remapped pgs
data:
pools: 2 pools, 129 pgs
objects: 1.10k objects, 4.2 GiB
usage: 14 GiB used, 116 GiB / 131 GiB avail
pgs: 1100/3306 objects misplaced (33.273%)
128 active+clean+remapped
1 active+clean
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 19 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 33.33
pool 2 'rbd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 2056 lfor 0/866/1096 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 2.13
pool 2 'rbd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 2056 lfor 0/866/1096 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 2.13
Rich (BB code):
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.12711 root default
-33 0.07912 datacenter dc1
-3 0.01556 host pn1
0 hdd 0.00389 osd.0 up 1.00000 1.00000
1 hdd 0.00389 osd.1 up 1.00000 1.00000
2 hdd 0.00389 osd.2 up 1.00000 1.00000
3 hdd 0.00389 osd.3 up 1.00000 1.00000
-23 0.00130 host pn13
25 hdd 0.00130 osd.25 up 1.00000 1.00000
-25 0.00389 host pn16
26 hdd 0.00389 osd.26 up 1.00000 1.00000
-27 0.00389 host pn17
27 hdd 0.00389 osd.27 up 1.00000 1.00000
-29 0.00389 host pn18
28 hdd 0.00389 osd.28 up 1.00000 1.00000
-31 0.00389 host pn19
29 hdd 0.00389 osd.29 up 1.00000 1.00000
-7 0.01556 host pn3
8 hdd 0.00389 osd.8 up 1.00000 1.00000
9 hdd 0.00389 osd.9 up 1.00000 1.00000
10 hdd 0.00389 osd.10 up 1.00000 1.00000
11 hdd 0.00389 osd.11 up 1.00000 1.00000
-9 0.01556 host pn4
12 hdd 0.00389 osd.12 up 1.00000 1.00000
13 hdd 0.00389 osd.13 up 1.00000 1.00000
14 hdd 0.00389 osd.14 up 1.00000 1.00000
15 hdd 0.00389 osd.15 up 1.00000 1.00000
-11 0.01556 host pn5
16 hdd 0.00389 osd.16 up 1.00000 1.00000
17 hdd 0.00389 osd.17 up 1.00000 1.00000
18 hdd 0.00389 osd.18 up 1.00000 1.00000
19 hdd 0.00389 osd.19 up 1.00000 1.00000
-34 0.04799 datacenter dc2
-17 0.00778 host pn10
22 hdd 0.00389 osd.22 up 1.00000 1.00000
32 hdd 0.00389 osd.32 up 1.00000 1.00000
-19 0.00778 host pn11
23 hdd 0.00389 osd.23 up 1.00000 1.00000
33 hdd 0.00389 osd.33 up 1.00000 1.00000
-21 0.00389 host pn12
24 hdd 0.00389 osd.24 up 1.00000 1.00000
-5 0.01556 host pn2
4 hdd 0.00389 osd.4 up 1.00000 1.00000
5 hdd 0.00389 osd.5 up 1.00000 1.00000
6 hdd 0.00389 osd.6 up 1.00000 1.00000
7 hdd 0.00389 osd.7 up 1.00000 1.00000
-13 0.00908 host pn6
20 hdd 0.00130 osd.20 up 1.00000 1.00000
30 hdd 0.00389 osd.30 up 1.00000 1.00000
31 hdd 0.00389 osd.31 up 1.00000 1.00000
-15 0.00389 host pn9
21 hdd 0.00389 osd.21 up 1.00000 1.00000