Hi all,
I'm new with ceph and ran into an interesting warning message which I think I can not interpret correctly and would appreciate any suggestion/comment on this, where to start rolling up the case and/or I missing something.
First of all, the configuration structured as follows:
3 node, each one is the same:
32GB RAM
i7-8700 CPU
1GbE for public network
1GbE for cluster communication
dual potrt 10GbE bonding
2x120GB SSD in RAID1 -> booting prox ve
1x120GB -> nvme cache
2x6TB HDD -> ceph osd
2x4TB HDD -> ceph osd
1x10GbE microtic SW -> for ceph storage
1x1GbE ethernet SW -> for cluster and public network communication.
The warning messages are the following:
I checked that the "active" state means: Ceph will process requests to the placement group.
the "undersized" means: The placement group has fewer copies than the configured pool replication level.
the "remapped" means: The placement group is temporarily mapped to a different set of OSDs from what CRUSH specified.
We have had to replace one dualport 10GbE interface because of overheating which caused latency errors on node2 (this was solved), and we have had a bad OSD in node1 that we also replaced (also solved). All the OSDs and the communication is working fine now, there is no error in any OSD's log or related to the cluster, except the above mentioned warning message.
ceph osd tree
crush map
I've checked the pool "size" and pool "min_size" of the pool which is 6/2. As I understand, it means ceph want to replicate the PGs to 6 node. If less than 2 node out of 6 working, the OSDs going to be write protected.
Due to we got only 3 active node, it could cause the warning message, because it can not distribute the PGs to 6 node?
Is it possibel to reduce the pool size from 6 to 3 on the fly without data loss?
In this threed:
https:\\forum.proxmox.com\threads\urgent-proxmox-ceph-support-needed.24302
Q-wulf wrote that:
In the related documentation here, is not so clear for me:
http:\\docs.ceph.com\docs\master\rados\operations\pools\#set-the-number-of-object-replicas
(please note that the slashes are replaced with backslash due to it's my first thread)
Thanks,
Gabor A. Tar
I'm new with ceph and ran into an interesting warning message which I think I can not interpret correctly and would appreciate any suggestion/comment on this, where to start rolling up the case and/or I missing something.
First of all, the configuration structured as follows:
3 node, each one is the same:
32GB RAM
i7-8700 CPU
1GbE for public network
1GbE for cluster communication
dual potrt 10GbE bonding
2x120GB SSD in RAID1 -> booting prox ve
1x120GB -> nvme cache
2x6TB HDD -> ceph osd
2x4TB HDD -> ceph osd
1x10GbE microtic SW -> for ceph storage
1x1GbE ethernet SW -> for cluster and public network communication.
The warning messages are the following:
Code:
103740/793608 objects misplaced (13.072%)
Degraded data redundancy: 28528/793614 objects degraded (3.595%), 55 pgs degraded, 201 pgs undersized
pg 1.45 is stuck undersized for 89295.255902, current state active+undersized+remapped, last acting [8,7,6,5]
pg 1.46 is stuck undersized for 89343.453644, current state active+undersized+remapped, last acting [7,5,0,10,11]
pg 1.47 is stuck undersized for 89344.213165, current state active+undersized+degraded, last acting [0,2,1]
pg 1.48 is stuck undersized for 89284.596868, current state active+undersized+remapped, last acting [4,9,5,8,1]
pg 1.49 is stuck undersized for 89343.148491, current state active+undersized+remapped, last acting [0,10,5,8,1]
pg 1.4a is stuck undersized for 89295.261484, current state active+undersized+remapped, last acting [6,10,11,5]
pg 1.4d is stuck undersized for 89348.559683, current state active+undersized+remapped, last acting [12,5,7,8]
pg 1.4e is stuck undersized for 89284.596518, current state active+undersized+remapped, last acting [4,5,9,8,10]
pg 1.4f is stuck undersized for 89295.240541, current state active+undersized+remapped, last acting [8,6,10,4,0]
pg 1.52 is stuck undersized for 89344.550538, current state active+undersized+degraded, last acting [5,0,10]
pg 1.53 is stuck undersized for 89284.573238, current state active+undersized+remapped, last acting [8,4,9,5,0]
pg 1.54 is stuck undersized for 89295.253566, current state active+undersized+remapped, last acting [5,6,10,2,7]
pg 1.55 is stuck undersized for 89284.574872, current state active+undersized+remapped, last acting [8,1,9,5]
pg 1.56 is stuck undersized for 89348.905412, current state active+undersized+remapped, last acting [4,5,12,10]
pg 1.57 is stuck undersized for 89343.150310, current state active+undersized+remapped, last acting [0,1,8,10]
pg 1.58 is stuck undersized for 89295.254172, current state active+undersized+remapped, last acting [5,6,10,11]
pg 1.59 is stuck undersized for 89348.906011, current state active+undersized+remapped, last acting [7,11,12,8]
pg 1.5a is stuck undersized for 89343.494564, current state active+undersized+remapped, last acting [1,0,8,11]
pg 1.5b is stuck undersized for 89295.254653, current state active+undersized+remapped, last acting [5,6,7,8,0]
pg 1.5c is stuck undersized for 89284.577999, current state active+undersized+remapped, last acting [9,8,7,10]
pg 1.5d is stuck undersized for 89348.902290, current state active+undersized+remapped, last acting [8,12,4,7]
pg 1.5e is stuck undersized for 89284.581713, current state active+undersized+remapped, last acting [10,11,9,7,6]
pg 1.5f is stuck undersized for 89295.255562, current state active+undersized+remapped, last acting [8,6,7,0]
pg 1.c8 is stuck undersized for 89295.254029, current state active+undersized+remapped, last acting [5,4,6,1,8]
pg 1.ca is stuck undersized for 89348.906285, current state active+undersized+remapped, last acting [7,2,12,5,10]
pg 1.cb is stuck undersized for 89343.148769, current state active+undersized+remapped, last acting [0,10,8,7,11]
pg 1.cd is stuck undersized for 89348.560341, current state active+undersized+remapped, last acting [12,10,11,5,7]
pg 1.ce is stuck undersized for 89295.255590, current state active+undersized+remapped, last acting [1,8,6,4,5]
pg 1.d2 is stuck undersized for 89295.261832, current state active+undersized+remapped, last acting [6,5,1,10,0]
pg 1.d3 is stuck undersized for 89343.492630, current state active+undersized+remapped, last acting [8,10,0,4,5]
pg 1.d4 is stuck undersized for 89295.255590, current state active+undersized+remapped, last acting [2,1,6,5,7]
pg 1.d5 is stuck undersized for 89284.593449, current state active+undersized+remapped, last acting [1,11,9,5,8]
pg 1.d6 is stuck undersized for 89348.902759, current state active+undersized+remapped, last acting [8,12,10,4,5]
pg 1.d7 is stuck undersized for 89343.453711, current state active+undersized+remapped, last acting [7,5,0,1,8]
pg 1.d8 is stuck undersized for 89343.493305, current state active+undersized+remapped, last acting [1,11,0,5,10]
pg 1.d9 is stuck undersized for 89348.904691, current state active+undersized+remapped, last acting [7,5,12,8]
pg 1.db is stuck undersized for 89343.149529, current state active+undersized+remapped, last acting [0,5,7,8]
pg 1.dc is stuck undersized for 89295.263541, current state active+undersized+remapped, last acting [6,4,5,7,8]
pg 1.e3 is stuck undersized for 89295.250517, current state active+undersized+remapped, last acting [7,6,8,1,5]
pg 1.e5 is stuck undersized for 89295.254107, current state active+undersized+remapped, last acting [8,1,6,7,10]
pg 1.e7 is stuck undersized for 89348.559345, current state active+undersized+remapped, last acting [12,7,5,1]
pg 1.eb is stuck undersized for 89348.884054, current state active+undersized+remapped, last acting [11,12,4,2,8]
pg 1.ec is stuck undersized for 89348.560150, current state active+undersized+remapped, last acting [12,10,11,2,4]
pg 1.ed is stuck undersized for 89343.497267, current state active+undersized+remapped, last acting [11,10,0,1,5]
pg 1.ee is stuck undersized for 89295.261962, current state active+undersized+remapped, last acting [6,10,8,1,7]
pg 1.f1 is stuck undersized for 89343.149813, current state active+undersized+remapped, last acting [0,7,5,2,4]
pg 1.f2 is stuck undersized for 89348.904245, current state active+undersized+remapped, last acting [4,12,11,1]
pg 1.f3 is stuck undersized for 89343.493227, current state active+undersized+remapped, last acting [8,0,10,2,5]
pg 1.f5 is stuck undersized for 89343.148341, current state active+undersized+remapped, last acting [0,2,10,1,8]
pg 1.fa is stuck undersized for 89284.573764, current state active+undersized+remapped, last acting [11,9,4,5,7]
pg 1.ff is stuck undersized for 89343.452833, current state active+undersized+remapped, last acting [7,0,8,4,5]
I checked that the "active" state means: Ceph will process requests to the placement group.
the "undersized" means: The placement group has fewer copies than the configured pool replication level.
the "remapped" means: The placement group is temporarily mapped to a different set of OSDs from what CRUSH specified.
We have had to replace one dualport 10GbE interface because of overheating which caused latency errors on node2 (this was solved), and we have had a bad OSD in node1 that we also replaced (also solved). All the OSDs and the communication is working fine now, there is no error in any OSD's log or related to the cluster, except the above mentioned warning message.
ceph osd tree
Code:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 54.57889 root default
-3 18.19296 host node1
0 hdd 3.63860 osd.0 up 1.00000 1.00000
6 hdd 5.45789 osd.6 up 1.00000 1.00000
9 hdd 5.45789 osd.9 up 1.00000 1.00000
12 hdd 3.63860 osd.12 up 1.00000 1.00000
-5 18.19296 host node2
1 hdd 3.63860 osd.1 up 1.00000 1.00000
4 hdd 3.63860 osd.4 up 1.00000 1.00000
7 hdd 5.45789 osd.7 up 1.00000 1.00000
10 hdd 5.45789 osd.10 up 1.00000 1.00000
-7 18.19296 host node3
2 hdd 3.63860 osd.2 up 1.00000 1.00000
5 hdd 3.63860 osd.5 up 1.00000 1.00000
8 hdd 5.45789 osd.8 up 1.00000 1.00000
11 hdd 5.45789 osd.11 up 1.00000 1.00000
crush map
Code:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host node1 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
# weight 18.193
alg straw2
hash 0 # rjenkins1
item osd.0 weight 3.639
item osd.6 weight 5.458
item osd.9 weight 5.458
item osd.12 weight 3.639
}
host node2 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
# weight 18.193
alg straw2
hash 0 # rjenkins1
item osd.1 weight 3.639
item osd.4 weight 3.639
item osd.7 weight 5.458
item osd.10 weight 5.458
}
host node3 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
# weight 18.193
alg straw2
hash 0 # rjenkins1
item osd.2 weight 3.639
item osd.5 weight 3.639
item osd.8 weight 5.458
item osd.11 weight 5.458
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
# weight 54.579
alg straw2
hash 0 # rjenkins1
item node1 weight 18.193
item node2 weight 18.193
item node3 weight 18.193
}
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
# end crush map
I've checked the pool "size" and pool "min_size" of the pool which is 6/2. As I understand, it means ceph want to replicate the PGs to 6 node. If less than 2 node out of 6 working, the OSDs going to be write protected.
Due to we got only 3 active node, it could cause the warning message, because it can not distribute the PGs to 6 node?
Is it possibel to reduce the pool size from 6 to 3 on the fly without data loss?
In this threed:
https:\\forum.proxmox.com\threads\urgent-proxmox-ceph-support-needed.24302
Q-wulf wrote that:
Is this reduction really works as long as the pool "size" is greater then pool "min_size"?You should be able to reduce the number of replicas on a replicated pool as long as you keep your new "size" >= your "min_size", else you cause your cluster to not be able to do any I/o on the pool in question.
In the related documentation here, is not so clear for me:
http:\\docs.ceph.com\docs\master\rados\operations\pools\#set-the-number-of-object-replicas
If yes, how will it affect the cluster performance?Set the Number of Object Replicas
To set the number of object replicas on a replicated pool, execute the following:
ceph osd pool set {poolname} size {num-replicas}
Important
The {num-replicas} includes the object itself. If you want the object and two copies of the object for a total of three instances of the object, specify 3.
For example:
ceph osd pool set data size 3
You may execute this command for each pool. Note: An object might accept I/Os in degraded mode with fewer than pool size replicas. To set a minimum number of required replicas for I/O, you should use the min_size setting. For example:
ceph osd pool set data min_size 2
This ensures that no object in the data pool will receive I/O with fewer than min_size replicas.
(please note that the slashes are replaced with backslash due to it's my first thread)
Thanks,
Gabor A. Tar