I have 4 node of identical spec.
HP DL20
Xeon E2278G
64G RAM
6 of PM883 480G SSD(Ceph OSD)
1 of Intel DC3700 800G PCI-e NVMe Card(Proxmox itself)
1 of Mellanox Connectx 3 Dual 40G card (FW. 2_36_5000 HP OEM, Ethernet Mode)
Internal network config also identical.
iLO : 192.168.88.201~204 Internal 1Gbps eth1 share
Internet : 192.168.88.211~214 Internal 1Gbps eth1 share (Linux-bridge)
Ceph Public : 10.10.11.11~14 Internal 1Gbps eth2 dedicate (Linux-bridge)
Ceph Cluster : 10.10.10.11~14 Additional Dual 40G eth3~eth4 Linux-bonding broadcast mode.
40G AOC connection as below
server1 eth4 -> server2 eth3
server2 eth4 -> server3 eth3
server3 eth4 -> server4 eth3
server4 eth4 -> server1 eth3
I create Ceph config all 4 node set as monitor and manager.
and create storage with all of 24 OSD (total 10.48TB)
It have 128 PGs
Problem is Ceph health indicate only 40 PGs active and clean.
another 55 PGs creating+peering and 34 PGs peering.
and it indicate 3 messages.
1. reduced data availability: 89 pgs inactivate, 89 pgs peering.
2. 1 pools have too many placement groups
3. 81 slow ops, oldest one blocked for 2134sec, damon [osd22,osd7,mon,pve1]have slow ops.
I restart each monitor and manager sequenced.
but still same message indicated.
how can i fix it?
HP DL20
Xeon E2278G
64G RAM
6 of PM883 480G SSD(Ceph OSD)
1 of Intel DC3700 800G PCI-e NVMe Card(Proxmox itself)
1 of Mellanox Connectx 3 Dual 40G card (FW. 2_36_5000 HP OEM, Ethernet Mode)
Internal network config also identical.
iLO : 192.168.88.201~204 Internal 1Gbps eth1 share
Internet : 192.168.88.211~214 Internal 1Gbps eth1 share (Linux-bridge)
Ceph Public : 10.10.11.11~14 Internal 1Gbps eth2 dedicate (Linux-bridge)
Ceph Cluster : 10.10.10.11~14 Additional Dual 40G eth3~eth4 Linux-bonding broadcast mode.
40G AOC connection as below
server1 eth4 -> server2 eth3
server2 eth4 -> server3 eth3
server3 eth4 -> server4 eth3
server4 eth4 -> server1 eth3
I create Ceph config all 4 node set as monitor and manager.
and create storage with all of 24 OSD (total 10.48TB)
It have 128 PGs
Problem is Ceph health indicate only 40 PGs active and clean.
another 55 PGs creating+peering and 34 PGs peering.
and it indicate 3 messages.
1. reduced data availability: 89 pgs inactivate, 89 pgs peering.
2. 1 pools have too many placement groups
3. 81 slow ops, oldest one blocked for 2134sec, damon [osd22,osd7,mon,pve1]have slow ops.
I restart each monitor and manager sequenced.
but still same message indicated.
how can i fix it?