Had two Ceph pools for RBD virt disks, vm_images (boot hdd images) + rbd_data (extra hdd images).
Then while adding pools for a rados GW (.rgw.*) suddenly ceph health status said that my vm_images pool had too few PGs, thus I ran:
ceph osd pool set vm_images pg_num <larger_number>
ceph osd pool set vm_images pgp_num <larger_number>
Kicking off a 20 min rebalancing with a lot of IO in the Ceph Cluster, eventually Ceph Cluster was fine again, only almost all my PVE VMs ended up in stopped state, wondering why, a watchdog thingy maybe...
/Steffen
PS! Admitting my Ceph public and private networks are on the same physical 2-3Gbs LaCP load balanced network (some nodes with 2x1Gbs NICs, some with 3x1Gbs NIcs) since my only other physical network is a slow 100Mbs public network.
Then while adding pools for a rados GW (.rgw.*) suddenly ceph health status said that my vm_images pool had too few PGs, thus I ran:
ceph osd pool set vm_images pg_num <larger_number>
ceph osd pool set vm_images pgp_num <larger_number>
Kicking off a 20 min rebalancing with a lot of IO in the Ceph Cluster, eventually Ceph Cluster was fine again, only almost all my PVE VMs ended up in stopped state, wondering why, a watchdog thingy maybe...
/Steffen
PS! Admitting my Ceph public and private networks are on the same physical 2-3Gbs LaCP load balanced network (some nodes with 2x1Gbs NICs, some with 3x1Gbs NIcs) since my only other physical network is a slow 100Mbs public network.