Ceph went down after reinstall 1 OSD:

elmacus · Aug 16, 2019

Cluster Ceph 4 nodes, 24 OSD (mixed ssd and hdd), ceph Nautilus 14.2.1 (via proxmox 6, 7 nodes).
Autoscale PG is ON, 5 pools, 1 big pool with all the VM's 512 PG (all ssd). This size did not change when i turned on Autoscale on SSD pool, only the smaller for HDD and test.
All OSD installed in Luminous.
I took out and destroyed a NVME P900 (set as SSD), it was a GPT and name osd.22.
Created it again as new, it turns up as LVM, stores like 22% of 447 GB there, circa 100 GB.

After some 5 min the size of pool RBD report is down from 3.77 TB data to 100 GB. PG num from 512 to 4.
And then whole CEPH went unresponsive, no VM responded, for 1 min.

Then i turn of autoscale for this pool and set it manually to 512 again, and BAAM, all Proxmox cluster went down and restarted, took 10 min. Probably HA went down due to CEPH traffic was skyhigh, same network (shouldnt).

Now everything works again, but size of SSD pool is used: 0.52 % of 95.61 GB. Should be 3.77 TB of 10 TB.
All files is there, nothing missing.

So is this a bug in Ceph that autoscale can go wrong when reinstalling 1 disk ?

Any hints on how to restore the report of size of RBD in this pool ?

elmacus · Aug 16, 2019

All disks was ceph-disk, then i upgraded like:
https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes

According to Ceph:
https://docs.ceph.com/docs/master/ceph-volume/#migrating

Should i (and everyone that upgrades from Luminous to Nautilus) reinstall ALL OSD to ceph-volume ?
https://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#rados-replacing-an-osd

I guess that ONE reinstall of OSD trigged (bug or not) something so RBD only sees this new disk ceph-volume and not all old ceph-disk type, and therefore autoscale PG only can now see ceph-volume disks ?

elmacus · Aug 16, 2019

Could be that this is fixed in 14.2.2:
https://ceph.io/releases/v14-2-2-nautilus-released/

Earlier Nautilus releases (14.2.1 and 14.2.0) have an issue where deploying a single new (Nautilus) BlueStore OSD on an upgraded cluster (i.e. one that was originally deployed pre-Nautilus) breaks the pool utilization stats reported by ceph df.
Until all OSDs have been reprovisioned or updated (via ceph-bluestore-tool repair), the pool stats will show values that are lower than the
true value.
This is resolved in 14.2.2, such that the cluster only switches to using the more accurate per-pool stats after all OSDs are 14.2.2 (or later), are BlueStore, and (if they were created prior to Nautilus) have been updated via the repair function.

So Proxmox team, when can we expect Ceph 14.2.2 ?

tom · Aug 16, 2019

elmacus said:
So Proxmox team, when can we expect Ceph 14.2.2 ?

Already in the test repo.

http://download.proxmox.com/debian/ceph-nautilus/dists/buster/test/binary-amd64/

elmacus · Aug 16, 2019

So the fix until 14.2.2 is installed, on every OSD in system, stop 1 at a time, then run:
ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-22 (change the number to correct OSD.)

Watch disk usage go up with: ceph df

Or reinstall all disks, zap and create via gui. ugh.

Proxmox, you should warn users in the wiki for this.

kyriazis · Dec 13, 2019

With Ceph 14.2.4.1, are there any gotchas for using pg autoscale? Is the correct method to use Ceph commands to enable it, or is there a more Proxmox-friendly way of doing it?

Alwin · Dec 13, 2019

@kyriazis, best see the Ceph docs.
https://docs.ceph.com/docs/nautilus/rados/operations/placement-groups/

Search

Search

Ceph went down after reinstall 1 OSD:

elmacus

Well-Known Member

elmacus

Well-Known Member

elmacus

Well-Known Member

tom

Proxmox Staff Member

elmacus

Well-Known Member

kyriazis

Active Member

Alwin

Proxmox Retired Staff