@midsize_erp Yeah, for sure .. as far as this Ceph problem goes with the long ping times, it's gone. We are still seeing some odd VM behavior on these latest kernels (5.15.x) but Ceph has been fine.
cluster [WRN] Health check failed: Slow OSD heartbeats on back (longest 1064.667ms) (OSD_SLOW_PING_TIME_BACK)
cluster [WRN] Health check failed: Slow OSD heartbeats on front (longest 1244.808ms) (OSD_SLOW_PING_TIME_FRONT)
cluster [WRN] Health check update: Slow OSD heartbeats on back (longest 2670.215ms) (OSD_SLOW_PING_TIME_BACK)
cluster [WRN] Health check update: Slow OSD heartbeats on front (longest 1744.936ms) (OSD_SLOW_PING_TIME_FRONT)
cluster [WRN] Health check update: Slow OSD heartbeats on back (longest 2889.051ms) (OSD_SLOW_PING_TIME_BACK)
cluster [WRN] Health check update: Slow OSD heartbeats on front (longest 2889.027ms) (OSD_SLOW_PING_TIME_FRONT)
cluster [WRN] Health check update: Slow OSD heartbeats on back (longest 1872.833ms) (OSD_SLOW_PING_TIME_BACK)
cluster [WRN] Health check update: Slow OSD heartbeats on front (longest 2481.821ms) (OSD_SLOW_PING_TIME_FRONT)
Weird thing is, that the issue gone away and sorted itself. There were no change in config or kernel.I don't believe we ever got root cause. Remediation came via upgrading to a kernel that didn't have the issue.