Did upgrades today that included Ceph 14.2.5, Had to restart all OSDs, Monitors, and Managers.
After restarting all Monitors and Managers was still getting errors every 5 seconds:
Dec 17 21:59:05 pve11 ceph-mon[3925461]: 2019-12-17 21:59:05.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:10 pve11 ceph-mon[3925461]: 2019-12-17 21:59:10.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:15 pve11 ceph-mon[3925461]: 2019-12-17 21:59:15.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:20 pve11 ceph-mon[3925461]: 2019-12-17 21:59:20.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:25 pve11 ceph-mon[3925461]: 2019-12-17 21:59:25.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:30 pve11 ceph-mon[3925461]: 2019-12-17 21:59:30.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:35 pve11 ceph-mon[3925461]: 2019-12-17 21:59:35.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:40 pve11 ceph-mon[3925461]: 2019-12-17 21:59:40.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:45 pve11 ceph-mon[3925461]: 2019-12-17 21:59:45.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:50 pve11 ceph-mon[3925461]: 2019-12-17 21:59:50.221 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:55 pve11 ceph-mon[3925461]: 2019-12-17 21:59:55.221 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Restarting all Monitors and Managers a second time has so far stopped the error messages...
After restarting all Monitors and Managers was still getting errors every 5 seconds:
Dec 17 21:59:05 pve11 ceph-mon[3925461]: 2019-12-17 21:59:05.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:10 pve11 ceph-mon[3925461]: 2019-12-17 21:59:10.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:15 pve11 ceph-mon[3925461]: 2019-12-17 21:59:15.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:20 pve11 ceph-mon[3925461]: 2019-12-17 21:59:20.214 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:25 pve11 ceph-mon[3925461]: 2019-12-17 21:59:25.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:30 pve11 ceph-mon[3925461]: 2019-12-17 21:59:30.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:35 pve11 ceph-mon[3925461]: 2019-12-17 21:59:35.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:40 pve11 ceph-mon[3925461]: 2019-12-17 21:59:40.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:45 pve11 ceph-mon[3925461]: 2019-12-17 21:59:45.218 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:50 pve11 ceph-mon[3925461]: 2019-12-17 21:59:50.221 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Dec 17 21:59:55 pve11 ceph-mon[3925461]: 2019-12-17 21:59:55.221 7f29ff2c5700 -1 mon.pve11@0(leader) e5 get_health_metrics reporting 1 slow ops, oldest is osd_failure(failed timeout osd.13 [v2:10.10.3.14:6802/3287700,v1:10.10.3.14:6805/3287700] for 25sec e3725 v3725)
Restarting all Monitors and Managers a second time has so far stopped the error messages...