Hello Proxmox Team,
I am running a 3-node Ceph cluster (ceph1, ceph2, and ceph3). Currently, ceph1 is configured as the primary local time source, while ceph2 and ceph3 use Chrony to sync their time from ceph1.
Recently, ceph3 experienced an unexpected incident and required a reboot to recover. A few days after the reboot, the active ceph1 dashboard started reporting a health warning: "Clock skew detected among monitors (WARNING)" on ceph2 and ceph3.
Here is the status payload from the cluster:
"ceph2": {
"skew": -0.16136153235400391,
"latency": 0.00091213023097808923,
"health": "HEALTH_WARN",
"details": "clock skew 0.161362s > max 0.05s"
}
Current Chrony tracking status on the affected nodes shows that the clock skew is exceeding the 0.05s maximum threshold required by Ceph.
Based on your experience, what is the best practice to resolve this permanently?
Thank you for your advanced support and guidance.
I am running a 3-node Ceph cluster (ceph1, ceph2, and ceph3). Currently, ceph1 is configured as the primary local time source, while ceph2 and ceph3 use Chrony to sync their time from ceph1.
Recently, ceph3 experienced an unexpected incident and required a reboot to recover. A few days after the reboot, the active ceph1 dashboard started reporting a health warning: "Clock skew detected among monitors (WARNING)" on ceph2 and ceph3.
Here is the status payload from the cluster:
"ceph2": {
"skew": -0.16136153235400391,
"latency": 0.00091213023097808923,
"health": "HEALTH_WARN",
"details": "clock skew 0.161362s > max 0.05s"
}
Current Chrony tracking status on the affected nodes shows that the clock skew is exceeding the 0.05s maximum threshold required by Ceph.
Based on your experience, what is the best practice to resolve this permanently?
Thank you for your advanced support and guidance.