Ceph integration - clock skew

pvps1 · Oct 27, 2016

Hi
we have a problem with a 4 node cluster running integrated ceph (meaning nodes are pve and ceph-cluster in one).

3 nodes are ceph mons and osds, 2 of them report:
health HEALTH_WARN
clock skew detected on mon.1
Monitor clock skew detected

we cannot detect why. all nodes are running NTP and have accurate time.

any hints/tips?
networking bottleneck (only 1gbit networking)?

thx in advance,
Peter

wolfgang · Oct 27, 2016

Hi,
sometimes after a reboot it can take a time until they are in sync again.
How log you got this message?

pvps1 · Oct 27, 2016

wolfgang said:
Hi,
sometimes after a reboot it can take a time until they are in sync again.
How log you got this message?

it's regulary, not just after rebooting. e.g. last report was this morning (checks run hourly). all nodes have uptime > 7 days

wolfgang · Oct 27, 2016

Could it be that the nodes have different timezone?

pvps1 · Oct 27, 2016

wolfgang said:
Could it be that the nodes have different timezone?

No.
see (run with cssh on all 4 nodes):
root@dkcpn0001:~# cat /etc/timezone ; date
Europe/Vienna
Thu Oct 27 10:11:24 CEST 2016

root@dkcpn0002:~# cat /etc/timezone ; date
Europe/Vienna
Thu Oct 27 10:11:24 CEST 2016

root@dkcpn0003:~# cat /etc/timezone ; date
Europe/Vienna
Thu Oct 27 10:11:24 CEST 2016

root@dkcpr0001:~# cat /etc/timezone ; date
Europe/Vienna
Thu Oct 27 10:11:24 CEST 2016

pvps1 · Oct 27, 2016

the 3 "pn" nodes
Linux dkcpn0002 4.4.19-1-pve #1 SMP Wed Sep 14 14:33:50 CEST 2016 x86_64 GNU/Linux

the "pr" node (router, no guests, no ceph -> just doing quorum):
Linux dkcpr0001 4.4.6-1-pve #1 SMP Thu Apr 21 11:25:40 CEST 2016 x86_64 GNU/Linux

all:
pve-manager Version: 4.3-3

spirit · Oct 27, 2016

This is warning when all your ceph mon nodes are not time sync. (> 50ms difference, and only between the mons).
try

Code:

date +"%T.%3N"

wolfgang · Oct 27, 2016

May try an other ntp server which is nearby you cluster.

pvps1 · Oct 27, 2016

spirit said:
This is warning when all your ceph mon nodes are not time sync. (> 50ms difference, and only between the mons).
try

Code:

date +"%T.%3N"

running cssh:

root@dkcpn0001:~# date +"%T.%3N"
14:16:05.967
root@dkcpn0002:~# date +"%T.%3N"
14:16:05.888
root@dkcpn0003:~# date +"%T.%3N"
14:16:05.967

don't know if the time difference can come from cssh runtime or the node's load.

all nodes sync to node dkcpr0001 which is part of the cluster (therefore 1 hop, switched)

pvps1 · Oct 27, 2016

wolfgang said:
May try an other ntp server which is nearby you cluster.

all nodes sync to node dkcpr0001 which is part of the cluster (therefore 1 hop, switched). pr0001 fetches time from some external server

pvps1 · Oct 27, 2016

realized -> it is allways mon.1 that is reported. this is node pn0002 which really has a different time of ~100ms (see last reply).
did a manual resync with the internal timeserver and immediatly after that a date +"%T.%3N" again shows between 50 and 100ms difference...

the node has no extraordinary load, on the contrary it's the less used node.
hmmm have to fresh up my ntpd knowledge on how to increase precision on that node or find the reason for the bias

Leron · Mar 8, 2018

I test a proxmox cluster with 3 node, 1 old HP et 2 new HPE server, i dont know why but new server's is not correctly sync despite ntpd process.
ntpq -p show a problem jitter network... i don't know why....
I have sync new hpe server's on my old HPE, same problem, jitter is high.
I test with chrony another ntp client, it's works !
chrony have a "Estimation of asymmetric jitter", maybe it was my problem.

aderumier · Mar 9, 2018

Hi,
I'm using chrony too now, no more clock screw and it's also able to manage leap second.

https://chrony.tuxfamily.org/comparison.html

Ceph integration - clock skew

pvps1

Renowned Member

wolfgang

Famous Member

pvps1

Renowned Member

wolfgang

Famous Member

pvps1

Renowned Member

pvps1

Renowned Member

spirit

Distinguished Member

wolfgang

Famous Member

pvps1

Renowned Member

pvps1

Renowned Member

pvps1

Renowned Member

Leron

New Member

aderumier

Renowned Member

We value your privacy