Proxmoxve ceph clock issue

nethfel

Member
Dec 26, 2014
151
0
16
Hi all,

I had some issues with my prox ceph cluster. I had one comp fail, no problem - swapped the hard drives into identical hardware and restarted the node. In checking everything I found a hdd in another node had failed, so I removed it as an osd and replaced it. No problem - everything rebuilt ok, but I still saw a health_warn. I did a ceph health detail and I saw:

HEALTH_WARN clock skew detected on mon.1, mon.2
mon.1 addr 172.16.6.6:6789/0 clock skew 0.0936987s > max 0.05s (latency 0.0187915s)
mon.2 addr 172.16.6.7:6789/0 clock skew 0.102262s > max 0.05s (latency 0.0181998s)

i tried to to force an ntp sync with ntpdate but the warning didn't go away. I've let it sit a few hours and plan to check on it soon, but I was wondering if there was anything else I could do to force them all to sync up properly...
 
Usually in a Ceph cluster the best idea is to sync the nodes with each other to make sure all nodes has the same time. It will correct itself though, no worries as long as the gap is not significant. With big enough skewed clock you will notice Ceph cluster will prevent writing on OSDs of that node. I have seen as many as 3 days for a skewed clocked to fix itself without any intervention. A node reboot will also force clock sync.
 
Is there a way to make prox nodes self sync or do I just need to choose one to be an ntp server and point the other nodes to it?

ps when I went to check this evening it had self corrected finally
 
Is there a way to make prox nodes self sync or do I just need to choose one to be an ntp server and point the other nodes to it?
Basically thats what you have to do. I usually pick one critical Proxmox node that i know for sure "must" run all the time then point all other nodes to it for time sync.
 
I understand that this post is over a year old but I would like to add my solution to this problem for other to see.

Solution: On all Proxmox nodes in the cluster disable Systemd's time sync service, install ntp and configure it to use your NTP server and all other Proxmox nodes as peers. The Proxmox WGUI Services will show systemd-timesyncd status as dead. This is correct as it is replaced with the ntp daemon.

1. Disable Systemd's time sync service.
[node2.mynet.com]# systemctl stop system-timesync.service
[node2.mynet.com]# systemctl disable system-timesync.service
[node2.mynet.com]# systemctl mask system-timesync.service
Note: Systemd's disable command does not really disable a daemon. It only removes it from the scripts that run at startup. Systemd's mask command has to be used which replaces the daemon with a link to /dev/null. Strange logic to say the least.

2. Install ntpd
[node2.mynet.com]# apt-get install ntp

3. Configure ntpd with one NTP server and four Proxmox nodes on all nodes. Do not include the current node as a peer to itself.
[node2.mynet.com]# nano /etc/ntp.conf
Edit:
--
# pool.ntp.org maps to about 1000 low-stratum NTP servers. Your server will
# pick a different set every time it starts up. Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
#server 0.debian.pool.ntp.org iburst
#server 1.debian.pool.ntp.org iburst
#server 2.debian.pool.ntp.org iburst
#server 3.debian.pool.ntp.org iburst
--
# NTP server
server ntp1.mynet.com iburst
# NTP peers: Each node in the cluster peers with all other cluster nodes in
# order to keep all nodes in sync. This has to be updated whenever node(s) are
# added or removed fromt the cluster
peer node1.mynet.com
#peer node2.mynet.com # This is the current node.
peer node3.mynet.com
peer node4.mynet.com
--

4. Sync the time on all Proxmox nodes (This will sync the time even if the offset exceeds the panic threshold, see man ntpd for details).
This will result in all the nodes being time-sync'd with in a short space of time: a few minutes or even seconds.
[node1.mynet.com]# /etc/init.d/ntp stop
[node1.mynet.com]# ntpd -gq
[node1.mynet.com]# /etc/init.d/ntp start

5. Check the time sync on all nodes
[node1.mynet.com]# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ntp1.mynet.com .... 1 u 74 1024 377 68.218 -61.079 32.217
+node1.mynet.com 192.168.100.24 2 u 150 1024 376 6.049 -31.594 10.073
+node3.mynet.com 192.168.100.24 2 u 70 1024 257 0.475 -24.275 42.678
-node4.mynet.com 192.168.100.24 2 u 724 1024 377 1.123 -12.062 16.346
==============================================================================

Repeat the above on all Proxmox nodes in the clusters.

Why use ntp daemon and disable Systemd's time-sync service:
Firstly I could not find a way to use Systemd's time-sync service to time-sync with multiple peers.
Secondly why reinvent the Wheel and replace it with an Hexagon! The ntp daemon is stable, reliable, and versatile, why replace it with something that is not stable and offers less.
 
  • Like
Reactions: brucexx and Davyd
I was having this issue too.

My solution was to install ONLY ntp and set a local (LAN) NTP server in /etc/ntp.conf, then restarting the ntp service. Do not install ntpdate (see below).

After that, timedatectl status shows
Code:
NTP enabled: yes

After a few minutes all servers were back in sync and ceph was happy: HEALTH_OK

See:
http://askubuntu.com/questions/7658...ctl-and-network-time-synchronisation-in-16-04

Second answer links to this "bug":
https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg4946295.html

-> "It maybe that ntp can't open port 123 since it's already opened by ntpdate."

From your original post it sounds like you have ntpdate installed ("i tried to to force an ntp sync with ntpdate").
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!