Best practice for internal time server cluster

ozy

New Member
Jul 19, 2023
3
0
1
Hello everyone!

I'm new in this forum. First of all, thank you for creating this awesome product!

I have 3x server node and I created an internal ceph cluster.
As you know, ceph very sensitive for time and I don't want to relay on external time servers.

I want to create internal time server cluster on my server nodes.
These 3x internal time server cluster will be connected an external time server.
The proxmox should use internal timeservers only to be fast.

My goal is: with internal timeserver cluster, time will be synced all the time and ceph cluster will work at any external network outage.

Is there any best practice with proxmox?
If now, can you please help me to implement this setup?

Thanks.
 
I hope you will not have your NTP cluster inside of your PVE cluster. If you need to do a cold start, you may run into trouble.
I'd setup one internal NTP, e.g. on your router or a pi and use the same upstream in addition to the internal one each node.
 
  • Like
Reactions: ozy
I hope you will not have your NTP cluster inside of your PVE cluster. If you need to do a cold start, you may run into trouble.
I'd setup one internal NTP, e.g. on your router or a pi and use the same upstream in addition to the internal one each node.

This 3 node proxmox cluster gonna be my management system.
Inside this cluster, every vm will be used for management and automation like grafana, ansible, ldap, etc.

I'm not thinking to deploy NTP servers as a Virtual machine.
I want to use directly hosts to create NTP cluster and in these servers the ntp servers will start directly after network online service.
I want to create an election system to have one leader all the time. The actual ntp server will be outside of my system.
My main goal is prevent time differences on outside network problems.

I have 30++ servers in my system and I'm thinking to use this internal 3 node NTP cluster with outside ntp fallback setting for all of my servers.

I have clustered 2x Cisco ASA5515-X, and stacked (VPC) 2x Cisco 93180YC main switch.
I wonder if I can create a ntp server cluster on these device with outside ntp server. I will check :)
 
Last edited:
My main goal is prevent time differences on outside network problems.
Is in your region a stratum 0 system available that is not only reachable via the internet? You can also run physical NTP servers that get their time e.g. over DCF77 (here in germany) to be completely offline (with respect to the internet).
 
Is in your region a stratum 0 system available that is not only reachable via the internet? You can also run physical NTP servers that get their time e.g. over DCF77 (here in germany) to be completely offline (with respect to the internet).

I have a reliable external time server. This is already in my pocket.
ı just want to create an internal layer to be more safe.
For example, currently I had to play around with my edge routers and outside network became not reachable for a day.
I directly started to see time differences and ceph cluster became unresponsive and of the day.

With internal ntp servers, I just want to be SYNC across all of my servers. I don't care if the time is correct or not.
Also for an awesome ceph storage, I want to have +-1ms and under time differences accross all of my nodes.
So I believe I can reach this goal with internal clustered time servers with supported external time server.
Also with this way, when I want to change the external time server, I will only change from one place that's it.
 
Yup, you are going to need those and working time servers or clustering goes wild. I found out the hard way that the US government bans people who pull with multiple set-it-and-forget-it servers. "All users should ensure that their software NEVER queries a server more frequently than once every 4 seconds. Systems that exceed this rate will be refused service. In extreme cases, systems that exceed this limit may be considered as attempting a denial-of-service attack.

Open your firewall wan for inbound NTP port 123 UDP and you could also allow imcp (echo request) only for this ping test and sorting but only to those listed timeservers. Make sure you deny everyone else in a new rule located at the very bottom of the tables, not on your Timeserver alias list (pfSense) otherwise, you will get some traffic from everywhere.

So now the solution is to find a list of Stratum One Time Servers that are OPEN AND find the one closest to you in ping times. Once we find those we will just nano-paste them into your Timeserver chrony.conf

# sudo nano /etc/chrony/chrony.conf # systemctl restart chronyd # journalctl --since -1h -u chrony

and then update your alias firewall rules for Time Servers or add them individually and call it a day.
Good luck with Stratum 0 servers which cannot be used on the network unless you have the physical equipment.

First, do this find your lists or group lists:

*** Make sure you can get Wan udp requests on port 123 from at least these servers you want to test. you could just open it all up and then close it when done. Allow all UDP Port 123. You'll also need ICMP echo request open. So let's do this quickly then.

Secondly, we need to ping test each one and sort which one is the lowest ping time. Yes, they are round-robin but the times will remain closely the same. So we need to create a list of NTP Time servers AFTER we test which ones are worthy.

  1. Open a new file in a text editor nano.
  2. # nano ping_script.sh to open a new file named ping_script.sh
  3. Copy and paste the entire script below into the file. Just make sure you overwrite my servers with the ones your want to test near you...or just use mine if you live on the east coast.
  4. each new time server must be quoted "mytimeserver1" so don't forget this when you add yours.
  5. Save and close the file. In nano, you can do this by pressing Ctrl+X, then Y to confirm that you want to save the changes, and then Enter to confirm the file name.
  6. Make the script executable by running chmod +x ping_script.sh.
  7. Run the script by typing # ./ping_script.sh in the cli

Code:
#!/bin/bash

servers=(
"tock.gpsclock.com"
"3.amazon.pool.ntp.org"
"clock.nyc.he.net"
"time1.facebook.com"
"tick.gpsclock.com"
"0.north-america.pool.ntp.org"
"time.cloudflare.com"
"time.facebook.com"
"time.euro.apple.com"
"time.apple.com"
"time3.facebook.com"
"bonehed.lcs.mit.edu"
"ntp.quidnet.com"
"time4.facebook.com"
"2.amazon.pool.ntp.org"
"time.google.com"
"time4.google.com"
"time3.google.com"
"time5.facebook.com"
"time2.google.com"
"time1.google.com"
"time.keneli.org"
"1.amazon.pool.ntp.org"
"utcnist2.colorado.edu"
"1.north-america.pool.ntp.org"
"3.north-america.pool.ntp.org"
"time.windows.com"
"clock.sjc.he.net"
"clock.isc.org"
"tick.ucla.edu"
"timekeeper.isi.edu"
"0.amazon.pool.ntp.org"
"montpelier.ilan.caltech.edu"
"2.north-america.pool.ntp.org"
"ntp.ubuntu.com"
"time2.facebook.com"
)

results=()

for server in "${servers[@]}"; do
  echo "Pinging $server"
  pingresult=$(ping -c 4 $server | tail -1| awk '{print $4}' | cut -d '/' -f 2)
  results+=("$pingresult ms to $server")
  sleep 5
done

# Sort the results
IFS=$'\n' sorted=($(sort -n <<<"${results[*]}"))
unset IFS

# Print the sorted results
for result in "${sorted[@]}"; do
  echo "$result"
done


Final results will take time due to a min of a 5 seconds sleep timer. Please don't make this faster. Don't be that guy ;)
Code:
$ ./ping_script.sh
Pinging tock.gpsclock.com
Pinging 3.amazon.pool.ntp.org
Pinging clock.nyc.he.net
Pinging time1.facebook.com
Pinging tick.gpsclock.com
Pinging 0.north-america.pool.ntp.org
Pinging time.cloudflare.com
Pinging time.facebook.com
Pinging time.euro.apple.com
Pinging time.apple.com
Pinging time3.facebook.com
Pinging bonehed.lcs.mit.edu
Pinging ntp.quidnet.com
Pinging time4.facebook.com
Pinging 2.amazon.pool.ntp.org
Pinging time.google.com
Pinging time4.google.com
Pinging time3.google.com
Pinging time5.facebook.com
Pinging time2.google.com
Pinging time1.google.com
Pinging time.keneli.org
Pinging 1.amazon.pool.ntp.org
Pinging utcnist2.colorado.edu
Pinging 1.north-america.pool.ntp.org
Pinging 3.north-america.pool.ntp.org
Pinging time.windows.com
Pinging clock.sjc.he.net
Pinging clock.isc.org
Pinging tick.ucla.edu
Pinging timekeeper.isi.edu
Pinging 0.amazon.pool.ntp.org
Pinging montpelier.ilan.caltech.edu
Pinging 2.north-america.pool.ntp.org
Pinging ntp.ubuntu.com
Pinging time2.facebook.com
 ms to 2.amazon.pool.ntp.org
7.260 ms to tick.gpsclock.com
7.738 ms to time.euro.apple.com
7.778 ms to time1.facebook.com
7.847 ms to tock.gpsclock.com
8.441 ms to time.apple.com
9.192 ms to 0.amazon.pool.ntp.org
9.712 ms to clock.nyc.he.net
11.010 ms to time.cloudflare.com
12.518 ms to time3.facebook.com
14.046 ms to bonehed.lcs.mit.edu
15.365 ms to time4.facebook.com
15.943 ms to time.facebook.com
17.114 ms to ntp.quidnet.com
17.192 ms to 1.amazon.pool.ntp.org
20.815 ms to 3.north-america.pool.ntp.org
23.618 ms to time4.google.com
25.319 ms to time3.google.com
25.741 ms to time5.facebook.com
26.781 ms to time.google.com
26.990 ms to time1.google.com
30.609 ms to time.keneli.org
31.540 ms to time2.google.com
40.254 ms to 0.north-america.pool.ntp.org
47.358 ms to utcnist2.colorado.edu
52.578 ms to time.windows.com
67.785 ms to clock.isc.org
68.588 ms to clock.sjc.he.net
68.904 ms to 1.north-america.pool.ntp.org
69.388 ms to tick.ucla.edu
70.285 ms to montpelier.ilan.caltech.edu
71.538 ms to timekeeper.isi.edu
74.218 ms to 3.amazon.pool.ntp.org
74.637 ms to 2.north-america.pool.ntp.org
76.616 ms to ntp.ubuntu.com
110.130 ms to time2.facebook.com


So now you can add these in your chrony.conf file if you want to get the fastest times however some of these are individual
IPS and not really reliable for enterprise use. So we will take the best of the best in this case. We will take the Group of POOLED server Ips from NTPPoolServers and use these ips FIRST why because there are so many ips in each pool. There are 638 active servers in this zone alone so damn that's reliable. So reliability is solved.

server 0.pool.ntp.org
server 1.pool.ntp.org
server 2.pool.ntp.org
server 3.pool.ntp.org

but we need to not get canceled by pulling every 4 seconds. Yup, I know about being canceled especially is you hard code these into other servers, routers, or any other devices. So let's keep this enterprise grade and all the IOT junks can use those ips above we pinged. So in conclusion we could do this for a perfect balance of reliability and shortest ms ping.
To add these:
nano /etc/chrony/chrony.conf
Code:
# NTP Pool servers
# These servers are part of the NTP Pool Project, a large cluster of time servers.
# They are polled frequently (every 64 to 1024 seconds) to provide accurate and reliable time synchronization.
# The 'iburst' option allows for quicker initial synchronization.
# 'minpoll 6' and 'maxpoll 10' set the minimum and maximum polling intervals to 2^6 (64) and 2^10 (1024) seconds, respectively.
server 0.pool.ntp.org iburst minpoll 6 maxpoll 10
server 1.pool.ntp.org iburst minpoll 6 maxpoll 10
server 2.pool.ntp.org iburst minpoll 6 maxpoll 10
server 3.pool.ntp.org iburst minpoll 6 maxpoll 10

# Other servers
# These servers are polled less frequently (once per day) to provide additional redundancy and to detect any issues with the NTP Pool servers.
# The 'iburst' option allows for quicker initial synchronization.
# 'minpoll 14' and 'maxpoll 14' set both the minimum and maximum polling intervals to 2^14 (16384) seconds, which is approximately once per day.
server time.facebook.com iburst minpoll 14 maxpoll 14
server time.apple.com iburst minpoll 14 maxpoll 14
server time1.google.com iburst minpoll 14 maxpoll 14
server time.windows.com iburst minpoll 14 maxpoll 14
server ntp.ubuntu.com iburst minpoll 14 maxpoll 14

assuming the latest version of Proxmox VE

# sudo nano /etc/chrony/chrony.conf
# Use Debian vendor zone.
paste fqdn here just like this one below
pool 2.debian.pool.ntp.org iburst


Save the file and exit the text editor.

# systemctl restart chrony.service
# systemctl status chrony.service

You can check that the changes have been applied correctly by looking at the system logs
# journalctl --since -1h -u chrony

Finally, Go add these to your firewall rules:

Allow ONLY these Timeservers into Wan UDP 123
Deny everything else Wan UDP 123

This script should work as long as the awk and cut commands are available on your system.
It pings each server in the list, extracts the average ping time, and then sorts and prints the results.
sudo apt-get install gawk

Allow ONLY these Timeservers to IMCP echo request
Deny any Timeservers to IMCP echo request

You will never want to hear about time servers again until you add a few more.
Hope this helps someone who trusts one set of ips like NIST Internet Time Servers. You'll never know if it goes down without a backup one daily ping.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!