Hello,
We usually setup plain old "ntp" service in our Proxmox nodes and sync time of the VM from them. The configuration is the default that comes with the package. Today I got a couple warnings regarding clock skew on 2 of 3 nodes of a cluster. Checking the time on them, I saw that ntp was not running, but it was up and running in the third node. This cluster runs on Proxmox 6.4-5.
Tried to start ntp as usual:
Service stayed up for like 20 seconds:
Then it exited:
While it was running, it adjusted the time on both servers and I could query ntp with "ntpq -p".
Checked /var/log/syslog and saw that for some reason systemd thinks that something is not ok and stops ntp (Proxmox node IPs removed from log):
Both the "kernel" and "dbus-daemon" messages can be ignored as the show in the third node and in at least 3 other clusters I've checked, were ntp runs properly.
Then, I checked the systemd file for the NTP service (vi /lib/systemd/system/ntp.service). it was the same in all nodes (as expected). That file specifies a "Conflicts=" directive with systemd-timesyncd.service, as expected, too.
I wanted to be sure if the problem was ntp itself or systemd, so I started ntp using the wrapper script configured in the systemd file
As I need ntp to be running, I've end up disabling systemd-timesyncd service and then I can start ntp using systemd properly.
Why is systemd thinking that ntp isn't starting properly ("Condition check resulted in Network Time Synchronization being skipped")?
Why this happens only on two of the three servers of this cluster?
Thanks
We usually setup plain old "ntp" service in our Proxmox nodes and sync time of the VM from them. The configuration is the default that comes with the package. Today I got a couple warnings regarding clock skew on 2 of 3 nodes of a cluster. Checking the time on them, I saw that ntp was not running, but it was up and running in the third node. This cluster runs on Proxmox 6.4-5.
Tried to start ntp as usual:
systemctl restart ntp
Service stayed up for like 20 seconds:
systemctl status ntp
● ntp.service - Network Time Service
Loaded: loaded (/lib/systemd/system/ntp.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2021-05-26 17:55:42 CEST; 1s ago
Docs: man:ntpd(8)
Process: 8326 ExecStart=/usr/lib/ntp/ntp-systemd-wrapper (code=exited, status=0/SUCCESS)
Main PID: 8339 (ntpd)
Tasks: 2 (limit: 4915)
Memory: 2.2M
CGroup: /system.slice/ntp.service
└─8339 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 110:117
Then it exited:
systemctl status ntp
● ntp.service - Network Time Service
Loaded: loaded (/lib/systemd/system/ntp.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Wed 2021-05-26 17:55:56 CEST; 33s ago
Docs: man:ntpd(8)
Process: 8326 ExecStart=/usr/lib/ntp/ntp-systemd-wrapper (code=exited, status=0/SUCCESS)
Main PID: 8339 (code=exited, status=0/SUCCESS)
While it was running, it adjusted the time on both servers and I could query ntp with "ntpq -p".
Checked /var/log/syslog and saw that for some reason systemd thinks that something is not ok and stops ntp (Proxmox node IPs removed from log):
Code:
May 26 17:59:35 maigmo02 systemd[1]: Starting Network Time Service...
May 26 17:59:35 maigmo02 ntpd[560155]: ntpd 4.2.8p12@1.3728-o (1): Starting
May 26 17:59:35 maigmo02 ntpd[560155]: Command line: /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 110:117
May 26 17:59:35 maigmo02 systemd[1]: Started Network Time Service.
May 26 17:59:35 maigmo02 ntpd[560168]: proto: precision = 0.050 usec (-24)
May 26 17:59:35 maigmo02 ntpd[560168]: leapsecond file ('/usr/share/zoneinfo/leap-seconds.list'): good hash signature
May 26 17:59:35 maigmo02 ntpd[560168]: leapsecond file ('/usr/share/zoneinfo/leap-seconds.list'): loaded, expire=2021-12-28T00:00:00Z last=2017-01-01T00:00:00Z ofs=37
May 26 17:59:35 maigmo02 ntpd[560168]: Listen and drop on 0 v6wildcard [::]:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen and drop on 1 v4wildcard 0.0.0.0:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 2 lo 127.0.0.1:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 3 vmbr0 :123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 4 vmbr110 :123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 5 vmbr111 :123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 6 vmbr31 :123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 7 vmbr112 :123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 8 vmbr113 :123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 9 lo [::1]:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 10 vmbr0 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 11 vmbr1 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 12 vmbr110 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 13 vmbr111 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 14 vmbr31 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 15 vmbr112 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listen normally on 16 vmbr113 []:123
May 26 17:59:35 maigmo02 ntpd[560168]: Listening on routing socket on fd #33 for interface updates
May 26 17:59:35 maigmo02 ntpd[560168]: kernel reports TIME_ERROR: 0x6041: Clock Unsynchronized
May 26 17:59:35 maigmo02 ntpd[560168]: kernel reports TIME_ERROR: 0x6041: Clock Unsynchronized
May 26 17:59:35 maigmo02 kernel: [1277187.992065] audit: type=1400 audit(1622044775.617:23): apparmor="DENIED" operation="connect" profile="/usr/sbin/ntpd" name="/run/dbus/system_bus_socket" pid=560168 comm="ntpd" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
May 26 17:59:36 maigmo02 ntpd[560168]: Soliciting pool server 146.59.16.42
May 26 17:59:37 maigmo02 ntpd[560168]: Soliciting pool server 5.196.160.139
May 26 17:59:37 maigmo02 ntpd[560168]: Soliciting pool server 51.77.221.70
May 26 17:59:38 maigmo02 ntpd[560168]: Soliciting pool server 91.224.149.41
May 26 17:59:38 maigmo02 ntpd[560168]: Soliciting pool server 62.210.136.184
May 26 17:59:38 maigmo02 ntpd[560168]: Soliciting pool server 62.210.244.146
May 26 17:59:39 maigmo02 ntpd[560168]: Soliciting pool server 151.80.211.8
May 26 17:59:39 maigmo02 ntpd[560168]: Soliciting pool server 212.83.158.83
May 26 17:59:39 maigmo02 ntpd[560168]: Soliciting pool server 51.68.44.27
May 26 17:59:39 maigmo02 ntpd[560168]: Soliciting pool server 95.81.173.8
May 26 17:59:40 maigmo02 ntpd[560168]: Soliciting pool server 94.23.21.189
May 26 17:59:40 maigmo02 ntpd[560168]: Soliciting pool server 37.187.122.11
May 26 17:59:40 maigmo02 ntpd[560168]: Soliciting pool server 194.177.34.116
May 26 17:59:41 maigmo02 ntpd[560168]: Soliciting pool server 51.15.157.154
May 26 17:59:41 maigmo02 ntpd[560168]: Soliciting pool server 162.159.200.1
May 26 17:59:42 maigmo02 ntpd[560168]: Soliciting pool server 2a01:e34:ec2a:94a0::2
May 26 18:00:30 maigmo02 dbus-daemon[2018]: [system] Activating via systemd: service name='org.freedesktop.timesync1' unit='dbus-org.freedesktop.timesync1.service' requested by ':1.145936' (uid=0 pid=562385 comm="timedatectl timesync-status " label="unconfined")
May 26 18:00:30 maigmo02 ntpd[560168]: ntpd exiting on signal 15 (Terminated)
May 26 18:00:30 maigmo02 ntpd[560168]: 146.59.16.42 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 systemd[1]: Stopping Network Time Service...
May 26 18:00:30 maigmo02 ntpd[560168]: 5.196.160.139 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 systemd[1]: Condition check resulted in Network Time Synchronization being skipped.
May 26 18:00:30 maigmo02 ntpd[560168]: 51.77.221.70 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 62.210.136.184 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 62.210.244.146 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 91.224.149.41 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 51.68.44.27 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 151.80.211.8 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 212.83.158.83 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 95.81.173.8 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 systemd[1]: ntp.service: Succeeded.
May 26 18:00:30 maigmo02 ntpd[560168]: 37.187.122.11 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 94.23.21.189 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 194.177.34.116 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 51.15.157.154 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 ntpd[560168]: 162.159.200.1 local addr 51.178.75.34 -> <null>
May 26 18:00:30 maigmo02 systemd[1]: Stopped Network Time Service.
Both the "kernel" and "dbus-daemon" messages can be ignored as the show in the third node and in at least 3 other clusters I've checked, were ntp runs properly.
Then, I checked the systemd file for the NTP service (vi /lib/systemd/system/ntp.service). it was the same in all nodes (as expected). That file specifies a "Conflicts=" directive with systemd-timesyncd.service, as expected, too.
I wanted to be sure if the problem was ntp itself or systemd, so I started ntp using the wrapper script configured in the systemd file
/usr/lib/ntp/ntp-systemd-wrapper
. It did run correctly and did not stop. So the problem lies somewhere in systemdAs I need ntp to be running, I've end up disabling systemd-timesyncd service and then I can start ntp using systemd properly.
Why is systemd thinking that ntp isn't starting properly ("Condition check resulted in Network Time Synchronization being skipped")?
Why this happens only on two of the three servers of this cluster?
Thanks