Syslog is spammed with: unable to determine idle time

zygis

New Member
Feb 6, 2019
2
0
1
40
Some time ago I started to receive syslog errors like this every few seconds for all containers in a physical node:
Code:
bindings.c: 4526: cpuview_proc_stat: cpuX from /lxc/CID/ns has unexpected cpu time: 9164403 in /proc/stat, 9393872 in cpuacct.usage_all; unable to determine idle time

I added flag '-l' in the /lib/systemd/system/lxcfs.service to separate system load from physical node (https://forum.proxmox.com/threads/lxc-loadavg.44463/). And then, these errors showed up. After removing the flag and physical node restart errors are still there. Not sure this is the issue. Should I recreate containers to get rid of these errors? Any other way to fix without recreation?

Code:
# pveversion -v
proxmox-ve: 5.4-1 (running kernel: 4.15.18-15-pve)
pve-manager: 5.4-6 (running version: 5.4-6/aa7856c5)
pve-kernel-4.15: 5.4-3
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-12-pve: 4.15.18-36
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-10
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-52
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-43
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-52
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
 
hi,

see this github issue from lxcfs[0]

looks like just a harmless debug log?

> htop will randomly show 600% CPU usage on all processes when this happens

but this one is quite interesting. do you have this too, or does it cause any other problems that you can see?

[0]: https://github.com/lxc/lxcfs/issues/283
 
Actually no. I can't see any spikes in CPU usage when this happens. When /lib/systemd/system/lxcfs.service was running with the -l flag, there was a problem with one tool. The CPU usage was always at 100%, and if I recall, correctly, it was htop when top, glances and monitoring solution showed correct values.

I just noticed that those messages appear only on htop tick every second and when monitoring agent runs checks (telegraf every 10 sec). But only from a few containers and only for a few CPU cores.

Code:
pct cpusets
-------------------------------------------------------------------
106:    1   3 4
108:  0         5
137:                    9                   16             21
148:  0 1 2       6 7 8   10 11 12 13 14 15    17 18 19 20    22 23
160:              6                               18
-------------------------------------------------------------------

messages are for CPU 1, 3 and 5 and only from containers 106 and 108
 
it likely has to do with how different tools use various methods to obtain/calculate the cpu usage.

i wouldn't worry about it too much
 
to clean up logs try

put to /etc/rsyslog.d/pve-local.conf
Code:
#  filter out
:msg, contains, "unable to determine idle time" stop

then
Code:
systemctl restart  rsyslog
 
Last edited:
Noted same behavior on a couple of containers after a recent upgrade to current, spams every few seconds. Rsyslog change suppresses it, would be interested to see what the fix turns out to be.
 
I just started seeing this issue too with a LXC I've had running for 2.5 years. It posts every second. I'll try the log filter posted above.

Update: The filter text added to /etc/rsyslog.d/pve-local.conf doesn't seem to be working for me.

Screen Shot 2020-05-20 at 9.46.58 PM.png
 
Last edited:
Update: The filter text added to /etc/rsyslog.d/pve-local.conf doesn't seem to be working for me.

did you restart the service for the filter to take effect?

are you running lxcfs with or without the -l flag?
 
did you restart the service for the filter to take effect?

are you running lxcfs with or without the -l flag?
Yes, I had restarted the service and the server. I am running lxcfs.service without the -l flag. If I should have it, I can follow these instructions after hours later today.
Code:
[Unit]
Description=FUSE filesystem for LXC
ConditionVirtualization=!container
Before=lxc.service
Documentation=man:lxcfs(1)

[Service]
ExecStart=/usr/bin/lxcfs /var/lib/lxcfs
KillMode=process
Restart=on-failure
ExecStopPost=-/bin/fusermount -u /var/lib/lxcfs
Delegate=yes
ExecReload=/bin/kill -USR1 $MAINPID

[Install]
WantedBy=multi-user.target
 
Last edited:
Yes, I had restarted the service and the server. I am running lxcfs.service without the -l flag. If I should have it, I can follow these instructions after hours later today.
Code:
[Unit]
Description=FUSE filesystem for LXC
ConditionVirtualization=!container
Before=lxc.service
Documentation=man:lxcfs(1)

[Service]
ExecStart=/usr/bin/lxcfs /var/lib/lxcfs
KillMode=process
Restart=on-failure
ExecStopPost=-/bin/fusermount -u /var/lib/lxcfs
Delegate=yes
ExecReload=/bin/kill -USR1 $MAINPID

[Install]
WantedBy=multi-user.target

* what is the output of pveversion -v ?

* how many containers are running?

* what kind of storage are the containers on? (zfs, local, lvm, etc.)

* does the log spam stop when you stop a certain container? (so we can narrow it down)
 
* what is the output of pveversion -v ?

* how many containers are running?

* what kind of storage are the containers on? (zfs, local, lvm, etc.)

* does the log spam stop when you stop a certain container? (so we can narrow it down)
I'm going to reply, but for some reason I'm no longer getting spammed with the idle time message even though I didn't add the -l flag.

Here's the pveversion -v output:
proxmox-ve: 6.2-1 (running kernel: 5.4.41-1-pve)
pve-manager: 6.2-4 (running version: 6.2-4/9824574a)
pve-kernel-5.4: 6.2-2
pve-kernel-helper: 6.2-2
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.13.13-5-pve: 4.13.13-38
pve-kernel-4.13.4-1-pve: 4.13.4-26
pve-kernel-4.10.17-3-pve: 4.10.17-23
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.15-1-pve: 4.10.15-15
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-2
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-1
pve-cluster: 6.1-8
pve-container: 3.1-6
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-2
pve-qemu-kvm: 5.0.0-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1
I'm running two containers and one VM.

All the containers are on local-lvm.

The log spam was always related to one container, but it has stopped now and I honestly can't explain why.
 
I have such a problem
proc_fuse.c: 1018: proc_stat_read: cpu18 from /lxc/XXX/ns has unexpected cpu time: 5729917 in /proc/stat, 5763564 in cpuacct.usage_all; unable to determine idle time