Syslog is spammed with: unable to determine idle time

zygis

New Member
Feb 6, 2019
2
0
1
40
Some time ago I started to receive syslog errors like this every few seconds for all containers in a physical node:
Code:
bindings.c: 4526: cpuview_proc_stat: cpuX from /lxc/CID/ns has unexpected cpu time: 9164403 in /proc/stat, 9393872 in cpuacct.usage_all; unable to determine idle time

I added flag '-l' in the /lib/systemd/system/lxcfs.service to separate system load from physical node (https://forum.proxmox.com/threads/lxc-loadavg.44463/). And then, these errors showed up. After removing the flag and physical node restart errors are still there. Not sure this is the issue. Should I recreate containers to get rid of these errors? Any other way to fix without recreation?

Code:
# pveversion -v
proxmox-ve: 5.4-1 (running kernel: 4.15.18-15-pve)
pve-manager: 5.4-6 (running version: 5.4-6/aa7856c5)
pve-kernel-4.15: 5.4-3
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-12-pve: 4.15.18-36
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-10
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-52
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-43
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-52
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
 
hi,

see this github issue from lxcfs[0]

looks like just a harmless debug log?

> htop will randomly show 600% CPU usage on all processes when this happens

but this one is quite interesting. do you have this too, or does it cause any other problems that you can see?

[0]: https://github.com/lxc/lxcfs/issues/283
 
Actually no. I can't see any spikes in CPU usage when this happens. When /lib/systemd/system/lxcfs.service was running with the -l flag, there was a problem with one tool. The CPU usage was always at 100%, and if I recall, correctly, it was htop when top, glances and monitoring solution showed correct values.

I just noticed that those messages appear only on htop tick every second and when monitoring agent runs checks (telegraf every 10 sec). But only from a few containers and only for a few CPU cores.

Code:
pct cpusets
-------------------------------------------------------------------
106:    1   3 4
108:  0         5
137:                    9                   16             21
148:  0 1 2       6 7 8   10 11 12 13 14 15    17 18 19 20    22 23
160:              6                               18
-------------------------------------------------------------------

messages are for CPU 1, 3 and 5 and only from containers 106 and 108
 
it likely has to do with how different tools use various methods to obtain/calculate the cpu usage.

i wouldn't worry about it too much
 
to clean up logs try

put to /etc/rsyslog.d/pve-local.conf
Code:
#  filter out
:msg, contains, "unable to determine idle time" stop

then
Code:
systemctl restart  rsyslog
 
Last edited:
Noted same behavior on a couple of containers after a recent upgrade to current, spams every few seconds. Rsyslog change suppresses it, would be interested to see what the fix turns out to be.
 
I just started seeing this issue too with a LXC I've had running for 2.5 years. It posts every second. I'll try the log filter posted above.

Update: The filter text added to /etc/rsyslog.d/pve-local.conf doesn't seem to be working for me.

Screen Shot 2020-05-20 at 9.46.58 PM.png
 
Last edited:
Update: The filter text added to /etc/rsyslog.d/pve-local.conf doesn't seem to be working for me.

did you restart the service for the filter to take effect?

are you running lxcfs with or without the -l flag?
 
did you restart the service for the filter to take effect?

are you running lxcfs with or without the -l flag?
Yes, I had restarted the service and the server. I am running lxcfs.service without the -l flag. If I should have it, I can follow these instructions after hours later today.
Code:
[Unit]
Description=FUSE filesystem for LXC
ConditionVirtualization=!container
Before=lxc.service
Documentation=man:lxcfs(1)

[Service]
ExecStart=/usr/bin/lxcfs /var/lib/lxcfs
KillMode=process
Restart=on-failure
ExecStopPost=-/bin/fusermount -u /var/lib/lxcfs
Delegate=yes
ExecReload=/bin/kill -USR1 $MAINPID

[Install]
WantedBy=multi-user.target
 
Last edited:
Yes, I had restarted the service and the server. I am running lxcfs.service without the -l flag. If I should have it, I can follow these instructions after hours later today.
Code:
[Unit]
Description=FUSE filesystem for LXC
ConditionVirtualization=!container
Before=lxc.service
Documentation=man:lxcfs(1)

[Service]
ExecStart=/usr/bin/lxcfs /var/lib/lxcfs
KillMode=process
Restart=on-failure
ExecStopPost=-/bin/fusermount -u /var/lib/lxcfs
Delegate=yes
ExecReload=/bin/kill -USR1 $MAINPID

[Install]
WantedBy=multi-user.target

* what is the output of pveversion -v ?

* how many containers are running?

* what kind of storage are the containers on? (zfs, local, lvm, etc.)

* does the log spam stop when you stop a certain container? (so we can narrow it down)
 
* what is the output of pveversion -v ?

* how many containers are running?

* what kind of storage are the containers on? (zfs, local, lvm, etc.)

* does the log spam stop when you stop a certain container? (so we can narrow it down)
I'm going to reply, but for some reason I'm no longer getting spammed with the idle time message even though I didn't add the -l flag.

Here's the pveversion -v output:
proxmox-ve: 6.2-1 (running kernel: 5.4.41-1-pve)
pve-manager: 6.2-4 (running version: 6.2-4/9824574a)
pve-kernel-5.4: 6.2-2
pve-kernel-helper: 6.2-2
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.13.13-5-pve: 4.13.13-38
pve-kernel-4.13.4-1-pve: 4.13.4-26
pve-kernel-4.10.17-3-pve: 4.10.17-23
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.15-1-pve: 4.10.15-15
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-2
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-1
pve-cluster: 6.1-8
pve-container: 3.1-6
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-2
pve-qemu-kvm: 5.0.0-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1
I'm running two containers and one VM.

All the containers are on local-lvm.

The log spam was always related to one container, but it has stopped now and I honestly can't explain why.
 
I have such a problem
proc_fuse.c: 1018: proc_stat_read: cpu18 from /lxc/XXX/ns has unexpected cpu time: 5729917 in /proc/stat, 5763564 in cpuacct.usage_all; unable to determine idle time
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!