No summary stats

Baptiste M · Feb 23, 2017

Hi,

My node have three LXC container :
- One show no CPUs Usage stats, no memory usage (always at 0)
- The others shows correct stats, except diskwrite and diskread, always at 0.

I tried rebooting, restarting container, suppressing /var/lib/rrdcached/db/pve2-vm/*
The api also gives 0

My container and software inside works perfectly, and munin graph shows that stats are not 0.

Any idea ?

dcsapak · Feb 23, 2017

what is the config of the container?

Code:

pct config <ctid>

Baptiste M · Feb 23, 2017

Code:

root@alien /root$ pct config 100
arch: amd64
cores: 5
hostname: mysql2.humanoid.fr
memory: 71680
nameserver: 8.8.8.8 8.8.4.4
net0: name=eth0,bridge=vmbr0,firewall=1,gw=62.210.0.1,hwaddr=00:50:56:00:0e:b8,ip=212.83.180.208/32,type=veth
net1: name=eth1,bridge=vmbr1,hwaddr=DE:F6:CC:C3:6C:AE,ip=192.168.0.100/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-zfs:subvol-100-disk-1,size=80G
searchdomain: humanoid.fr
swap: 4096

dcsapak · Feb 23, 2017

what about

Code:

pct status 100 --verbose

?

Baptiste M · Feb 23, 2017

Code:

root@alien /var/lib/rrdcached/db$ pct status 100 --verbose
cpu: 0
cpus: 5
disk: 8950251520
diskread: 0
diskwrite: 0
lock:
maxdisk: 85899345920
maxmem: 75161927680
maxswap: 4294967296
mem: 0
name: mysql2.humanoid.fr
netin: 43340003299
netout: 1450873518810
pid: 63740
status: running
swap: 0
template:
type: lxc
uptime: 79097

Here it is !

liane · Feb 23, 2017

I too I'm facing a similar problem: on all LXC containers, only the network stats are alive.
Interestingly, I had to migrate one container and change its ID (with stop/backup/restore), and while restoring on a new ID, I noticed the old one had begun to show some stats, probable some time after migration. Alas, after restore, the new one is still showing null values.

Code:

# pveversion -v
proxmox-ve: 4.4-80 (running kernel: 4.4.40-1-pve)
pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74)
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.40-1-pve: 4.4.40-80
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-109
pve-firmware: 1.1-10
libpve-common-perl: 4.0-92
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-73
pve-libspice-server1: 0.12.8-1
vncterm: 1.3-1
pve-docs: 4.4-3
pve-qemu-kvm: 2.7.1-3
pve-container: 1.0-94
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-3
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80

Code:

# pct status 102 --verbose
cpu: 0
cpus: 1
disk: 5634506752
diskread: 143360
diskwrite: 0
lock:
maxdisk: 8320901120
maxmem: 3221225472
maxswap: 536870912
mem: 0
name: xxx.xxx.com
netin: 390570508
netout: 512469702
pid: 25167
status: running
swap: 0
template:
type: lxc
uptime: 52859

Code:

# pct config 102
arch: amd64
cores: 1
hostname: xxx.xxx.com
memory: 3072
net0: name=eth0,bridge=vmbr0,gw=xx.xx.xx.1,hwaddr=00:xx:xx:xx:xx:5a,ip=xx.xx.xx.127/24,type=veth
onboot: 1
ostype: debian
rootfs: storage:vm-102-disk-1,quota=1,size=8G
swap: 512

Baptiste M · Feb 23, 2017

Interessing story because I forget to tell something : all my containers were restored from some backups.

liane · Feb 24, 2017

I started a previous VM (id 112) and surprise, this one show stats...
So, checking what might change between this one and the others, I found that this one had an empty log in /val/log/lxc/112.log, and all others show errors in their respective log files:

Code:

# cat /var/log/lxc/102.log
      lxc-start 20160218122918.961 ERROR    lxc_conf - conf.c:run_buffer:405 - Script exited with status 1.
      lxc-start 20160218122918.961 ERROR    lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container "102".
      lxc-start 20160218122920.430 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1317 - Path "/sys/fs/cgroup/systemd//lxc/102" already existed.
      lxc-start 20160218122920.431 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1381 - No such file or directory - Failed to create /sys/fs/cgroup/systemd//lxc/102: No such file or directory
      lxc-start 20160223015306.471 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1317 - Path "/sys/fs/cgroup/systemd//lxc/102" already existed.
      lxc-start 20160223015306.471 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1381 - No such file or directory - Failed to create /sys/fs/cgroup/systemd//lxc/102: No such file or directory

notice the double slash after systemd, could this be a hint?

Baptiste M · Feb 24, 2017

(and ... I confirm

same here)

liane · Feb 26, 2017

after migrate containers to another host, stats live again...

That tell us 3 things:
- it is not container dependant: I've done it on 8 different containers
- it is not host dependant: migrating back to the origin host keeps stats living
- there is something flawed with backup restore, but that can be cured with a migration

liane · Mar 4, 2017

liane said:
- there is something flawed with backup restore

for information: restoring a new backup does have correct stats, and IIRC, the previous backups were openvz ones that were converted to LXC

so it seems that it is the LXC conversion that had some problem on all containers, but that further backup/restore work ok.

baggar11 · Mar 6, 2017

I'm noticing a similar issue. My /var/log/lxc/121.log are full of lines like:

directory
lxc-start 20170306022553.225 ERROR lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1317 - Path "/sys/fs/cgroup/systemd//lxc/121-2" already existed.
lxc-start 20170306022553.225 ERROR lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1381 - No such file or directory - Failed to create /sys/fs/cgroup/systemd//lxc/121-2: No such file or directory.

The contents of /sys/fs/cgroup/systemd/lxc are as follows:

drwxr-xr-x 3 root root 0 Mar 5 18:15 121
drwxr-xr-x 3 root root 0 Mar 5 18:15 121-1
drwxr-xr-x 3 root root 0 Mar 5 18:15 121-2
drwxr-xr-x 3 root root 0 Mar 5 18:36 121-3

What's up with all the 121 container directories?

baggar11 · Mar 6, 2017

I gave my node a restart and that seemed to clear out the /sys/fs/cgroups/systemd/lxc/121* duplicates. I'm wondering if there is something that isn't cleaning up correctly with containers?

fabian · Mar 6, 2017

yes, there is currently a bug when rebooting containers from within the container. fix is in the works.

baggar11 · Mar 6, 2017

fabian said:
yes, there is currently a bug when rebooting containers from within the container. fix is in the works.

Thanks for the update!

V-SPEED · Mar 9, 2017

I have noticed the same problem when backup lxc container on one proxmox and restore on another proxmox. Kindly looking for update

Baptiste M · Mar 16, 2017

Is there any command line quickfixs ?

V-SPEED · Mar 16, 2017

Just do backup and restore on different vm id on the same Proxmox. This works for me

fabian · Mar 16, 2017

fixed package is available on all repositories - reboot the host after upgrading (the upgraded packages also contain a kernel update with a security fix anyway) and everything should work as expected again.

V-SPEED · Mar 16, 2017

Works after upgrade and reboot. Thank you!

No summary stats

New Member

Attachments

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Renowned Member

New Member

Renowned Member

New Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

New Member

New Member

New Member

Proxmox Staff Member

New Member

We value your privacy