Still happened today. Seems to be related to the call to sync and not arc_prune. Here's some more logs. Let me know if you need any informations:
root 6959 1.5 0.0 199056 72048 ? SLsl 2018 1112:56 /usr/sbin/corosync -f
root 19671 0.0 0.0 7060 664 ? D Jan14...
Still happened today when rolling back a snapshot
root 25899 0.0 0.0 35148 3108 ? D 10:56 0:00 zfs rollback rpool/data/subvol-101-disk-0@pre_deploy
EDIT: Realised it's behind in kernel. I will update and reboot to see if it solves the issue
INFO: task kworker/u66:3:17147...
Yeah I've look into netdata but the end goal is to have a script on the host that would report the CPU usage so we could plug it into Nagios for alerting.
Hello,
I'm trying to use this monitoring script for Nagios which monitor LXC cpu and ram usage.
https://www.claudiokuenzler.com/nagios-plugins/check_lxc.php#CommandDefinition
The script uses the command lxc-cgroup
I tried
lxc-cgroup -n 101 memory.stat
which return nothing and no error...
Hello,
There seems to be a problem with one of our server where the arc_prune is stuck in D state. Sync from the initramfs-update never completes and wait on something which make me believe it's the arc_prune.
The server is a Dell R710 with 142GB ram and SSDs
I also have a bunch of errors...
It seems the bug was never fixed. I found a workaround which is to connect to the host, run a ping command to another machine by using lxc-attach which bring the network up. Here's the full command:
ssh proxmox-1 'sudo lxc-attach -n ct_id -- ping -c 10 any_server
Hopefully it can help others...
How would I go about monitoring the CPU usage for an LXC container ? Solutions like in Nagios reads /proc but since the CPU are shared we get a lot of false positive.
Example:
CT A has 4 cores and uses 2 at 100% - 50% total CPU usage
CT B has 2 cores which are shared with CT A - 100% CPU usage...
Sorry Wolfgang maybe my original post wasn't clear enough. This is the ram usage for the host (node) not a KVM/CT. The issue I'm having is I'm trying to monitor the ram usage with Nagios but I don't know if I should look up the ram reported by the webui or free ?
Nothing
Startup finished in 8.443s (kernel) + 1min 11.598s (userspace) = 1min 20.041s
I'm starting to think maybe it's my HBA. I will research some more. Thanks for the help
Hello,
I'm trying to understand how does the webui report the ram usage for the host.
As you can see here the ram usage reported by the webui is 50GB
but according to free shouldn't that value be "total - available column" (125-45=85GB used)
I'm not sure which one to believe. Could ZFS...
Everything seems normal in dmesg and journalctl. Could it be the HBA ? I'm still not sure since I have other servers with the same hardware config under the same firmware version which do not exhibit the problem.
We have some server that display some really long boot time. We are talking about 30min to 45min each.
At first I though it was related to the server POST but the image here show the server finished his POST correctly and gave control to the OS. This is where they hang for 30+ min.
We have...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.