Ouch! We are very used to other tools showing the actual CPU usage, so I'm surprised that this is a percentage graph. The left hand side scale does not indicate that and it wrong then. The top should be 100% and the bottom 0%. As it is, the scale shows 14 at the top. Is that 14%?
I think...
I have an interesting situation. An LXC running Power-mail-in-a-box has 4 cores assigned (with 8GB RAM and 100GB NVMe ceph pool storage).
The graph below shows the following:
The section from 9:32 to around 10:02 is when I only had 4 cores assigned. Before and after that time I had 12 cores...
Surely this a bug and can't be by design. If the main service goes down because to place one backs up to cannot be reached consistently, the backup service should balk, but the main service must be stable.
I'll open a bug report for this.
The only one of the these command that shows a problem is this one
root@FT1-NodeD:~# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-10-19...
It seems that my remote backup server (PBS) may be the cause of this. I have now determined that the link to it is very slow (1Mb/s) instead of the Gb/s is used to be, so I'm investigating that.
I find it peculiar though that not being able to do a fast backup can kill a whole node's running...
Regarding the hanging specifically: When the "pct status xxx" command times out, it's not the what's inside the container, it's the system. Some of these lxc's run on NVMe storage and for the rest of the system everything is running perfectly fine.
I still don't know why some nodes just look...
This morning it is another node that displays the exact same symptoms! Node B. It get's stuck trying to take a snapshot of an LXC and then all the LXC's become unresponsive.
I upgraded all the nodes yesterday to ensure that I've got all the latest patches, including PBS, so what could be...
We had an interesting situation this morning. For some reason one node in our cluster was not showing as active (green "running" arrows on the guest icon on the tree) and all the LXC's were not responding. We managed to address the issue as quickly as possible by simply resetting the node and...
The full isolation of each LXD OS is better than with LXC, not? I think the toolset with LXD is more powerful as well. I may be wrong, but aren't there more startable images available for LXD than for LXC as well?
To allow a tunnel to be established into a container, this post describes a method to do so.
The essence of it is this:
Add to the container config these lines
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net dev/net none bind,create=dir
Then change the /dev/net/tun...
Yes, Indeed, except that in this case the predictable names get swapped counter to the explicit renaming rules I provide. Works perfect on 3 nodes and used to work on the 4th as well. Now it doesn't. Did you see the detail in the reference?
I have 4 pmx identical nodes, of which I have renamed the nic's to more workable eth0, 1, 2, 3. However, after a recent outage in the DC (due to a power test), one of the these nodes swaps eth2 and 3 for no reason that I can find.
Please see...
This is literally a naming bug. If I simply add eth1 to the vmbr0 bridge and use eth2 for corosync, the node works correctly.
I'll wait to see who has an explaination, otherwise I'll file a bug with debian. Or should it be filed with proxmox?
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.