Search results

  1. L

    [BUG] Network is only working selectively, can't see why

    This morning a restart of a node that had not been restarted for quite some time caused the same symptoms as those reported here. It dawned on me that this might be due to a new running kernel that this swapping of ports occurs. On further investigation, here's what I found. NodeA was running...
  2. L

    ceph thin provisioning for lxc's not working as expected?

    :redface: Of course, the command has to run on the node on which the container is running...! ~# pct fstrim 192 /var/lib/lxc/192/rootfs/: 88.9 GiB (95446147072 bytes) trimmed /var/lib/lxc/192/rootfs/home/user-data/owncloud: 1.6 TiB (1795599138816 bytes) trimmed However, when I ask rbd for the...
  3. L

    ceph thin provisioning for lxc's not working as expected?

    I don't think it's a good idea to run privileged containers for clients, not? If a UID matches one of the host's UIDs that has rights to locations a client should not have access to, it may create a big problem...
  4. L

    ceph thin provisioning for lxc's not working as expected?

    Does it mean that if you have a mountpoint (over and above the boot drive), thin-provisioning doesn't work? ~# cat /etc/pve/lxc/192.conf arch: amd64 cores: 4 features: nesting=1 hostname: productive memory: 8192 nameserver: 8.8.8.8 net0...
  5. L

    ceph thin provisioning for lxc's not working as expected?

    Of course that gives the same result. For some reason the container believes that the storage doesn't support trimming, i.e. it's not thin provisioned. However, some other volumes on the same ceph storage pool are completely ok with trimming. Could there be something that's set in the...
  6. L

    ceph thin provisioning for lxc's not working as expected?

    The response is: fstrim: /: FITRIM ioctl failed: Operation not permitted This is Ubuntu 22.04 running a ceph storage cluster. Why is this?
  7. L

    ceph thin provisioning for lxc's not working as expected?

    I have an LXC that is provisioned with a 100GB boot drive using ceph RBD storage. However, see the following: ~# df -h Filesystem Size Used Avail Use% Mounted on /dev/rbd10 98G 8.8G 85G 10% / This is in the running container. Checking the disk usage in ceph however, claims...
  8. L

    [SOLVED] LXC with more cores assigned uses dramatically less CPU. Why?

    I know the matter has been resolved, but just for reference, here's what I was referring to: ~# uptime 14:17:19 up 21 days, 19:44, 1 user, load average: 5.37, 5.36, 5.55 This refers to CPU's, not percentages.
  9. L

    How to fence off ceph monitor processes?

    I the continuous process of learning about running an pmx environment with ceph, I came across a note regarding ceph performance: "... if running in shared environments, fence off monitor processes." Can someone explain what is meant by this and how does one achieve this? thanks!
  10. L

    [SOLVED] LXC with more cores assigned uses dramatically less CPU. Why?

    Then it must be changed to say percentage use. There is an enhancement request open for this.
  11. L

    [SOLVED] LXC with more cores assigned uses dramatically less CPU. Why?

    40 CPU's, number 40 in the graph. Not percentages. Go to a VM however: 6 vCPU's assigned, the graphs shows 7. Why on earth would anyone think that this is a percentage graph?
  12. L

    [SOLVED] LXC with more cores assigned uses dramatically less CPU. Why?

    Ouch! We are very used to other tools showing the actual CPU usage, so I'm surprised that this is a percentage graph. The left hand side scale does not indicate that and it wrong then. The top should be 100% and the bottom 0%. As it is, the scale shows 14 at the top. Is that 14%? I think...
  13. L

    [SOLVED] LXC with more cores assigned uses dramatically less CPU. Why?

    I have an interesting situation. An LXC running Power-mail-in-a-box has 4 cores assigned (with 8GB RAM and 100GB NVMe ceph pool storage). The graph below shows the following: The section from 9:32 to around 10:02 is when I only had 4 cores assigned. Before and after that time I had 12 cores...
  14. L

    How to deal with unresponsive lxc and kvm guests in HA context

    Surely this a bug and can't be by design. If the main service goes down because to place one backs up to cannot be reached consistently, the backup service should balk, but the main service must be stable. I'll open a bug report for this.
  15. L

    How to deal with unresponsive lxc and kvm guests in HA context

    The only one of the these command that shows a problem is this one root@FT1-NodeD:~# systemctl status pvestatd ● pvestatd.service - PVE Status Daemon Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2022-10-19...
  16. L

    How to deal with unresponsive lxc and kvm guests in HA context

    It seems that my remote backup server (PBS) may be the cause of this. I have now determined that the link to it is very slow (1Mb/s) instead of the Gb/s is used to be, so I'm investigating that. I find it peculiar though that not being able to do a fast backup can kill a whole node's running...
  17. L

    How to deal with unresponsive lxc and kvm guests in HA context

    Regarding the hanging specifically: When the "pct status xxx" command times out, it's not the what's inside the container, it's the system. Some of these lxc's run on NVMe storage and for the rest of the system everything is running perfectly fine. I still don't know why some nodes just look...
  18. L

    How to deal with unresponsive lxc and kvm guests in HA context

    This morning it is another node that displays the exact same symptoms! Node B. It get's stuck trying to take a snapshot of an LXC and then all the LXC's become unresponsive. I upgraded all the nodes yesterday to ensure that I've got all the latest patches, including PBS, so what could be...
  19. L

    How to deal with unresponsive lxc and kvm guests in HA context

    This same thing has happened two days in a row now... :-(