Search results

  1. dakralex

    Clarification related to HA Maintenance Mode/Affinity Rules

    I proposed to change that since the failback flag's description [0] explicitly states it's only concerned about moving back to a higher priority node class, but for now it's a proposal ;). Exactly, in the end if two nodes have the exact same score (e.g. exactly the same amount of guests on each...
  2. dakralex

    Proxmox pve-manager 9.1.8 / pve-ha-manager 5.2.0 [Dynamic load intro]

    Thanks for the input! This should be doable, but since we haven't tracked whether migrations are initiated by a user or some automatic mechanism yet, this should be tracked in a separate Bugzilla entry, so feel free to create one for this here [0]. This could also be related to another feature...
  3. dakralex

    Proxmox pve-manager 9.1.8 / pve-ha-manager 5.2.0 [Dynamic load intro]

    Thanks a lot for testing and sending in a report! There is a patch series in review, which overhauls the CRS section itself and adds documentation for the new load balancing system here [0], but this will certainly be available before or in the Proxmox VE 9.2 release. External feedback on these...
  4. dakralex

    Clarification related to HA Maintenance Mode/Affinity Rules

    Not directly, this will only be respected if there are no affinity rules, which prohibit this behavior. For example, if the node affinity rule in your second scenario would have been strict, then the behavior would be correct as the HA resource on node3 does not have any other place to go but...
  5. dakralex

    Clarification related to HA Maintenance Mode/Affinity Rules

    In general, there shouldn't really be a precedence as all of those conditions should hold at the same time. The rule verification system does dismiss many types of affinity rules, which cannot be determined to be resolvable at runtime, see [0]. There are still valid cases, which still could...
  6. dakralex

    disarm-ha and arm-ha commands

    The disarm-ha and arm-ha commands are mainly intended for specific maintenance tasks, where the whole cluster communication stack is temporarily unavailable or other situations, where one wants to avoid the HA stack make a node fence. The HA Manager should be able to handle complete cluster...
  7. dakralex

    Clarification related to HA Maintenance Mode/Affinity Rules

    Welcome to the Proxmox forum, Libero_AT! For scenario 1, it seems like that the current HA stack gives more priority that the resource affinity rule holds than whether it should migrate back to its maintenance node. For scenario 2, I assume that the negative resource affinity rule (keep...
  8. dakralex

    [Solved] Recent updates caused problem with migration

    Hi! Could you post the output of pveversion -v, pct start 103 --debug and syslog which includes the starting of the container? Also what version of apparmor is running on the host?
  9. dakralex

    Sporadic error when live migrating VMs on 9.1.7

    Hi! Could you provide the migrated VMs config where this does happen? Which storages/filesystems are the disks stored on?
  10. dakralex

    Option for "group" gone in edit ct/vm machine HA setting in Prox 9

    Yes, there currently is no option to bulk add them to the HA stack, but feel free to add a feature request in our Bugzilla [0] such that we can track it there with some description what it should do. [0] https://bugzilla.proxmox.com/enter_bug.cgi?product=pve&component=HA
  11. dakralex

    Negative affinity moved both VMs

    Good question! But yes, in the proposed 2-node cluster setup with a QDevice, the HA Manager will only move one of the HA resources to another node and keep the other one as-is, because it is known that there aren't any other viable nodes to migrate to. In fact, it is the same behavior as...
  12. dakralex

    Negative affinity moved both VMs

    Hi! Yes, currently, this is the expected behavior as the HA Manager will detect that each HA resource is on the same node as a HA resource it must not be together with and schedules both to move to other nodes. Negative HA resource affinity rules are relatively strict at the moment, as in...
  13. dakralex

    [SOLVED] Fencing Status after HA update

    Hi! There should be no specific reason why the fencing status is either on the top or the bottom. I couldn't yet reproduce the order of the first screenshot, does it happen randomly on reloading or move while staying on the page where it automatically updates?
  14. dakralex

    [SOLVED] Why does my PVE node still get rebooted by Watchdog when HA VMs are set to Ignored during a network outage?

    Great that the details helped you! The recommended way to put an LRM service into an idle state would be to either move all of the HA resources placed on that node to another node or put all the HA resources placed on that node into the 'ignored' state and then wait the ~10 minutes until it...
  15. dakralex

    [SOLVED] Why does my PVE node still get rebooted by Watchdog when HA VMs are set to Ignored during a network outage?

    After all the HA resources were set to 'ignored', it takes roughly 10 minutes for the LRM and 15 minutes for the CRM to release their watchdogs and become idle. Maybe the network interruption for both the production and test setup were done while some of the LRM services and/or the CRM Manager...
  16. dakralex

    VM failback with cloudinit and ZFS replication fails

    To mitigate this for now, zfs2/vm-101-cloudinit can be removed on the failed node to be able to migrate it back there.
  17. dakralex

    VM failback with cloudinit and ZFS replication fails

    Hi! Thanks for the report! I suppose there is a HA node affinity rule which makes the HA resource failback to the old node. As the node fails, the HA Manager will move the HA resource but not clean up the cloudinit image from the failed node as it would happen in normal circumstances... I...
  18. dakralex

    <span>nodename</span> showing up in front of everything

    Hi! Could you try force refreshing the web interface with Ctrl+F5 or Ctrl+Shift+R? Which version of pve-manager is running on the node?
  19. dakralex

    [SOLVED] PVE 9.1.5: Linux VM Freeze Randomly

    Hm, then it could be a kernel panic inside the VM that is causing it and isn't written to the disk anymore if it doesn't show up in the syslog, e.g. journalctl -b -1... Maybe you can setup netconsole [0] [1] or some other log-persisting setup to capture the log when the freeze happens. [0]...
  20. dakralex

    How to modify cgroup trees of LXCs and VMs?

    Hi! There is no simple way to move VMs and CTs under a single root cgroup as for VMs the code already depends on the VM's cgroups to be under the /qemu.slice cgroup and the /lxc cgroup is given by LXC itself, which has many intricacies that make this a requirement. So it can only be limited for...